Sign in to follow this  
spraff

Please help me design this SQL query

Recommended Posts

spraff    100
Hello. I'm familiar with basic SQL but this is making my head hurt... I have a table of names and addresses, string fields FORENAME, SURNAME, ADDRESS. I want to select addresses by the number of distinct surnames therein. For example, If I have stored (FORENAME,SURNAME,ADDRESS) John,Smith,A Jill,Smith,A Adam,Jones,A Jack,Smith,B Jane,Smith,B I want to retrieve (ADDRESS,NUM_FAMILIES) A,2 B,1 Also since this is probably going to involve something like "SELECT ADDRESS, [expression] AS NUM_FAMILIES" is it legal to append "WHERE NUM_FAMILIES>x" even though the NUM_FAMILIES token has only just been defined? Incidentally, I'm trying to do this in OpenOffice Base, in case that matters. Thanks if you can help.

Share this post


Link to post
Share on other sites
spraff    100
Thanks for your suggestion, but that doesn't deal at all the problem! I am aware of COUNT etc, but the only way I can imagine to use them to solve this is imperatively in another language. How, within SQL can I get this result declaratively?

Thanks for your time.

Share this post


Link to post
Share on other sites
Telastyn    3777
Alimonster's answer is pretty much the best you're going to get without spelling it out for you or describing the entirety of group by/having behavior line by line.

That answer is probably more complete than I would give for something that smells decidedly like homework.

Share this post


Link to post
Share on other sites
Wan    1366
Hint: there exists a distinct keyword you can use in combination with the count function. Then group by what ever you need to distinct.

Share this post


Link to post
Share on other sites
spraff    100
The closest I've come is

SELECT
DISTINCT address AS a,
(SELECT COUNT(DISTINCT surname) AS n FROM Table1 WHERE address=a)
FROM Table1


This doesn't work, the inner query doesn't recognise 'a'. A stored procedure might tidy this up but I can't figure out how to make stored procedures work in OpenOffice Base. Any tips?

Share this post


Link to post
Share on other sites
Wan    1366
I think this should work (untested):
SELECT
DISTINCT address, COUNT(DISTINCT(surname))
FROM
Table1
GROUP BY
address
In the first column we're selecting all distinct addresses, in the second column we count the number of distinct surnames within a particular address.

Share this post


Link to post
Share on other sites
Codeka    1239
SELECT
ans.address, count(*)
FROM
(
SELECT
address, surname, count(*) AS num
FROM
table
GROUP BY
address, surname
) AS ans
GROUP BY
ans.address
I haven't tested this, but I think it'll work. Using your sample data, the inner SELECT should give results like this:

A,Smith,2
A,Jones,1
B,Smith,2

And the outer SELECT should return

A,2
B,1

It can probably be refined further (for example, you don't really need the COUNT(*) in the inner select, it's just for demonstration purposes)

Share this post


Link to post
Share on other sites
kirsis    170
Quote:
Original post by spraff
Thanks for your suggestion, but that doesn't deal at all the problem!


The reply by Alimonster was exactly what you asked for. It wasnt even pseudocode, it was a 100% valid SQL query that does what you asked for (+ background information).

It seems that you assumed it wouldn't work as-is and just "politely" requested someone else come up with a ready-made query for you.

Share this post


Link to post
Share on other sites
Codeka    1239
Quote:
Original post by kirsis
Quote:
Original post by spraff
Thanks for your suggestion, but that doesn't deal at all the problem!


The reply by Alimonster was exactly what you asked for. It wasnt even pseudocode, it was a 100% valid SQL query that does what you asked for (+ background information).
Actually, Alimonster's solution doesn't do what he asked for...

Share this post


Link to post
Share on other sites
ToohrVyk    1595
Quote:
Original post by Wan
I think this should work (untested):
SELECT
DISTINCT address, COUNT(DISTINCT(surname))
FROM
Table1
GROUP BY
address
In the first column we're selecting all distinct addresses, in the second column we count the number of distinct surnames within a particular address.


+1 (well, the syntax is COUNT(DISTINCT surname) and the GROUP BY makes the first DISTINCT pointless, but you get the idea).

Share this post


Link to post
Share on other sites
owl    376
I'm not sure if I understood you correctly but:


SELECT DISTINCT T1.surname, (select count(*) FROM [Table] AS T2 WHERE T2.surname=T1.surname) AS num_addresses FROM [Table] AS T1

Share this post


Link to post
Share on other sites
spraff    100
Thank you Wan, that works fine!

Is it a happy side-effect of GROUP BY that DISTINCT(surname) limits its scope to the address at that row, or is that exactly what GROUP BY was designed for?

Share this post


Link to post
Share on other sites
Wan    1366
Quote:
Original post by ToohrVyk
+1 (well, the syntax is COUNT(DISTINCT surname) and the GROUP BY makes the first DISTINCT pointless, but you get the idea).

Um yeah. Blame incremental development. :)

Quote:
Original post by spraff
Is it a happy side-effect of GROUP BY that DISTINCT(surname) limits its scope to the address at that row, or is that exactly what GROUP BY was designed for?

In this case the latter. Consider this obvious example:
SELECT
COUNT(DISTINCT(surname))
FROM
Table1

This will count the number of distinct surnames in the table, right?
But now we would like to split (group) them by address:
SELECT
COUNT(DISTINCT(surname))
FROM
Table1
GROUP BY
address

Profit!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this