Sign in to follow this  
ArchG

[web] SQL Query (counting rows and returning data in one query)

Recommended Posts

Hello, I'm not quite sure if what I want to do is possible with one query, but I'll jump right into what I'm trying to do anyway.. This is our Setup
End Users Table
ID, Name, Email, Date_Created, Last_Date_Accessed

Uses Table
ID, End_User_ID, Date_Accessed
Every time a user logs into our system, a record is added to the Uses table. What I want to do, is generate a report, that contains the number of uses each user had in a given month, that would display like this Name, Email, User Since, Last Login, Number of Logins in X Month (x being a variable of any month I wish to generate) an example record for July would be Kyle, myEmail@gmail.com, 1/17/2008, 7/5/2008, 13 (meaning Kyle logged in 13 times this month..or that Kyle had 13 records in the Uses table with the Date_Accessed being between 7/1/2008, and 7/31/2008) This is what I know how to do
SELECT End_Users.Name, End_Users.Email, End_Users.Date_Created, End_Users.Last_Date_Accessed, Uses.Date_Accessed FROM End_Users INNER JOIN Uses ON End_Users.ID = Uses.End_User_ID

The problem with this, is if for example User '1' Logged on 10 times that month, it would return 10 rows for that user. Is there a way to count up the number of rows, and put that in a different field or something? Any hints on this would be appreciated, if this is simply not possible with SQL alone, please let me know, and i'll use some other alternative. Thank you, ArchG edit* Unsure of why the formatting seems off in firefox at least...I'm not sure how to fix that.

Share this post


Link to post
Share on other sites
I haven't tested these, but there is at least two ways you can do this.

First, use COUNT with a left join, like so:

SELECT Users.Name, Users.Email, Users.Date_Created, Users.Last_Date_Accessed,
COUNT(ALL Uses.ID) AS Number_Of_Logins
FROM End_Users AS Users
LEFT JOIN Uses
ON Uses.End_User_ID = Users.ID
AND Uses.Date_Accessed BETWEEN '7/1/2008' AND '7/31/2008'
GROUP BY Users.Name, Users.Email, Users.Date_Created, Users.Last_Date_Accessed



Second, use a nested select statement, like so:

SELECT Users.Name, Users.Email, Users.Date_Created, Users.Last_Date_Accessed,
(
SELECT COUNT(*)
FROM Uses
WHERE End_User_ID = Users.ID
AND Uses.Date_Accessed BETWEEN '7/1/2008' AND '7/31/2008'
) AS Number_Of_Logins
FROM End_Users AS Users



I personally prefer the second method as it makes adding additional columns easier as it doesn't require a GROUP BY clause.

Share this post


Link to post
Share on other sites
ArchG is correct on all counts. Just be aware that the subquery method may cause a very serious drop in performance on certain database engines if you have a lot of data.

Share this post


Link to post
Share on other sites
Actually, there's a third approach that I use most of the time:

SELECT Users.Name, Users.Email, Users.Date_Created, Users.Last_Date_Accessed, tmp.Number_Of_Logins
FROM End_Users AS Users
INNER JOIN
(
SELECT End_User_ID, COUNT(*) as Number_Of_Logins
FROM Uses
WHERE AND Uses.Date_Accessed BETWEEN '7/1/2008' AND '7/31/2008'
GROUP BY End_User_ID
) AS tmp on tmp.End_User_ID=Users.ID


This keeps the performance characteristics of Cyansoft's first example (the simple GROUP BY) while also keeping the query free of a top-level GROUP BY as in Cyansoft's second example. The performance improvement is because it's not performing a sub-query for each row--rather, it's performing one other query and then joining the result to your main table.

However, you should note that Cyansoft's second example will perform much better than either of the other two options if you later use a LIMIT clause to remove most of the result set and your subquery is doing something more substantial than a simple COUNT(*). That is, if you run this query with a LIMIT 10 (showing only the first ten users), Cyansoft's second example will calculate the number of logins only for those 10 users, rather than calculating all of them and then limiting the result set.

As always, a little benchmarking with real data goes a long way. If you have any substantial user base, your Uses table is going to get very large very fast, and you should plan for that.

Share this post


Link to post
Share on other sites
If you know what intervals you want to query on, you might consider updating your Uses table to have a count, say number of logins that month, rather than tracking every one. Keep the date field, but set it to the first day of the month when you create a new row.

Uses Table
ID, End_User_ID, Month_Accessed, Number_Accesses

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this