Jump to content
  • Advertisement
Sign in to follow this  
Michalson

[web] SQL problem solving

This topic is 4822 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I believe I've bitched about it before, but I don't think I can go much longer having to create a situation specific solution (that comes way too close to having holes and problems if the right is used) everytime I come up against this problem. For reference this is under SQL Server 2000. The problem is simple. You want to list the TOP 1 item from a list based on a specified sorting order. However you don't just want one item, you want to take a table/view and find the first item in the sorting order for each item in a group. Lets take a hypothetical example. We have a table for employees: CREATE TABLE Employees(Name varchar(100), Age integer, Salary integer, DepartmentID integer); As you can see, for each employee we know their age and their department and how much they make. What we want to do is find the oldest employee in each department, how old they are, and how much they make. If there is a tie, then we want to break it by the order of their name. For example to find the oldest in one department the query would look like this: SELECT * FROM Employees WHERE DepartmentID=# ORDER BY Age DESC, Name However we don't want one department at a time. We want to be able to list the oldest employee in each department in one query. Something along the lines of this (note, the FIRST aggregate function only exists in JET, and you can't make user defined aggregate functions unless you go to something like SQL Server 2005 or Oracle 9. Even in JET I'm not sure if this is valid or not as I'm not a pro-Access programmer). SELECT FIRST(Name), FIRST(age), FIRST(Salary), DepartmentID FROM Employees GROUP BY DepartmentID ORDER BY Age DESC, Name There is a big rate up if anyone can find a general purpose, robust solution to this problem. I'm also rating up to anyone with insightful comments or sucking up.

Share this post


Link to post
Share on other sites
Advertisement
You can do it with a correlated subquery, but that's horribly inefficient.

I'm thinking that there might be a better way of doing it with a clever outer join, joining the employee table with itself (possibly with group by).


SELECT employee.id, employee.name, employee.age FROM employee e where age = (SELECT MAX(age) FROM employee oldest WHERE oldest.dept = e.dept)


Mark

Share this post


Link to post
Share on other sites
Hmm ok, this does it a bit better:


SELECT e.id, e.name, e.dept, e.age, count(older.id) FROM
EMPLOYEE e
LEFT OUTER JOIN employee older
ON e.dept = older.dept AND e.age < older.age
GROUP BY e.id, e.name, e.dept, e.age
HAVING count(older.id) = 0


Effectively we're joining everyone with the group of rows of people in their own department who are older than then, and counting them with the count aggregate function.

Then using HAVING, we select only those where the count is 0, who must therefore be the oldest people in the department.

Mark

Share this post


Link to post
Share on other sites
Quote:
Original post by markr
You can do it with a correlated subquery, but that's horribly inefficient.

I'm thinking that there might be a better way of doing it with a clever outer join, joining the employee table with itself (possibly with group by).


SELECT employee.id, employee.name, employee.age FROM employee e where age = (SELECT MAX(age) FROM employee oldest WHERE oldest.dept = e.dept)


Mark



That's pretty much the horrible method I've been using now. The problem with it is that you get only one column, which breaks a lot of stuff in one of two ways. First is that you can't "sort" by two or more columns (the primary sort column might have only 2 or 3 possible values, making the second column very important to the ordering). The second problem is again related to that, in that the record chosen is based entirely on that one column. In a lot of cases that column is not unique. And you are right, you can do it with a join too.

Just for reference, the other horrible method involves user defined functions that execute the single item TOP 1 method. While technically robust, you can see how inefficient it is (worst plan ever).

Share this post


Link to post
Share on other sites
Quote:
Original post by markr
Hmm ok, this does it a bit better:


SELECT e.id, e.name, e.dept, e.age, count(older.id) FROM
EMPLOYEE e
LEFT OUTER JOIN employee older
ON e.dept = older.dept AND e.age < older.age
GROUP BY e.id, e.name, e.dept, e.age
HAVING count(older.id) = 0


Effectively we're joining everyone with the group of rows of people in their own department who are older than then, and counting them with the count aggregate function.

Then using HAVING, we select only those where the count is 0, who must therefore be the oldest people in the department.

Mark


I think I like it. I will have to try it out to see if it works with all my situations and what the speed is. You got your rate up.

Share this post


Link to post
Share on other sites
We're still running into the issue of sorting by more then one column. When 2 employees have the same age, they will both be included in the result set.

Share this post


Link to post
Share on other sites
Which of course we can solve with a simple condition:

(e.age < older.age OR (e.age = older.age and e.name > older.name))

Great solution markr.

Share this post


Link to post
Share on other sites

select *
From Employees e1
WHERE Name =(Select top 1 Name From Employees e2
WHERE e1.DepartmentID = e2.DepartmentID
ORDER BY e2.Age desc, e2.Name)


EDIT:

Source data:

Name Age Salary DepartmentID
Bob 27 15 1
Mike 57 35 1
Jim 28 17 2
Clive 17 10 3
John 18 10 3
Clem 18 14 3



Results:

Name Age Salary DepartmentID
Mike 57 35 1
Jim 28 17 2
Clem 18 14 3

Share this post


Link to post
Share on other sites
Best I can come up with is writing a user-defined function that takes the department and returns the record of choice. All the custom function would need to do is execute "SELECT TOP 1 FROM employees WHERE Dept = @deptVar ORDER BY Age, Name". Shouldn't leave too many holes and could be used as a template for similar sorts.

Then your main query would be "SELECT DISTINCT Dept, dept_Function(Dept) FROM employees;"

I'm not currently at work and do not have Enterprise manager up to test this, but like I said, it's the best I could come up with. Hope it helps.

cyn

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!