Jump to content
  • Advertisement
Sign in to follow this  
benryves

[web] [MySQL] Return matching "previous" and "next" values (or null if nonexistant)

This topic is 3162 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Here's one that's puzzling me; let's say you have a table tracking page views. Every time a page is visited its name is logged, along with an autoincrementing ID and a session ID (unique to each user). When pulling data off the table I would like to retrieve the current page ID, but also retrieve the previous and next page for each row. If there was no previous or next page, null should be returned instead. A couple of LEFT JOINs does the trick, and works beautifully on a 7 row table (named previous_next):
SELECT
    main.*,
    previous.page AS previous_page,
    next.page AS next_page
FROM
    previous_next AS main
LEFT JOIN (
    SELECT
        p_main.id AS id,
        p_previous.page AS page
    FROM
        previous_next AS p_main,
        previous_next AS p_previous
    WHERE
        p_previous.id<p_main.id AND p_previous.session=p_main.session
    GROUP BY p_previous.id
) AS previous ON (previous.id=main.id)
LEFT JOIN (
    SELECT
        n_next.id AS id,
        n_main.page AS page
    FROM
        previous_next AS n_main,
        previous_next AS n_next
    WHERE
        n_next.id<n_main.id AND n_next.session=n_main.session
    GROUP BY n_next.id
) AS `next` ON (next.id=main.id);
However, this is evidently not the way to go as it takes 70 seconds on a 50,000 row table (though it does produce the correct results). Is there a better way to do this? Edit: This question is now purely academic, as I now log the previous/next pages directly onto the log table every time a page is viewed. I'd still be interested to see the "correct" solution. [Edited by - benryves on October 25, 2009 10:48:53 AM]

Share this post


Link to post
Share on other sites
Advertisement
I would think that subqueries would work better than left joins. But I would guess if performance improved with that technique, it wouldn't reduce your query time enough from 70 seconds to be practical, either.

Have you defined indexes that cover the query? That is always the first thing to check when you have a really slow query. You should have indexes on id and session, at the least.


SELECT
main.*,
( SELECT
p_previous.page
FROM
previous_next AS p_previous
WHERE
p_previous.id < main.id AND p_previous.session=main.session
GROUP BY p_previous.id
) AS previous_page,
( SELECT
n_main.page AS page
FROM
previous_next AS n_next
WHERE
n_next.id < main.id AND n_next.session=main.session
GROUP BY n_next.id
) AS next_page
FROM
previous_next AS main;

Share this post


Link to post
Share on other sites
Yes, indices are set up correctly. In fact, the table has been directly imported from an old site, and I've been trying to recreate a report that used to run on the old site without access to its source. The old site would complete generating a report in under five seconds.

The subquery method is significantly slower, sadly:

SELECT
main.*,
( SELECT
p_previous.page
FROM
previous_next AS p_previous
WHERE
p_previous.id < main.id AND p_previous.session=main.session
GROUP BY p_previous.id
ORDER BY id DESC
LIMIT 1
) AS previous_page,
( SELECT
n_next.page AS page
FROM
previous_next AS n_next
WHERE
n_next.id > main.id AND n_next.session=main.session
GROUP BY n_next.id
ORDER BY id ASC
LIMIT 1
) AS next_page
FROM
previous_next AS main;

Share this post


Link to post
Share on other sites
Quote:
Original post by benryves
Edit: This question is now purely academic, as I now log the previous/next pages directly onto the log table every time a page is viewed. I'd still be interested to see the "correct" solution.
Is it possible the old site simply selected all required the rows and generated the report itself (i.e. in a procedural language)? That's probably the way I'd do it.

(Actually, I'd probably do what you're doing now and log the required data directly in the table)

Share this post


Link to post
Share on other sites
Quote:
Original post by Codeka
Quote:
Original post by benryves
Edit: This question is now purely academic, as I now log the previous/next pages directly onto the log table every time a page is viewed. I'd still be interested to see the "correct" solution.
Is it possible the old site simply selected all required the rows and generated the report itself (i.e. in a procedural language)? That's probably the way I'd do it.
Possibly. I hacked something together in a minute to populate the new "next page" and "previous page" fields for the historical data and it took a couple of minutes to run, but that could have been my naïve implementation.

Share this post


Link to post
Share on other sites
Quote:
Original post by benryves
Yes, indices are set up correctly. In fact, the table has been directly imported from an old site, and I've been trying to recreate a report that used to run on the old site without access to its source. The old site would complete generating a report in under five seconds.

The subquery method is significantly slower, sadly:

*** Source Snippet Removed ***


I'm curious, too. The GROUP BY might be the culprit. What happens if those are removed, and replaced with MIN or MAX, as appropriate?

Share this post


Link to post
Share on other sites
I'm no expert on query optimization, so I would ask if you've tried running "EXPLAIN" on your query to see if there's something else happening (assuming you're dealing MySQL)? Maybe, for some wild reason, an index isn't getting used. Or maybe it's fetching way more rows than it really needs to.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!