WTF Sleep sort

Started by
25 comments, last by way2lazy2care 12 years, 10 months ago

Scheduler behavior therefore becomes a property of algorithm itself. A scheduler which doesn't preserve strict order can be O(1). Even order-preserving scheduler can be O(1) at expense of memory. it's impractical as general purpose algorithm, but looks something like this:std::list<Process*> next_to_execute[MAX]; // next_to_execute(i) = list of processes to execute at time i

(...)

For me, stable comparison sort is a special case of the problem, one which most schedulers will not be able to achieve due to non-deterministic nature of threads. So might as well go for unstable non-comparison sort.


OK, sure. I guess the question is whether most schedulers actually handle sleep this way. I was under the impression that either heaps or sorted lists were far and away the most common, but I'll accept otherwise. And yes, I do understand the idea that the "circle of confusion," as it were, can be thought of as a bucket, and so...


Bucket sort is O(n). If original example can be shown to be equivalent, then it has same complexity.
[/quote]

I see what you mean. My only qualm with this argument is that it seems to assume the scheduler works "too well," and that, if we assume that the scheduler does something that makes it a bucket sort, why not go further and assume that the scheduler runs all of the threads in parallel in the most optimal way possible? If we do this, then we don't (ideally) have to worry about even looking at all of the numbers sequentially, and, as such, all of the O(n) terms drop out entirely. This means that we're left with the n processes, sleeping independently. This means that the "only" thing we have to worry about is the last thread finishing, which trivially runs in time proportional to the number it was assigned, that is, O(B).

In any case, I think we agree that if we assume we can always distinguish between numbers with arbitrary precision, then O(B) will always be at least O(n), so even the parallel version doesn't do anything good, but in any case, I think that O(B) is still the most general time complexity, and O(n log n) is something that will happen on many machines.
-~-The Cow of Darkness-~-
Advertisement

In any case, I think we agree that if we assume we can always distinguish between numbers with arbitrary precision, then O(B) will always be at least O(n), so even the parallel version doesn't do anything good, but in any case, I think that O(B) is still the most general time complexity, and O(n log n) is something that will happen on many machines.


What if you sort the set {1, 1, ..., 1}? Then O(B) is way way less than O(n) :D
All this distraction with kernels completely distracted me from what I was originally thinking.


This only works if it's a bucket sort.


First pass is to read all numbers or scan over them to schedule the delays. This operation is O(n). Parsing/reading/scheduling a single number is O(1). Hence the value i will be parsed at parse_time(i).

Number at index i will therefore be scheduled at absolute time (value(i) + parse_time(i)).

Trivial example:[2, ...one_million_numbers..., 1]
// numbers will therefore be printed at following times
2: 2 + 1 = 3
.... one million numbers
1: 1 + 1,000,000 = 1,000,001
And the algorithm is incorrect

Therefore, unless time when an item is scheduled is subtracted from delay, the algorithm isn't correct. And comparison sort cannot be implemented without it.

OS scheduler plays no role here, in order to implement this universal algorithm one would need perfectly accurate clock, something which simply isn't available on any non-dedicated computer, even with timing boards, there is deviation from accurate time due to factors internal to CPU.


The reason original algorithm works is because it clumps items into buckets which are vastly larger than time quantum and time needed to parse all numbers. So the following happens:2: 2+0.0000001 = 2.0000001
... one million numbers...
1: 1 + 0.1 = 1,1


Original sort uses sleep (sleep by a relative offset), not schedule_at_accurate_absolute_time (wake at absolute time). And that one only works if delay times are such that we bucket things, which is why O(n) is possible. As a generalization, item I must be scheduled at time (k*I+C). C is the time needed to parse all numbers and is f(n). k is additional factor to compensate for time needed to print individual numbers.

Even if kernel internally stores delays as absolute numbers, our initial pass has drift since we always specify relative offsets without compensating for current time.


If however we used schedule_at(absolute_time), then it would be logn since it would require times to be sorted, either via priority queue or via some other method - but it would preserve the order and end up as comparison sort. Even then, it would still require items to be scheduled at (+C) time to compensate for reading and scheduling all numbers.


What if you sort the set {1, 1, ..., 1}? Then O(B) is way way less than O(n)[/quote]
It's always O(n), if numbers are unique, one must still parse O(n) numbers.

What if you sort the set {1, 1, ..., 1}? Then O(B) is way way less than O(n)

It's always O(n), if numbers are unique, one must still parse O(n) numbers.
[/quote]

That's correct, but in the case of sorting the set {1000000000000000000000000000000000} it would be O(B) like he said. This is kind of a weird situation because technically your only doing 1 operation, but the operation is taking the same time as 1000000000000000000000000000000000 operations.

On second thought I'm not sure how that translates into O(). I think one could make a solid case for either :-/ This is probably what your and cows' argument boils down to.


That's correct, but in the case of sorting the set {1000000000000000000000000000000000} it would be O(B) like he said. This is kind of a weird situation because technically your only doing 1 operation, but the operation is taking the same time as 1000000000000000000000000000000000 operations.


Waiting isn't part of O(n). If we are sorting 5 numbers, then 5 timers will need to be triggered. The fact that one of them will take a billion years doesn't change that. It's why asymptotic complexity is sometimes at odds with working implementations.

The O(n) in sorting algorithms is typically about a certain characteristic operation, such as number of comparisons or number of swaps. These can vastly differ between algorithms in same class, but their asymptotic bounds remain.

If we allow the algorithm to signal when all numbers have been emitted - the required constant mentioned above or explicit signal, then timers can be triggered instantly, since internally they are adequately sorted. Hence a T(n) implementation is possible.

Waiting isn't part of O(n). If we are sorting 5 numbers, then 5 timers will need to be triggered. The fact that one of them will take a billion years doesn't change that. It's why asymptotic complexity is sometimes at odds with working implementations.

The O(n) in sorting algorithms is typically about a certain characteristic operation, such as number of comparisons or number of swaps. These can vastly differ between algorithms in same class, but their asymptotic bounds remain.


That's what Prefect meant by "wall clock" time instead of CPU operations. You're right, we're not measuring operations, but we are still measuring the asymptotic behavior of something (the wall clock time), which grows proportionally with n (well, B). It's still a perfectly valid use of Big O notation, and I would argue that in this case it's the best one, since the only part of the algorithm we know takes time is the "waiting." If the waiting time were a actually a constant, that would be one thing, but since it's the dominant term in the limiting behavior (at least with respect to wall clock time), it just doesn't make sense to "ignore" it. It's not like using asymptotic notation with respect to waiting time is "incorrect" or something, so I'm not sure where you got the idea that "waiting isn't part of O(n)," since I think it's clearly the right thing to do in this case.

All this distraction with kernels completely distracted me from what I was originally thinking.


This only works if it's a bucket sort.


First pass is to read all numbers or scan over them to schedule the delays. This operation is O(n). Parsing/reading/scheduling a single number is O(1). Hence the value i will be parsed at parse_time(i).

Number at index i will therefore be scheduled at absolute time (value(i) + parse_time(i)).

Trivial example:

(...)

And the algorithm is incorrect

Therefore, unless time when an item is scheduled is subtracted from delay, the algorithm isn't correct. And comparison sort cannot be implemented without it.[/quote]

Yes, I've already acknowledge this. However, if we make this into a parallel algorithm (assuming arbitrarily many "cores" and all of the other things that go along with that sort of analysis), we can have each core take a number from the list, have them all start spin-waiting at the same time, and then have them each put their numbers on the end of a master list once they wake up. Obviously that's not exactly practical, as is the case with most algorithms that require cores proportional to the number of input elements, but in theory I guess it would take O(B) time and be bounded by O(n*B) work.
-~-The Cow of Darkness-~-

[quote name='Antheus' timestamp='1308578921' post='4825488']
Waiting isn't part of O(n). If we are sorting 5 numbers, then 5 timers will need to be triggered. The fact that one of them will take a billion years doesn't change that. It's why asymptotic complexity is sometimes at odds with working implementations.

The O(n) in sorting algorithms is typically about a certain characteristic operation, such as number of comparisons or number of swaps. These can vastly differ between algorithms in same class, but their asymptotic bounds remain.


That's what Prefect meant by "wall clock" time instead of CPU operations. You're right, we're not measuring operations, but we are still measuring the asymptotic behavior of something (the wall clock time), which grows proportionally with n (well, B). It's still a perfectly valid use of Big O notation, and I would argue that in this case it's the best one, since the only part of the algorithm we know takes time is the "waiting." If the waiting time were a actually a constant, that would be one thing, but since it's the dominant term in the limiting behavior (at least with respect to wall clock time), it just doesn't make sense to "ignore" it. It's not like using asymptotic notation with respect to waiting time is "incorrect" or something, so I'm not sure where you got the idea that "waiting isn't part of O(n)," since I think it's clearly the right thing to do in this case.

All this distraction with kernels completely distracted me from what I was originally thinking.


This only works if it's a bucket sort.


First pass is to read all numbers or scan over them to schedule the delays. This operation is O(n). Parsing/reading/scheduling a single number is O(1). Hence the value i will be parsed at parse_time(i).

Number at index i will therefore be scheduled at absolute time (value(i) + parse_time(i)).

Trivial example:

(...)

And the algorithm is incorrect

Therefore, unless time when an item is scheduled is subtracted from delay, the algorithm isn't correct. And comparison sort cannot be implemented without it.[/quote]

Yes, I've already acknowledge this. However, if we make this into a parallel algorithm (assuming arbitrarily many "cores" and all of the other things that go along with that sort of analysis), we can have each core take a number from the list, have them all start spin-waiting at the same time, and then have them each put their numbers on the end of a master list once they wake up. Obviously that's not exactly practical, as is the case with most algorithms that require cores proportional to the number of input elements, but in theory I guess it would take O(B) time and be bounded by O(n*B) work.
[/quote]
20100108063328%21Exploding-head.gif

This topic is closed to new replies.

Advertisement