Sign in to follow this  
fredrikhcs

I don't get this book exercise

Recommended Posts

This is from Accelerated C++, third chapter. It's about vectors. 3-1. Suppose we wish to find the median of a collection of values. Assume that we have read some of the values so far, and that we have no idea how many values remain to be read. Prove that we cannot afford to discard any of the values that we have read. Hint: One proof strategy is to assume that we can discard a value, and then find values for the unread--and therefore unknown--part of our collection that would cause the median to be the value that we discarded. I don't understand what I should do (or why really; it seems very obvious why you can't discard any of the values). In what form should I prove this, and what is it exactly I should prove?

Share this post


Link to post
Share on other sites
This is sort of an odd question, I guess, since it really doesn't have anything to do with C++ vectors per se... it seems more about the mathematical concept of 'median.'

Anyway, you say its obvious that you can't discard any of the values -- can you articulate why its obvious? That should be sufficient enough. As the question states, if you can locate a case where discarding a value would produce an incorrect result for the median of the collection, you have proven that you can't discard any values and safely produce the correct result in all cases.

Share this post


Link to post
Share on other sites
Just a thought...

Why is it that the simplest things are the hardest to describe? Can complex devices, like speech, be used to describe simpler ones?

Well, yes, of course they can. It's just uncomfortable at times.

First step:

What's so obvious about not being able to throw one of the values out?

What is the importance of one number to finding the median?

Hell, what is the definition of a median?

Second step:

Try it.

Try finding the median after throwing out some of the values in the set.

Why can't you do it?

Third step:

In smart-people talk, describe how taking out values in the set disables the process of finding the median.

In other words, describe what just happened in the second step and WHY.

Fourth step:

Write the proof.

Share this post


Link to post
Share on other sites
Thanks for the responses. So, what do you make of these cases of incorrect results I came up with? Tell me if you think they are the "proof" the question wanted.

Case 1. If we discard more values from the left or right side of the collection's median than from the other side of it, the previous median is moved to the left or right in the sorted list of values, thus rendering the resulting median incorrect.

Case 2. If we discard the value that would otherwise have been the median, the resulting median is obviously incorrect, unless it is the same value as the previous median, or unless the sorted collection now has an even number of values, and the median of those values is the value we discarded.

Share this post


Link to post
Share on other sites
That's probably sufficient for the purposes of the book; they just want to get you thinking along more formal lines about problems.

I would approach it this way:

1) There exists a trivial sequence of arbitrary numbers such that they do not encode any information about that sequence.

2) This has the implication that finding the median is a matter of first identifying the (zero-based) index (or indicies) of the middle-most values.

3) You intend to read and discard X number of values from the sequence, leaving an unknown Y number of values still to be read.

4) The total number of values in the sequence is therefore given by: X + Y and the median index calculated as: (X + Y - 1) / 2 Where a fractional result indicates that there are two middle-most indices, being the floor and ceiling integer values.

5) Discarding those X number of values, the 'new' median index is given by ((X + Y - 1) - X) / 2 which simplifies to (Y - 1) / 2.

6) Since (Y - 1) / 2 == (X + Y - 1) / 2 is not an identity this shows that the original index of the median cannot be recalculated and, from point 2, the remaining Y values cannot be used to infer anything about the previous X values - Thus the original median is lost.

Obviously that's quite verbose, but I was being explanatory.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this