weighted random numbers

Started by
15 comments, last by alvaro 15 years, 11 months ago
Quote:Original post by alvaro
I don't think my solution is that complicated. Here's some code:
*** Source Snippet Removed ***
EDIT: You don't have to use doubles if your weights are integers. Modifying the code I provided for integer weights should be trivial.

EDIT: Added a few comments


I appreciate the code, I'll look over it more thoroughly tomorrow. I forget but I think I've read that in a for loop var++ and ++var do the same thing, is this correct?

edit: Also, is there a way to force a templated argument to be of a numeric type?
Advertisement
Quote:Original post by atonalpanic
I like where this is going, but I don't quite understand the bars stacking on top of each other.

What don't you understand about it? Picture four bars, of different heights, all stacked up.
Quote:Also if your solution is like alvaro's, how do you suppose I store the tree so that I can binary search it?
The CPD, by definition, is a sequence of numbers which are in order from smallest to biggest. You binary search it in the normal way.

BTW, I misunderstood cjb's approach on a first reading. His way is less efficient, but it is not "arbitrarily slow".
Quote:Original post by atonalpanic
I appreciate the code, I'll look over it more thoroughly tomorrow. I forget but I think I've read that in a for loop var++ and ++var do the same thing, is this correct?

For simple iterator types (integers or pointers) yes, they are about the same. However, if you have a non-trivial iterator type, var++ will make a copy of the old value of the iterator, while ++var won't. In general, I try to use ++var in loops since it never hurts and it sometimes helps.

Quote:edit: Also, is there a way to force a templated argument to be of a numeric type?

If you try to instantiate a template that expects a numeric type using something else, you'll get an ugly compiler error about some operation not being defined. There are two things that might allow you to say you want it to be numeric explicitly:
1) Use boost's static asserts (I don't know enough about the subject to know if this is even possible)
2) Wait for the new version of C++, which should have something called "concepts" that will allow you to specify some requirements on template types.

Quote:Original post by atonalpanic
This is related to my interests. It sounds a lot like the algorithm for Huffman coding. Please do provide more information.

It is exactly like Huffman coding. Just as with Huffman coding, you want the most common elements to be at the top of the tree. The optimal tree in both cases is the same.
Quote:Original post by Sneftel
a: 2/10.
a or b: 6/10.
a or b or c: 7/10.
a or b or c or d: 10/10.


Quote:Original post by Sneftel
BTW, I misunderstood clb's approach on a first reading. His way is less efficient, but it is not "arbitrarily slow".


Ours are exactly same, except I did a linear search, whereas you performed binary search. Difference, as we all know, is O(N) vs O(logN), and I think it would show in timings already with N as small as 26.

If you're doing a general purpose library, I suggest going alvaro's route, as you can modify frequencies in O(logN) time as well, instead of O(N) updates with the array approach.
Hello again,

I noticed that when I run the code you posted, it always returns the same 20 letters each run of the program. I added the line: std::srand( time( NULL ) );
in the constructor to seen the rand() function but now all it does is spew back a bunch of letter a's. I was wondering how to make it so that every run of the program spits back new random letters.

Edit: I take that back now when I run the program it exhibits the behavior I wanted...yet I didn't change anything. Ohh well, there is still the smaller problem that every run of the program starts with the same letter. I should also say that the letter that always shows first is not a letter with a high probability of showing.
You should call `srand(time(0))' only once at the beginning of the program (I usually do it at the top of main()).

EDIT: Actually, what I do (when I am not being lazy) is call time(0), print the result in the log and then call srand with that value. That way if I find a bug I can call srand with that value and reproduce the problem.

This topic is closed to new replies.

Advertisement