• Create Account

### #ActualiMalc

Posted 04 February 2013 - 12:51 AM


unordered_map<unsigned int,unordered_map<unsigned int,unordered_map<unsigned int,struct A*>>>
. . . that is constructed in a linear pass through the original array, but this doesn't work since the amortized allocations that must happen sum up to O(n^2) unless they are precomputed.

Actually, that's wrong. With the required geometric growth of a standard library container, and assuming the common growth factor of 2:
The total number of allocations is log(n),
The sum of the size of the allocations is 2n,
The overall running time is amortised O(n).
Nowhere does n*n come into it.

But each reallocation requires O(n) copying I would think? So if the total allocations is log(n) instead of n, which is reasonable, that's still O(n*log(n)).
Nope, that's not the case. This is how a vector grows, and an unordered_map is very similar. Starting from a typical initial size of 8:
Upon adding the 9th item, the existing 8 items are copied into the new block with room for 16 items.
Upon adding the 17th item, the existing 16 items are copied into the new block with room for 32 items.
Upon adding the 33rd item, the existing 32 items are copied into the new block with room for 64 items.
Upon adding the 65th item, the existing 64 items are copied into the new block with room for 128 items.
Upon adding the 129th item, the existing 128 items are copied into the new block with room for 256 items.
Total number of growth-related copies to expand to a total of 256 items: 8+16+32+64+128 = 248 which is very close to 256. continue adding more items up to the next growth point and it's the same - Always proportional to n.
If you don't follow my logic there, or for some reason don't believe it, just know that it really is amortised O(n) and you can probably search plenty of other more technical explanations as to why.

Stroustrup himself says:

People sometimes worry about the cost of std::vector growing incrementally. I used to worry about that and used reserve() to optimize the growth. After measuring my code and repeatedly having trouble finding the performance benefits of reserve() in real programs, I stopped using it except where it is needed to avoid iterator invalidation (a rare case in my code). Again: measure before you optimize.

I myself have found the same thing. It makes very little difference.

There is little sense, if any, in condiering anything else until you have measured the performance using unordered_map. You would be very very hard pressed to find anything better in practice.

### #1iMalc

Posted 04 February 2013 - 12:50 AM


unordered_map<unsigned int,unordered_map<unsigned int,unordered_map<unsigned int,struct A*>>>
. . . that is constructed in a linear pass through the original array, but this doesn't work since the amortized allocations that must happen sum up to O(n^2) unless they are precomputed.

Actually, that's wrong. With the required geometric growth of a standard library container, and assuming the common growth factor of 2:
The total number of allocations is log(n),
The sum of the size of the allocations is 2n,
The overall running time is amortised O(n).
Nowhere does n*n come into it.

But each reallocation requires O(n) copying I would think? So if the total allocations is log(n) instead of n, which is reasonable, that's still O(n*log(n)).

H

ave you tried an unordered_map, and what were the results?

Actually, yes. I'm trying out a set, since I really only care about the set of unique triples. I found that there's a preallocation option on construction of these sets, so I set it to twice the number of elements possible. It's ghastly slow--taking 500ms in release mode for a measly 250,000 elements. I think that may have something to do with how I'm implementing it, so I'm starting a new thread specifically.

Nope, that's not the case. This is how a vector grows, and an unordered_map is very similar. Starting from a typical initial size of 8:
Upon adding the 9th item, the existing 8 items are copied into the new block with room for 16 items.
Upon adding the 17th item, the existing 16 items are copied into the new block with room for 32 items.
Upon adding the 33rd item, the existing 32 items are copied into the new block with room for 64 items.
Upon adding the 65th item, the existing 64 items are copied into the new block with room for 128 items.
Upon adding the 129th item, the existing 128 items are copied into the new block with room for 256 items.
Total number of growth-related copies to expand to a total of 256 items: 8+16+32+64+128 = 248 which is very close to 256. continue adding more items up to the next growth point and it's the same - Always proportional to n.
If you don't follow my logic there, or for some reason don't believe it, just know that it really is amortised O(n) and you can probably search plenty of other more technical explanations as to why.

Stroustrup himself says:

People sometimes worry about the cost of std::vector growing incrementally. I used to worry about that and used reserve() to optimize the growth. After measuring my code and repeatedly having trouble finding the performance benefits of reserve() in real programs, I stopped using it except where it is needed to avoid iterator invalidation (a rare case in my code). Again: measure before you optimize.

I myself have found the same thing. It makes very little difference.

There is little sense, if any, in condiering anything else until you have measured the performance using unordered_map. You would be very very hard pressed to find anything better in practice.

PARTNERS