std::unordered_map::insert() Return Values

Started by
4 comments, last by Ectara 10 years, 8 months ago

Can anyone explain why some overloads of the insert() method of std::unordered_map return a pair with a boolean value indicating success, while others that differ only in providing a hint iterator don't return this boolean value? It seems like there'd be an equal chance of failure, so why does only one set of overloads return a boolean value?

Advertisement

Pretty much you have two variations of return value, well three if you count void. The single iterator return is actually the 'uncommon' case and the 'hint' is the important bit. Basically the idea is that you pass in what you believe is the iterator to the item you want to replace and if that iterator references the proper key, the value is replaced without any lookup overhead. The more common version is the pair return which tells you if the iterator represents a newly inserted item or something you simply replaced the value of.

What use is this 'hint' variation, that's tricky but can be useful for optimizations (no guarantee's though according to the standard). As this is in the beginners forum you should probably ignore the versions which accept hints as it is really an optimization item you will only put to use at a later time.

Pretty much you have two variations of return value, well three if you count void. The single iterator return is actually the 'uncommon' case and the 'hint' is the important bit. Basically the idea is that you pass in what you believe is the iterator to the item you want to replace and if that iterator references the proper key, the value is replaced without any lookup overhead. The more common version is the pair return which tells you if the iterator represents a newly inserted item or something you simply replaced the value of.

What use is this 'hint' variation, that's tricky but can be useful for optimizations (no guarantee's though according to the standard). As this is in the beginners forum you should probably ignore the versions which accept hints as it is really an optimization item you will only put to use at a later time.

I understand the differences between how the methods work; what I want to know is why the single insertion methods without a hint return a pair with a boolean value indicating success, while the methods accepting hints don't return a boolean value indicating whether or not they failed. From what I can tell, if an element already exists in the container, the insertion fails, and the methods accepting hints are no exception.

So, why does one return a bool, and the other does not?


As this is in the beginners forum you should probably ignore the versions which accept hints as it is really an optimization item you will only put to use at a later time.

I don't remember posting this in the For Beginners forum...

Pretty much you have two variations of return value, well three if you count void. The single iterator return is actually the 'uncommon' case and the 'hint' is the important bit. Basically the idea is that you pass in what you believe is the iterator to the item you want to replace and if that iterator references the proper key, the value is replaced without any lookup overhead. The more common version is the pair return which tells you if the iterator represents a newly inserted item or something you simply replaced the value of.

What use is this 'hint' variation, that's tricky but can be useful for optimizations (no guarantee's though according to the standard). As this is in the beginners forum you should probably ignore the versions which accept hints as it is really an optimization item you will only put to use at a later time.

I understand the differences between how the methods work; what I want to know is why the single insertion methods without a hint return a pair with a boolean value indicating success, while the methods accepting hints don't return a boolean value indicating whether or not they failed. From what I can tell, if an element already exists in the container, the insertion fails, and the methods accepting hints are no exception.

So, why does one return a bool, and the other does not?

Returning an iterator is the usual insertion return value for any associative container - map, set, multiset, multimap. The iterator-bool pair is a pattern only used by associative containers with unique keys. This looks like a design compromise - the STL creators tried to make container interfaces match. So, I bet adding the iterator-bool pair version for the hint-based methods was a step too far for the designers. They were probably reluctant about the hint versions to begin with. They're only really supposed to be used when you know exactly where you want to insert something - you should know it will succeed. I doubt they wanted two separate versions for such a specialized method among associative containers.

I understand the differences between how the methods work; what I want to know is why the single insertion methods without a hint return a pair with a boolean value indicating success, while the methods accepting hints don't return a boolean value indicating whether or not they failed. From what I can tell, if an element already exists in the container, the insertion fails, and the methods accepting hints are no exception.

So, why does one return a bool, and the other does not?

There could be a lot of why questions.

Why did they provide a hint version? The documentation from GCC explains it pretty clearly:

In the case of <code>std::unordered_set</code> and <code>std::unordered_map</code> you need to look through all bucket's elements for an equivalent one. If there is none the insertion can be achieved, otherwise the insertion fails. As we always need to loop though all bucket's elements, the hint doesn't tell us if the element is already present, and we don't have any constraint on where the new element is to be inserted, the hint won't be of any help and will then be ignored.

In the case of <code>std::unordered_multiset</code> and <code>std::unordered_multimap</code> equivalent elements must be linked together so that the <code>equal_range(const key_type&amp;)</code> can return the range of iterators pointing to all equivalent elements. This is where hinting can be used to point to another equivalent element already part of the container and so skip all non equivalent elements of the bucket. So to be useful the hint shall point to an element equivalent to the one being inserted. The new element will be then inserted right after the hint. Note that because of an implementation detail inserting after a node can require to update the bucket of the following node. To check if the next bucket is to be modified we need to compute following node hash code. So if you want your hint to be really efficient it should be followed by another equivalent element, the implementation will detect this equivalence and won't compute next element hash code.

Why does the version that accepts the hint not include a flag indicating if the content was already there?

If you already have the value for the hint (as the docs mention above) "it won't be of any help and will then be ignored." So why provide it? For consistency. There is no penalty for having it, and adding it allows consistency between all eight containers.


They're only really supposed to be used when you know exactly where you want to insert something - you should know it will succeed.

So, basically, there's no real difference, and it is left up to the user to ensure that the insertion will succeed for this method only?


Why did they provide a hint version?

I do understand that. I wanted to know why they made the return value of one a pair, and left the other without an additional return value. MSDN documentation even says their implementation returns insert(val).first. If they could change both as easily as they did one, why didn't they?


Why does the version that accepts the hint not include a flag indicating if the content was already there?



If you already have the value for the hint (as the docs mention above) "it won't be of any help and will then be ignored." So why provide it? For consistency. There is no penalty for having it, and adding it allows consistency between all eight containers.

I don't think the answer truly answers that question, unless this is a roundabout way of saying "Nobody will use this unless it is for compatibility with other containers that don't fail to insert elements, so that's why they didn't bother to change the return value."

It has little importance in the grand scheme of things, but I have a hash map class with an interface like unordered_map, and I want to understand why for one of the methods, I return extra information, but the other, I discard it. Basically, I want to make sure I am implementing it right, by understanding why it does what it does.

If the answer is "Nobody will call this unless its for compatibility," or "You should know whether or not it fails," then I guess I can accept that.

This topic is closed to new replies.

Advertisement