Sign in to follow this  
saejox

std::map registration

Recommended Posts

saejox    714
Hi,

i wrote a std::map, std::unordered_map, or any associative container register class.

here is the syntax
[CODE]
hashmap<int>::RegisterHashMap("hashmap_int", "int", engine);
hashmap<string>::RegisterHashMap("hashmap_string", "string", engine); // a better dictionary addon
hashmap<float>::RegisterHashMap("hashmap_float", "float", engine);
RegisterIntpair();
hashmap<intpair>::RegisterHashMap("hashmap_intpair", "intpair", engine); // std::pair
[/CODE]

script
[CODE]
hashmap_int<float> map;
int size = 100;
for(uint i = 0; i < size; ++i)
{
map[i] = 42.0f;
}
hashmap_int_iterator end = map.End();
for(hashmap_int_iterator it = map.Begin(); it != end; ++it)
{
float x = map.Get(it);
}
[/CODE]
[CODE]
intpair pair(10, -10);
hashmap_intpair<HashMapTest@> map;
@map[pair] = HashMapTest(11.0f);

hashmap_intpair_iterator it = map.Begin();
while(it != map.End())
{
float x = map[it].x;
dbg("x: " + x);
it = it.Next();
}
[/CODE]

code is only 1 day old. only tested barely.
it is not battle tested, there may be some leaks.

i hope it will be useful to someone.

Share this post


Link to post
Share on other sites
TheAtom    330
In script environment, one would like to be safe from the possibility of dereferencing bad iterators. Any thoughts on that?

In the code I am doing now, every container keeps track of iterators that point to it to invalidate them whenever the container's contents (anything that affects iterators) are changed. At that point it also updates internally kept end and begin iterators which are to be returned by reference, to save from excessive copying. GC references are kept back and forth. Dereferencing an iterator marked as invalid raises a script exception.

An alternative I considered was to keep the container's "version number" and increment it every time the contents change. The iterator is also keeping the container version number it was created for, and this value is passed over when copying the iterator. Dereferencing, incrementing etc. of an iterator that holds an outdated value would raise an exception. That relieves both the container and GC from extra iterators "herding" but is potentially unsafe in case of some heavy duty containers(then again it's not likely for the version number to go over 64 bit integer limit, is it).

Apart from the iterators I exposed all the basic stl templates. I still haven't decided on the manner I'm going to do the iterators, I would welcome any suggestions. Perhaps I am missing some obvious solution.

Share this post


Link to post
Share on other sites
saejox    714
why do you even need stored iterators. just store keys
they are needed most in linked lists where access is N. in every other case you have o(1) access.
linked list do not even invalidate iterators, so it is unnecessary to implement some sort of reference counting for iterators.
i use iterators only when i need to delete more than one element in one iteration.

that how i use them anyway. might be some other cases i dont know

Share this post


Link to post
Share on other sites
TheAtom    330
It's easy to produce, accidentally or on purpose, a code that leads to an undefined behaviour.

[code]
container.push_back(50);
iterator it = container.begin();
container.clear();
Log(it->value);
[/code]

My goal was to eliminate the possibility. That's again the issue of scripts supposed to be a sandbox environment. Much like disabling/enabling unsafe references, it's the matter of having a possible crash making code or not having it.

Share this post


Link to post
Share on other sites
saejox    714
how about you keep one copy of the container and use it exclusively for iterators.
this maybe the only way to make sure iterators do not crash your code.
java does it this way.

Share this post


Link to post
Share on other sites
WitchLord    4677
These issues are the main reasons why I haven't implemented iterators or for-each loops in the script yet.

It's definitely not possible to keep the iterators as lightweight as in C++ while still keeping the safety.

Your idea of the container keeping track of all live iterators is a good one. Assuming you don't have a lot of iterators to the same container at the same time this shouldn't add too much of an overhead, and is probably quicker than doing a fresh lookup with every access to the value. How would you determine if the iterator has to be invalidated or not? Perhaps an iterator is only invalidated if the value it is pointing to is deleted?

The version number suggestion is also not bad, even though there is minimal risk of the version number overflowing and restarting from 0. Any updates to the container would invalidate all iterators all times, so this can possibly add more overhead than the first one, but would be easier to implement.

Share this post


Link to post
Share on other sites
TheAtom    330
[quote name='saejox' timestamp='1341947783' post='4957737']
how about you keep one copy of the container and use it exclusively for iterators.
this maybe the only way to make sure iterators do not crash your code.
java does it this way.
[/quote]

How does it handle erasing by iterators or iterators range? For instance, container.erase(container.begin(), container.begin()+5);

[quote name='Andreas Jonsson' timestamp='1341950598' post='4957759']
How would you determine if the iterator has to be invalidated or not? Perhaps an iterator is only invalidated if the value it is pointing to is deleted?
[/quote]

This is hard to do in general and it will usually lead to iteration over all iterators to see what must be done about each one. There are some guarantees on the validity of iterators after calling some methods that modify the container. A rather messy document available at [url="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf"]http://www.open-std..../2010/n3225.pdf[/url] tries to sum them up in the chapter 23. For example, a swap() never invalidates the iterators (though of course after the call they point to different containers, and all GC references have to be updated anyway). An erase() on an associative container invalidates only the erased iterator - however there might still exist other copies of it, so again we have to iterate and check each one. A push_back() or pop_back() on a deque does not invalidate anything. A range erase() on a vector invalidates only the iterators past the new end. In this last case it is possible to verify an iterator at any point - it is sufficient that it lies between [begin(), end()) range since vector never shrinks the underlying array, excepting clear() call, which invalidates everything (in any other container too).

The need for iteration on almost every modification is one of the arguments for the "versioned containers" solution. But perhaps some hybrid approach wouldn't be a bad idea - in some cases we know that we have to invalidate everything (any clear()) and we do that by increasing the version counter, sometimes we handypick the iterators to be invalidated (erase(iterator) in map or set), and in case of a vector, maybe we don't care for invalidation and check the validity runtime (it entails but a pair of address comparisons, so the performance hit should not be severe).

I will take some time to think more about this. I am gone for good until 18.07 so I just wanted to pour all my thoughts here first. Edited by TheAtom

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this