Container Selection

Started by
5 comments, last by Stragen 10 years, 3 months ago

Just after some opinions,

I have a 2 element structure that I need to store, and am trying to consider the options available, and am not sure which approach i want to take, hopefully staying within the STD libraries.

The information being stored is a string name with an associated matrix. It needs to be searchable, but also referenced by index.

So far:

I've implemented this using a MAP, keyed by the string name, the search-ability is brilliant, however the need to access the index has come up.

I'm considering a struct based vector with searcher function, though i'd really like to avoid this if there is a more elegant solution available in library.

I've considered a list, however, the data needs to be randomly referenced, and when the index is already known i dont want to have to scan the entire list to get to the last reference (could be well over 5k references). Sets suffer from the same issue as maps, and arrays aren't intrinsically searchable or as expandable as i'd like.

I'm expecting that i'm not going to be able to avoid the Struct-Vector with search off the side, but if anyone has another idea, i'm all ears.

Advertisement

Depending on your requirements on the indexing, you could get away with sorting the vector based on the string and look up the name with std::binary_search which has the same logarithmic lookup performace as a map but allows linear time indexing into the vector as well. It does, however, require that your indices can be determined from the array, and not the other way around: for example, that all you need is some "arbitrary" unique integer that you can use to reference the object in the vector, instead of having the objects generate their own specific ID.

Another option depending on how dynamic the sets of objects are during runtime; keep a map and a vector, and then access the map with the key and the vector with the index.

Boost has a container that allows multiple types of indexing; I've never used it before.

Are you sure you must allow index lookup? How fast does it need to be? How many elements do you expect the container to hold?

Unless your map needs to be ordered, using std::unordered_map is faster for lookups than std::map.

If your index-based lookups must be fast, then use std::vector, and use a function to crawl the vector for string-based lookups.

If your string-based lookups must be fast, then use std::unordered_map, and use a function to crawl the vector for ID-based lookups.

If your iterations need to be fast, then use std::vector, and use a function to crawl the vector for string-based lookups.

If all need to be fast, and the container is huge, then you could use Boost's container that is basically a map and a vector combined, or roll your own wrapping std::vector and std::unordered_map.

By 'use a function', std::find_if(...container..., predicate) would work well, and you can pass a lambda or a functor, or wrap it for convience:


auto it = std::find_if(std::begin(data), std::end(data),
						   [&](const MyStruct &myStruct) -> bool
						   { return (myStruct.name == nameToLookup); });

http://ideone.com/QyaRJC

There are two options I'd consider:

A) Store matrices unordered in a vector. Also store a map of strings to indices. Wrap these two things in your own class.

B) Store structs containing with the matrix-string pair in a vector sorted by the string. Use the index to search by index. Use a binary search to find a string.

If you use a map, you can store pointers to the items which in most case are similar to index's. That way you have constant time pointer access, and logarithmic time access with the string.

Another option: a vector of matrices and a std::unordered_map<string,unsigned> that maps strings to indices. That gets you amortized constant time insertion, constant-time access by index and near-constant-time access by string (fast, in any case).

To answer the question around the need to be indexed, while also look up capable; the resource files that i'm using have multiple ways of referencing the same piece of data...

For example: one part of the file might refer to an element by its ID name, with another part referring to its index within the container, the format isnt mine, so i've got no real control over that.

this being said, a vector of pointers might be the trick, if the string name is used, use the map to find the reference, if an integer index is used, use the vector of pointers to the map.

This topic is closed to new replies.

Advertisement