Fastest data structure

Started by
9 comments, last by steg 10 years, 12 months ago

Hi,

I'm using a hash map to store my objects which contain vertices and world offset position. I use a string as the key, the key being the objects x,y,z position stored as a string - used to lookup if object has been drawn or needs drawing or removing. Is there anything faster than a hashmap for this?

This is in Java by the way :-)

Thanks

If it isn't working, take a bath, have a think and try again...

Advertisement

Note sure on speed, but Assuming they are all unique (what happens atm if you have 2 objects at the same x,y,z?)

You could use a Matrix ...

If you can pre-assign each object an ID from 0 to 1-numObjects, you can just use an array, indexed by IDs.

However:
Can you explain the algorithm that's causing you to require this structure?
Sounds like the faster option would be to not draw thing in such a way that requires these kinds of lookups to begin with.

Hi,

Firstly, two objects would never have the same x,y,z they are all static objects - basically voxels like minecraft.

I have a chunk class which has an array of blocks (my block class):


private Block[][][] Blocks;

I also have a chunk manager class which has the hashmap in question:


private HashMap<String, Chunk> ChunkList = new HashMap<String, Chunk>();

Chunks are added to this HashMap at the start of the game, for instance: (this is just a simple example - 4 x 4 chunks)


for(int x = 0; x < 4; x++)
{
	for(int z = 0; z < 4; z++)
	{
		ChunkManager.getInstance().AddChunk(ChunkType.ChunkType_Random, x, 0,
					z);
	}
}

Then when rendering the chunks - done in chunk manager:


public void render(float x, float z)
{
	for(Chunk c : ChunkList.values())
	{				
		int calc = Math.abs((c.getX() * Chunk.CHUNK_SIZE) - (int)Math.abs(x));
		if(calc < Chunk.CHUNK_SIZE * 3)
		{
			calc = Math.abs((c.getZ() * Chunk.CHUNK_SIZE) - (int)Math.abs(z));
			if(calc < Chunk.CHUNK_SIZE * 3)				
			{
				c.render();
			} 
		} 
	}
}

Was just wondering what structure could be better than a hash map? Hashmap is being used or will be to quickly look up if a chunk needs

rendering, loading or unloading...

Thanks for your input.


If it isn't working, take a bath, have a think and try again...

PS - here is a clip of what I am doing:

">

If it isn't working, take a bath, have a think and try again...

get out your flamethrowers!

relational databases implemented as static arrays.

one list of vertices/meshes
one list of object types
one list of active objects

an object struct has an active field to indicate whether its in use and should be drawn etc.
it also has a type field, which is the index of its type in the list of object types.

the object types struct has a meshID field , which is its index into the mesh database.

to draw:
for a=0; a<maxobjs; a++
if (objj[a].active)
{
t=objj[a]type
m=objtype[t].meshID
drawmesh(obj[a].x,obj[a].y,[obj[a].z,meshID);
}

objrecs carry target specific info like active, location, orientation, type, and damage.

objtype recs carry type info like meshID, hp, maxspeed, turnrate, etc.

for static objtypes, it may just be name, meshID, and hp.

but this will be the fastest data structure.

possible speedups include putting more info in the objrec to reduce some dereferenced addressing, but its not worth it.
if the target list is large, it can be ordered on x,y,z and indexed on x,y, or z to speed up searches. i use that method on a list of 18,000 huts to draw. but i don't even need it for drawing, i use it for collision checks while drawing: "i'm drawing a plant. is there a hut in the way? ok, no plant then." the list is sorted on x, then y, then z. then the list is indexed on x lets say. the index has the offset of the first object with that x, and the number of objects with that x. when you need to know if there's an object at some x,y,z, you lookup the start offset in the index, then loop thru the list from there for the number of objects with that x. since its ordered, you can bail as soon as x gets too big, or y gets too big for a given x, or z gets too big for a given x and y.
but that speedup is primarily for collision checks, not drawing. for drawing, looping thru the list with "if active, draw" is plenty fast. an even faster speedup for collision checks is to create a collision map from the list of objects. a simple monochrome bitmap with a 1 where there's an object, and a zero where there isn't. or a 2d array of unsigned char, unsigned short, or unsigned int, that uses obj# (index in the obj database)) instead of 1 to indicate terrain in way. that way, you know what you hit. for each object in your database, you fill in the collision map with its id#. and use something like -1 to indicate open space (no collision). when its time to check for collision:
obj_collided_with=collisionmap[x][z];
if (obj_collided_with != -1) handle_collision()
hard to get faster that that.

used to lookup if object has been drawn or needs drawing or removing.


sounds like you may have less than most efficient flow control, as well as data structures.

i can't think of any reason why one would need to lookup if something had been drawn yet. this is something that your flow control code should know at all times, as its what controls the drawing of objects.

your basic drawing routine should be something like:
for (a=0; a<maxobjs; a++)
{
if (!obj[a].active) continue;
if (far_away(a)) continue;
if (frustum_cull(a)) continue;
drawit(a);
}

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Let's start with the most important question -- is it too slow right now? If it works, and it isn't a bottleneck, then it's better to continue with the rest of your project and only worry about the HashMap if it turns into a performance issue, or interferes with other parts of your implementation.

It sounds to me like your system is flawed. You shouldn't be deciding if an object should be drawn or not based on its location. You should have a better process for adding/removing objects from your scene. Hodgman has the right of it, which is that you should be looking to control your objects by ID, rather than by position.

Hey many thanks guys,

No, the project is running ok - frustum culling will be implemented today on chunks, but this can be expensive if no form of

spatial partioning been done - we don't want to loop through 100000 objects for instance...

Using an array of object id's and use this id as the index (kinda like a bucket sort) - I guess this will be a dynamic array?

There still needs to be something in place to determine what is visible and not, for voxel type world it looks like an Octree is

a good solution to speed this process up.

@Norman - my draw routine is basically like that, but that is flawed as you would be checking every object in that loop, it needs splitting / partioning up first.

My voxels do have a boolean to see if active or not.

Reason wanting to know if something had been drawn as at some point it will need unloading from memory - so I guess I should have said, has been loaded and still in view, if not in view but has been loaded, release from memory.

I implemented face culling yesterday, basically don't draw faces that have neighbouring ones, this is all done before being sent to the GPU, so gave some very good optimization. Of course, you could take this further and merge triangles - but and don't shout ;-), but I'm using quads at the moment.

Have any of you guys got some good advice on what to do when a collision with one of the blocks occurs?

Again, thanks for your appreciated advice :-)

Steve

If it isn't working, take a bath, have a think and try again...

To understand the suitability of any given container, it's quite useful to understand what is going on under the hood.

The hash table is basically an array of 'buckets'. When you retrieve an item, it generates a hash code from your key, and uses that to decide which bucket it is in. The buckets themselves are linear arrays, so if multiple items ended up into one bucket it must do a linear search to find the correct one.

A lot depends then, on the hash function that generates the hash code from your key. I'd assume that the java default string hashing function is pretty decent, so you can probably expect a reasonable spread of values, and therefore not too many entries going into the same bucket (an unhashed, linearly incrementing integer on the other hand, would pretty much guarantee you the worst case performance of O(n))

Another thing to consider is the load factor. The default value is 0.75 - which basically means that once the number of items reaches three quarters of the total number of buckets (the default capacity is 16), the structure doubles the number of buckets, so you can expect a certain amount of reallocation of memory and other work going on when this happens. You probably don't want this to happen too often, especially if the hash table is large and you're mid game, so you need to consider how much your table is likely to grow, if at all.

So far we've only considered the actual performance of the container itself, but don't forget that we're doing other work here as well. Converting the floating point coordinates to a string will involve a whole load of work, plus memory allocation for the string, and then the string's hashing algorithm has to turn that into a hash code. This additional work is probably quite significant compared to the lookup itself. I'd guess that the time taken to convert the coordinates to a string would completely dwarf the time spent in the rest of the algorithm.

If you're looking for a fast algorithm that you can call many thousands of times a frame, this probably isn't it. There's a lot of pointless work going on with all the string conversion and hashing, so if you're doing a lot of lookups, you're going to suffer. Unless your chunk table is actually quite large, I wouldn't be too surprised if it turned out that a dumb linear search just based on the raw coordinates was quicker.

However if my understanding of your usage is correct, you're not actually searching this table that often. It's probably way off from being a performance bottleneck, in which case, don't worry about it.

This topic is closed to new replies.

Advertisement