Jump to content

  • Log In with Google      Sign In   
  • Create Account


Thread-safe unique sequential index per class


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
16 replies to this topic

#1 TheChubu   Crossbones+   -  Reputation: 3760

Like
2Likes
Like

Posted 08 February 2014 - 12:28 PM

Hi! So I was hacking a bit the Java Artemis ECS framework  (ECS = Entity component system) and noticed something: There are two parts of it that require unique integer indices per class, not per object instance.

 

It needs an unique index for each class that extends Component (ComponentType serves this purpose) and for each class that extends EntitySystem (the SystemIndexManager static inside EntitySystem serves this purpose).

 

Both solutions more or less rely on a HashMap storing the relationship between a class and its index.

 

(EntitySystem class source and ComponentType source for reference)

 

Both are unsafe in multi threading usage, meaning that if you either add components of the same type to two separate entities in different threads or create EntitySystem objects in different threads, you get non-deterministic results.

 

First instinct was to replace the static ints for AtomicIntegers, yet that doesn't makes it thread safe since there is a whole method and a HashMap in the middle.

 

So I am asking how would I implement an unique index per class. It doesn't has to be unique in itself but unique indices for EntitySystems and unique indices for ComponentTypes separately. It has to be sequential since those are used for direct array access on various places (so I can't have an EntitySystem with an index of 3 million for example).

 

I could use synchronized methods and ConcurrentHashMap, following the same ideas of Artemis, but its possible that there is a lock-free way to do this? Or at least least intrusive than synchronizing related methods? As I mentioned, I'm using Java so there are quite a bit of threading stuff to use.


"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


Sponsor:

#2 SeanMiddleditch   Members   -  Reputation: 4086

Like
0Likes
Like

Posted 08 February 2014 - 01:07 PM

This generally isn't a problem that happens in practice. All the code that initializes the list of available components or whatever would run in the main thread during early engine initialization before any threads are even created. I have no experience with Artemis or Java games specifically, but that's how I'd expect it to be done in any language.

#3 Servant of the Lord   Crossbones+   -  Reputation: 17272

Like
1Likes
Like

Posted 08 February 2014 - 02:17 PM

Random crazy thought:

Each concrete function has it's own address in memory. Would it be possible to do something like using a function known to exist uniquely for each component (like the class constructor) and use its address as a unique ID?

 

Or if you have an overridden 'getID()' function (like your getTypeFor() function), have that function just return its own address?

 

(C++-inspired pseudocode)

//typedef uintptr_t ComponentID; //If you want.
 
class MyComponent : public BaseComponent
{
     //overrides virtual BaseComponent::getID()
     uintptr_t getID() override 
     {
          //Not sure if this does what I think it does or even compiles:
          return reinterpret_cast<uintptr_t>(&MyComponent::getID);
     }
}

This would only work at runtime, and the IDs couldn't be saved to file or anything, but it would be constant for the duration of the program's execution and maybe(?) across threads (do functions change their addresses for each thread?).


It's perfectly fine to abbreviate my username to 'Servant' rather than copy+pasting it all the time.

[Fly with me on Twitter] [Google+] [My broken website]

All glory be to the Man at the right hand... On David's throne the King will reign, and the Government will rest upon His shoulders. All the earth will see the salvation of God.                                                                                                                                                       [Need free cloud storage? I personally like DropBox]

Of Stranger Flames - [indie turn-based rpg set in a para-historical French colony] | Indie RPG development journal


#4 Pink Horror   Members   -  Reputation: 1090

Like
1Likes
Like

Posted 08 February 2014 - 02:30 PM


Random crazy thought:
Each concrete function has it's own address in memory. Would it be possible to do something like using a function known to exist uniquely for each component (like the class constructor) and use its address as a unique ID?

 

How is that usable as an array index, or in Java?



#5 Servant of the Lord   Crossbones+   -  Reputation: 17272

Like
0Likes
Like

Posted 08 February 2014 - 02:42 PM

How is that usable as an array index, or in Java?

 
Oops, I missed the part of the sequential indices and was focusing on the cross-thread consistency problem. I saw the hashmaps in the ComponentType Java code, so I figured any unique integers would suffice. biggrin.png


It's perfectly fine to abbreviate my username to 'Servant' rather than copy+pasting it all the time.

[Fly with me on Twitter] [Google+] [My broken website]

All glory be to the Man at the right hand... On David's throne the King will reign, and the Government will rest upon His shoulders. All the earth will see the salvation of God.                                                                                                                                                       [Need free cloud storage? I personally like DropBox]

Of Stranger Flames - [indie turn-based rpg set in a para-historical French colony] | Indie RPG development journal


#6 Madhed   Crossbones+   -  Reputation: 2522

Like
1Likes
Like

Posted 08 February 2014 - 03:25 PM

The problem here is that in a multi threaded environment different classes could be assigned the same index, right?

So I guess the AtomicInteger wold be the right solution, no? Am I missing something?

 

Edit: Should have read the code first...

 

Edit2: What about static initialization blocks? I haven't done extensive java programming but they seem like a better solution. Initialization is done *once* the class is first loaded. They are thread safe.


Edited by Madhed, 08 February 2014 - 03:50 PM.


#7 TheChubu   Crossbones+   -  Reputation: 3760

Like
0Likes
Like

Posted 08 February 2014 - 04:07 PM

I'm not following how exactly would you use the static initializer in this case, could you provide an example?

 

The idea is not to burden the user of the library with generating their own indices. ie, the indices can be generated with information that is inside Artemis, outside users shouldn't touch the indices. When you extend EntitySystem, you should already get an index for the class you created.

 

This is what I came up with: http://pastebin.com/g6mgRah5

 

EDIT: Whoops, typo, it should be 

synchronized (ClassIndexer.class)
{
i = Integer.valueOf( getNextIndex( superType ) );
indexMap.put( type, i );
return i.intValue();
}
(note the lack of type.getClass() )

 

The mapping isn't straightforward because the indices depend on the superclass. You want to start counting from a particular superclass, ie: A counter for EntitySystem sublcasses and another one for Component subclasses.

 

The synchronized hit comes the first time the index is required, otherwise it should return before getting to the synchronized block.


Edited by TheChubu, 08 February 2014 - 04:17 PM.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


#8 Bacterius   Crossbones+   -  Reputation: 8188

Like
0Likes
Like

Posted 08 February 2014 - 05:53 PM

Well since they must be strictly sequential it would seem kind of difficult to make sure there are no gaps or repeats without synchronizing the counter (or just doing the setup work in a single thread). Do the classes even need to know what their index is? The sequential requirement is regrettable, otherwise there would be interesting solutions to the problem, for instance encoding the class name as an arbitrary unique integer, which could let you map that index to the class and vice versa in a completely stateless fashion sad.png


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#9 frob   Moderators   -  Reputation: 19011

Like
1Likes
Like

Posted 08 February 2014 - 08:23 PM

I think their system's use of a ConcurrentHashMap is probably the right idea. 

 

Why are you doing this?

 

Is this for serialization or anything that sticks around between runs? Storing class-specific details based on an arbitrarily assigned integer at runtime is a really bad idea. What happens if later in development you need to add a new class? Are you going to go through every save file, every game asset, every tool, every piece of code, and modify the values to simply +1 everything that came after it? What if you add more than one, and they are in different places in code?

 

Is there some actual performance reason you are trying to address? There are many good solutions to the problem of object lookup depending on your specific needs.

 

There are other, much better ways to get a unique identifier for each class.  A simple run of foo.getClass().getName() would work for a unique class identifier. If you need to see if it is an EntityType of CompoenentType (not sure how you would have lost that information in the first place) a simple cast would either return the object or null, so you're covered.

 

A unique reference to an object's type is a solved problem in the language. Every non-anonymous class has a unique canonical type. 

 

Making a list of every object in the system is also fairly easy with reflection and you get all their canonical names.

 

What exactly are you doing that absolutely must have unique consecutive integers that never change between runs?


Check out my personal indie blog at bryanwagstaff.com.

#10 TheChubu   Crossbones+   -  Reputation: 3760

Like
0Likes
Like

Posted 09 February 2014 - 07:18 AM

 for instance encoding the class name as an arbitrary unique integer, which could let you map that index to the class and vice versa in a completely stateless fashion sad.png

You say as encoding the class name on runtime with a function?

 

Anyway, I asked because I thought there would be some simpler solution with AtomicInteger (or maybe some initialization trick like Madhed suggested).

 

 


What exactly are you doing that absolutely must have unique consecutive integers that never change between runs?

The better question would be, did you read the OP and followed the links? biggrin.png

 

As I mentioned, I don't need an unique identifier for each class (String or otherwise). I need a sequential index, its used for what it looks like, indexing.

 

The actual thing (its there in the links, BSD, you can poke it biggrin.png) uses Component and EntitySystem indices for two things AFAIK:

 

1. Generating aspects. Bit fields are used to mark which Components a system might be interested in, each component type has its own index, thus there can be Aspects that have marked which components are interested in with bitfields (with other kind of relations like, "only if it has this kind of component" or "only if it doesn't has this kind of component" and so on).

 

2. Entities have bit fields for marking what kind of components they have and what kind of systems they can be processed on. Again, Component and EntitySystem indices are used for marking those bits. ie, component Position has index 4, so in Entity, if the fourth bit is ticked, then it has a Position.

 

Most of the relations are actually implicit, not direct. ie, you can't instantiate Entities, you can ask for them via a World instance (so World owns entities), Entities don't actually have a collection of components, they're stored separately.

 

The framework exposes it as if it was all like you'd expect, Systems have all the entities they process, Entities are sacks of components. As an user, that's all you see. But inside the framework, it doesn't works like that.

 

New classes would be simply another bit in the BitSet, it shouldn't pose a problem. You shouldn't serialize the indices anyway, the whole idea is to generate them on runtime as they are needed.

 

Now, I didn't designed the framework, all I know is that it works (been ported to various languages, used in a few games) so I'm not going to redesign the whole thing because bit fields with per-class indices aren't the best thing to do. I'm not in the best position to do so either since I didn't figure out exactly how all interconnects yet.

 

I do care that if I do relatively "innocent" things like creating components on separate threads, and assign them to entities on separate threads, it should work as expected.

 

BTW, I'll check ConcurrentHashMap biggrin.png

 

EDIT2: Now I think what I did on the pastebin isn't thread safe at all. ConcurrentHashMap it is!

 

EDIT3: ... I'm out of ideas, seriously, can somebody help me with this one? http://pastebin.com/g6mgRah5 I'm not sure replacing HashMap with ConcurrentHashMap is enough.


Edited by TheChubu, 09 February 2014 - 08:26 AM.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


#11 TheChubu   Crossbones+   -  Reputation: 3760

Like
0Likes
Like

Posted 09 February 2014 - 11:01 AM

This is the best I could come up with:

 

http://pastebin.com/ch9ht9Q5

 

Its big but hopefully avoids synchronization once most indices have been initialized.

 

I'm done with this, thanks all of you for your suggestions and ideas!


"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


#12 rip-off   Moderators   -  Reputation: 7706

Like
1Likes
Like

Posted 09 February 2014 - 03:50 PM

It is not thread safe to access a variable like resolvedMap like that, where some access is synchronized and others are not. Even though some operations are "reads", you cannot guarantee the state is safe to read from if another thread is changing the state. Also, there is multi-processor consistency, to ensure other processors will see the changed value.

 

ConcurrentHashMap is lock free in the way you appear to be looking for:


However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access.



#13 TheChubu   Crossbones+   -  Reputation: 3760

Like
0Likes
Like

Posted 09 February 2014 - 04:53 PM


It is not thread safe to access a variable like resolvedMap like that, where some access is synchronized and others are not. Even though some operations are "reads", you cannot guarantee the state is safe to read from if another thread is changing the state.

That's the idea, first access isn't supposed to be thread safe at all, is the second access that does the trick. Unless I'm missing something, if a thread changes resolvedMap inside the synchronized block, the change should be visible at least for the next thread that enters the synchronized block right?

 

Logic is this:

 

Two threads enter the method, two threads retrieve the index, two threads see its null at the same time.

 

Now here is the locking, one thread reaches the synchronized fallback method call, the other thread has to wait.

 

This is the first thing the synchronized block tries to do:

  1. Integer i = resolvedMap.get( type );
  2.        
  3. if ( i != null )
  4. {
  5.     return i.intValue();
  6. }

It tries to retrieve the same value again. Since it was null before, it will still be null, so the thread will follow the rest of the code and initialize the index, thus finishing executing the synchronized method and releasing the lock.

 

So far we've added a value to resolvedMap inside the synchronized block.

 

Now second thread enters the synchronized fallback method, and tries to retrieve the same value again (like the previous thread did before) and this time it isn't null so it has to return.

 

It doesn't matters if the change is made visible outside the synchronized block in a later time, as long as the change made inside it is visible for the next thread that enters the synchronized fallback method, it will work I think.

 

The idea is that first time, no matter how many threads try to get the index for the same class, they'll get all blocked, first thread that grabs the lock will initialize the value for the rest, then the others will retrieve that same value again in an orderly fashion.

 

After that, when the change is visible to other threads in a later time, all further calls for that same index will never reach the synchronized block.

 

Now the thing is ConcurrentHashMaps solve the access thing, but not the whole other operations I need to do. Say that I remove the synchronization and replace the HashMaps with Concurrent ones:

 

Two threads enter the same method, two threads ask for the same index, the two will see a null value, so the two threads will try to execute the initialization method, and since it isn't synchronized, they'll overwrite each other. The only advantage is that once a thread reaches the put() call, it will be made visible everywhere else, but that doesn't solves my race condition if other threads already entered the initialization method.


Edited by TheChubu, 09 February 2014 - 04:54 PM.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


#14 rip-off   Moderators   -  Reputation: 7706

Like
1Likes
Like

Posted 10 February 2014 - 04:50 AM

I know what you're trying to do, it is called double checked locking, and unfortunately it is not guaranteed to work in Java. The synchronized keyword is not just a lock, it also inhibits certain optimisations that would be correct in single threaded code but that can cause issues in the presence of concurrency.

 

In addition, it is explicitly stated that HashMap is unsafe to read while another thread is updating it:


Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

 

A concurrent hash map has the properties you need.


Now the thing is ConcurrentHashMaps solve the access thing, but not the whole other operations I need to do. Say that I remove the synchronization and replace the HashMaps with Concurrent ones:

You don't remove the synchronized block - you still need that as you correctly point out. However, by using a ConcurrentHashMap with the synchronized block, you can get the lock-free optimisation you are trying to implement, in a way that is guaranteed to work.



#15 TheChubu   Crossbones+   -  Reputation: 3760

Like
0Likes
Like

Posted 10 February 2014 - 08:40 AM

I see. So the problem is that, while the next thread will see it as initialized, since the "put()" operation is done inside the synchronized block, the constructor might not be finished in time before the object gets retrieved by the next thread.

 

So what you're proposing is using a ConcurrentHashMap instead of a HashMap. I'm not seeing how that changes the constructor issue?


"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator


#16 rip-off   Moderators   -  Reputation: 7706

Like
1Likes
Like

Posted 10 February 2014 - 09:03 AM

The documentation for the java.util.concurrent package states:


Memory Consistency Properties

Chapter 17 of The Java™ Language Specification defines the happens-before relation on memory operations such as reads and writes of shared variables. The results of a write by one thread are guaranteed to be visible to a read by another thread only if the write operation happens-before the read operation. The synchronized andvolatile constructs, as well as the Thread.start() and Thread.join() methods, can form happens-before relationships. In particular:

  • Each action in a thread happens-before every action in that thread that comes later in the program's order.
  • An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock (synchronized block or method entry) of that same monitor. And because the happens-before relation is transitive, all actions of a thread prior to unlocking happen-before all actions subsequent to any thread locking that monitor.
  • A write to a volatile field happens-before every subsequent read of that same field. Writes and reads of volatile fields have similar memory consistency effects as entering and exiting monitors, but do not entail mutual exclusion locking.
  • A call to start on a thread happens-before any action in the started thread.
  • All actions in a thread happen-before any other thread successfully returns from a join on that thread.

The methods of all classes in java.util.concurrent and its subpackages extend these guarantees to higher-level synchronization. In particular:

  • Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.
  • Actions in a thread prior to the submission of a Runnable to an Executor happen-before its execution begins. Similarly for Callables submitted to an ExecutorService.
  • Actions taken by the asynchronous computation represented by a Future happen-before actions subsequent to the retrieval of the result via Future.get() in another thread.
  • Actions prior to "releasing" synchronizer methods such as Lock.unlock, Semaphore.release, and CountDownLatch.countDown happen-before actions subsequent to a successful "acquiring" method such as Lock.lock, Semaphore.acquire, Condition.await, and CountDownLatch.await on the same synchronizer object in another thread.
  • For each pair of threads that successfully exchange objects via an Exchanger, actions prior to the exchange() in each thread happen-before those subsequent to the corresponding exchange() in another thread.
  • Actions prior to calling CyclicBarrier.await and Phaser.awaitAdvance (as well as its variants) happen-before actions performed by the barrier action, and actions performed by the barrier action happen-before actions subsequent to a successful return from the corresponding await in other threads.

The implementation of ConcurrentHashMap must do whatever it needs under the hood to ensure these properties.



#17 TheChubu   Crossbones+   -  Reputation: 3760

Like
0Likes
Like

Posted 10 February 2014 - 11:52 AM

All right then, I replaced the HashMaps with ConcurrentHashMaps. Thanks a lot for the feedback :D


"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

 

My journals: dustArtemis ECS framework and Making a Terrain Generator





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS