• FEATURED
• FEATURED
• FEATURED
• FEATURED
• FEATURED

View more

View more

View more

### Image of the Day Submit

IOTD | Top Screenshots

### The latest, straight to your Inbox.

Subscribe to GameDev.net Direct to receive the latest updates and exclusive content.

# How to do Ordered Multithreading: 2 or more threads executing 1 same function.

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

25 replies to this topic

### #1asperatology  Members

Posted 18 October 2012 - 05:48 AM

Here's my current code and the output:

The ouput, in other words, are labeled as:

AAAAAABBBBBB

I would like to make it so that it does:

ABABABABABAB

In the past, I would create threads and make them work on things independent of one another, Thread A does X work, Thread B does Y work, Thread C does Z work, and so on. The X, Y, and Z do not conflict each other.

Now that I'm starting out on doing multithreading in the same function, I realized it's a unknown adventure for me. I would like someone to lead the way, and warn me, "It's too dangerous to go alone, take this!".

Would anyone like to help?

EDIT: I'm using Firefox 16 and Internet Explorer 9, both browsers appeared to wonk out the code formatting.

Edited by tom_mai78101, 18 October 2012 - 05:52 AM.

### #2Bacterius  Members

Posted 18 October 2012 - 06:48 AM

If all you want is minimal overlapping, you can just give each of your threads an "index" (from 0, 1, 2, ..., n - 1 where n is your thread count) and make each thread aware of how many threads it is working with. Then, each thread with index T does work unit T, T + n, T + 2n, T + 3n, ... there you have it, zero overlapping. For two threads, the first one will do 0, 2, 4, 6, ... and the second one will do 1, 3, 5, 7, ... all within the same "function".

Note you probably don't want to use System.out.println when working with threads, since they'll print their stuff in nondeterministic order depending on which thread got there first. You want to keep a memory buffer which'll store your thread's results (one slot for each work unit you're doing) and each thread writes there. Then, at the end, you can process it however you want.

Also, for multithreading to be of any use, you need your function to contain significant work that can be parallelized. If you're just trying to get two threads to somehow execute an arbitrary function "together" (your diagram is rather vague), that's not the way to go, especially considering each thread has its own stack and CPU context.

I find OpenMP to be rather refreshing when it comes to getting familiar with the concepts involved in scalable multithreading. Don't know if Java has that though.

Edited by Bacterius, 18 October 2012 - 06:53 AM.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

### #3asperatology  Members

Posted 18 October 2012 - 09:39 AM

The significant work I want to give to 2 or more threads, is a Rendering function. All it does is to blit bitmap images onto a Canvas, which will, according to my game logic, make the Canvas "refresh" quicker.

This is why I wanted the threads to execute in an orderly fashion, like a queue pool of threads running towards a goal and hitting a button, in a straight line.

If I index each thread with an ID or some sort, I wouldn't know how to implement it. I wouldn't be able to split the Rendering function in groups in whic they (group IDs) are divisible by the respective thread IDs. The Rendering function is based solely on the order of execution, so it's easier to sort their Z positions out (drawing from back to front). And another problem would be, if I don't do this in an orderly fashion, the rendering will be a catastrophe.

=========================================

Just off topic question. What do you do to println() properly for multithreading programs, if I shouldn't be using println(), as you've mentioned, but I don't want to use a memory buffer? I want to avoid using any sort of memory buffers as this program is to be executed on a hardware with very limited memory.

### #4SimonForsman  Members

Posted 18 October 2012 - 10:48 AM

The significant work I want to give to 2 or more threads, is a Rendering function. All it does is to blit bitmap images onto a Canvas, which will, according to my game logic, make the Canvas "refresh" quicker.

This is why I wanted the threads to execute in an orderly fashion, like a queue pool of threads running towards a goal and hitting a button, in a straight line.

If I index each thread with an ID or some sort, I wouldn't know how to implement it. I wouldn't be able to split the Rendering function in groups in whic they (group IDs) are divisible by the respective thread IDs. The Rendering function is based solely on the order of execution, so it's easier to sort their Z positions out (drawing from back to front). And another problem would be, if I don't do this in an orderly fashion, the rendering will be a catastrophe.

=========================================

Just off topic question. What do you do to println() properly for multithreading programs, if I shouldn't be using println(), as you've mentioned, but I don't want to use a memory buffer? I want to avoid using any sort of memory buffers as this program is to be executed on a hardware with very limited memory.

You pretty much can't control the order of execution when you have multiple threads without removing the parallell execution(Which really is the point of using threads)

Now, since you are blitting bitmaps onto a canvas you should probably give each thread its own section of the canvas to write to so if you got 2 threads you can just split the canvas in the middle and have ThreadA blit the left side while ThreadB blits the right side (with 4 threads you can split it both vertically and horizontally or into 4 "rows" or "columns"), that way there is no interference between threads.

as for println, there is no good way to solve this without synchronization (Which degrades performance), you can have each thread write to its own stream without any problems but if you want them to write to the same stream you will need synchronization. (println in java is synchronized so if one thread calls println on standard output any other thread trying to call println will block until the first one has completed its println call), your best bet is probably to write to separate streams, timestamp the writes and then merge the two streams at a later point. (printing to the console is slow anyway so its not a big deal to merge all the output at the end of each frame and then push it out)

Edited by SimonForsman, 18 October 2012 - 10:57 AM.

I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

### #5asperatology  Members

Posted 18 October 2012 - 11:15 AM

That, I did not realize the meaning until now.

What method design (function design) should I approach so that it would like ideally parallelized? And that would be it. I'll try to study the pattern or paradigm, and probably ask a few more questions about them if I get into some problems.

Posted 18 October 2012 - 03:43 PM

For good performance with multiple threads here's some rough guidelines of what to do:

- Output from each thread must go to a different bit of memory (i.e. on a separate cache line to memory any other threads touch). This may mean you need two copies of the data.
- Instead of using threads directly, use a thread pool that someone else has written and create one thread per CPU core if you get a choice.
- The main thread should split the work to be done into a bunch of independent tasks and adds them to the pool's work queue. It can then continue with other processing that won't affect the data the threads are using if you want. At some point the main thread probably needs to block and wait for the threads to complete their tasks.
- If the tasks can vary in how long they take to process it's a good idea to create significantly more tasks than you have CPU cores to distribute the workload more evenly, but making too many tasks will also add overheads.

By the way because of the way the CPU cache works in 64-byte cache lines you really don't want to have one task processing even elements of an array, and another task handling the odd elements. It's usually much quicker to give the first half of the array to one thread, and the second half to another to stop them fighting over memory accesses. The same goes for splitting an image vertically - do all the splitting horizontally instead and it should be quicker.

### #7rip-off  Moderators

Posted 18 October 2012 - 04:39 PM

POPULAR

At its core, parallelisation is about identifying distinct units of work that can be done at different times without depending on or interfering with one another. One approach to make a seemingly serial problem parallisable is by duplicating some of the data structures involved and coming up with a way of "merging" the results.

Two caveats:
• You must be careful to actually increase throughput overall - there is a threshold of work that must be exceeded before the overhead involved in threading, copying, communication, synchronisation can be offset by the gains of simultaneous execution.
• Find your bottleneck. Not being able to update the screen fast enough might not be limited by bitmap blitting onto a canvas - perhaps the limit is in sending the completed canvas over to the graphics card, for example. Optimising other areas may not result in the kind of speed boost you might be hoping for.

### #8Katie  Members

Posted 19 October 2012 - 02:10 AM

"Now, since you are blitting bitmaps onto a canvas you should probably give each thread its own section of the canvas to write to so if you got 2 threads you can just split the canvas in the middle and have ThreadA blit the left side while ThreadB blits the right side (with 4 threads you can split it both vertically and horizontally or into 4 "rows" or "columns"), that way there is no interference between threads."

Split vertically, not horizontally. Why? Because bitmap memory normally runs left to right, top to bottom. This means that the pixels either side of a vertical dividing line are next to each other in memory and there's a good chance caches will try and pre-load the data. If things next to each other in memory are accessed by different threads running on cores in different packages, they'll have to move ownership of the memory around between the cores' caches. This is slow (it's like a cache miss, but a bit less expensive).

If you split vertically, you minimise the number of pixels which border each other and might be repeatedly moved between the threads/cores.

### #9Bacterius  Members

Posted 19 October 2012 - 02:23 AM

Katie: wouldn't that be a horizontal split, though?

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

### #10asperatology  Members

Posted 19 October 2012 - 05:10 AM

There's a lot of discussion about splitting up the canvas into groups, and then let each thread draw the group. It gave me a whole lot of questions.

1. Wouldn't splitting up the Canvas into areas for threads to render on cause something that resembles tearing?

2. In Java, I do not know of a Java method that allows splitting bitmaps while it is being drawn. What is the logic for that? Test to see if the bitmap Rect area is over the threshold, and then stop drawing that area, while letting the other thread draw the undrawn area?

3. rip-off mentioned that I may be passing a gradually completing Canvas into methods. The possibility that this might be my bottleneck is actually quite high (as I haven't considered it before), as I have passed the same reference to the Canvas object through a lot of methods. Are there other alternatives to efficiently draw tons of bitmaps (of powers of 2, in my case) onto a Canvas?

Edited by tom_mai78101, 19 October 2012 - 05:12 AM.

### #11SimonForsman  Members

Posted 19 October 2012 - 05:23 AM

There's a lot of discussion about splitting up the canvas into groups, and then let each thread draw the group. It gave me a whole lot of questions.

1. Wouldn't splitting up the Canvas into areas for threads to render on cause something that resembles tearing?

2. In Java, I do not know of a Java method that allows splitting bitmaps while it is being drawn. What is the logic for that? Test to see if the bitmap Rect area is over the threshold, and then stop drawing that area, while letting the other thread draw the undrawn area?

3. rip-off mentioned that I may be passing a gradually completing Canvas into methods. The possibility that this might be my bottleneck is actually quite high (as I haven't considered it before), as I have passed the same reference to the Canvas object through a lot of methods. Are there other alternatives to efficiently draw tons of bitmaps (of powers of 2, in my case) onto a Canvas?

1,2 you can access a bitmaps data array in Java (atleast on Android) and copy the parts of the array you want to the buffer you're blitting to.

3. yes, this is quite possible, unless you are blitting tons of bitmaps or using some really ineffective blitting method sending the canvas to the GPU will be the slow part. The Canvas in Java seems to be singlebuffered so you should blit to a software buffer (BufferedImage should work) and then blit that buffer to the canvas. (which means you only do a single blit to the canvas). Also, if you have layers or sections of layers that rarely change you can blit those to their own buffered images at launch + when they change and then blit those images to the main bufferedimage each frame.

Edited by SimonForsman, 19 October 2012 - 05:49 AM.

I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!

### #12Katie  Members

Posted 19 October 2012 - 01:28 PM

"wouldn't that be a horizontal split, though?"

OK, I was terminologically unclear. Split it vertically into horizontal strips. Yes, the split line is horizontal.

### #13asperatology  Members

Posted 19 October 2012 - 07:31 PM

The main problem for splitting the Canvas for threads to work on is the splitting part. I have not found so far a way to split a Canvas class object area into at least two areas. They said that the class Canvas doesn't "own" pixel data. The canvas I'm using to make drawing calls is always attached to the bitmap images.

I'm compelled to create more Canvas objects, and run threads through them. Merging Canvas is going to be a new topic for me, and so does trying to get 2 canvas to draw the same bitmap at different locations to match up with its universal coordinates (originally intended coordinates), because once they are split up, the X and Y destination coordinates for the bitmaps are not entirely the same as they should be.

For Android, in my opinion, their method of using their subset of Java would involve something like using Canvas constructor to create one from a bitmap. Maybe that counts as an BufferedImage substitute.

I did checked out a few resources about parallel programming for Java. The conncurrency keyword, "synchronized" is a total mystery for me. They also mentioned that how a "synchronized" keyword can also affect immensely the execution of threads. It would seemed I have reached a Pandora's Box, and probably may get my project totalled if I keep nudging towards it.

### #14rip-off  Moderators

Posted 20 October 2012 - 09:32 AM

There are lots of techniques that can be used to speed up the game. Multi-threading is a very complex way - it generally requires large architectural changes to how your program works, and also can only provide at most a constant speedup N - where N is the number of distinct hardware "cores" that can be used simultaneously. The chances that your program will get such a boost are low, most real applications aren't perfectly parallisable.

There might be other ways which can offer a much larger speedup - algorithmic improvements, choosing the right data structures or improving your memory layout and access patterns.

I did checked out a few resources about parallel programming for Java. The conncurrency keyword, "synchronized" is a total mystery for me. They also mentioned that how a "synchronized" keyword can also affect immensely the execution of threads. It would seemed I have reached a Pandora's Box, and probably may get my project totalled if I keep nudging towards it.

Hold up - if you don't understand such things then you absolutely should not be using threads. Threads add some very complex behaviour to the otherwise simple to understand Java programming model. When two threads interact without correct communication, the results are undefined and can result in some bizarre, even seemingly "impossible" situations.

If you are serious about learning to write multi-threaded programs you need to build up this knowledge first.

### #15asperatology  Members

Posted 21 October 2012 - 01:32 AM

My game is basically a player controlling a ball with an accelerometer embedded in a hardware system (such as an Android phone). The ball is used to knock other balls into a hole (the goal) in order to complete the level. There are many levels with objects interacting with the player's ball and the balls that should be knocked into the goal.

EDIT: I should mentioned that the accelerometer can easily change its X axis, Y axis, and Z axis values, simply because you are holding the accelerometer with your hands. It is not supposed to be placed on a table or on a hard surface.

About the game logic, the game itself is split into two groups, the Update group and the Rendering group. It's multithreaded and it's done in a simple way of abusing the nature of threads. Like many other people have said time and time again, threads are run inconsistently together, causing many race conditions and multiple complex problems if not handled correctly. By understanding the nature of how threads compete one another on the CPU cores (in this case, I have two cores), I utilize a parallel programming paradigm.

One thread is used to update logic ONLY. The other thread is used to render objects ONLY. And that's it. All variables are read-only if thread A wants to read what thread B is using, and vice versa. I have not done anything with concurrency in order to make both of these threads communicate with each other. I simply made them as separate processes with "global-ish" variables wrapped in a Java class (namespace, as you would call it).

When I test the program out, all is fine. Everything run as expected, and there's no bugs on the rendering or updates. It's a perfect combination. But, there are times when I felt it is running slow after a long period of processing the threads, updates, and rendering.

I did checked out a few resources about parallel programming for Java. The conncurrency keyword, "synchronized" is a total mystery for me. They also mentioned that how a "synchronized" keyword can also affect immensely the execution of threads. It would seemed I have reached a Pandora's Box, and probably may get my project totalled if I keep nudging towards it.

Hold up - if you don't understand such things then you absolutely should not be using threads. Threads add some very complex behaviour to the otherwise simple to understand Java programming model. When two threads interact without correct communication, the results are undefined and can result in some bizarre, even seemingly "impossible" situations.

If you are serious about learning to write multi-threaded programs you need to build up this knowledge first.

The only thing I understand about threads, is that they are unpredictable. So, I used this only understanding and wrote a medium-sized program that goes with this rule of thumb. Technically, it's a little bit like multithreading, but in reality, I really don't know a better way to describe this. "Separated threading" sounds a bit off, but that's a better description of it.

If "synchronized" is the answer to all of my questions, then it's either me not looking at it correctly or it's something I find it hard to grasp, yet can't find out why. But I did some more research before posting this.

The function pipe I drew in the diagram may well be a Java synchronized method. In order to let two threads go in one by one and on a tight, repeating loop, I have to understand the concept of notify(), notifyAll() and wait() and how it works in such a loop... This is the part where I'm probably stuck in, and had me lost my focus, I believe.

Edited by tom_mai78101, 21 October 2012 - 02:12 AM.

### #16nife87  Members

Posted 21 October 2012 - 02:37 AM

The only thing I understand about threads, is that they are unpredictable. So, I used this only understanding and wrote a medium-sized program that goes with this rule of thumb. Technically, it's a little bit like multithreading, but in reality, I really don't know a better way to describe this. "Separated threading" sounds a bit off, but that's a better description of it.

Sounds interesting. One thread/function that manages and executes different tasks, like when a multi-tasking OS is running 2 threads simultaneously on a single core (or running N+X threads on N cores, where X > 0)? I assume this requires that your tasks can suspend themselves after a timeslice (to give control back to the main function). Resembles coroutines (used within Lua to resemble multi-tasking) to me.

### #17asperatology  Members

Posted 21 October 2012 - 07:24 AM

The only thing I understand about threads, is that they are unpredictable. So, I used this only understanding and wrote a medium-sized program that goes with this rule of thumb. Technically, it's a little bit like multithreading, but in reality, I really don't know a better way to describe this. "Separated threading" sounds a bit off, but that's a better description of it.

Sounds interesting. One thread/function that manages and executes different tasks, like when a multi-tasking OS is running 2 threads simultaneously on a single core (or running N+X threads on N cores, where X > 0)? I assume this requires that your tasks can suspend themselves after a timeslice (to give control back to the main function). Resembles coroutines (used within Lua to resemble multi-tasking) to me.

I think you could say that. I just don't have any better descriptive ways to say it though.

### #18rip-off  Moderators

Posted 21 October 2012 - 07:44 AM

My game is basically a player controlling a ball with an accelerometer embedded in a hardware system (such as an Android phone). The ball is used to knock other balls into a hole (the goal) in order to complete the level. There are many levels with objects interacting with the player's ball and the balls that should be knocked into the goal.

This game doesn't sounds complicated enough to require multi-threading when rendering. Arguably it might not require any threading at all.

About the game logic, the game itself is split into two groups, the Update group and the Rendering group. It's multithreaded and it's done in a simple way of abusing the nature of threads. Like many other people have said time and time again, threads are run inconsistently together, causing many race conditions and multiple complex problems if not handled correctly. By understanding the nature of how threads compete one another on the CPU cores (in this case, I have two cores), I utilize a parallel programming paradigm.

One thread is used to update logic ONLY. The other thread is used to render objects ONLY. And that's it. All variables are read-only if thread A wants to read what thread B is using, and vice versa. I have not done anything with concurrency in order to make both of these threads communicate with each other. I simply made them as separate processes with "global-ish" variables wrapped in a Java class (namespace, as you would call it).

When I test the program out, all is fine. Everything run as expected, and there's no bugs on the rendering or updates. It's a perfect combination.

Unfortunately, it sounds like you have a latent bug. Your current combination of code, hardware and JVM implementation might not manifest the bug, but a change in any of these could break your program.

The problem is simple: Java doesn't guarantee that a thread will see a "consistent" view of the state of the program modified by other threads in the absence of synchronized (or Java classes/methods that have the same memory model semantics as synchronized (e.g. Atomic classes, volatile variables or explicit locks).

There are two particularly interesting cases of bugs caused by this.

One is that thread A does not see changes made by thread B at all - each thread might be only reading and writing to its own registers and processor caches, and in the absence of synchronization there is no reason for the JVM to explicitly let the processor know it must flush these local changes to main memory, and for the second thread to reload any changes.

The second is that thread A sees some of the actions of thread B, but in an unexpected order or even in seemingly "impossible" orders. For example, it is possible for a thread to see a partially constructed object in the following code:
private static Foo sharedFoo = null;

public static Foo getFoo()
{
if(sharedFoo == null) {
Foo foo = new Foo(42);
sharedFoo = foo;
}
return sharedFoo;
}

Even though Foo's constructor sets some field to the passed value, if two threads race on this method then it is possible for the following interaction to take place:
Thread A - Calls getFoo().getValue()
Thread A - Enter method getFoo()
Thread A - sharedFoo == null
Thread A - Allocate memory for "foo" // Note: for this example we presume the JVM zeros the allocated memory.
Thread A - set sharedFoo to reference this memory
>>> Context switch <<<
Thread B - Enter method getFoo()
Thread B - sharedFoo != null
Thread B - getValue() returns 0 // !!!
>>> Context switch <<<
Thread A - Calls constructor for "foo"
Thread A - getValue() returns 42

It might seem odd that the constructor of "foo" is run after the reference "sharedFoo" has been set, but this is a perfectly valid optimisation for the JVM/hardware to perform in the absence of synchronisation. If the constructor was even more complicated, you could end up with one thread seeing a class instance with the invariants broken!

But, there are times when I felt it is running slow after a long period of processing the threads, updates, and rendering.

If your game is running slow I am positive there are much better ways to get better performance. In any case, your first step should be to profile the application and determine the source of the slowness.

Adding threads can slow an application down - unless done with care.

If "synchronized" is the answer to all of my questions, then it's either me not looking at it correctly or it's something I find it hard to grasp, yet can't find out why. But I did some more research before posting this.

The function pipe I drew in the diagram may well be a Java synchronized method. In order to let two threads go in one by one and on a tight, repeating loop, I have to understand the concept of notify(), notifyAll() and wait() and how it works in such a loop... This is the part where I'm probably stuck in, and had me lost my focus, I believe.

One could use synchronized to get the kind of pattern you described above. However, the code would actually run slower than if you had a single thread alternating between the two actions.

### #19asperatology  Members

Posted 21 October 2012 - 11:27 AM

Do you mean something like this? I happened to get this to work as I wanted to.

What it does is it alternates between printing out "Thread 1 - First Thread" and "Thread 2- Last Thread" over and over again 50 times. I gave it a few extra milliseconds of time for it to completely finish the synchronized run() method, but that's only because I wanted to learn more about the keyword, synchronized.

Anyway, if my current project is built on a latent bug in its concurrent state, I think the best method to increase performance would have to be what you recommended me to do, since you did say that if I were to modify the thread logic, I could break the build easily. If I just modify a small portion of the thread flow, it wouldn't hurt much at all.

My biggest proof to support that sentence is the 7 additional entities and 9 level maps I have added since I started the "separated threading" paradigm four months ago (July, at the time). I just kept adding and adding new game entities and maps, to the point that I'm just fixing small collision bugs here and there.

If I use only one single thread to work on both the Update group and the Rendering group, its work would be about twice the amount of work it takes for a thread to process either the Update group or the Rendering group, and not both. I did test to see if running on a single thread works or not, just by rewriting a small portion of my base code, where it is the place that initiated the separated threading. The result is actually slower than usual; the game runs sluggishly slow. I reverted it back, and it's now normal.

I really don't know how to say this, but I have to keep this "latent bug" existing on my build.

Gotta love JVM for this, I guess. Or in this case, gotta love Android's Dalvik VM.

EDIT: I hate my browser crashing...

Edited by tom_mai78101, 21 October 2012 - 11:38 AM.

### #20ppgamedev  Members

Posted 21 October 2012 - 02:03 PM

The "wait" "run" code seems wrong to me: