How to do Ordered Multithreading: 2 or more threads executing 1 same function.

Started by
24 comments, last by tom_mai78101 11 years, 6 months ago
ostfD.png

Here's my current code and the output:

[source lang="java"]public class Worker{
public static int id;

public static void print(){
System.out.println("Thread hashcode: " + Integer.toString(id));
}
}

public class Main implements Runnable {
public static void main(String[] arg){
Main m = new Main();
Thread t1 = new Thread(m);
Thread t2 = new Thread(m);

t1.start();
t2.start();
}

public synchronized void run(){
Thread t = Thread.currentThread();
for (int i = 0; i < 20; i++){
Worker.id = t.hashCode();
Worker.print();
}
}
}

//=================OUTPUT===========================
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 15868055
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
Thread hashcode: 26983814
[/source]

The ouput, in other words, are labeled as:

AAAAAABBBBBB

I would like to make it so that it does:

ABABABABABAB

In the past, I would create threads and make them work on things independent of one another, Thread A does X work, Thread B does Y work, Thread C does Z work, and so on. The X, Y, and Z do not conflict each other.

Now that I'm starting out on doing multithreading in the same function, I realized it's a unknown adventure for me. I would like someone to lead the way, and warn me, "It's too dangerous to go alone, take this!".

Would anyone like to help?

EDIT: I'm using Firefox 16 and Internet Explorer 9, both browsers appeared to wonk out the code formatting.
Advertisement
If all you want is minimal overlapping, you can just give each of your threads an "index" (from 0, 1, 2, ..., n - 1 where n is your thread count) and make each thread aware of how many threads it is working with. Then, each thread with index T does work unit T, T + n, T + 2n, T + 3n, ... there you have it, zero overlapping. For two threads, the first one will do 0, 2, 4, 6, ... and the second one will do 1, 3, 5, 7, ... all within the same "function".

Note you probably don't want to use System.out.println when working with threads, since they'll print their stuff in nondeterministic order depending on which thread got there first. You want to keep a memory buffer which'll store your thread's results (one slot for each work unit you're doing) and each thread writes there. Then, at the end, you can process it however you want.

Also, for multithreading to be of any use, you need your function to contain significant work that can be parallelized. If you're just trying to get two threads to somehow execute an arbitrary function "together" (your diagram is rather vague), that's not the way to go, especially considering each thread has its own stack and CPU context.

I find OpenMP to be rather refreshing when it comes to getting familiar with the concepts involved in scalable multithreading. Don't know if Java has that though.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

The significant work I want to give to 2 or more threads, is a Rendering function. All it does is to blit bitmap images onto a Canvas, which will, according to my game logic, make the Canvas "refresh" quicker.

This is why I wanted the threads to execute in an orderly fashion, like a queue pool of threads running towards a goal and hitting a button, in a straight line.

If I index each thread with an ID or some sort, I wouldn't know how to implement it. I wouldn't be able to split the Rendering function in groups in whic they (group IDs) are divisible by the respective thread IDs. The Rendering function is based solely on the order of execution, so it's easier to sort their Z positions out (drawing from back to front). And another problem would be, if I don't do this in an orderly fashion, the rendering will be a catastrophe.

=========================================

Just off topic question. What do you do to println() properly for multithreading programs, if I shouldn't be using println(), as you've mentioned, but I don't want to use a memory buffer? I want to avoid using any sort of memory buffers as this program is to be executed on a hardware with very limited memory.

The significant work I want to give to 2 or more threads, is a Rendering function. All it does is to blit bitmap images onto a Canvas, which will, according to my game logic, make the Canvas "refresh" quicker.

This is why I wanted the threads to execute in an orderly fashion, like a queue pool of threads running towards a goal and hitting a button, in a straight line.

If I index each thread with an ID or some sort, I wouldn't know how to implement it. I wouldn't be able to split the Rendering function in groups in whic they (group IDs) are divisible by the respective thread IDs. The Rendering function is based solely on the order of execution, so it's easier to sort their Z positions out (drawing from back to front). And another problem would be, if I don't do this in an orderly fashion, the rendering will be a catastrophe.

=========================================

Just off topic question. What do you do to println() properly for multithreading programs, if I shouldn't be using println(), as you've mentioned, but I don't want to use a memory buffer? I want to avoid using any sort of memory buffers as this program is to be executed on a hardware with very limited memory.


You pretty much can't control the order of execution when you have multiple threads without removing the parallell execution(Which really is the point of using threads)

Now, since you are blitting bitmaps onto a canvas you should probably give each thread its own section of the canvas to write to so if you got 2 threads you can just split the canvas in the middle and have ThreadA blit the left side while ThreadB blits the right side (with 4 threads you can split it both vertically and horizontally or into 4 "rows" or "columns"), that way there is no interference between threads.

as for println, there is no good way to solve this without synchronization (Which degrades performance), you can have each thread write to its own stream without any problems but if you want them to write to the same stream you will need synchronization. (println in java is synchronized so if one thread calls println on standard output any other thread trying to call println will block until the first one has completed its println call), your best bet is probably to write to separate streams, timestamp the writes and then merge the two streams at a later point. (printing to the console is slow anyway so its not a big deal to merge all the output at the end of each frame and then push it out)
[size="1"]I don't suffer from insanity, I'm enjoying every minute of it.
The voices in my head may not be real, but they have some good ideas!
That, I did not realize the meaning until now.

What method design (function design) should I approach so that it would like ideally parallelized? And that would be it. I'll try to study the pattern or paradigm, and probably ask a few more questions about them if I get into some problems.
For good performance with multiple threads here's some rough guidelines of what to do:

- Output from each thread must go to a different bit of memory (i.e. on a separate cache line to memory any other threads touch). This may mean you need two copies of the data.
- Read only input data can be shared between all threads.
- Instead of using threads directly, use a thread pool that someone else has written and create one thread per CPU core if you get a choice.
- The main thread should split the work to be done into a bunch of independent tasks and adds them to the pool's work queue. It can then continue with other processing that won't affect the data the threads are using if you want. At some point the main thread probably needs to block and wait for the threads to complete their tasks.
- If the tasks can vary in how long they take to process it's a good idea to create significantly more tasks than you have CPU cores to distribute the workload more evenly, but making too many tasks will also add overheads.

By the way because of the way the CPU cache works in 64-byte cache lines you really don't want to have one task processing even elements of an array, and another task handling the odd elements. It's usually much quicker to give the first half of the array to one thread, and the second half to another to stop them fighting over memory accesses. The same goes for splitting an image vertically - do all the splitting horizontally instead and it should be quicker.
At its core, parallelisation is about identifying distinct units of work that can be done at different times without depending on or interfering with one another. One approach to make a seemingly serial problem parallisable is by duplicating some of the data structures involved and coming up with a way of "merging" the results.

Two caveats:

  • You must be careful to actually increase throughput overall - there is a threshold of work that must be exceeded before the overhead involved in threading, copying, communication, synchronisation can be offset by the gains of simultaneous execution.

  • Find your bottleneck. Not being able to update the screen fast enough might not be limited by bitmap blitting onto a canvas - perhaps the limit is in sending the completed canvas over to the graphics card, for example. Optimising other areas may not result in the kind of speed boost you might be hoping for.

"Now, since you are blitting bitmaps onto a canvas you should probably give each thread its own section of the canvas to write to so if you got 2 threads you can just split the canvas in the middle and have ThreadA blit the left side while ThreadB blits the right side (with 4 threads you can split it both vertically and horizontally or into 4 "rows" or "columns"), that way there is no interference between threads."

Split vertically, not horizontally. Why? Because bitmap memory normally runs left to right, top to bottom. This means that the pixels either side of a vertical dividing line are next to each other in memory and there's a good chance caches will try and pre-load the data. If things next to each other in memory are accessed by different threads running on cores in different packages, they'll have to move ownership of the memory around between the cores' caches. This is slow (it's like a cache miss, but a bit less expensive).

If you split vertically, you minimise the number of pixels which border each other and might be repeatedly moved between the threads/cores.
Katie: wouldn't that be a horizontal split, though?

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

There's a lot of discussion about splitting up the canvas into groups, and then let each thread draw the group. It gave me a whole lot of questions.

1. Wouldn't splitting up the Canvas into areas for threads to render on cause something that resembles tearing?

2. In Java, I do not know of a Java method that allows splitting bitmaps while it is being drawn. What is the logic for that? Test to see if the bitmap Rect area is over the threshold, and then stop drawing that area, while letting the other thread draw the undrawn area?

3. rip-off mentioned that I may be passing a gradually completing Canvas into methods. The possibility that this might be my bottleneck is actually quite high (as I haven't considered it before), as I have passed the same reference to the Canvas object through a lot of methods. Are there other alternatives to efficiently draw tons of bitmaps (of powers of 2, in my case) onto a Canvas?

This topic is closed to new replies.

Advertisement