Jump to content

  • Log In with Google      Sign In   
  • Create Account

Threadpool design


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
14 replies to this topic

#1 maya18222   Members   -  Reputation: 191

Like
1Likes
Like

Posted 30 September 2011 - 04:17 PM

First off, Im not trying to reinvent the wheel, I'm just doing this as an exercise.

Below is my ThreadPool class and first of all I was looking for some feedback on it. Secondly, you'll notice that queued tasks can only be dispatched to available threads in 2 places, when they are queued and then in the wait call. How would I go about implementing it so that tasks get dispatched as soon as a thread is free? Im thinking I'll need to have the ThreadPool operating on a thread of its own.

Thread Pool Code

#pragma once

#include <windows.h>
#include <vector>

template< int N >
class ThreadPool
{
public:
	ThreadPool()
	{
		for( int i = 0; i < N; ++i )
		{
			m_Threads[i] = CreateThread( 0, 0, Loop, &m_ThreadStates[i], 0, 0 );
		}
	}

	~ThreadPool()
	{
		for( int i = 0; i < N; ++i )
		{
			TerminateThread( m_Threads[i], 0 );
			CloseHandle( m_Threads[i] );
		}
	}

	void Queue( LPTHREAD_START_ROUTINE func, void* pParams )
	{
		Task t = { func, pParams };
		m_Tasks.push_back(t);

		for( int i = 0; i < N; ++i )
		{
			if( !m_ThreadStates[i].Working )
			{
				const Task& t = m_Tasks.back();
				m_ThreadStates[i].SetTask( t );
				m_Tasks.pop_back();
				return;
			}
		}
	}

	void WaitForAll()
	{
		while(true)
		{
			bool WorkStillinProcess = false;
			for( int i = 0; i < N; ++i )
			{
				if( m_ThreadStates[i].Working )
				{
					WorkStillinProcess = true;
				}
				else if( !m_Tasks.empty() )
				{
					const Task& t = m_Tasks.back();
					m_ThreadStates[i].SetTask( t );
					m_Tasks.pop_back();
					WorkStillinProcess = true;
				}
			}

			Sleep(0);

			if( !WorkStillinProcess )
				return;
		}
		return;	
	}

private:

	static DWORD WINAPI Loop( void* pParams )
	{
		ThreadState* pts = (ThreadState*)pParams;
		while(true)
		{
			if( pts->HasTask() )
			{
				pts->DoTask();
				pts->SetAvaliable();
			}

			Sleep(0);
		}
	}

	struct Task
	{
		LPTHREAD_START_ROUTINE Workfunc;
		void* pWorkParams;
	};

	struct ThreadState
	{
		ThreadState() : Working(0), Workfunc(0), pWorkParams(0) {}
		
		bool HasTask()
		{
			return Workfunc != 0;
		}

		void SetTask( const Task& t )
		{
			Working = 1;
			Workfunc = t.Workfunc;
			pWorkParams = t.pWorkParams;
		}

		void DoTask()
		{
			Workfunc( pWorkParams );
		}

		void SetAvaliable()
		{	
			Working = 0;
			Workfunc = 0;
			pWorkParams = 0;
		}

		int Working;
		LPTHREAD_START_ROUTINE Workfunc;
		void* pWorkParams;
	};

	ThreadState m_ThreadStates[N];
	HANDLE m_Threads[N];
	std::vector<Task> m_Tasks;
};

Usage

#include <iostream>
using namespace std;

#include "ThreadPool.h"

const unsigned int COUNT = 100000000;
const unsigned int THREADS = 4;

struct WorkParams
{
	float begin;
	float end;
	unsigned int count;	
	float result;
	float pad[16];
};

DWORD WINAPI ComputeAreaUnderCurve( void* pParams )
{
	WorkParams* plParams = (WorkParams*)pParams;
	double Area = 0.0f;
	double dx = ( plParams->end - plParams->begin ) / plParams->count;
	for( unsigned int i = 0; i < plParams->count; ++i )
	{
		double x = plParams->begin + (i * dx);
		Area += dx * x * x;
	}
	plParams->result = (float)Area;
	return 0;
}

int main()
{
	ThreadPool<THREADS> threadpool;

	while(true)
	{
		WorkParams wp[THREADS];
		float Result = 0.0f;

		for( int i = 0; i < THREADS; ++i )
		{
			wp[i].begin = i * (1.0f / THREADS);
			wp[i].end = (i+1) * (1.0f / THREADS);
			wp[i].count = (COUNT / THREADS);
			threadpool.Queue( ComputeAreaUnderCurve, &wp[i] );
		}

		threadpool.WaitForAll();

		for( int i = 0; i < THREADS; ++i )
		{
			Result += wp[i].result;
		}

		cout << "Result   : " << Result  << "\n";
	}

	cin.get();
}





Sponsor:

#2 Ravyne   GDNet+   -  Reputation: 7481

Like
1Likes
Like

Posted 30 September 2011 - 05:18 PM

Leaving everything else aside, its seems folly to use template parameters to determine the number of threads in the pool. Remember that templates are instantiated at compile time, one copy for each combination of template arguments. You don't really need separate copies for 1,2,4,8... capacity pools, and since you're kicking off (presumably) large, or at least non-trivial, jobs there's no argument against "overhead" involved in managing a dynamic number of threads.

From an API perspective, how does one set job priority? Thread affinity?

As for processing jobs as soon as a thread is able, you could use a job complete callback which would call on the job manager to kick off the next job as one way -- its non-deterministic though, which might be a problem in environments that expect less fluctuation (games, especially on consoles). The second approach is to just accept that the job manager operates on a fixed time step -- either along with some other fixed timestep (eg, framerate or physics time-step) or as a result of a timer-based callback or thread.

#3 _swx_   Members   -  Reputation: 944

Like
1Likes
Like

Posted 01 October 2011 - 05:56 AM

You should check out http://software.intel.com/en-us/articles/do-it-yourself-game-task-scheduling/

#4 Hodgman   Moderators   -  Reputation: 30415

Like
0Likes
Like

Posted 01 October 2011 - 08:11 AM

The ThreadPool class needs to be non-copyable (e.g. add a private, non-implemented copy-constructor and assignment-operator).
The destructor should wait for the threads to finish instead of calling TerminateThread.

The Queue function is not thread-safe -- only one thread can call this safely, but this fact is not documented with a comment.

The ThreadState logic contains dangerous race-conditions and is currently not safe to use:

Main thread calls:
                                m_ThreadStates[i].SetTask( t );
                  Which expands to:
                        Workfunc = t.Workfunc;//Has Task now begins returning true
                        pWorkParams = t.pWorkParams;//But DoTask is not safe to call until this line is finished

        Worker Thread:
		...
                        if( pts->HasTask() ) // returns Workfunc != 0
                        {
                                pts->DoTask(); // calls Workfunc( pWorkParams )


#5 phantom   Moderators   -  Reputation: 7317

Like
0Likes
Like

Posted 01 October 2011 - 08:22 AM

This thread has now been cleaned of the noise which has turned up.

Lets keep it on topic guys...

#6 maya18222   Members   -  Reputation: 191

Like
0Likes
Like

Posted 01 October 2011 - 11:25 AM

Thanks for the comments.

Ravyne - Yeah, the template is unnecessary, that was actually being used at an earlier point for something else. I hadn't really thought about job priority as I was thinking more along the lines of important jobs will need to be added first.

_swx_ - Thanks for the link

Hodgeman - Sorry, I should have mentioned how the ThreadPool is to be used. The threads never exit as they contain an infinite loop, by calling "wait" in the "main thread" before the end of the ThreadPool objects scope the threads can be terminated safely as all work is complete. If the threads exit when no more tasks are *currently* available then the threadpool cant be used again . And yes, the Queue function is designed to only be called on the thread the ThreadPool is created in. And thanks for pointing out the race condition.

phantom - I'm curious now as to what was said.

#7 Ravyne   GDNet+   -  Reputation: 7481

Like
0Likes
Like

Posted 01 October 2011 - 01:11 PM

So... You create a bunch of threads (jobs) that never die until the entire pool does? And a pool dies when it runs out of work, and presumably you create new pools over time as needed? If that's all you're doing, what's the advantage of this over individual thread wrappers? Otherwise, what are we missing in your explanation?

Maybe I'm not well-read on the subject but my understanding is that most threadpools just manage a list of jobs, and that the jobs end. When one job is finished, another gets to take the slot -- also, jobs are typically managed in a priority-queue (usually determined by priority assignment, deadline, or both). Generally, a thread-pool is a persistent thing -- Otherwise, what if you create 2+ threadpools that overlap in time? Now you've got 2 times or more the "optimal" number of threads operating. The whole point of a thread-pool, as I understand it, is to have global knowledge about how many threads are allocated to job-processing, synchronize between them (dependency graph) and to prioritize the work.

#8 maya18222   Members   -  Reputation: 191

Like
0Likes
Like

Posted 01 October 2011 - 01:26 PM

No. The ThreadPool dies when it goes out of scope, which could have the lifetime of the application. The threads stay alive until the ThreadPool is destructed. Tasks get queued. If there are tasks they get consumed by the threads. A thread gets a new task if its finished its current task and other tasks exist, else the threads just sleep.

#9 phantom   Moderators   -  Reputation: 7317

Like
0Likes
Like

Posted 01 October 2011 - 01:51 PM

One issue I have with your design is that while you 'sleep(0)' you are still effectively busy waiting as all that will happen is the OS will give up the quantum and then, if you are the only active app, give it right on back again.

A better solution might be to use events of some sort to signal to a thread 'wake up and work'; you could even use three events;

- 'wake up and work'
- 'time to die'
- 'exiting'

So your thread pool class posts some work to a thread and then raises the 'wake up and work' event which the thread is waiting on, the thread then goes off and does the work as before.

On shut down your thread pool posts 'time to die' and the thread wakes up, optionally clears any work in the queue (depending on design), then exits it's loop and signals 'exiting'. In the mean time, having signalled all the threads to shut down your main thread is waiting on all their 'exiting' events and when they are all signalled it knows they have shut down and it can continue to shut down as well.

And by 'events' I mean OS primatives or some abstraction there of, not checking for bools etc :)

#10 maya18222   Members   -  Reputation: 191

Like
0Likes
Like

Posted 01 October 2011 - 02:16 PM

What would you gain from using events though? If theres no work to be done, then you'd still just be iterating over some message loop, waiting for the WAKE_UP message.

And by 'events' I mean OS primatives or some abstraction there of, not checking for bools etc


Can you give an example? And whats wrong with checking bools?

#11 phantom   Moderators   -  Reputation: 7317

Like
1Likes
Like

Posted 01 October 2011 - 05:46 PM

Events let the OS put the thread to sleep, without burning CPU power, which can save energy and depending on the usage make your overall system more responsive. Basically instead of spinning up 3 thread and having them peg 3 cores of a 4 core system at 100% without doing any useful work they will sleep until woken up.

By 'OS primatives' I mean using something like, on Win32, CreateEvent which gives you back a handle you can wait on via a function such as WaitForSingleObject or WaitForMultipleObjects depending on your use case (in this case you'd want the latter as you'll be waiting for multiple thread events to signal).

This is better because, by using the OS functionality, you know it will do The Right Thing™. In threads you can get problems with non-atomic operations causing subtle and hard to track down bugs, such as situations where one thread starts to check something, gets swapped out, that value gets changed in memory but the thread doesn't see it.

Basically this is 'the right way' to do it, certainly when starting out as a common mistake people new to threading make is trying to create all manner of 'safety' features using bools, and operations on bools are not atomic and thus not thread safe :)

Windows sync functionality can be found here on MSDN; http://msdn.microsoft.com/en-us/library/ms686360(v=VS.85).aspx

#12 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 01 October 2011 - 07:47 PM

And whats wrong with checking bools?


Where is the bool in your computer's memory hierarchy? Or more to the point, which copy of the bool is being checked in the conditional? Depending on the OS scheduler, the processor's memory model, the number of physical cores in your system, the compiler and optimizations employed, the load of the system and the direction the wind is blowing, this may or may not work as you intended.

The methods suggested by phantom ensure proper synchronisation and visibility.

#13 Hodgman   Moderators   -  Reputation: 30415

Like
0Likes
Like

Posted 01 October 2011 - 09:41 PM

The threads never exit as they contain an infinite loop, by calling "wait" in the "main thread" before the end of the ThreadPool objects scope the threads can be terminated safely as all work is complete. If the threads exit when no more tasks are *currently* available then the threadpool cant be used again . And yes, the Queue function is designed to only be called on the thread the ThreadPool is created in. And thanks for pointing out the race condition.

I'd still try to avoid using TerminateThread -- it's a very "last resort" kind of function, the threading equivalent of ctrl+alt+delete. Using it can cause resource leaks (on XP it's a guaranteed leak of ~1MB RAM), which at worst can lead to deadlock bugs (e.g. if the thread has locked a critical section and is then terminated without unlocking it).
I'd recommend changing your worker's loop check a thread-shared boolean (and break/return if it's true) and have the pool's destructor set the bool to true before waiting for the threads to finish.

Regarding the race condition, you can't fix that bug by rearranging lines of code either. The easiest way is to wrap the accessing of all those variables in a critical section. e.g.
lock( t.mutex );
  assert( t.workFunc == NULL );
  t.workFunc = f;
  t.workParams = p;
unlock( t.mutex );
...
lock( t.mutex );
  bool isAvailable = (t.workFunc == NULL);
  bool hasWork = (t.workFunc != NULL);
unlock( t.mutex );

How would I go about implementing it so that tasks get dispatched as soon as a thread is free? Im thinking I'll need to have the ThreadPool operating on a thread of its own.

To do this, you could share the queue between the workers, so they can do something like:
lock( pool.mutex );
  const Task& t = pool.m_Tasks.back();
  this->SetTask( t );
  pool.m_Tasks.pop_back();
unlock( pool.mutex );
this->DoTask();


#14 maya18222   Members   -  Reputation: 191

Like
0Likes
Like

Posted 02 October 2011 - 09:47 AM

Im going to redo the design using atomic operations as suggested from _swx_'s link. Phantom, I'm looking at the Wait() win32 calls and they don't mention anything about putting threads to sleep, but rather they just enter a wait state. However, at the moment I'm not really concerned with power efficiency.

Im going to have a good read through the win32 sync docs tonight.

#15 phantom   Moderators   -  Reputation: 7317

Like
0Likes
Like

Posted 02 October 2011 - 09:54 AM

'Putting threads to sleep' and 'entering a wait state' is, in this case, the same thing... and well, bad use of terms on my part as it could be confused with 'sleep'.

Basically when a thread enters a 'wait state' it is no longer considered for scheduling by the kernel until either a) an event it is waiting for happens, b) a time out occurs and, iirc, c) a RPC is performed or an IOCP completes.

In this instance we are intrested in reason 'a'; the thread won't be considered for scheduling until an event it is waiting for is signaled. This is all handed by windows under the hood.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS