Task/Job Scheduler code review

Started by
5 comments, last by Antheus 11 years, 11 months ago
I've been working on a simple, cross-platform task/job scheduler, and I would like some help reviewing the code. Any feedback on general pitfalls or problems is appreciated!

I've posted the code on github, and below:
https://gist.github.com/2625036


/// This is not written to compile. I just copied everything into one file to make viewing easy
/// Here is a sample usage (task definitions and implementations omitted)
int main()
{
TaskScheduler scheduler;
MyTaskA taskA;
MyTaskB taskB;
BOOL running = true
while (running)
{
scheduler.pushTask(&taskA);
scheduler.pushTask(&taskB);
scheduler.runUntilDone();
}
}
#include "boost/thread.hpp"
class Task;
class TaskScheduler;
class WorkerThread;
//// HEADERS ////
/// Abstract interface for all tasks run by scheduler
class Task
{
public:
virtual ~Task() { }
/** Checks if all dependencies are met and the task can run
* @returns true if scheduler can run task, false otherwise
*/
virtual BOOL isReady() = 0;
/// User task code goes here
virtual void run() = 0;
/// Checks whether the task is marked complete
BOOL complete() { return m_complete; }
/** Sets the complete status of the task
* @param v Completetion status
*/
void setComplete(BOOL v) { m_complete = v; }
private:
/// Flag indicating task completion
BOOL m_complete;
};
/// Runs Tasks across multiple threads on multiple cores
class TaskScheduler
{
public:
/** Create scheduler with default number of worker threads.
* Default is 1 thread per 1 logical core (core or HT unit)
*/
TaskScheduler();
/** Create scheduler with specified number of worker threads
* @param n Number of worker threads (must be less than MAX_THREADS)
*/
TaskScheduler(U32 n);
/// Finishes all currently running tasks, then stops all worker threads.
~TaskScheduler();
/** Schedules a task for execection
* The task is assigned to the first idle work thread if there is one;
* otherwise, the task is put on the queue of a worker thread. Pushed tasks
* execute immediately.
*/
void pushTask(Task* t);
/// Blocks execution until all worker threads have completed there tasks
void runUntilDone();
/** Gets number of worker threads
* @return Number of worker threads
*/
U32 threadCount();
private:
U32 m_workerCount;
WorkerThread* m_workers[MAX_THREADS];
BOOL m_running;
U32 m_nextPushIndex;
friend class WorkerThread;
};
/** Implementation of a task-stealing worker thread for the TaskScheduler
* WorkerThreads steal tasks from other WorkerThreads if they are out of tasks or
* blocked by task dependencies. If a WorkerThread is truly out of tasks, it sleeps on a
* condition variable until new tasks are available.
*/
class WorkerThread
{
public:
/** Creates a new worker thread--only called by a TaskScheduler
* The system thread is started and enters idle, waiting on a condition variable.
* @param scheduler Parent TaskScheduler
*/
WorkerThread(TaskScheduler* scheduler);
/// Deletes underlying system thread
~WorkerThread();
/** Adds a task to the local task pool
* This method is called by the TaskScheduler from the main thread and uses
* a spinlock to access task queue. Execution of the task can begin immediately after
* the lock mutex is released.
* @param t Task to add to queue
*/
void pushTask(Task* t);
/** Checks whether the worker thread is currently idling
* @return true if idling, false otherwise
*/
BOOL idling();
/** Blocks execution until the worker thread has completed all tasks in its queue
* and can no longer find tasks to steal
*/
void blockUntilDone();
/** Wakes the thread and joins execution
* TaskScheduler must have m_running = false or method will block indefinitely
*/
void stop();
private:
void thread_proc();
BOOL run();
void idle();
BOOL steal();
BOOL stealFromWorker(WorkerThread* wt);
TaskScheduler* m_scheduler;
boost::thread* m_internalThread;
boost::mutex m_workerMutex;
boost::condition_variable m_wakeUp;
boost::condition_variable m_done;
std::queue<Task*> m_tasks;
U32 m_taskCount;
BOOL m_idling;
BOOL m_dependencyBlocked;
};
//// HEADERS ////
//// IMPLEMENTATIONS ////
/// TASK SCHEDULER ///
TaskScheduler::TaskScheduler()
{
m_running = true;
// use number of 'logical cores' as default number of worker threads
// on CPUs with Hyperthreading, each core counts as 2
m_workerCount = boost::thread::hardware_concurrency();
for (U32 i = 0; i < m_workerCount; i++)
m_workers = new WorkerThread(this);
m_nextPushIndex = 0;
}
TaskScheduler::TaskScheduler(U32 n)
{
m_workerCount = n;
for (U32 i = 0; i < m_workerCount; i++)
m_workers = new WorkerThread(this);
m_nextPushIndex = 0;
}
TaskScheduler::~TaskScheduler()
{
m_running = false;
for (U32 i = 0; i < m_workerCount; i++)
m_workers->stop();
}
void TaskScheduler::pushTask(Task* t)
{
// try to find an idling worker thread
for (U32 i = 0; i < m_workerCount; i++)
{
if (m_workers->idling())
{
m_workers->pushTask(t);
return;
}
}

// push it onto the next worker thread round-robin style
// if there isn't an idling thread
m_workers[m_nextPushIndex++]->pushTask(t);
if (m_nextPushIndex >= m_workerCount)
m_nextPushIndex = 0;
}
void TaskScheduler::runUntilDone()
{
for (U32 i = 0; i < m_workerCount; i++)
m_workers->blockUntilDone();
}
U32 TaskScheduler::threadCount()
{
return m_workerCount;
}
/// TASK SCHEDULER ///
/// WORKER THREAD ///
WorkerThread::WorkerThread(TaskScheduler* scheduler)
{
m_scheduler = scheduler;
m_taskCount = 0;
m_idling = false;
m_dependencyBlocked = false;
// creates system thread
// thread_proc begins execution immediately
m_internalThread = new boost::thread(&WorkerThread::thread_proc, this);
}
WorkerThread::~WorkerThread()
{
delete m_internalThread;
}
void WorkerThread::pushTask(Task* t)
{
while (!m_workerMutex.try_lock());
m_tasks.push(t);
m_taskCount++;
m_idling = false;
m_workerMutex.unlock();

// wake up the thread if idling
m_wakeUp.notify_all();
}
BOOL WorkerThread::idling()
{
return m_idling;
}
void WorkerThread::blockUntilDone()
{
if (m_idling)
return;
// the m_done condition variable is notified when the thread enters idle,
// which only happens when it is out of tasks and can't find any to steal
boost::unique_lock<boost::mutex> lock(m_workerMutex);
while(!m_idling)
m_done.wait(lock);
}
void WorkerThread::thread_proc()
{
// always idle when thread first starts because
// not all other worker threads are gauranteed to
// be initialized yet, and steal() relies on that.
// Also, there shouldn't be any tasks yet anyway.
idle();
for(;;)
{
if (!m_scheduler->m_running)
break;
// try to run a task. If there are not tasks available,
// try to steal tasks. If there are no tasks to steal
// and the task queue is empty, then idle. If there are
// dependency-blocked tasks, the thread does not idle.
if (!run())
{
if (!steal() && m_taskCount <= 0)
idle();
}
}
}
BOOL WorkerThread::run()
{
if (m_taskCount <= 0)
return false;

// Try to find a task to run. If all tasks in the queue
// have been checked, then return false so we can try to steal.
// If we have tasks waiting on dependencies, set m_dependencyBlocked
// so other WorkerThreads know not to steal from this one.
m_workerMutex.lock();
m_dependencyBlocked = false;
U32 numPops = 0;
Task* t;
while (numPops < m_taskCount)
{
t = m_tasks.front();
m_tasks.pop();
if (t->isReady())
{
m_taskCount--;
m_workerMutex.unlock();
t->run();
return true;
}
numPops++;
m_tasks.push(t);
}
if (numPops > 0)
m_dependencyBlocked = true;
m_workerMutex.unlock();
return false;
}
void WorkerThread::idle()
{
m_workerMutex.lock();
m_idling = true;
m_workerMutex.unlock();
// If another thread is waiting on us to finish execution,
// notify that this thread is done now
// Used by blockUntilDone()
m_done.notify_all();
boost::unique_lock<boost::mutex> lock(m_workerMutex);
while (m_idling)
m_wakeUp.wait(lock);
}
BOOL WorkerThread::steal()
{
// check each worker thread for extra work
U32 workerCount = m_scheduler->m_workerCount;
for (U32 i = 0; i < workerCount; i++)
{
WorkerThread* wt = m_scheduler->m_workers;
if (wt == this || wt->m_dependencyBlocked)
continue;
if (wt->m_taskCount > 0)
{
if (stealFromWorker(wt))
return true;
}
}
return false;
}
BOOL WorkerThread::stealFromWorker(WorkerThread* wt)
{
while (!wt->m_workerMutex.try_lock());

// steal half of the other thread's tasks,
// rounding up. If they don't have tasks
// (the check in steal() is not guaranteed because,
// it does not lock the mutex), then return false so
// we can try another worker thread.
U32 numToSteal;
U32 taskCount = wt->m_taskCount
if (wt->m_taskCount <= 0)
{
wt->m_workerMutex.unlock();
return false;
}
else
numToSteal = (wt->m_taskCount + 1) / 2;
m_workerMutex.lock();
for (U32 i = 0; i < numToSteal; i++)
{
m_tasks.push(wt->m_tasks.front());
wt->m_tasks.pop();
wt->m_taskCount--;
m_taskCount++;
}
wt->m_workerMutex.unlock();
m_workerMutex.unlock();
return true;
}
void WorkerThread::stop()
{
m_workerMutex.lock();
m_idling = false;
m_workerMutex.unlock();
m_wakeUp.notify_all();
m_internalThread->join();
}
/// WORKER THREAD ///
//// IMPLEMENTATIONS ////
Advertisement
There are several race conditions surrounding your use of state variables (e.g. m_idle). Also, spinning on try_lock() seems a bit pointless; you should instead use a blocking lock() call so that your thread can be turned off by the OS scheduler.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

I've changed the WorkerThread implementation to avoid the race conditions. It passes testing through Helgrind at least.

https://gist.github.com/2625036

[source lang="cpp"]

[color=#999988]/// This is not written to compile. I just copied everything into one file to make viewing easy
[color=#999988]/// Here is a sample usage (task definitions and implementations omitted)

[color=#445588]int main()
{
TaskScheduler scheduler;
MyTaskA taskA;
MyTaskB taskB;

BOOL running = true
while (running)
{
scheduler.pushTask(&taskA);
scheduler.pushTask(&taskB);
scheduler.runUntilDone();
}
}

[color=#999999]#include "boost/thread.hpp"

class [color=#445588]Task;
class [color=#445588]TaskScheduler;
class [color=#445588]WorkerThread;

[color=#999988]//// HEADERS ////

[color=#999988]/// Abstract interface for all tasks run by scheduler
class [color=#445588]Task
{
public:
virtual ~Task() { }

[color=#999988]/** Checks if all dependencies are met and the task can run
[color=#999988]* @returns true if scheduler can run task, false otherwise
[color=#999988]*/
virtual BOOL isReady() = [color=#009999]0;

[color=#999988]/// User task code goes here
virtual [color=#445588]void run() = [color=#009999]0;

[color=#999988]/// Checks whether the task is marked complete
BOOL complete() { return m_complete; }

[color=#999988]/** Sets the complete status of the task
[color=#999988]* @param v Completetion status
[color=#999988]*/
[color=#445588]void setComplete(BOOL v) { m_complete = v; }

private:
[color=#999988]/// Flag indicating task completion
BOOL m_complete;

};

[color=#999988]/// Runs Tasks across multiple threads on multiple cores
class [color=#445588]TaskScheduler
{
public:
[color=#999988]/** Create scheduler with default number of worker threads.
[color=#999988]* Default is 1 thread per 1 logical core (core or HT unit)
[color=#999988]*/
TaskScheduler();

[color=#999988]/** Create scheduler with specified number of worker threads
[color=#999988]* @param n Number of worker threads (must be less than MAX_THREADS)
[color=#999988]*/
TaskScheduler(U32 n);

[color=#999988]/// Finishes all currently running tasks, then stops all worker threads.
~TaskScheduler();

[color=#999988]/** Schedules a task for execection
[color=#999988]* The task is assigned to the first idle work thread if there is one;
[color=#999988]* otherwise, the task is put on the queue of a worker thread. Pushed tasks
[color=#999988]* execute immediately.
[color=#999988]*/
[color=#445588]void pushTask(Task* t);

[color=#999988]/// Blocks execution until all worker threads have completed there tasks
[color=#445588]void runUntilDone();

[color=#999988]/** Gets number of worker threads
[color=#999988]* @return Number of worker threads
[color=#999988]*/
U32 threadCount();

private:
U32 m_workerCount;
WorkerThread* m_workers[MAX_THREADS];
BOOL m_running;
U32 m_nextPushIndex;

friend class [color=#445588]WorkerThread;
};

[color=#999988]/** Implementation of a task-stealing worker thread for the TaskScheduler
[color=#999988]* WorkerThreads steal tasks from other WorkerThreads if they are out of tasks or
[color=#999988]* blocked by task dependencies. If a WorkerThread is truly out of tasks, it sleeps on a
[color=#999988]* condition variable until new tasks are available.
[color=#999988]*/
class [color=#445588]WorkerThread
{
public:
[color=#999988]/** Creates a new worker thread--only called by a TaskScheduler
[color=#999988]* The system thread is started and enters idle, waiting on a condition variable.
[color=#999988]* @param scheduler Parent TaskScheduler
[color=#999988]*/
WorkerThread(TaskScheduler* scheduler);

[color=#999988]/// Deletes underlying system thread
~WorkerThread();

[color=#999988]/** Adds a task to the local task pool
[color=#999988]* This method is called by the TaskScheduler from the main thread and uses
[color=#999988]* a spinlock to access task queue. Execution of the task can begin immediately after
[color=#999988]* the lock mutex is released.
[color=#999988]* @param t Task to add to queue
[color=#999988]*/
[color=#445588]void pushTask(Task* t);

[color=#999988]/** Checks whether the worker thread is currently idling
[color=#999988]* @return true if idling, false otherwise
[color=#999988]*/
BOOL idling();

[color=#999988]/** Blocks execution until the worker thread has completed all tasks in its queue
[color=#999988]* and can no longer find tasks to steal
[color=#999988]*/
[color=#445588]void blockUntilDone();

[color=#999988]/** Wakes the thread and joins execution
[color=#999988]* TaskScheduler must have m_running = false or method will block indefinitely
[color=#999988]*/
[color=#445588]void stop();

private:
[color=#445588]void thread_proc();
BOOL run();
[color=#445588]void idle();
BOOL steal();
BOOL stealFromWorker(WorkerThread* wt);

TaskScheduler* m_scheduler;
boost::thread* m_internalThread;
boost::mutex m_workerMutex;
boost::mutex m_idleMutex;
boost::condition_variable m_wakeUp;
boost::condition_variable m_done;
std::queue<Task*> m_tasks;
U32 m_taskCount;
BOOL m_idling;
BOOL m_dependencyBlocked;

};

[color=#999988]//// HEADERS ////

[color=#999988]//// IMPLEMENTATIONS ////

[color=#999988]/// TASK SCHEDULER ///
TaskScheduler::TaskScheduler()
{
m_running = true;

[color=#999988]// use number of 'logical cores' as default number of worker threads
[color=#999988]// on CPUs with Hyperthreading, each core counts as 2
m_workerCount = boost::thread::hardware_concurrency();

for (U32 i = [color=#009999]0; i < m_workerCount; i++)
m_workers = new WorkerThread(this);
m_nextPushIndex = [color=#009999]0;
}

TaskScheduler::TaskScheduler(U32 n)
{
m_workerCount = n;
for (U32 i = [color=#009999]0; i < m_workerCount; i++)
m_workers = new WorkerThread(this);
m_nextPushIndex = [color=#009999]0;
}

TaskScheduler::~TaskScheduler()
{
m_running = false;
for (U32 i = [color=#009999]0; i < m_workerCount; i++)
m_workers->stop();
}

[color=#445588]void TaskScheduler::pushTask(Task* t)
{
[color=#999988]// try to find an idling worker thread
for (U32 i = [color=#009999]0; i < m_workerCount; i++)
{
if (m_workers->idling())
{
m_workers->pushTask(t);
return;
}
}

[color=#999988]// push it onto the next worker thread round-robin style
[color=#999988]// if there isn't an idling thread
m_workers[m_nextPushIndex++]->pushTask(t);
if (m_nextPushIndex >= m_workerCount)
m_nextPushIndex = [color=#009999]0;
}

[color=#445588]void TaskScheduler::runUntilDone()
{
for (U32 i = [color=#009999]0; i < m_workerCount; i++)
m_workers->blockUntilDone();
}

U32 TaskScheduler::threadCount()
{
return m_workerCount;
}
[color=#999988]/// TASK SCHEDULER ///

[color=#999988]/// WORKER THREAD ///
WorkerThread::WorkerThread(TaskScheduler* scheduler)
{
m_scheduler = scheduler;
m_taskCount = [color=#009999]0;
m_idling = false;
m_dependencyBlocked = false;

[color=#999988]// creates system thread
[color=#999988]// thread_proc begins execution immediately
m_internalThread = new boost::thread(&WorkerThread::thread_proc, this);
}

WorkerThread::~WorkerThread()
{
delete m_internalThread;
}

[color=#445588]void WorkerThread::pushTask(Task* t)
{
m_workerMutex.lock();
m_tasks.push(t);
m_taskCount++;
m_workerMutex.unlock();

m_idleMutex.lock();
m_idling = false;
m_idleMutex.unlock();

[color=#999988]// wake up the thread if idling
m_wakeUp.notify_all();
}

BOOL WorkerThread::idling()
{
boost::lock_guard<boost::mutex> lock(m_idleMutex);
return m_idling;
}

[color=#445588]void WorkerThread::blockUntilDone()
{
[color=#999988]// the m_done condition variable is notified when the thread enters idle,
[color=#999988]// which only happens when it is out of tasks and can't find any to steal
boost::unique_lock<boost::mutex> lock(m_idleMutex);
while(!m_idling)
m_done.wait(lock);
}

[color=#445588]void WorkerThread::thread_proc()
{
[color=#999988]// always idle when thread first starts because
[color=#999988]// not all other worker threads are gauranteed to
[color=#999988]// be initialized yet, and steal() relies on that.
[color=#999988]// Also, there shouldn't be any tasks yet anyway.
idle();

for(;;)
{
if (!m_scheduler->m_running)
break;

[color=#999988]// try to run a task. If there are not tasks available,
[color=#999988]// try to steal tasks. If there are no tasks to steal
[color=#999988]// and the task queue is empty, then idle. If there are
[color=#999988]// dependency-blocked tasks, the thread does not idle.
if (!run())
{
if (!steal() && m_taskCount <= [color=#009999]0)
idle();
}
}
}

BOOL WorkerThread::run()
{
if (m_taskCount <= [color=#009999]0)
return false;

[color=#999988]// Try to find a task to run. If all tasks in the queue
[color=#999988]// have been checked, then return false so we can try to steal.
[color=#999988]// If we have tasks waiting on dependencies, set m_dependencyBlocked
[color=#999988]// so other WorkerThreads know not to steal from this one.
m_workerMutex.lock();
m_dependencyBlocked = false;
U32 numPops = [color=#009999]0;
Task* t;
while (numPops < m_taskCount)
{
t = m_tasks.front();
m_tasks.pop();
if (t->isReady())
{
m_taskCount--;
m_workerMutex.unlock();
t->run();
return true;
}
numPops++;
m_tasks.push(t);
}
if (numPops > [color=#009999]0)
m_dependencyBlocked = true;
m_workerMutex.unlock();
return false;
}

[color=#445588]void WorkerThread::idle()
{
m_idleMutex.lock();
if (m_taskCount > [color=#009999]0)
{
m_idleMutex.unlock();
return;
}
m_idling = true;
m_idleMutex.unlock();

[color=#999988]// If another thread is waiting on us to finish execution,
[color=#999988]// notify that this thread is done now
[color=#999988]// Used by blockUntilDone()
m_done.notify_all();

boost::unique_lock<boost::mutex> lock(m_idleMutex);
while (m_idling)
m_wakeUp.wait(lock);
}

BOOL WorkerThread::steal()
{
[color=#999988]// check each worker thread for extra work
U32 workerCount = m_scheduler->m_workerCount;
for (U32 i = [color=#009999]0; i < workerCount; i++)
{
WorkerThread* wt = m_scheduler->m_workers;
if (wt == this)
continue;

if (stealFromWorker(wt))
return true;
}

return false;
}

BOOL WorkerThread::stealFromWorker(WorkerThread* wt)
{
wt->m_workerMutex.lock();

[color=#999988]// steal half of the other thread's tasks,
[color=#999988]// rounding up. If they don't have tasks
[color=#999988]// (the check in steal() is not guaranteed because,
[color=#999988]// it does not lock the mutex), then return false so
[color=#999988]// we can try another worker thread.
U32 numToSteal;
U32 taskCount = wt->m_taskCount;
if (wt->m_taskCount <= [color=#009999]0 || wt->m_dependencyBlocked)
{
wt->m_workerMutex.unlock();
return false;
}
else
numToSteal = (wt->m_taskCount + [color=#009999]1) / [color=#009999]2;

m_workerMutex.lock();
for (U32 i = [color=#009999]0; i < numToSteal; i++)
{
m_tasks.push(wt->m_tasks.front());
wt->m_tasks.pop();
wt->m_taskCount--;
m_taskCount++;
}
wt->m_workerMutex.unlock();
m_workerMutex.unlock();
return true;
}

[color=#445588]void WorkerThread::stop()
{
m_idleMutex.lock();
m_idling = false;
m_idleMutex.unlock();
m_wakeUp.notify_all();
m_internalThread->join();
}
[color=#999988]/// WORKER THREAD ///

[color=#999988]//// IMPLEMENTATIONS ////
[/source]
These sorts of things really aren't easy to get running cleanly. There are however a whole lot of existing implementations that you might want to just use, rather than deal with this whole can of worms [unless you're doing it for practice].

Lets say you have a processor with 4 threads. You push 5 tasks into your system, where each task N blocks on a task being done by task N+1, except for the last task which is free to complete unimpeded. Your system assigns the first 4 tasks, which in turn eventually block on the completion of the 5th, which never gets scheduled.
@oolala:
Deadlocks are always a problem of multithreaded applications anyway how you manage the actiivites.

@Craig_jb:
I would maintain a special running list besides the list of tasks.
You need not searching for the running tasks anymore. On state change they move into and out of the list.
I do that for a timer list, because I have to handle some thousands of timers in parallel.

@oolala:
Deadlocks are always a problem of multithreaded applications anyway how you manage the actiivites.
True, but this is an instance where the user of the library has designed parallel code that does not contain a deadlock, but the implementation of the scheduler induces deadlock. That's a big, big distinction.
I'm not a fan of such design or current implementation thereof.

There's still race conditions. Race conditions occur when two atomic operations occur one after another, yet they should be atomic as a unit (or transaction).

This, for example:
m_workerMutex.lock();
m_tasks.push(t);
m_taskCount++;
m_workerMutex.unlock();

<-- I steal a task here

m_idleMutex.lock();
m_idling = false;
m_idleMutex.unlock();
<-- I have no tasks, but I'm not idle


There is too much interlocked or duplicated state:
m_taskCount and m_tasks. Why?

Then there are m_workerMutex, m_idleMutex, m_wakeUp, m_done, m_idling, m_dependencyBlocked.

A thread is a state machine. All of the above can be converted into a single variable or a synchronization primitive. Draw a graph on paper on how worker works, which states it can be in and under which conditions it can transition. Such formalization also allows proof of correctness.

Querying for idle() is highly undesirable. It's serial action ("if (idle()) "). Conditions and serial execution harm the throughput. Worker knows exactly when it's idle (it's either executing a task, or it's waiting on a condition variable).

ready() - a bad idea. A thread is either executing a task or it's waiting on condition variable.

To manage dependencies, look into topological sort. Single responsibility principle and all that - workers work and no more. Deciding what and how to run is the job of scheduler. They only submit tasks once they know they can run.

Reason:Tasks a, b;

a::ready() {
return b->ready();
}
b::ready() {
return a->ready();
}


As far as concurrency goes, here's a hard rule: Always acquire shared resources in same order.

If scheduler is supposed to resolve dependencies, it should do so at task level. Or, instead of opaque ready(), do something like this:struct Task {
vector<Task *> depends_on();
}
Using topological sort, scheduler can now determine the order in which to execute, or abort if a cycle is detected.

This topic is closed to new replies.

Advertisement