I'm currently trying to get my head around how to implement the dependencies of tasks onto other tasks in such a system.
For priority of tasks my task object stores a priority value -> priority_queue gives me the task with the highest priority.
But the problem is for defining dependencies like e.g. a render task having dependency on a physics task to finish its work.
Can someone tell me / give me an example on how this is done right ?
One way is for each task to simply have a list of things that depend on it.
Let's look at the simple case of each task having at most a single dependency. You have a global queue of pending tasks. Each task has a list of dependencies. When a task is created, only add it to the global queue if it has no outstanding dependencies. Otherwise, add it to the dependent list on the task it depends on. When a task is completed, move all the tasks in its dependent list to the pending task queue.
Extending this to allowing multiple dependencies mostly just entails adding a reference count. When a task completes, loop over its dependent tasks and decrement their "pending_dependency" count by 1. If the count hits 0, all dependencies have completed and the task is ready to be moved into the pending queue. If the count is greater than 0, the task is still waiting on something, so do nothing.
If tasks can fail, all dependent tasks should also be marked as failed (recursively) to avoid "stuck" tasks from failures leaking.
I would just avoid most of this, though. I can see good uses for dependencies in complex resource chain loading, but in main engine code I tend to advocate the fork-join model of threading. Instead of trying to queue up physics and render jobs simultaneously (do all render jobs depend on all physics jobs?), do one at a time. In your main loop, your physics update just adds a bunch of physics jobs. The loop then waits for all those jobs to complete. Then it might move on rendering and generate all the jobs and then wait for it to finish. It's a _significantly_ easier model to work with as you have a much stricter and easier to manage set of data dependencies between threads.
Another small question...do I take tasks as they come or do I first fill the queue with tasks for e.g. the entire frame and then schedule it ?
Try both, see what works better. The only way you can _ever_ accurately reason about performance is to actually measure and compare, assuming that's your primary concern here.
I'm currently looking at Intel Building Blocks. Is it worth using ?
There's no strong reason not to if you like it and it does what you want. It's not horribly broken or anything.