Threads and performance!

Started by
19 comments, last by uncutno2 16 years, 11 months ago
Hello, i have a problem. I have two threads, in an audio-system I am creating. Thread one renders the sound, and requires to run as close to realtime as possible (Running on the Win32, non-realtime OS). This thread renders the sound, and should run all the time (I tried to sleep it, but then it sleeps for to long, and the audio outputstream is broken) Thread two runs the gui, and the gui is the user controlle of the system... The gui is implemented in OpenGL, so this thread refreshes as often as it can, but waits if the other thread is in trouble. This is working but not optimaly, is there a way i can give super-high (max possible) priority to the first thread, but so that it still let the other tread run somethimes? If I sleep a high priority thread, does it wait shorter before returning, then an average priority thread? How can you tune them? (ny sunchronizing wait/notise is possible) Next problem: I need to measure the processing time used by each thread, but since they run simultaniously, this is hard to measure, anybody know a technique for aproximating? I would like to know something like: Thread1 uses 60%, Thread2 uses 30%, and 10% is used for waiting! Again, synchronizing while testing is not an option.
Advertisement
I don't have experience with audio in particular, but in most such cases you're better off with some asynchronous mechanism for supplying the sound buffers.

I believe that recent Windows versions support these mechanisms natively via overlapped structures (I won't guarantee for sound, but it does for all other IO).

So if you're having issues with real-time, you might want to look into that. Relying on busy thread to provide data doesn't seem that reliable to me. This is very true for network IO for example, where async handling is the defacto optimal aproach.

Which API are you using for sound, and how do you keep it supplied with buffers?

Also, you do not need real-time OS to process sound in real-time (apparently, everyone else does it without much problem)


Performance is best measured with various profilers, those will usually give you a breakdown at OS level, including by thread/process/kernel mode.
As far as sleep()..

A high priority thread, when issuing a sleep(0), will only yield its timeslice to other high priority threads.

sleep(n > 0) works more or less the same regardless of priority (if two threads of different priority are scheduled to resume at the same time, the higher one goes first)

Quote:Original post by uncutno2
Thread one renders the sound, and requires to run as close to realtime as possible (Running on the Win32, non-realtime OS). This thread renders the sound, and should run all the time (I tried to sleep it, but then it sleeps for to long, and the audio outputstream is broken)

I don't know what your particular situation is, but this problem is usually solved by making the audio buffers big enough so that even a long delay will not cause a problem.
Quote:Original post by uncutno2
This is working but not optimaly, is there a way i can give super-high (max possible) priority to the first thread, but so that it still let the other tread run somethimes? If I sleep a high priority thread, does it wait shorter before returning, then an average priority thread?
The answer to the first question is yes. Run the thread at a high priority and then sleep it when you want to give the lower priority thread a chance to run. The answer to the second question is maybe yes, but not always. Remember that there are other processes that have equal or higher priorities than yours, so your high priority thread could still end up waiting for one of those to complete.
John BoltonLocomotive Games (THQ)Current Project: Destroy All Humans (Wii). IN STORES NOW!
Thanks guys!

ok the problem is that it's a realtime audio synthesiser, so i need to reduce the time from when the "player" presses a key to when the output sound changes. This should be about 20ms... Im also simulating sounds (i dont play them back from a file), so a lot of processing have to be done in this small time frame window...

Since this system turns my computer into a synthesiser, i dont care if it blocks out other processes (and im not going to give the system to someone else, so i don't have to be nice :-) )

I have looked at the VST audio plugin SDK and it seems to handle these things very good, but I want my native Win32 version to work whitout VST, and I think the problem now is my threads (that is why im asking for advice on how to pump out more power for this single real-time-ish thread). To much time used by the drawing thread, and more should be given to the sound-simulator thread.

So this is the best solution?:
A = important thread
B = unimportant thread

A run at very high priority, and sleep whenever the soundbuffer is full?
B run at low priority

Should i maybe extend it with:
B checks if soundbuffer is full, and sleeps if it isn't, before it does unimportant stuff?
Are you sure that the problem is with the threads fighting for cpu time?

I would think that the big cause of latency would be the audio API and drivers in question. Check out FMOD.
Quote:Original post by uncutno2
Thanks guys!

ok the problem is that it's a realtime audio synthesiser, so i need to reduce the time from when the "player" presses a key to when the output sound changes. This should be about 20ms... Im also simulating sounds (i dont play them back from a file), so a lot of processing have to be done in this small time frame window...
I realise you want realtime sound, but I think you're going a little overboard trying to get the timing accurate. Using a 100ms buffer should still be plenty close enough to real-time sound.

One option to make the threads play nicer together is to have them do less work, i.e burn less CPU cycles to generate the sound samples. In other words, look into optimising the code that generates the sound samples.

Also, have you considered using lock-free data structures for passing data between threads?
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Tkanks again, i know that reducing the needed CPU cycles also is a must, but I want the absolute maximum out of this system, so if I can do both, that would be nice (I have already optimized most of the heavy code to a degree where every list is hashed, and no multiplications (only + and bitshift) is used under calculations, they are super fast, i have measured them).... Also, all data is lock free, that is why no synchronization is needed, Not a single variable in the whole system needs to be locked (This is also working great). Almost no memory is alocated while the system is running, and little overhead is used (the system is as optimized as it should be).

Im going to change drivers to VST architecture (which uses a dynamic latency always adjusted to be minimal, based on performance), but i want my Win32 version to work as good as possible, so the question remains: How do you get you threads to run as much as possible, and how do you tune your two threads to get appropriate percentages of the processing time?
Ok well it sounds like you're doing pretty well then.
However you never know what incredible optimisations someone else might come up with.
In fact if you'd like to think of it as a challenge to optimise your code furthur for you, then feel free to post some. You'll get a few takers I'm sure.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Thanks, this is version 3 of my synth, version 1 is 4 years old, and my current version is 8 times faster. From version 1 to version 3 i have changed the whole representation of the sound problem, and reduced representation data with 50%, which means that 50% less calculations are done calculating the same data. After that I have optimized every critical area, and separated the sound-calculating part of the system from the gui, so that they run seperately, and only share a set of controller variables (ex. volum), that dont need to be locked. For example: instead of processing a rendered sinus-wave, I only process 3 variables, the wave's frequency, volume and phase. If i want to pitch the frequency of the sin, i multiply the frequency by two (instead of recalculating the rendered sin)... Because of this i can handle about 10000 different waves or pulses every second, and they are all passed trough a set of dynamic filters which i dynamically connect together with cables within the gui. If I want an extra delay filter somewhere, I add a Delay module, and connect its cables... If I want 55 delay modules, no problems. By having a large set of primitive filters, I can create millions of different sounds and moods. The final rendering of the sin waves is done without floats or doubles, by only adding and bit-shifting. Both the waves volume and frequency changes over time, so this is not trivial.

I have now tried to turn the threads priorities to the maksimum, and my GUI thread sleeps 15 ms after each rendered frame. When using sleep(0), the system froze (no other top priority thread), so now i'm doing 10 ms sleeps (I know win32 cant guarantee anything less then 10 ms). Any other idea? Is there ANY way to squece out more power from the thread shedualing system?

This topic is closed to new replies.

Advertisement