XAudio2 Code Design, Callbacks or Polling

Started by
2 comments, last by Krohm 11 years, 9 months ago
Been working on creating a better audio module for my game and have come across two different designs, but been unable to work out how the XAudio2 API was actually intended to be used in larger projects.


Design A - Callbacks:

XAudio2 provides a number of callbacks such as when a submitted audio buffer has completed playing.

My design here is to create a worker thread in my audio module to perform the heavy lifting (as to not ever block the XAudio2 thread) which can run tasks as soon as possible, or delayed until at least some future time.

So for example, for audio that I am streaming (e.g. music), when I get an XAudio2 buffer completed thread, I can then add a task to my worker to refill and submit the buffer.

[source lang="cpp"]
STDMETHODIMP_(void) StreamSourceVoice::OnBufferEnd (THIS_ void* pBufferContext)
{
if (!completed)
audio->getWorkerThread()->addTask(
std::bind(&StreamSourceVoice::fillAndSubmitBuffer, this));
}
[/source]
I can also handle some things that need to be completed in the future completely internally. e.g. if the game wants to stop some sound, but does want any remaining effects (such as echo) to complete, I could stop the source voice, then add a destroy callback to my worker for in say 5 seconds, and the game code need not to worry about such details.
[source lang="cpp"]
void SourceVoice::end(bool finishEffects)
{
if(!finishEffects)
{
//example problem here, what if the audio worker thread had a fillAndSubmitBuffer?
//Could solve here with a mutex, but what if this was the destructor, then the mutex is invalidated as well...
//shared_ptr for everything that crosses threads?
voice->DestroyVoice();
voice = 0;
}
else
{
voice->Stop(false);
audio->getWorkerThread()->addDelayedTask(
std::bind(&StreamSourceVoice::end, this, false),
calcEffectTrailTime());
}
}
[/source]

The problem I'm having with this design is one of resource management. Mainly at the point of destroying a voice (either from the game code, or via one of my worker callbacks) how to ensure this resource is no longer accessed (e.g. due to another callback that was already pending). I think I might be able to make it work with appropriate locks and a way to cancel scheduled tasks (i.e. remove them from the task queue without executing), but I can see that getting very complex design wise.

Design B - Polling
Ignore the callbacks for buffers ending and such, and just provide some audio update methods to call x times per second, since I can query the number of queued buffers and such. Right now this is seeming somewhat simpler, since XAudio2 in that case already does basically all the synchronisation work required, e.g. it is always safe to call DestroyVoice from my own thread, and XAudio2 takes care of it.

[source lang="cpp"]
void StreamSourceVoice::update()
{
if (!completed)
{
XAUDIO2_VOICE_STATE state;
voice->GetState(&state);
unsigned toFill = BUFFER_COUNT - state.BuffersQueued;
while (toFill--) fillAndSubmitBuffer();
}
}
[/source]

The potential downside is it leaves the audio I/O and decoding on the game thread, but in my tests streaming an ogg/vorbis was like 1% CPU usage for the entire app, so is that a problem anyway?



Clearly these 2 designs are very different, so I cant just change between them later. Design B seems simpler right now, but the API seems like it was built more for A, and perhaps I'm missing something and overcomplicating that designs.
Advertisement

streaming an ogg/vorbis was like 1% CPU usage for the entire app, so is that a problem anyway?
Of course not.
I use a method which is more or less an hybrid.
First, I decide if a sound can be completely pooled or needs to be streamed.
Streamed sounds get slightly different buffer sizes, a base value + a random value ranging about 5%. That way they don't run out of decoded samples in the same frame.
Then I allocate three buffers (same size) for each streaming sound, 1 playing, 1 queued, 1 decoded.
Callbacks just perform an atomic swap on the queue marking 1 buffer completed. The next time the game ticks (it still got a whole buffer before time runs out) it will poll everything and update accordingly.


The problem I'm having with this design is one of resource management. Mainly at the point of destroying a voice (either from the game code, or via one of my worker callbacks) how to ensure this resource is no longer accessed (e.g. due to another callback that was already pending). I think I might be able to make it work with appropriate locks and a way to cancel scheduled tasks (i.e. remove them from the task queue without executing), but I can see that getting very complex design wise.
Hopefully worker threads won't destroy a voice just because they think is reasonable. Your code must be integrated. And you cannot just have sources anyway (perhaps another component needs to compute occlusion, or a path or whatever...) so some synchronization is necessary.

In my case that's not a problem as the main thread has full control of the graph. In the exact moment a source is destroyed it will not generate new callbacks. Components destroying sources go through the main thread as well. XAudio2 guarentees this will not be a problem, unless you are running a callback in that specific instant... it was my understanding there was a lock on callbacks anyway but I'm not sure.
So I'm actually interested in some elaborations as well.

Previously "Krohm"

Yes I was designing non-streamed sounds as well, although I have not entirely defined where I'll put the boundary, since I guess some ambient and voice stuff can be fairly long and doesn't gain much from pooling either. In testing I was able to get 2 buffers each approx 1 second to run smoothly, does the 3rd buffer gain anything except a bit
of extra time if the disk is busy or something and your decode gets delayed?

Not sure what good an atomic swap does you in the callback if the game has to poll the status of your buffer anyway? Doesn't XAUDIO2_VOICE_STATE give you that info without the callback if your polling (as per my code snippets)?


Well XAudio2 guarantees that during/after you call DestroyVoice it wont call a callback on that voice (or that voices effects chain AFAIK) and most XAudio2 methods are thread safe, except the object destruction ones (since afterwards you have no object, invalid pointer, etc if there is another thread), so the problem is what to do with callbacks on my worker thread I already scheduled when it comes time for voice deletion. Which is where the problem with that end method comes from or anything else the game might do to terminate a voice, because my task queue on the my worker thread might reference it still? Guess as I said can be solved with Mutex and such, but does that effort gain anything?

I'm going to have some kind of audio graph regardless I guess, just a question of weather the game thread polls that graph x times per second to perform maintenance (load stream buffers, clear up finished sounds, etc.) or if those tasks are done as a response to an XAudio2 callback. Although having said that, the game does play with most of the graph most frames anyway to update 3D positional data and such, so polling buffer status is perhaps not much extra, and best to forget about IXAudio2VoiceCallback and having my own internal worker thread to perform related tasks (as to not block the XAudio2 thread with say mutexes and disk IO).

Not sure what good an atomic swap does you in the callback if the game has to poll the status of your buffer anyway? Doesn't XAUDIO2_VOICE_STATE give you that info without the callback if your polling (as per my code snippets)?
No. It's not a full poll. I don't need the full information to drive this. As a side note, while the check granularity is the same **in the current implementation** event-based allows much higher precision, in case I need it in the future.

so the problem is what to do with callbacks on my worker thread
It appears I haven't got the point across. 1st: no need for worker threads to manage 1% load. 2nd: your code must be integrated so this does not happen. In my case this does not happen because the playing and the decoding are effectively decoupled.

Previously "Krohm"

This topic is closed to new replies.

Advertisement