XAudio2 management question

Started by
6 comments, last by Burnt_Fyr 10 years, 10 months ago

Hi

I am reading up on XAudio2 and I have some questions.

So the pipeline is as follows:

pBuffer(RAM) -> Source Voice -> Mixer Voice -> Master Voice -> Sound Adapter.

Data is loaded to RAM in a special way. You get a pointer to the audio pBuffer. This is then submitted to a source voice which can then start().

My question regards how to manage buffers and Source voices.

Lets say I have a sound effect that consist of 5 short sounds to be played in a distinct sequence. I would then have one buffer for each sound, submit them to 5 different Source voices. Confiugre the source voices with different delays, then use the start() method on all 5 source voices.

So far so good.

But, if I have 100 sound effects, each having 5 short sounds in them. Should I make 500 Source voices, one for each buffer in RAM? Or, should I only have as many as I need playing at the same time? Say 50, for max 10 sound effects?

For c++, I would think that a sound (pBuffer) could be wrapped in its own class called ShortSounds. Then have SoundEffect class which holds many ShortSounds instances (std::vector<*ShortSound>). Then the Audio class would have many SoundEffects (std::map<*SoundEffect>).

Then I could just call the method Audio::Play(EffectID).

Is this a good idea?

Advertisement

I haven't worked with xaudio2 in a long while, and I've grown as a programmer so i'd likely do things a bit different now. But I'll describe what i've done below.

I tried to think of how I would do something in a studio, and based my classes around that. I play guitar, so am quite familiar with the typical recording setup. First off was the mixer device. It created and destroyed source and submix voices(tracks and buses) as well as a master voice for output. It allows tweaking of each of the tracks in the same way you could on a mackie or other.

Next up was something to load and store all the various sounds/music I was using. I created a sampler class, which was equivalent to your soundeffect class. It had a std::vector<sample*>, where the sample class was like your shortsounds class, a wrapper around the buffer object to abstract away the windows specific code.

Calling Audio::Play("gunshot") would grab a free track from a pool that the mixer had reserved, and the sound buffer from the sampler, and play immediately. For music or ambiance I could request a specific track from the mixer, getting access to pan/fade/etc and play a buffer directly on that track. This let me have one song playing, and another buffered and ready to go.

I would likely change a lot of what i've done it retrospect, but as it's working for me, I don't feel the need yet. I've got much bigger fish to fry moving graphics system from a dx9 windows specific library to a cross platform dx11 and OGL4 setup.

best advice i can give is KISS. Implement what you need, nothing more.

Thanks for your input. But I have no clue if making a thousand Source Voices, each with its own submitted pAudioBuffer has any negative consequences.

Thanks for your input. But I have no clue if making a thousand Source Voices, each with its own submitted pAudioBuffer has any negative consequences.

The question is do you need that much polyphony? I was able to survive with a pool of 32 tracks, as each sound effect only lasts a short while (1 sec or so). If your 1 effect requires 5 sounds, why not just preprocess them into 1 sound clip, and play it all on the same source voice?

Well, it would be nice to be able to space out the sound clips at run-time. Plus, if I am able to keep every clip in its own container without searching through a pool of available source voices the code will get simpler.

There just is'nt any good tutorials (atleast the first 50 google pages) that explain the XAudio2 API, functionalities and limitations.

FMOD looks way easier to use, but i'd like to stick to DirectX.

The "spacing" of sound clips should be handled by the game logic. I'm envisioning something like a rocket, where you have a launch sound, a flight sound, and an explosion. So on firing, the launch sound is played, and then while in flight, the flight sound is looped continuously until it hits something, at which point the explosion sound is triggered. The pooling is as simple as a vector of source voices, with an iterator to the next to be used source voice. When you play a sound "immediately" it just grabs the voice pointed to by the iterator,, and increments the iterator by one. As long as everything sounds ok, you are good to go. if you notice considerable lag between an event and it's sound, then increase the pool size so that the source voices never have a backlog of sounds to play. In my test for the code, I was able to fire off a gunshot sound per frame at around 60 fps, with only 32 tracks in the free pool.

I agree with keeping clips in their own container, that is my "sample" class, and the sampler is essentially a "manager" or container of all the samples loaded into the game.

Can the API handle a thousand source voices? If so, I don't see the reason to complex it.

Only one way to find out, but it will likely come down to the individual card/onboard audio themselves.

This topic is closed to new replies.

Advertisement