OpenAL why is there no group working on it?

Started by
29 comments, last by Krohm 11 years, 1 month ago

The next big thing in audio has to be the real-time modelling of acoustic spaces. The extra dimension of realism this would add would be eye opening

Advertisement
32 sources compared to hundreds of sources with proper occlusion and implicit environmental effects (I.e. they echo if they happen to be next to a stone wall, not because you explicitly told the source to use the 'stone room' effect) is an unimaginably huge difference. Audio really has been stagnating.

A lot of people have shitty PC speakers, yeah, but a lot of people also have cinema-grade speakers and/or very expensive headsets. Surround sound headsets are becoming very common with PC gamers at least.

Is it possible that in the future, instead of a dedicated audio processing card, we'll just be able to perform our audio processing on the (GP)GPU?

GeneralQuery, there's certainly some interesting work happening in that area - for example the 'aural proxies' stuff here - http://gamma.cs.unc.edu/AuralProxies/ - but they are calling 5-10 ms on a single core "high performance", and I would suggest they need to do better than that for it to be widely accepted, especially since none of their examples show how the system scales up to double digit numbers of sound sources.

Hodgman, there was some talk of the GPU over in the other thread that I linked to above. From what I understand opinion is a bit divided as to whether the latency will be an issue. One poster there said he could get it down to 5ms of latency, but that was reading from audio capture, presumably a constant stream of data, to the GPU; going the other direction from CPU -> GPU -> PCIe audio device may not be so quick, and even just a 10ms delay will ruin the fidelity of a lot of reverb algorithms.

32 sources compared to hundreds of sources with proper occlusion and implicit environmental effects (I.e. they echo if they happen to be next to a stone wall, not because you explicitly told the source to use the 'stone room' effect) is an unimaginably huge difference. Audio really has been stagnating.

A lot of people have shitty PC speakers, yeah, but a lot of people also have cinema-grade speakers and/or very expensive headsets. Surround sound headsets are becoming very common with PC gamers at least.

Is it possible that in the future, instead of a dedicated audio processing card, we'll just be able to perform our audio processing on the (GP)GPU?

I had considered this before (using GPU for some audio tasks), but haven't done much with this.


probably, a person doesn't need to realistically calculate every sample, but many effects (echoes, muffling, ...) can be handled by feeding the samples through an FIR (or IIR) filter.

the problem then is probably mostly the cost of realistically calculating and applying these filters for a given scene.

possibly, some of this could be handled by enlisting the help of the GPU, both for calculating the environmental effects, and possibly also for applying the filters (could possibly be handled using textures and a lot of special shaders, or maybe OpenCL, or similar).

I have a few ideas here, mostly involving OpenGL, but they aren't really pretty. OpenCL or similar could probably be better here...


in my case, for audio hardware, I have an onboard Realtek chipset, and mostly use headphones.

Is it possible that in the future, instead of a dedicated audio processing card, we'll just be able to perform our audio processing on the (GP)GPU?

I've seen a few VSTs for real time processing (convolution reeverbs if I recall correctly) being accelerated with CUDA. I dunno how well they would work on a videogame.

Searched for "cuda vst" in Google and some things turn up. http://www.liquidsonics.com/software_reverberate_le.htm

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Is it possible that in the future, instead of a dedicated audio processing card, we'll just be able to perform our audio processing on the (GP)GPU?

I've seen a few VSTs for real time processing (convolution reeverbs if I recall correctly) being accelerated with CUDA. I dunno how well they would work on a videogame.

Searched for "cuda vst" in Google and some things turn up. http://www.liquidsonics.com/software_reverberate_le.htm

The latency is not such a problem for audio engineering but becomes problematic for real-time interactive applications.

Is it possible that in the future, instead of a dedicated audio processing card, we'll just be able to perform our audio processing on the (GP)GPU?

I've seen a few VSTs for real time processing (convolution reeverbs if I recall correctly) being accelerated with CUDA. I dunno how well they would work on a videogame.

Searched for "cuda vst" in Google and some things turn up. http://www.liquidsonics.com/software_reverberate_le.htm

The latency is not such a problem for audio engineering but becomes problematic for real-time interactive applications.

How much is too much latency?

At least from what I've seen latency is a problem in audio engineering and music production, people prefer to work with DAWs with <10ms latency for maximum responsiveness (specially when dealing with MIDI controllers). 10ms is too much?

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Is it possible that in the future, instead of a dedicated audio processing card, we'll just be able to perform our audio processing on the (GP)GPU?

I've seen a few VSTs for real time processing (convolution reeverbs if I recall correctly) being accelerated with CUDA. I dunno how well they would work on a videogame.

Searched for "cuda vst" in Google and some things turn up. http://www.liquidsonics.com/software_reverberate_le.htm

The latency is not such a problem for audio engineering but becomes problematic for real-time interactive applications.

How much is too much latency?

At least from what I've seen latency is a problem in audio engineering and music production, people prefer to work with DAWs with <10ms latency for maximum responsiveness (specially when dealing with MIDI controllers). 10ms is too much?

Latency in a DAW is not a problem (I'm not talking about midi latency but the latency between what is heard), even a few hundred milliseconds is certainly liveable. The problem with real-time, interactive applications like games is that the latency between what is seen and what is heard will pose problems and ruin the illusion.

10ms

I'm no expert, but considering the speed of sound (ca. 300 m/s) and the size of a head (ca. 0.3 m), the difference between "sound comes from far left" to "sound comes from far right", which is pretty much the most extreme possible, is 0.5 ms. The ear is able to pick that up without any trouble (and obviously, it's able to pick up much smaller differences, too -- we are able to hear a lot more detailled than just "left" and "right").

In that light, 10ms seems like... huge. I'm not convinced something that coarse can fly.

Of course we're talking about overall latency (on all channels) but the brain has to somehow integrate that with the visuals, too. And seeing how it's apparently doing that quite delicately at ultra-high resolution, I think it may not work out.

10ms

I'm no expert, but considering the speed of sound (ca. 300 m/s) and the size of a head (ca. 0.3 m), the difference between "sound comes from far left" to "sound comes from far right", which is pretty much the most extreme possible, is 0.5 ms. The ear is able to pick that up without any trouble (and obviously, it's able to pick up much smaller differences, too -- we are able to hear a lot more detailled than just "left" and "right").

In that light, 10ms seems like... huge. I'm not convinced something that coarse can fly.

Of course we're talking about overall latency (on all channels) but the brain has to somehow integrate that with the visuals, too. And seeing how it's apparently doing that quite delicately at ultra-high resolution, I think it may not work out.

If all sounds are delayed the same, I think it might work. 10ms means it starts while the right frame is still displaying.

You usually have some delay in all soundsystems from when you tell it to start playing until it plays, but I don't know how long it usually is... Longer on mobile devices at least.

As long as it's below 100ms or so, I think most people will interpret it as "instantaneous".

Phase shifts and such in the same sound source reaching both ears is another thing.

It would be pretty easy to test...

Edit:

Also, to simulate sound and visual-sync properly, you should add some delay. If someone drops something 3m away, the sound should be delayed 10ms.

I think this is good news. This means a minimum delay of 10ms just means you can't accurately delay sounds closer then 3m, but that shouldn't be much problem, since 3m is close enough that you wouldn't really notice it in real life either.

This topic is closed to new replies.

Advertisement