Howdy!
Say, is it feasible to attempt to speed up audio processing (like FFT and other DSP stuff) using the graphics hardware on such google Android devices which provide GFX hardware, using OpenGL ES?
I'd reckon one issue would be a botteneck in mem transfer between GFX chip and CPU, increasing latency - well that's my assumption - I have no experience yet with such devices or android.
Btw., I know audio playback in general has high latency on Android, but that's supposed to be fixed in 4.2, which would be what I'd target then.
- unshaven
Feasible? GFX HW audio processing (OpenGL ES)
Probably not. I don't believe most devices have the same raw horsepower to make GPGPU programming worthwhile.
Then again, if you are only talking about future 4.2 devices, perhaps if you cherry-pick that one magical future device, it may be worth it. Who knows?
Then again, if you are only talking about future 4.2 devices, perhaps if you cherry-pick that one magical future device, it may be worth it. Who knows?
Probably not. I don't believe most devices have the same raw horsepower to make GPGPU programming worthwhile.
Then again, if you are only talking about future 4.2 devices, perhaps if you cherry-pick that one magical future device, it may be worth it. Who knows?
It will definitely be a "hand picked" device, it's not a project with the goal to support all devices on the planet. But there will also be other features influencing the picking.
It's probably going to depend on what kind of GPU your platform has access to -- many of the newer SOCs are very programmable, and some even support various profiles of OpenCL. With that out of the way, it probably comes down to driver overhead more than anything; Since audio data is comparably small to most GPGPU workloads there's less compute to absorb the overhead, but you'll probably win some of that back by virtue of having to transfer less data in the first place.
That said, I wonder how integrated the GPU is with memory... With very modern integrated graphics on PCs, its getting very close to the point where you don't have to move the data at all (though, you compete with the system for bandwidth and sometimes cache). I'm not sure whether SOCs are further behind or ahead in this respect.
That said, I wonder how integrated the GPU is with memory... With very modern integrated graphics on PCs, its getting very close to the point where you don't have to move the data at all (though, you compete with the system for bandwidth and sometimes cache). I'm not sure whether SOCs are further behind or ahead in this respect.
If the CPU is ARM and has NEON instructions that would be where I would start looking for this sort of thing. The NEON instructions are designed for this. I know OpenCV on Android uses the NEON instruction set when available and is quite efficient. In my experience a Galazy S2 ran it faster than a Galaxy Nexus. You will of course have to use the Android NDK to take advantage of NEON.
I believe most of the mobile GPUs (PowerVR and Mali definitely) are based on tile rendering for memory and power (as in electric) efficiency, so reading stuff from back GPU memory would likely be inefficient in the extreme.
It is possible to do low latency audio (check out caustic http://singlecellsoftware.com/caustic) on the latest versions of ICS and seamless video playback on stock Android is possible if you are clever.
Example of NEON:
http://hilbert-space.de/?p=22
Info on NEON:
http://elinux.org/images/4/40/Elc2011_anderson_arm.pdf
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344d/DDI0344D_cortex_a8_r2p1_trm.pdf - See Chapter 13
http://en.wikipedia.org/wiki/ARM_architecture#Advanced_SIMD_.28NEON.29
I believe most of the mobile GPUs (PowerVR and Mali definitely) are based on tile rendering for memory and power (as in electric) efficiency, so reading stuff from back GPU memory would likely be inefficient in the extreme.
It is possible to do low latency audio (check out caustic http://singlecellsoftware.com/caustic) on the latest versions of ICS and seamless video playback on stock Android is possible if you are clever.
Example of NEON:
http://hilbert-space.de/?p=22
Info on NEON:
http://elinux.org/images/4/40/Elc2011_anderson_arm.pdf
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344d/DDI0344D_cortex_a8_r2p1_trm.pdf - See Chapter 13
http://en.wikipedia.org/wiki/ARM_architecture#Advanced_SIMD_.28NEON.29
If you need to do a readback from GPU to CPU you're going to get pipeline stalls. Of course you could probably schedule such a readback asynchronously if you don't need the data immediately, but it's something you need to be aware of and something that will be far more significant than transfer rates.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement