Fast Direct3d MJPEG compression

Graphics and GPU Programming Programming

Started by Michael Tanczos February 16, 2010 08:09 PM

6 comments, last by Michael Tanczos 14 years, 2 months ago

5,681

Author

February 16, 2010 08:09 PM

Does anybody know of a fast (fast!) way to encode direct3d surfaces into jpegs? With slimdx and c# I tried texture.tostream and that seemed pretty slow.. This was also pretty slow and I didn't even get to the jpeg compression yet. Getrendertargetdata clocked in at 15ms on my computer:


_device.GetRenderTargetData(_prvsurface, _prvsurface_tmp);

DataRectangle g = _prvsurface_tmp.LockRectangle(LockFlags.None);

System.Drawing.Bitmap output = new Bitmap(_prvsurface.Description.Width, _prvsurface.Description.Height, 4 * _prvsurface.Description.Width, System.Drawing.Imaging.PixelFormat.Format32bppArgb, g.Data.DataPointer);

_prvsurface_tmp.UnlockRectangle();

Adding this brings the clock up to 40ms, so the GDI+ encoder is pretty slow: System.IO.MemoryStream stm = new System.IO.MemoryStream(); output.Save(stm, System.Drawing.Imaging.ImageFormat.Jpeg); Any other ideas for fast(er) jpeg compression of a surface?? - Michael Tanczos

Michael Tanczos

5,681

Author

February 16, 2010 08:23 PM

On an unrelated note, it took all of 2 minutes for my post to show up in google. Wow. That's some seriously fast indexing time.

Nik02

4,359

February 17, 2010 01:12 AM

If you rendered the data at the same frame within which you are retrieving it, the GPU needs to flush all rendering operations on that target immediately in order to lock it for you to read. This will break the parallelism between CPU and GPU. 15 ms sounds roughly like 1/60th of a second, which in turn sounds like a common refresh rate.

One way to rectify this is to use several buffers - while you're rendering to one of them, you can lock an another one for which the rendering has already finished. Modern hardware can render from 1 to 4 frames in advance.

In any case, you probably need some kind of work queue approach.

Also, a high-tech solution would be to implement the compression directly in the GPU by using a compute shader, and read the compressed data back to the CPU :) However, this will limit the target audience somewhat, and represents a considerable amount of work.

Niko Suni

ET3D

810

February 17, 2010 01:15 AM

I think that the fastest way to encode D3D into MJPEG would be on the GPU. Now, getting that implemented may not be trivial, but it looks like some people have done it. For example, the university of Oslo apparently runs a competition to compress MJPEG as quickly as possible (see here and here). Googling "gpu jpeg compression" also gets some results. Couldn't find any working code, although NVIDIA's site has code for DCT in CUDA.

ET's Place

Adam_42

3,664

February 17, 2010 08:01 AM

The obvious way to me to speed up the 15ms read + 25ms compress time is to multi thread the compression. Two compression threads should easily keep up with reading one frame every 15ms.

You may also find http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-jpeg-sample-and-performance-faqs/ handy - it's Intel's free optimized jpeg code.

TomKQT

1,708

February 17, 2010 08:05 AM

Quote:Original post by Adam_42
The obvious way to me to speed up the 15ms read + 25ms compress time is to multi thread the compression. Two compression threads should easily keep up with reading one frame every 15ms.

You may also find http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-jpeg-sample-and-performance-faqs/ handy - it's Intel's free optimized jpeg code.

Intel's IPP isn't free, as far as I know.

feal87

238

February 17, 2010 08:07 AM

Imho the best idea (if you don't want to mess with shaders) is not to encode in real time, but download the raw RGBA data from the surface and save them onto a file and at the end create the jpeg data.
This way it should be quite faster. (but occupy a very big space in the elaboration process (800x600x4 = 2 MB a frame * 24 = 48 MB every second)

Personal DirectX Related Blog

Michael Tanczos

5,681

Author

February 17, 2010 11:52 AM

Right now I'm thinking a best approach would end up being to transfer the data into memory using getrendertargetdata and then hand it off to another thread for compression to allow rendering to continue. I think stalling the render thread to compress an image is a waste.

I tried locking the backbuffer after each present to cause the gpu to flush rendering operations each frame and then batched several getrendertargetdata operations. It seems most of the time spent is waiting for the gpu to finish the flush?? In any case, getrendertargetdata is pretty fast in copying from several surfaces back to back. It seems to lock my fps at 30fps (i'd assume using presentflags.one might be the cause?).

If I use more than one backbuffer though locking one while others are in use will still not cause a stall?

Fast Direct3d MJPEG compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Fast Direct3d MJPEG compression

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines