I'm trying to use hardware encoding on a nvidia card using NVENC. I've successfully hooked directX and have the EndScene call, so i can get my hands on the backbuffer which is in ARGB format.
The problem is that NVENC only supports resources with the NV12 format. This isn't mentioned in the official SDK, but is clear in various powerpoint slides and the actual SDK source code you can read. In the SDK source, they create the surface that they use as an input for NVENC like so:
IDirectXVideoProcessorService * directx_services;
DXVA2CreateVideoService(d3ddev, IID_PPV_ARGS(&directx_services));
directx_services->CreateSurface(desc.Width, desc.Height, 0, (D3DFORMAT)MAKEFOURCC('N', 'V', '1', '2'), D3DPOOL_DEFAULT, 0, DXVA2_VideoProcessorRenderTarget, &NV12_surface[0], NULL);
I wasn't even aware that D3D9 supported NV12 format. Already I'm not sure what's going on here. But nevertheless, the surface is created successfully, and can be mapped in NVENC.
So the problem now is how to convert the ARGB backbuffer data into NV12. This has to be done on the GPU, without resorting to copying this data to system memory.
The strategy I'm thinking of is to copy the backbuffer to it's own texture, change render targets to my NV12 target, set a pixel shader, draw the texture onto the NV12 render target (can this even be done?), and use the saved backbuffer as a texture input so that in my pixel shader i can compute the YUV components based on the texture input. Then I'll have a filled NV12 surface, and use that as my input to the encoder.
It sounds "somewhat" correct, but I'm confused as to how the pixel shader would work across the two different formats? What the pixel shader would think is one pixel on the NV12 render target, would actually be 4 Y components, at least for 66% of the data?
Any pointers in the right direction would be appreciated. Thanks!