Jump to content
  • Advertisement
Sign in to follow this  
NewUser13

Make function faster (RGB-YUV)

This topic is 3024 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello, I have a function which converts pixels from an RGB surface to a YUV surface. What the fuck. Where are the code tags. if (m_overlay_back_surface->Lock(NULL,&ddsd,DDLOCK_WAIT,NULL) == DD_OK ) { if (m_offscreen_surface->Lock(NULL,&ddsd_offscreen,DDLOCK_WAIT,NULL) == DD_OK ) { CopyRGBSurfaceToYUVSurface(&ddsd_offscreen, &ddsd, UYVY); m_offscreen_surface->Unlock(NULL); } m_overlay_back_surface->Unlock(NULL); } bool overlay_renderer_t::CopyRGBSurfaceToYUVSurface( LPDDSURFACEDESC2 pddsd1, LPDDSURFACEDESC2 pddsd2, fourcc_enum eOverlayFormat) { if (pddsd1->dwWidth != pddsd2->dwWidth) return false; if (pddsd1->dwHeight != pddsd2->dwHeight) return false; DWORD w = pddsd1->dwWidth; DWORD h = pddsd1->dwHeight; LONG pitch1 = pddsd1->lPitch; LONG pitch2 = pddsd2->lPitch; unsigned __int32 *pPixels1 = (unsigned __int32 *)pddsd1->lpSurface; unsigned __int32 *pPixels2 = (unsigned __int32 *)pddsd2->lpSurface; unsigned __int32 color1; LONG offset1 = 0; LONG offset2 = 0; unsigned int R, G, B, i1, i2, i3, i4; BYTE yuv[4]; if (eOverlayFormat == UYVY) // U Y V Y { i1 = 1; i2 = 0; i3 = 3; i4 = 2; } else // Y U Y 2 { i1 = 0; i2 = 1; i3 = 2; i4 = 3; } // Go through the image 2 pixels at a time and convert to YUV for (unsigned int y=0; y<h; y++) { offset1 = y*pitch1/4; offset2 = y*pitch2/4; for (unsigned int x=0; x<w; x+=2) { color1 = pPixels1[offset1++]; B = (color1) & 0xFF; G = (color1 >> 8) & 0xFF; R = (color1 >> 16) & 0xFF; yuv[i1] = (( 66*R + 129*G + 25*B + 128)>>8)+ 16; yuv[i2] = ((-38*R - 74*G + 112*B + 128)>>8)+128; color1 = pPixels1[offset1++]; B = (color1) & 0xFF; G = (color1 >> 8) & 0xFF; R = (color1 >> 16) & 0xFF; yuv[i3] = (( 66*R + 129*G + 25*B + 128)>>8)+ 16; yuv[i4] = ((112*R - 94*G - 18*B + 128)>>8)+128; pPixels2[offset2++] = *((unsigned __int32 *)yuv); } } return true; } This function is too slow for my needs (depending on the screen size up to a second is required to convert all pixels to YUV). Are there any tricks to make it faster?

Share this post


Link to post
Share on other sites
Advertisement
First thing you need to do is put timing code around the different part of your functions to see which parts are slowest - you might find the Locks are actually a lot slower than the conversion process itself, you might find that one of the Locks is much slower than the other.

Doing the conversion in a shader will speed up the conversion part, but if you need to recover the converted buffer back to system memory then you'll still have the cost of the Locks to contend with. If you don't need the converted buffer in system memory, you should definitely use a shader, otherwise you're copying from GPU to system memory, doing the conversion, then copying from system memory back to GPU memory again, which is nuts from a performance perspective.

Share this post


Link to post
Share on other sites
Break the swizzling into its own function. Writing to those non-constant indices is going to slow you down a lot. Oh, and you can move that +16, +28 inside the left hand side of the shift. Finally, marking pPixels1/2 as restricted could possibly help, though it isn't a slam dunk here.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!