RGB-HSV use SSE

Started by
6 comments, last by Emmanuel Deloget 16 years, 9 months ago

__inline void ConvertRGB2HSV(
unsigned char sR,unsigned char sG,unsigned char sB,int *dH,int *dS,int *dV)
{
      float fTemp,fDelta;
      int nMin,nMax;
      nMin = min( sR, min( sG, sB ));
      nMax = max( sR, max( sG, sB ));
      fDelta = nMax - nMin;
      *dV = nMax;
      if(nMax == nMin)
             *dH = *dS = 0;
      else
      {
             fTemp = fDelta / nMax;
             *dS= (int) (fTemp * 255);
             if(sR == nMax)
                   fTemp = (float)(sG - sB) / fDelta;
             else if(sG == nMax)
                   fTemp = 2.0 + ((float)(sB - sR) / fDelta);
             else
                   fTemp = 4.0 + ((float)(sR - sG) / fDelta);
             fTemp *= 60;
             if(fTemp < 0)
                   fTemp += 360;
             if(fTemp == 360)
                   fTemp = 0;
             *dH = (int)fTemp;
      }
}

unsinged char sR,sG,sB;
int dH,dS,dV;
int nMaxH,nMinH,nMaxS,nMinS,nMaxV,nMinV;
nMaxH = nMaxS = nMaxV = 0;
nMinH = nMinS = nMinV = 360;
RGBTRIPLE *pPixel = (RGBTRIPLE *)pSourceBuffer;
//for(int j = 0; j < 576; j++)
//for(int i = 0; i < 768; i++)
int i=0;
while(i++ < 768 * 576)
{
      sR = pPixel->rgbtRed;
      sG = pPixel->rgbtGreen;
      sB = pPixel->rgbtBlue;
      ConvertRGB2HSV( sR, sG, sB, &dH, &dS, &dV);
      if( dH > nMaxH)
            nMaxH = dH;
      if( dH < nMinH)
            nMinH = dH;
      if( dS > nMaxS)
            nMaxS = dS;
      if( dS < nMinS)
            nMinS = dS;
      if( dV > nMaxV)
            nMaxV = dV;
      if( dV < nMinV)
            nMinV = dV;
      pPixel++;
}
ConvertRGB2HSV(...) is too slow.I want to rewrite this function (or the all code)using SSE (C++ code with inline Assembly ,like as: _asm{ movaps xmm1,xmm2 ... }) The compiler is VC6(SP5),support inline assembly with SSE instructions. I want to use SSE,Who can help me?Please!
Advertisement
Please help me?
Your algorithm appears to have too many branches to benefit from SIMD reliably. About the only place where you could apply some parallelization is in the final "determine minmax" six comparisons, but these only account for 4% of your execution time anyway. Where did you intend to apply SSE?

My first check would be to determine whether the call is actually inlined (on VC2005, for instance, it isn't).
Quote:Original post by ToohrVyk
Your algorithm appears to have too many branches to benefit from SIMD reliably. About the only place where you could apply some parallelization is in the final "determine minmax" six comparisons, but these only account for 4% of your execution time anyway. Where did you intend to apply SSE?

My first check would be to determine whether the call is actually inlined (on VC2005, for instance, it isn't).


If the compiler is smart, it's not going to be inlined any time soon - the function is far too heavy.

Now, this alorithm is quite difficult to vectorize. However, there are some possible areas of improvement:
* Avoid unnecessary data type changes (and casts). And remember, 2.0 is a double, and 2.0f is a float.
* fDelta shall contain 1,0f / static_cast<float>(nMax - nMin) - you can then get rid of many divisions.

I would rewrite the code like this:
void ConvertRGB2HSV(unsigned char sR,                    unsigned char sG,                    unsigned char sB,                    int *dH,int *dS,int *dV){  float fTemp, fDelta;  int nMin, nMax;  nMin = std::min(sR, std::min(sG, SB));  nMax = std::max(sR, std::max(sG, SB));  *dV = nMax;  if (nMin == nMax) {    *dH = *dS = 0;  } else {    // no need to use any float to get dS    *dS = ((nMax - nMin) * 255) / nMax;    // we only compute fDelta here: we incorporate the    // the *60, and we store the inverse of the final value    fDelta = 60.0f / static_cast<float>(nMax - nMin);    // constants also include the *60, so we avoid the final    // multiplication    if(sR == nMax) {      fTemp = static_cast<float>(sG - sB) * fDelta;    } else if(sG == nMax) {        fTemp = 120.0f + static_cast<float>(sB - sR) * fDelta;    } else {      fTemp = 240.0f + static_cast<float>(sR - sG) * fDelta;    }    // cast now - integer comparisons and operations are    // going to be faster than the same operation on floats    *dH = static_cast<int>(fTemp);    if (*dH < 0) *dH += 360;    else if (*dH >= 360) *dH -= 360;  }}

Now, I won't say you that this version performs incrediby faster than your version. One thing that can be improved is to use a fake float-to-int conversion - since you're dealing with a range that allow you to remove all checks AND you are not using your result anymore.
You can include the small function:
namespace {  // this is, of course, heavily MS oriented. If you are using GCC,   // the syntax will change.  inline int fake_ftol(float v)  {    int retval;    __asm fld a    __asm fistp retval    return retval;      }}
*dH = static_cast<int>(fTemp);

becomes
*dH = fake_ftol(fTemp);

We are in the realm of micro-optimization, so don't expect a terrible boost.

You can also avoid passing 6 parameters for each function call - just pass a const reference to the RGBTRIPLE and a non-const reference to a HSVTRIPLE:
struct HSVTRIPLE{  int h, s, v;}void ConvertRGB2HSV(const RGBTRIPLE& rgb, HSVTRIPLE& hsv){  // ...}RGBTRIPLE *pSrc = (RGBTRIPLE *)pSourceBuffer;RGBTRIPLE *pSrcEnd = pPixel + (768 * 576)// let's say we're storing the results in an imageHSVTRIPLE *pDestBuffer = new HSVTRIPLE[768*576];HSVTRIPLE *pDest = pDestBuffer;while(pSrc < pSrcEnd){  ConvertRGB2HSV(*pPixel, *pDest);  // update the max and min values  // ...  ++pDestBegin;  ++pSrc;}

You'll avoid as much as 4 push per function call - and that's quite important for a function which is called that often.
Thank you very much!
Quote:Original post by Emmanuel Deloget
If the compiler is smart, it's not going to be inlined any time soon - the function is far too heavy.


If the compiler managed to determine that the ConvertRGB2HSV function was only called within this code, and was always called exactly once (per loop iteration) when that code was run, then it could inline it to save both time and space.

So, not inlining would be the situation of a compiler that is smart, but not smart enough [wink]

Entire algorithm could also be rewritten like this:
void ConvertRGB2HSV(unsigned char sR,                    unsigned char sG,                    unsigned char sB,                    int *dH,int *dS,int *dV){  float fTemp, fDelta;  int nMin, nMax;  nMin = std::min(sR, std::min(sG, SB));  nMax = std::max(sR, std::max(sG, SB));  *dV = nMax;  if (nMin == nMax) {    *dH = *dS = 0;  } else {    // no need to use any float to get dS    *dS = ((nMax - nMin) * 255) / nMax;  }}


Hue does not have minimum or maximum, it's a cyclical value. It's also not independant of other two - the relations between H, S and V are not linear in the way R, G and B are.

Whatever the resulting value ranges will be, they will have no real meaning - increasing single H, S or V value is non-linear to changes in RGB - as such, HSV min/max values are meaningless.

See the definition of hue for why finding min/max for it doesn't make sense.

Given the (17, 55, 18) and (56, 7, 100) H, S and V values, the minimum would be: 17, 7, 18) - which is meaningless color in RGB space. The only thing that would matter would be the last one (18 or 100), but you cannot calculate the values for H and S from that value alone, due to non-linearity.

And for finding brightness, you should use the YUV (YCrCb) transform, which gives meaningful value (from human perspective).
Quote:Original post by ToohrVyk
Quote:Original post by Emmanuel Deloget
If the compiler is smart, it's not going to be inlined any time soon - the function is far too heavy.


If the compiler managed to determine that the ConvertRGB2HSV function was only called within this code, and was always called exactly once (per loop iteration) when that code was run, then it could inline it to save both time and space.

So, not inlining would be the situation of a compiler that is smart, but not smart enough [wink]


Thou be ruining my expectations!

Thanks god, I'm drunk, so I don't care much [smile]

This topic is closed to new replies.

Advertisement