Sign in to follow this  

Log2

This topic is 1113 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I would like to implement a full-fledged method of calculation of the logarithm double. Whether I not the write requests, whether the search engine yet. The original idea will calculate the result bits as integer logarithm:

double _DLog2DEF(double _a)
{
    if (_a<=0) return 0;
    double a=0.00048828125;
    if (_a<1) {
      _a=1.0/_a;
      a=-0.00048828125;
    }
    unsigned _int32 b=0;
    if (18446744073709551616.0<=_a) {b|=131072;_a*=5.4210108624275221700372640043497E-20;}
    if (4294967296.0<=_a) {b|=65536;_a*=0.00000000023283064365386962890625;}
    if (65536.0<=_a) {b|=32768;_a*=0.0000152587890625;}
    if (256.0<=_a) {b|=16384;_a*=0.00390625;}
    if (16.0<=_a) {b|=8192;_a*=0.0625;}
    if (4.0<=_a) {b|=4096;_a*=0.25;}
    if (2.0<=_a) {b|=2048;_a*=0.5;}
    if (1.4142135623730950488016887242097<=_a) {b|=1024;_a*=0.70710678118654752440084436210485;}
    if (1.1892071150027210667174999705605<=_a) {b|=512;_a*=0.84089641525371454303112547623321;}
    if (1.0905077326652576592070106557607<=_a) {b|=256;_a*=0.91700404320467123174354159479414;}
    if (1.0442737824274138403219664787399<=_a) {b|=128;_a*=0.95760328069857364693630563514792;}
    if (1.0218971486541166782344801347833<=_a) {b|=64;_a*=0.97857206208770013450916112581344;}
    if (1.0108892860517004600204097905619<=_a) {b|=32;_a*=0.98922801319397548412912495906558;}
    if (1.0054299011128028213513839559348<=_a) {b|=16;_a*=0.99459942348363317565247768622217;}
    if (1.0027112750502024854307455884504<=_a) {b|=8;_a*=0.99729605608547012625765991384792;}
    if (1.0013547198921082058808815267841<=_a) {b|=4;_a*=0.99864711289097017358812131808592;}
    if (1.0006771306930663566781727848746<=_a) {b|=2;_a*=0.99932332750265075236028365984374;}
    //if (1.0003385080526823129533054818562f<=_a) {r+=0.00048828125f;_a*=0.99966160649624368394219686876282f;}
    double r=b;
    if (1.0003385080526823129533054818562<=_a) {
      _a=1+(_a-1.0003385080526823129533054818562)*2953.1628373988541728190892445809;
      r+=_a;
    } else {
      _a=(_a-1)*2954.1394719448279842425365870796;
      r+=_a;
    }
    return r*a;
}

In principle, everything is even faster than the FPU and accuracy can catch up to the desired state. But still this sse is not optimizer.exe. 
Found a little code to implement a fast calculation on SSE but it's only for float. and precision of 3 bits less than the original. 
The challenge is to filter values mantissa through the polynomial:

union conv8b {
  double valdouble;
  _int32 valint[2];
  unsigned _int32 valuint[2];
  char valarray[8];
};

double _DLog2DEF(double _a)
{
 conv8b c;
    c.valdouble=_a;

    c.valdouble=((c.valint[1] & 0x7FF00000)>>20)-1023;

    conv8b d;
    d.valdouble=_a;
    d.valint[1]=(d.valint[1]&0x000FFFFF) | 0x3FF00000;

    double rez=((((((-3.4436006e-2*d.valdouble)+3.1821337e-1)*d.valdouble)-1.2315303)*d.valdouble+2.5988452)*d.valdouble-3.3241990)*d.valdouble+3.1157899;
    return (d.valdouble-1.0f)*rez+c.valdouble;
}

Found only polynomial of the 5 factors, as these factors are calculated for log2 descriptions are unfortunately not found. Actually the question is who is aware of how you can calculate these coefficients for a polynomial of the 6-th,7 th and th grade.

 

source: http://jrfonseca.blogspot.ru/2008/09/fast-sse2-pow-tables-or-polynomials.html

Share this post


Link to post
Share on other sites

Yes I have looked into this method, but if you increase the order of accuracy is lost. And I went the other way. And looked on what principle is implemented _mm_log2_pd. It turned out that one can reduce the variance error creating a table of differences Log2(x)-F(x). The result was:
Log2(x)=F(x)+Log2Table[IdTable(x)];

double _DsLog2DEF(double _a)
{
  unsigned _int64 a1=((*(unsigned _int64*)&_a) & 0x000FFFFFFFFFFFFF) | 0x3F60000000000000;
  double s2=(*(double*)&a1);
  double s1=(double)(1.0f/(float)s2);
  s1=s1*1.44140625;
  unsigned _int64 a2=(a1 & 0xfffffffffc000000);
  double s3=(*(double*)&a2);
  s2-=s3;
  s1=floor(s1+0.5);
  s2=s2*s1;
  s3=s3*s1-1.44140625;

  int tableId=(((*(unsigned _int64*)&s1) >> 39)-0x0080ee20)>>4;
  double s4=((*(unsigned _int64*)&_a) >> 52)+table[tableId*2];

  s3=s3+s2;

  double s6=((((((-0.026810595806536702)*s3+0.046373904173250588)*s3
                         -0.083554326761742681)*s3+0.16058097174555816)*s3
                         -0.34719362445817492)*s3+0.00089412050833233693)*s3;

  return s6+s4+s3+table[tableId*2+1];
}

To use this method, the desired table, mantissa X for this table is 0.001953125 in the range from 1 to 2 only 512 pairs, the total size 8208 bytes. 

I don't know how to add the court file table, therefore, indicate a link to another source http://www.gamedev.ru/files/?id=101924

 

Yes there are a few tricks associated with lighting calculations on high and low part of the result.

For example:

s3=s3*s1-1.44140625+(s2-s3)*s1;

which can be simplified to:

s3=s2*s1-1.44140625;

but the point is that in S3 stores the high-order part of the mantissa, and (S2-S3) younger, S2 contains the mantissa completely. 
In this calculation the result is quite different precision.

Edited by foxes

Share this post


Link to post
Share on other sites

This topic is 1113 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this