Jump to content
  • Advertisement
Sign in to follow this  


This topic is 1450 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I would like to implement a full-fledged method of calculation of the logarithm double. Whether I not the write requests, whether the search engine yet. The original idea will calculate the result bits as integer logarithm:

double _DLog2DEF(double _a)
    if (_a<=0) return 0;
    double a=0.00048828125;
    if (_a<1) {
    unsigned _int32 b=0;
    if (18446744073709551616.0<=_a) {b|=131072;_a*=5.4210108624275221700372640043497E-20;}
    if (4294967296.0<=_a) {b|=65536;_a*=0.00000000023283064365386962890625;}
    if (65536.0<=_a) {b|=32768;_a*=0.0000152587890625;}
    if (256.0<=_a) {b|=16384;_a*=0.00390625;}
    if (16.0<=_a) {b|=8192;_a*=0.0625;}
    if (4.0<=_a) {b|=4096;_a*=0.25;}
    if (2.0<=_a) {b|=2048;_a*=0.5;}
    if (1.4142135623730950488016887242097<=_a) {b|=1024;_a*=0.70710678118654752440084436210485;}
    if (1.1892071150027210667174999705605<=_a) {b|=512;_a*=0.84089641525371454303112547623321;}
    if (1.0905077326652576592070106557607<=_a) {b|=256;_a*=0.91700404320467123174354159479414;}
    if (1.0442737824274138403219664787399<=_a) {b|=128;_a*=0.95760328069857364693630563514792;}
    if (1.0218971486541166782344801347833<=_a) {b|=64;_a*=0.97857206208770013450916112581344;}
    if (1.0108892860517004600204097905619<=_a) {b|=32;_a*=0.98922801319397548412912495906558;}
    if (1.0054299011128028213513839559348<=_a) {b|=16;_a*=0.99459942348363317565247768622217;}
    if (1.0027112750502024854307455884504<=_a) {b|=8;_a*=0.99729605608547012625765991384792;}
    if (1.0013547198921082058808815267841<=_a) {b|=4;_a*=0.99864711289097017358812131808592;}
    if (1.0006771306930663566781727848746<=_a) {b|=2;_a*=0.99932332750265075236028365984374;}
    //if (1.0003385080526823129533054818562f<=_a) {r+=0.00048828125f;_a*=0.99966160649624368394219686876282f;}
    double r=b;
    if (1.0003385080526823129533054818562<=_a) {
    } else {
    return r*a;

In principle, everything is even faster than the FPU and accuracy can catch up to the desired state. But still this sse is not optimizer.exe. 
Found a little code to implement a fast calculation on SSE but it's only for float. and precision of 3 bits less than the original. 
The challenge is to filter values mantissa through the polynomial:

union conv8b {
  double valdouble;
  _int32 valint[2];
  unsigned _int32 valuint[2];
  char valarray[8];

double _DLog2DEF(double _a)
 conv8b c;

    c.valdouble=((c.valint[1] & 0x7FF00000)>>20)-1023;

    conv8b d;
    d.valint[1]=(d.valint[1]&0x000FFFFF) | 0x3FF00000;

    double rez=((((((-3.4436006e-2*d.valdouble)+3.1821337e-1)*d.valdouble)-1.2315303)*d.valdouble+2.5988452)*d.valdouble-3.3241990)*d.valdouble+3.1157899;
    return (d.valdouble-1.0f)*rez+c.valdouble;

Found only polynomial of the 5 factors, as these factors are calculated for log2 descriptions are unfortunately not found. Actually the question is who is aware of how you can calculate these coefficients for a polynomial of the 6-th,7 th and th grade.


source: http://jrfonseca.blogspot.ru/2008/09/fast-sse2-pow-tables-or-polynomials.html

Share this post

Link to post
Share on other sites

Yes I have looked into this method, but if you increase the order of accuracy is lost. And I went the other way. And looked on what principle is implemented _mm_log2_pd. It turned out that one can reduce the variance error creating a table of differences Log2(x)-F(x). The result was:

double _DsLog2DEF(double _a)
  unsigned _int64 a1=((*(unsigned _int64*)&_a) & 0x000FFFFFFFFFFFFF) | 0x3F60000000000000;
  double s2=(*(double*)&a1);
  double s1=(double)(1.0f/(float)s2);
  unsigned _int64 a2=(a1 & 0xfffffffffc000000);
  double s3=(*(double*)&a2);

  int tableId=(((*(unsigned _int64*)&s1) >> 39)-0x0080ee20)>>4;
  double s4=((*(unsigned _int64*)&_a) >> 52)+table[tableId*2];


  double s6=((((((-0.026810595806536702)*s3+0.046373904173250588)*s3

  return s6+s4+s3+table[tableId*2+1];

To use this method, the desired table, mantissa X for this table is 0.001953125 in the range from 1 to 2 only 512 pairs, the total size 8208 bytes. 

I don't know how to add the court file table, therefore, indicate a link to another source http://www.gamedev.ru/files/?id=101924


Yes there are a few tricks associated with lighting calculations on high and low part of the result.

For example:


which can be simplified to:


but the point is that in S3 stores the high-order part of the mantissa, and (S2-S3) younger, S2 contains the mantissa completely. 
In this calculation the result is quite different precision.

Edited by foxes

Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!