Jump to content

  • Log In with Google      Sign In   
  • Create Account

L. Spiro

Member Since 29 Oct 2003
Offline Last Active Yesterday, 02:20 PM

#5312829 Coding-Style Poll

Posted by on Yesterday, 07:50 AM

that being said, it sounds like you've been hired to produce code that conforms to certain formats and coding conventions. it would probably be a good idea if you just swallowed the bitter pill and wrote their ugly code for them. and be thankful you can define your own saner conventions for your own projects.

Opposite.
We are defining new standards and I am on the committee.
As for me, one of my policies is that happy coders are more productive coders, and one way to make unhappy coders is to make them use a style they dislike a lot.
A coding-style guideline should not be overly restrictive, especially unnecessarily. But the boundary between consistent coding and “comfortable dissonance” is not an easy line to draw.

This poll is really as straightforward as it sounds—no hidden motives about getting pushed into a style I don’t like etc.
The more topics I know that people consider “personal” the better I can consider where we should give people a “personal comfort” pass weighed against the clarity/consistency of the resulting code base.


L. Spiro


#5310567 C++ Going Beyond Basics

Posted by on 13 September 2016 - 04:22 AM

I learned by doing and working in small steps up.

 

Besides “Hello world,” my first program was a “game” in which the computer chose a number and you had to guess it.

I moved up a single step by adding branching logic to make the computer insult you in various ways for getting it wrong.

I moved up a single step by adding a loop to keep the game going rather than halting after 1 pass.

 

These projects were designed for me to learn the basics.

 

Every project is designed to push me to learn something new.  Each project was the next step up from the previous with something new that I had not explored or understood yet.

 

There is nothing mysterious about the process.  Do something new in a project, learn, and repeat.

 

 

L. Spiro




#5310413 Faster Sin and Cos

Posted by on 12 September 2016 - 02:16 AM

It’s possible if I have time.

I am in the middle of writing a blog post about sin() and cos() at the moment and it is going into a lot of detail (more than I originally planned).

I might include a note about your my_atan(), and possibly exp() and some others.  Did you come up with the original code?

 

 

L. Spiro




#5310275 Faster Sin and Cos

Posted by on 10 September 2016 - 02:16 PM

Then you may find these constants more to your liking:
1.00022506713867187500f 0.324211299419403076172f 0.0511969886720180511475f


L. Spiro




#5310241 Faster Sin and Cos

Posted by on 10 September 2016 - 07:47 AM

You should find that these coefficients give you the best-possible accuracy:
9.99934434890747070313e-01 3.32740783691406250000e-01 6.53432160615921020508e-02
return x / (0.999934434890747070313f + 0.33274078369140625f*x2 - 0.0653432160615921020508f*x2*x2);


L. Spiro


#5310083 Looking for feedback on my game engine

Posted by on 09 September 2016 - 02:19 AM

This is the class that every game will inherit in order to create a game. In order to drive the game forward, is it "OK" for HiminbjorgRoot to keep an reference to each one of the Core* and *Manager objects and simply call their respective update methods

That is completely backwards.
Managing messages, systems, and managers is the job of the highest-level module in the engine.
If everything is inheriting from this object, how can it possibly have information on any other type?

At the root you have basic systems and libraries. Math, common core functions, templates, allocators, file systems, primitive types, etc.
Then you have a layer of objects that use those core components to implement their functionality. Physics, sound, input, networking, graphics API (shader system implicitly here), image library, etc.
Then you have higher-level objects that build on top of these, bringing some of them together.

  • Sprite module.
    • Uses image library, graphics API, and physics (note that sprites do not perform any physics, they simply hold data the physics library needs so that another higher-level system can run physics.  This is the same for all objects below as well).
  • Model module.
    • Uses image library, file system (loads custom 3D model and animation formats, etc.), physics, graphics API, etc.
    • Runs skeleton animations and handles model properties, updating of bounding boxes, etc.
  • Terrain module.
    • Uses image library, file system (loads heighmaps and custom terrain files), physics, graphics API, etc.
  • Ocean module, etc.

Then, above all that, you start to couple more things together using the highest-level objects that exist in the engine.  This is the Engine Module.  This is where everything becomes an actual engine.

  • The 3D scene manager knows about terrain, models, static meshes, foliage, weather, oceans, cameras, physics, etc.  It manages the scene.  It connects 3D models (your character, for example) to the world around them so that they themselves don’t need to know what terrain is.  It takes the physics properties of objects, feeds them to the physics engine, and manages all high-level interactions between these otherwise unrelated systems.
    • If you had a render manager it would be invoked by the scene manager (the scene manager has access to all objects in the scene).  With generic data being fed to it by the scene manager, it would perform frustum culling, occlusion culling, render-queue sorting, etc.  It doesn’t know what a 3D model is or what terrain is or what an ocean is.
  • Game states are introduced at this level to allow changing from the main menu to the gameplay state, or credits screen, etc.  The rules of the game are coded into these.
  • A menu/UI system could be introduced here or be sandwiched between this level and the one below.  It would use at least the sprite menu and font system.  Depending on your organization, it could directly access the input system as well, or it could just be fed generic input data such as “RClick( X, Y )”.
  • Game entities/actors and components are introduced here.  These are how to create instances of models, cameras, etc. for the scene manager to manage.
  • Input is handled here.
  • If you had a messaging system (in case you want to make your life hell and eventually have a reason to give up on the project) it would go here.
    • Hint: Don’t have a generic messaging system.  Why would you need one when you already have perfectly fine objects for binding things together and orchestrating the whole process?

There is no such thing as, “everything must inherit from this” in an engine, and this is necessarily true.

An engine is what you get when you bring otherwise unrelated modules together.  The physics library is completely unrelated to the sound library.  You should mentally compartmentalize each different system and module such that they are each their own stand-alone libraries in order to help you clearly see how things should be separated and how high-level each system is.

 

In your proposal, you have input and rendering on the same level.  How does that make sense?  Input comes from the very top and has to go through the highest-level parts of the engine in order to be translated into contextual game actions.  Rendering, on the other hand, doesn’t have knowledge of what terrain, water, fog, foliage, skies, models, or even sprites are, and these are all modules/systems below the engine module.  The rendering API/module would necessarily be at least the next level down.

 

 

You don’t mention a “SoundManager” but I have to assume you have one.

Would you send an entire model, complete with bounding boxes, skeletons, meshes, physics properties, etc., to the sound manager when all you want to do is play a buzzing sound emanating from the object (a saw weapon in an FPS, for example)?

No, of course not.

Then why would you send an entire model, complete with bounding boxes, skeletons, meshes, physics properties, etc., to the render manager when all you want to do is render a mesh?

 

 

You don’t have a clear hierarchy of modules/systems, and what you do have seems poorly defined in terms of responsibility and scope.

What does a “ResourceManager” manage?

Can you clearly and concisely define a “resource”?

Does it include the original Photoshop .PDF files and the original .MAX or .MB files?

Does the same class load your custom model format as well as .WAV files, even though the only thing common to each is that they are files that are loaded or streamed into memory?

Are shaders resources?

 

 

You provide way too little detail, and what you do provide doesn’t seem well defined.

 

 

L. Spiro




#5309921 Faster Sin and Cos

Posted by on 08 September 2016 - 01:13 AM

You can make this code more precise by using a higher-degree polynomial in x2, and by picking better coefficients (I pretty much picked these by hand).

Any takers?

Sure. I have plans for tonight but I can do it soon.

One thing that will make it faster and more accurate is to replace:
static const float pi_halves = float(std::atan(1.0)*2.0);
with:
const float pi_halves = 1.5707963267948966192313216916398f; // float(std::atan(1.0)*2.0)
Even using a high-precision calculator does not give exactly PI/2 when I go through atan(1)*2. Due to the accuracy of floats, it will likely end up being the same constant no matter what, but this ensures it is correct for higher degrees of accuracy, and it makes us sure we have exactly the best constant.
It also avoids the possibility that a certain compiler does not implement atan() as an intrinsic and evaluated at compile time.

The 2nd and more-important point is to remove static. This will add a lot of code for initializing, in a thread-safe way, pi_halves. It will always add a branch (critical to avoid in performant code such as this) and may add instruction-cache misses due to the extra code that exists for locking, applying the value, setting a flag, and unlocking.

I see a few other things that could impact performance as well.
And I will tackle accuracy soon.


L. Spiro


#5309920 Transitioning game states when using a stack of states

Posted by on 08 September 2016 - 01:09 AM

As Kylotan says though It is a fine line and I am having trouble deciding what constitutes as a "GameState" and "ScreenState".

Not in this case.
There is a fine line when 2 states have very similar sets of rules, gameplay, or functionality.
Here, you have a single set of rules and gameplay (the GameState) and while keeping it persistent you want to add layers of menus on top of it.

Nothing tricky about this case. You want a single state and a menu/UI system on top of it. The UI system implicitly will allow layers and allow you to block input to the game, easily achieving the desired effect. I would also shy away from calling it a “screen”. That’s just another layer of ambiguity.


Don’t overcomplexificationatize things.


L. Spiro


#5309776 Bullet Physics - Weird collision

Posted by on 07 September 2016 - 03:57 AM

P.S.: I don't get the meaning of the -1 on my post -.-"

It means your reply was obnoxious.
If you need more assistance with your problem, say so and provide more information. Your topic will still be bumped and people will be more willing to take a 2nd look.
When asking for help, it is important to indicate to others that you have tried on your own, which means it is also important to give some indication that you have tried things during the days that you waited.


L. Spiro


#5309763 Transitioning game states when using a stack of states

Posted by on 07 September 2016 - 02:16 AM

A stack of states only makes sense when you have what you consider the “current state” and then have the ability to push another sub-state on top of it to temporarily block that state, such as to tell the player his or her WiiMote batteries are dead (the current state is temporarily interrupted, a screen appears indicating that batteries need to be replaced, and it goes away and your game is resumed normally once batteries are replaced).

In almost all situations you only need a current state.
View these posts on how to set it up.
General Game/Engine Structure
Passing Data Between Game States


L. Spiro


#5309761 Faster Sin and Cos

Posted by on 07 September 2016 - 01:15 AM

For most of my tests, -PI to PI. I test beyond that for sanity checks to ensure it works properly, but because any value passed gets mod’ed into -PI to PI only that range needs to be strictly checked for accuracy.
For the tests in my last post, I used samoth’s code, which ranges from -3.14 to 3.14.

Large float values are valid, but due to the way computers handle them they decrease in accuracy away from 0 (the same holds for fsinf() and all implementations of sin/cos).


L. Spiro


#5309534 Faster Sin and Cos

Posted by on 05 September 2016 - 11:09 AM

Are you maybe compiling in 32-bit mode, or for an older architecture like PIII? This might be a possible reason. I'm compiling in 64-bits -O3 --march=skylake. This processor has FMA available, and the compiler is using it (albeit for all funcitons, not just some). Quite possibly, however, a not-FMA build might have slightly different results, both in performance and in precision. It may be interesting to look at that.

I found the reason you are sometimes getting better results (I had a feeling it was this, and mentioned it in my last post).
You are using:

constexpr float sin_garrett_c11_s(float x)
{
	float x2 = x * x;
	return (((((-2.05342856289746600727e-08*x2 + 2.70405218307799040084e-06)*x2
		- 1.98125763417806681909e-04)*x2 + 8.33255814755188010464e-03)*x2
		- 1.66665772196961623983e-01)*x2 + 9.99999707044156546685e-01)*x;
}

This does the entire algorithm in double precision and then casts to float for the return. This is not a fair comparison against mine.
Additionally, your performance tests are skewed because this only handles values in the range from -PI to PI. This is why Garrett’s versions fail most of your sanity checks.
In order to make this useful and support a natural range exceeding -PI to PI, the manual fmodf( X, PI ) that my code does is necessary.


With these changes, the testing code should be:

constexpr float sin_garrett_c11_s( float _fX ) {
    int i32I = int( _fX * (1.0f / 3.1415926535897932384626433832795f) );	// 0.31830988618379067153776752674503 = 1 / PI.
    _fX = (_fX - float( i32I ) * 3.1415926535897932384626433832795f);

    float x2 = _fX * _fX;
    return (i32I & 1) ?
        (((((-2.05342856289746600727e-08f*x2 + 2.70405218307799040084e-06f)*x2
            - 1.98125763417806681909e-04f)*x2 + 8.33255814755188010464e-03f)*x2
            - 1.66665772196961623983e-01f)*x2 + 9.99999707044156546685e-01f)*-_fX :
        (((((-2.05342856289746600727e-08f*x2 + 2.70405218307799040084e-06f)*x2
            - 1.98125763417806681909e-04f)*x2 + 8.33255814755188010464e-03f)*x2
            - 1.66665772196961623983e-01f)*x2 + 9.99999707044156546685e-01f)*_fX;

}

The constants must be changed to floats to test for accuracy fairly (otherwise my constants should be in double form) and the manual fmodf() must be present to test performance (or it should be removed from my code).


With these changes, I get the exact same performance from both functions, and my results are as follows:

QPC resolution      = 3417021 ticks/sec
Iterations per test = 500000000

timings
sin               : 0             [0.000000 us per iteration] --> -nan(ind) : 1 // Cannot run the test on sin().
sin_garrett_c11   : 8745125       [0.005119 us per iteration] --> 0.0 : 1
sin_garrett_c11_s : 11423929      [0.006686 us per iteration] --> 0.0 : 1
sin_spiro_s       : 15761254      [0.009225 us per iteration] --> 0.0 : 1
sin_spiro         : 11677962      [0.006835 us per iteration] --> 0.0 : 1
sin_adam42_s      : 18002370      [0.010537 us per iteration] --> 0.0 : 1
sin_adam42        : 15559129      [0.009107 us per iteration] --> 0.0 : 1
sin_spiro_c11_s   : 10417944      [0.006098 us per iteration] --> 0.0 : 1
sin_spiro_c11     : 8995807       [0.005265 us per iteration] --> 0.0 : 1

error metrics
sin_garrett_c11   : emax=0.000000291691941 eavg=0.000000051244667 sse=0.000001752641653 rmse=0.000000059205433   0 values > 1.0
sin_garrett_c11_s : emax=0.000000974476979 eavg=0.000000089525331 sse=0.000007937254041 rmse=0.000000125994080   0 values > 1.0
sin_spiro_s       : emax=0.000000503048245 eavg=0.000000045625274 sse=0.000002303864921 rmse=0.000000067880261   1937 values > 1.0
	bracket around -3/2 pi where f(x) >  1 : [-4.712584292884689 ; -4.612388983364922]    interval = 0.100195309519767
	bracket around -pi/2   where f(x) < -1 : [-1.570942811169897 ; -1.470796329775129]    interval = 0.100146481394768
	bracket around  pi/2   where f(x) >  1 : [+1.570601014294897 ; +1.670796323814664]    interval = 0.100195309519768
	bracket around  3/2 pi where f(x) < -1 : [+4.712291324134689 ; +4.812388977404457]    interval = 0.100097653269768
sin_spiro         : emax=0.000000019798559 eavg=0.000000008708419 sse=0.000000063015472 rmse=0.000000011226350   0 values > 1.0
sin_adam42_s      : emax=0.000000627044312 eavg=0.000000053344115 sse=0.000003131099316 rmse=0.000000079134055   11181 values > 1.0
	bracket around -3/2 pi where f(x) >  1 : [-4.712584292884689 ; -4.711998355384690]    interval = 0.000585937499999
	bracket around -pi/2   where f(x) < -1 : [-1.570869568982396 ; -1.570405701794896]    interval = 0.000463867187500
	bracket around  pi/2   where f(x) >  1 : [+1.570405701794896 ; +1.570799378552709]    interval = 0.000393676757813
	bracket around  3/2 pi where f(x) < -1 : [+4.711998355384690 ; +4.712584292884689]    interval = 0.000585937499999
sin_adam42        : emax=0.000000019798559 eavg=0.000000008708419 sse=0.000000063015472 rmse=0.000000011226350   0 values > 1.0
sin_spiro_c11_s   : emax=0.000000663675650 eavg=0.000000076164470 sse=0.000005072194449 rmse=0.000000100719357   0 values > 1.0
sin_spiro_c11     : emax=0.000000167929271 eavg=0.000000061079306 sse=0.000002831597331 rmse=0.000000075254200   0 values > 1.0

// Sanity checks do not compile.

sin_garrett_c11_s emax=0.000000974476979 eavg=0.000000089525331 sse=0.000007937254041 rmse=0.000000125994080 0 values > 1.0
sin_spiro_c11_s : emax=0.000000663675650 eavg=0.000000076164470 sse=0.000005072194449 rmse=0.000000100719357 0 values > 1.0


The error metrics still suffer from what I mentioned in the bottom of my post.
If you have a float pipeline, you will not be passing in doubles to these approximations.
To measure fairly, we need to pass sin() a float input if we are going to measure against a float approximation.

This gives these results, which align with actual use-cases:

QPC resolution      = 3417021 ticks/sec
Iterations per test = 500000000

timings
sin               : 0             [0.000000 us per iteration] --> -nan(ind) : 1
sin_garrett_c11   : 8849303       [0.005180 us per iteration] --> 0.0 : 1
sin_garrett_c11_s : 11467390      [0.006712 us per iteration] --> 0.0 : 1
sin_spiro_s       : 15753936      [0.009221 us per iteration] --> 0.0 : 1
sin_spiro         : 11658452      [0.006824 us per iteration] --> 0.0 : 1
sin_adam42_s      : 17996489      [0.010533 us per iteration] --> 0.0 : 1
sin_adam42        : 15685189      [0.009181 us per iteration] --> 0.0 : 1
sin_spiro_c11_s   : 10425304      [0.006102 us per iteration] --> 0.0 : 1
sin_spiro_c11     : 8979611       [0.005256 us per iteration] --> 0.0 : 1

error metrics
sin_garrett_c11   : emax=0.000000291691941 eavg=0.000000051244667 sse=0.000001752641653 rmse=0.000000059205433   0 values > 1.0
sin_garrett_c11_s : emax=0.000000856372801 eavg=0.000000087126716 sse=0.000007333915867 rmse=0.000000121110824   0 values > 1.0
sin_spiro_s       : emax=0.000000389970995 eavg=0.000000039888538 sse=0.000001700549542 rmse=0.000000058318943   1937 values > 1.0
	bracket around -3/2 pi where f(x) >  1 : [-4.712584292884689 ; -4.612388983364922]    interval = 0.100195309519767
	bracket around -pi/2   where f(x) < -1 : [-1.570942811169897 ; -1.470796329775129]    interval = 0.100146481394768
	bracket around  pi/2   where f(x) >  1 : [+1.570601014294897 ; +1.670796323814664]    interval = 0.100195309519768
	bracket around  3/2 pi where f(x) < -1 : [+4.712291324134689 ; +4.812388977404457]    interval = 0.100097653269768
sin_spiro         : emax=0.000000019798559 eavg=0.000000008708419 sse=0.000000063015472 rmse=0.000000011226350   0 values > 1.0
sin_adam42_s      : emax=0.000000511392248 eavg=0.000000048573870 sse=0.000002527778524 rmse=0.000000071102441   11181 values > 1.0
	bracket around -3/2 pi where f(x) >  1 : [-4.712584292884689 ; -4.711998355384690]    interval = 0.000585937499999
	bracket around -pi/2   where f(x) < -1 : [-1.570869568982396 ; -1.570405701794896]    interval = 0.000463867187500
	bracket around  pi/2   where f(x) >  1 : [+1.570405701794896 ; +1.570799378552709]    interval = 0.000393676757813
	bracket around  3/2 pi where f(x) < -1 : [+4.711998355384690 ; +4.712584292884689]    interval = 0.000585937499999
sin_adam42        : emax=0.000000019798559 eavg=0.000000008708419 sse=0.000000063015472 rmse=0.000000011226350   0 values > 1.0
sin_spiro_c11_s   : emax=0.000000546977238 eavg=0.000000072763346 sse=0.000004468865118 rmse=0.000000094539570   0 values > 1.0
sin_spiro_c11     : emax=0.000000167929271 eavg=0.000000061079306 sse=0.000002831597331 rmse=0.000000075254200   0 values > 1.0

The changes were:

#define error(f, s) error_impl(#f, f, s)
template<typename F> void error_impl(const char* name, F f, bool isFloat = false, double low = -3.14, double high = +3.14)
double diff = std::abs(sin(isFloat ? (float)t : t) - f(t));
	error(sin_garrett_c11, false);
	error(sin_garrett_c11_s, true);
	error(sin_spiro_s, true);
	error(sin_spiro, false);
	error(sin_adam42_s, true);
	error(sin_adam42, false);
	error(sin_spiro_c11_s, true);
	error(sin_spiro_c11, false);

This ensures we pass exactly the same value to sin() as we do to our approximations, which is necessary for error-checking.
My approximations win on max error and average error.
sin_garrett_c11_s emax=0.000000856372801 eavg=0.000000087126716 sse=0.000007333915867 rmse=0.000000121110824 0 values > 1.0
sin_spiro_c11_s : emax=0.000000546977238 eavg=0.000000072763346 sse=0.000004468865118 rmse=0.000000094539570 0 values > 1.0

sin_garrett_c11 emax=0.000000291691941 eavg=0.000000051244667 sse=0.000001752641653 rmse=0.000000059205433 0 values > 1.0
sin_spiro_c11 : emax=0.000000167929271 eavg=0.000000061079306 sse=0.000002831597331 rmse=0.000000075254200 0 values > 1.0

(My constants were not meant to be used as doubles, so there is a bit more average error here. Float constants and double constants should be derived separately.)


L. Spiro




#5309488 Faster Sin and Cos

Posted by on 05 September 2016 - 02:21 AM

I wrote this test which performs exactly the same windowing as in Garrett’s paper:

Spoiler

Nevermind the redundant casts etc.—this section of code has been through a lot.

On http://coliru.stacked-crooked.com/ it prints:
Max: 0.000000656150266592492314998708025 Avg: 0.000000076220214520654925687580315 Max Val: 1.000000000000000000000000000000000
Max: 0.000000964399093460535650201848057 Avg: 0.000000089761670466902256275395285 Max Val: 1.000000000000000000000000000000000

On my machine (Visual Studio 2015, /fp:precise, /fp:strict, /fp:fast):
x86
Max: 0.000000656150266592492314998708025 Avg: 0.000000076220214520658102061132519 Max Val: 1.000000000000000000000000000000000
Max: 0.000000964399093460535650201848057 Avg: 0.000000089761670466905194420931073 Max Val: 1.000000000000000000000000000000000
x64
Max: 0.000000656150266592492314998708025 Avg: 0.000000076220214520184544469168361 Max Val: 1.000000000000000000000000000000000
Max: 0.000000964399093460535650201848057 Avg: 0.000000089761670466352121611043418 Max Val: 1.000000000000000000000000000000000



The important points:
#1: Error accumulation matters and should be done in double.
#2: Error should be measured against sin() rather than sinf(), but either way mine will be more accurate.
#3: It is not enough to simply make Garrett’s function return a float. If you don’t convert his constants to float (as mine are) then his function is going to be significantly more accurate and the test will not be fair. I wrapped mine in a float() so that it is easier to change between float() and double().


In terms of performance, they are exactly the same (which I had to measure on my own end rather than online for obvious reasons).
Important points:
#1: His function as posted in his paper does not handle wrapping around PI.
#2: His function does not handle up/down curves for each odd PI interval.

Naturally, his function was only posted as an example of Horner’s scheme, but if one took it directly and only tested from -PI to PI then one would not realize that it can’t go beyond that range.

When you have added the code necessary to handle these cases you get what I have posted here, which performs exactly the same as my versions.

Neither accuracy nor performance is affected by the order of the constants—SinF_LS and SinF_LS2 produce exactly the same results in exactly the same amount of time.


L. Spiro


[EDIT]
Changing my test to check every single float value between -PI and PI gives me the following:
Sin_LS Max: 0.000000546977237822487971641294280 Avg: 0.000000002610098455852473062433230 Max Val: 1.000000000000000000000000000000000
Sin_G: Max: 0.000000879148802446346952499389715 Avg: 0.000000002789327931710123957033398 Max Val: 1.000000000000000000000000000000000


This was measured by taking every single one of the 2,157,060,024 valid 32-bit floats from -PI to PI, which means that these are the “objective truth” results.

Important notes:
#1: The average error dropped significantly. This is because the samples are not distributed evenly. The samples were taken via each successive floating-point value in-order, which produces more values near 0 where both functions are more accurate.
#2: The max error dropped for both slightly. This is because we pass only 32-bit float values to std::sin() (cast up to double), whereas previous tests would pass values in double-precision, which would be cast to float for our approximations.


Point #2 is important because it is the case you will find in games. In the real world, we are not going to generate inputs at double-precision and then throw them into a single-precision sin().


Due to the nature of this test, the max error here is the reliable result. This test is immune to sample bias, and my function produces significantly better results.
[/EDIT]




#5309425 Faster Sin and Cos

Posted by on 04 September 2016 - 02:45 PM

My tests have always shown my numbers to be more accurate, so I would like to run your test on my machine.
But your link is broken.


L. Spiro


#5309341 Faster Sin and Cos

Posted by on 03 September 2016 - 05:59 PM

I’ve tested the branchless paths some time ago, because branchless should intuitively be the way to go.
They didn’t give the performance I desired on consoles. I didn’t investigate deeply why that was because I was on the clock.

It will be one of my upcoming investigations. For now I am focusing on accuracy, and next I will focus on performance.


L. Spiro




PARTNERS