Why is math transformation taxing to most CPUs?

Started by
12 comments, last by tanzanite7 9 years, 4 months ago


Done poorly a game can still overload the CPU with badly-written math operations.

badly-written math operations as in "poorly optimized math code"? May you write an example showcasing a badly-written math operations and a goodly-written math operations? Definitely would want to learn more.

I think what he is getting at is that if you don't take advantage of parallel operations, and if you take the naïve approach (i.e. math ops without any refactoring), then you get terrible performance.

However, I would like to take a different stance than the others on this topic. I think the terms provided by the author are probably appropriate - taxing to a CPU can mean a lot of different things, and just like all things in computer graphics, it depends on the scene you are processing. If you have a simple scene, a CPU rasterizer can easily keep a high frame rate on modern CPUs. If you want to push the limits of current technology, then CPUs are not the choice processor graphics - you would obviously go for GPUs.

So the author is incorrect because of his blanket statement (that the CPU is always heavily taxed by transformations), because it depends on the scene being rendered and the ops being executed. If you take a look at the latest WARP devices in D3D11, you can find some really screaming software based rasterizers that work just fine for many situations.

Advertisement

[...] and threads (2-8 as many operations per clock, if being unrealistically ideal) [...]

I nearly blew a fuse reading that. Please tell me that's a typo, and not how you think threading improves performance.

RIP GameDev.net: launched 2 unusably-broken forum engines in as many years, and now has ceased operating as a forum at all, happy to remain naught but an advertising platform with an attached social media presense, headed by a staff who by their own admission have no idea what their userbase wants or expects.Here's to the good times; shame they exist in the past.

[...] and threads (2-8 as many operations per clock, if being unrealistically ideal) [...]

I nearly blew a fuse reading that. Please tell me that's a typo, and not how you think threading improves performance.

In general no, but if all we're talking about is vertex transforms or another "embarrassingly parallel" problem, then yes. You could very well write a software T&L engine and simply replicate it across any and all cores not already consumed with other duties and achieve essentially linear speedup on vertex transformations, limited only by available memory bandwidth. The same properties that make this problem suitable for the massive parallelism of GPUs make this equally possible on CPUs. This is more or less what GPUs do, except they're massively scaled up (and of course they have other optimizations appropriate for their problem domain.

throw table_exception("(? ???)? ? ???");

[...] and threads (2-8 as many operations per clock, if being unrealistically ideal) [...]


I nearly blew a fuse reading that. Please tell me that's a typo, and not how you think threading improves performance.

"if being unrealistically ideal".

His example was detailing the higher/upper bound and is correct as such. Reality of it is irrelevant in that context.

edit: or did you get the impression he is not talking about hardware threads (cpu cores and HT if available ... typically 2-8)?

This topic is closed to new replies.

Advertisement