flops on the GBA

Started by
4 comments, last by pcxmac 18 years ago
I am coding a game for the Gameboy Advance and have noticed for some reason that my physics is far slower than it should be. After stripping out float multiplications and generally trying to optimize where possible, I've noticed only marginal improvement. At this point, I noticed that I essentially was only doing a bunch of float additions per physics update (nothing too strenous and definitely not an algorithm issue) and so I did a few profiling tests of float additions vs integer additions. The following were my results: 1000 integer additions on the GBA: 269 ticks 1000 float additions on the GBA: 17975 ticks Generally this ratio holds for any number of additions and float addition is 66 to 67 times slower than integer addition. This means I should probably change my physics system to only utilize integers. I shudder to think how float multiplication performs on this machine.. Basically I am wondering if such a large difference is typical on modern cpus, or is this reason to believe that floating point operations are emulated on the GBA. I couldn't seem to find any definitive answers as to the nature of the GBA's arithmetic hardware on google.
Advertisement
That sounds pretty certainly emulated to me.

You do know how to implement fixed point arithmetic, yes? :)

(Then again, just what kind of game are you implementing for GBA that you *need* FP math/physics anyway? From what I've seen, the platform seems to emphasize tile-and-HUD-sprite based stuff...
Floating point is emulated by the gpa cpu. Fixed point is the way to go here.
"Fast" floating point operations are relatively recent for some of us - there was a time when PCs didn't even have hardware float operations, they where an optional extra added via a x87 co-coprocessor (a seperate chip that plugged into an expansion socket) until the 486DX integrated them into the same chip. The following chip, the Pentium, made the combination standard.

The 486DX, which started at 25mhz, still had poor floating point performance. It wasn't until the Pentium that the modern "fast" floating point operations you're used to started to appear (and hence the reason why the float intensive Quake 1 was much faster on a Pentium 66mhz then on a 486DX4 100mhz). The pre-Pentium days where filled with fixed point math in games, not just because some cheap fools had an SX or no co-processor, but because even with an x87 some floating point operations could take more then 100 cycles.

One of the big reasons for the speed increase for floating point operations, beside integrating into one chip was transistor count - the 486DX had a little over 1 million, while the Pentium jump to over 3 million. However more transistors adds cost, power consumption and heat. The Pentium introduced the *need* for proper heatsinks and fans, unlike the 486 and previous chips (I've literally booted and run cobbled together 486s, in the summer, without remembering to attach the tiny heatsink, and there was no problem).

Now taking that all into consideration, look at what is in the GBA. You've got a custom 16mhz ARM7 processor - a processor designed for integrated components with low cost, heat and power usage as 3 out of the 4 main goals (the 4th goal is often integrating device support like COM ports and LCDs directly into the chip, so that the device doesn't need support chips).

Devices that use the ARM7 and similar processors don't need to do lots of floating point operations - when you need heavy floating point lifting, like an mp3 player, you use a DSP chip (digital signal processor) instead. DSP chips excel at providing floating point calculations at low prices, but have pipelines that are much deeper. The ARM7 has a 3 step pipeline, while most embedded DSP chips have a 5-10 stage pipeline. The cost of a longer pipeline is that branching hurts more: DSP chips are good for making lots of calculations, but not for making decisions. They also tend to be slightly more expensive.

So to answer your question:

-The GBA's CPU is by no means modern in terms of power - it's modern in terms of being extremely economical in both factory production costs and portable power usage. By PC standards the GBA is powered by a low end 486SX with some custom operations and some very optimized assembly code for realtime per-scanline rendering (this method is very fast and memory efficent - the NES could render at 256x224 with sprites and scrolling backgrounds 60 times a second on a 1.79MHz processor with 2KB of video memory).

-AFAIK (I have no GBA dev kit, and if I did I would be under NDA so I couldn't tell you of any secret instructions) the ARM7 variant used in the GBA does not add any floating point operations (it would be hard to do anyway since it would probably double the transistor count). Typically when someone wants to do fast floating point operations with an ARM7, they pair it with an ARM VFP co-processor.
Thank you all for your wonderful explanations. With my suspicions affirmed, I'll work on implementing a fixed point physics system.

If you're wondering what type of game this is, it is a fast paced space shooter puzzle game so there are tons of objects moving about the screen at once. It seems I made a mistake in assuming float support, but I think I can do everything I need with integers.
Michalson, you're saturn pic is off the hook, cudos. Saturn is my favorite all time platform, last one I really liked.

-Daytona USA
-PDS 1 & 2
-WSB 98
-Sega Rally
-Nights of coarse
-Marvel vs. Capcom (Japan import w/ cartridge)
-VF 2
- the list goes on and on. good stuff


OMG, woohoo.

Ah the memories, great happiness, thanks a bunch for that image / commercial.
sig

This topic is closed to new replies.

Advertisement