DISCLAIMER: this is a massive post and I probably made several computational errors along the way - bear with me.EDIT: god damn it, why does the latex "times" command get mangled by the website?----
I don't think Muzzy wants a floating-point version but rather an arbitrary precision integer arithmetic version. Well let's see - how are numbers stored when we write them down in everyday life? Say 25 - that's a 2, representing the number of "tens", and a 5 representing the number of "units". That's nice, but how do computers do it? Surely they don't use decimal (base 10) notation, do they? No, they don't, they use binary notation which is more convenient for them to process. In binary, each digit represents whether a specific power of two is a "part" of the number being represented. So basically, binary notation represents an arbitrary number as a sum of powers of two! For 25 we get:
To make it more explicit we could write it as this:
Clearly when viewed in this way, the exponents will be the same regardless of the number being represented, so we can remove them as they are implied by binary notation. So we are left with the string "10011000....". By convention, this string is actually written backwards, so "....00011001". And the leading zeroes can be removed (just as in decimal notation), so we are left with "11001". This is binary for 25!
You will notice a similarity with decimal notation. Take the 25 example again, we can write it as:
As in binary, the exponents are implied by decimal notation, so we are left with "5200...", written backwards "...0025", and with leading zeroes removed, we get "25", which is base 10 notation for 25 (obviously this sounds weird, this is because humans assume base 10 notation when they talk, so they don't specifically state it usually). This pattern works for any integer base you could think of, like base 17 and base 3441 (and it's also possible to have real bases, but I won't mention those as they are not relevant to this question). Except base 0 (which is meaningless) and base 1 (which can't work for obvious reasons). More info here
Now how does this relate to the issue at hand? We could just keep adding higher and higher exponents to reach higher and higher numbers, right? True, but CPU registers have a practical limit on the number of bits they can handle simultaneously for operations such as addition. This is 32 and 64 bits respectively for a 32-bit and 64-bit processor. So they can't directly do math with bigger numbers using the native operators available.
So clearly binary is a no-go for big numbers. But hey, this means numbers are represented, inside the CPU, as an array of bits, right? And we know that in software, we can make arrays of bytes, of words, or longwords (32-bit integers), or even quadwords (64-bit) if the CPU supports it, right! Well, what's keeping us from representing numbers in base 256 (for bytes), or base 65536, in software? The math is sound, and we could implement our own addition/multiplication/etc... functions, right?
Well, let's try it with bytes. Represent, say, 441 in base 256:
Which gives the base 256 notation for 441 of "1:185" (I use colons to delimit the "digits" here since there are 256 possible values per digit, more than we have base 10 digits available - note this is a logistical problem rather than a mathematical one).
Of course, this sucks - 441 could be represented by any CPU. So let's try a larger number, say 4984994854993448178545902948759435848758439853487834939846735869. I'll spare you the working (there is an efficient algorithm to change bases, I'll let you look it up), but the base 256 notation for this massive integer is:
This is more manageable - each digit can be individually handled by a CPU, and a formula for addition with digit carry can be efficiently done. Similarly, subtraction, multiplication, and all other operations can be done. But this could be made better - a byte is quite wasteful considering CPU's can work with 32 or 64 bits at a time. What does it give in base 2^32 instead of base 2^8 (base 256)? We get:
Much better. Now let's take another huge number, and add the two together, as an example. So let's say 193849494855949383389292884 is our second number. Its form in base 2^32 is:
Now the least significant digit is the last one, so that's where we want to start to do addition (remember primary school addition?). So let's take the last digit of both numbers, and add them together:
2234169341 + 2470062420 = 4704231761
Now remember primary school addition again, if you add, say, 7 and 8 together, you get 15, so you keep 5 and you carry 1, right? Same here - if the result is bigger than the maximum digit allowed (which is 2^32 - 1), we simply take the remainder as the resulting digit and carry 1. And 2^32 - 1 is 4294967295. Our result was 4704231761, which is bigger. So we remember to carry 1
on the next digit addition, and use the remainder as the final digit, which is 4704231761 - 4294967296 = 409264465
Ok, next digits are 1015093636 and 742607992. We add them together, and we don't forget the carry:
1015093636 + 742607992 + 1
This is smaller than our maximum digit 4294967295, so we just use that as the final digit and we have no carry for the next digit.
Next up are 339451501 and 10508602. Add them up, with no carry:
339451501 + 10508602 + 0
Smaller than the maximum digit, use that as the final digit. Now we have no digits left in the second number, and we have no carry left, so we can just copy the remaining digits of the first number. You could do it manually by padding the second number with leading zero digits and working through it, if you wanted. So the sum is:
This is the correct sum, and you can convert it back to base 10 to display it for a human to read if you wish.
That's basically it. That's how GMP, Java's bignum, or any other such library works. They work by defining arithmetic on a specific base that the CPU is comfortable with (the best choice is almost always the CPU's native word size, e.g. 32 bits or 64 bits for 32-bit and 64-bit CPU's respectively), and doing math on each "digit" in that base individually (which the CPU can do). All imaginable operations (addition, division, multiplication, modulus, integer square root, etc...) can be implemented in this way, for numbers whose sizes are bounded only by the available memory in the system (to represent it in memory) instead of the CPU register size.
Of course the point is to make it work FAST, since now the CPU will handle individual digits instead of the whole number in a single shot, so performance will suffer as numbers get bigger and bigger. You can look up the GMP documentation for more information - I think they call the digits "limbs".
I hope this made sense
PS: in the addition example, if your CPU was 32-bit, it would seem that you wouldn't be able to know if the sum of two individual digits would exceed the maximum digit size 2^32 - 1, as the sum would naturally wrap around to zero because of the CPU's limited register size. How do you handle this situation? There is a trivial solution, which just involves checking the CPU's overflow flag - if it is set, then the operation overflowed, and you get the remainder for free too. There is also a comparison-based solution, but it only works for addition/subtraction, not for multiplication.
Edited by Bacterius, 09 June 2012 - 04:59 AM.