implementing an arbitrary size integer class

Started by
21 comments, last by Melekor 14 years, 8 months ago
I'm working on an integer class (C++) that can contain numbers of arbitrary size, ie. the programmer should be able to use numbers of infinite length, limited only by memory. I am aware of the existance of many classes which already handle this sort of stuff, but I am doing this as a programming exercise so those are pretty useless to me ;) Obviously I need to define the behaviour for all of the basic operators such as +, -, *, /, % and =, as well as the behaviour for equality operators (+=, -=, *=, /=, %=, ==), and finally conversion operators, etc... I will begin with operator+ as it's behaviour is relatively simple. My question is, what is the most efficient way to do this? Ideally I could store the number within the class as a vector of words and perform all my calculation in binary. Is there a better way? Are there any optimizations I can make when calculating in binary (like some sort of tricky binary bitshift magic?) I'm not too concerned with the other operators for the time-being - it would be nice to work on them one at a time and get them optimised before moving onto the rest. It's also likely that I'll have to come up with some sort of square root function... is there any way to calculate the square root of such large numbers, or would I have to rely on a lookup table? (virtually impossible for me to do)
Advertisement
Why roll your own bignum class? Take a look at GMP: http://gmplib.org/
Like I said, I'm doing it as a programming exercise. For project Euler actually.

Using an existing bigInt library would be cheating, and It would be nice to know that I'm capable of implementing such a thing myself ;)
http://en.wikipedia.org/wiki/Methods_of_computing_square_roots

Correct me if I'm wrong, but are you trying to optimize your code before you have written it? ...and you are writing it to make sure you can? Personally I would use the largest 10-based number below uint-max and use that as a base.
not really trying to optimise it, so much as making it easier to write... The algorithm for adding base10 numbers seems simple enough when you're adding numbers on paper, but actually implementing it is what escapes me. I'll keep at it.
1) You don't need a big int class to solve the linked problem. There is a much, much easier way to do it. (Apologies if you already know that.)

2) To answer your more direct question, I would implement a software ALU. While I have not done it myself, you could consider keeping an array of 64-bit numbers. You'll have to do the carrying manually (i.e. add the low order pair of 64-bit numbers together.. if you detect an overflow manually add 1 to the next higher order 64-bit numbers. It may help to look up some very basic computer engineering stuff to see how an ALU works so that you can imitate it.. you'd use the hardware though to do 64 bits at a time.. certainly not bit-by-bit :)

Detecting the overflow is easy.. C = A + B. If C < A, you overflowed. Addition is the easy operation though. I'd suggest taking a look at how Melekor's linked code works.

3) Wikipedia has a page specifically about how to find square roots, but using Newton's Method is just dead simple. Your equation would be 0 = x^2 - Y where x is your answer and Y is the number you're taking the square root of. I'm sure an algorithm at http://en.wikipedia.org/wiki/Methods_of_computing_square_roots is much faster though (particularly the one using digit by digit work on binary numbers.)

BTW, don't try to think in base 10. Think in base 2 or 16. A + 9 = 14. That's an overflow to the next digit (the 1) and you have 4 left for the ones column. The next column is the 16s column, the next is the 256s column, etc. F00 is 15*16*16 + 0*16 + 0*1 = 3840
The problem at projecteuler doesn't need explicit bignums, you can just use the high-school number adding algorithm to calculate one digit at a time. Since there are 100 numbers, and you work in base 10, the sum of a single column easily fits in 32 bits, and similarly since there are only 50 columns, carries won't get too large.

char * numbers[] ={"37107287533902102798797998220837590246510135740250","46376937677490009712648124896970078050417018260538","74324986199524741059474233309513058123726617309629","91942213363574161572522430563301811072406154908250","23067588207539346171171980310421047513778063246676","89261670696623633820136378418383684178734361726757","28112879812849979408065481931592621691275889832738","44274228917432520321923589422876796487670272189318","47451445736001306439091167216856844588711603153276","70386486105843025439939619828917593665686757934951","62176457141856560629502157223196586755079324193331","64906352462741904929101432445813822663347944758178","92575867718337217661963751590579239728245598838407","58203565325359399008402633568948830189458628227828","80181199384826282014278194139940567587151170094390","35398664372827112653829987240784473053190104293586","86515506006295864861532075273371959191420517255829","71693888707715466499115593487603532921714970056938","54370070576826684624621495650076471787294438377604","53282654108756828443191190634694037855217779295145","36123272525000296071075082563815656710885258350721","45876576172410976447339110607218265236877223636045","17423706905851860660448207621209813287860733969412","81142660418086830619328460811191061556940512689692","51934325451728388641918047049293215058642563049483","62467221648435076201727918039944693004732956340691","15732444386908125794514089057706229429197107928209","55037687525678773091862540744969844508330393682126","18336384825330154686196124348767681297534375946515","80386287592878490201521685554828717201219257766954","78182833757993103614740356856449095527097864797581","16726320100436897842553539920931837441497806860984","48403098129077791799088218795327364475675590848030","87086987551392711854517078544161852424320693150332","59959406895756536782107074926966537676326235447210","69793950679652694742597709739166693763042633987085","41052684708299085211399427365734116182760315001271","65378607361501080857009149939512557028198746004375","35829035317434717326932123578154982629742552737307","94953759765105305946966067683156574377167401875275","88902802571733229619176668713819931811048770190271","25267680276078003013678680992525463401061632866526","36270218540497705585629946580636237993140746255962","24074486908231174977792365466257246923322810917141","91430288197103288597806669760892938638285025333403","34413065578016127815921815005561868836468420090470","23053081172816430487623791969842487255036638784583","11487696932154902810424020138335124462181441773470","63783299490636259666498587618221225225512486764533","67720186971698544312419572409913959008952310058822","95548255300263520781532296796249481641953868218774","76085327132285723110424803456124867697064507995236","37774242535411291684276865538926205024910326572967","23701913275725675285653248258265463092207058596522","29798860272258331913126375147341994889534765745501","18495701454879288984856827726077713721403798879715","38298203783031473527721580348144513491373226651381","34829543829199918180278916522431027392251122869539","40957953066405232632538044100059654939159879593635","29746152185502371307642255121183693803580388584903","41698116222072977186158236678424689157993532961922","62467957194401269043877107275048102390895523597457","23189706772547915061505504953922979530901129967519","86188088225875314529584099251203829009407770775672","11306739708304724483816533873502340845647058077308","82959174767140363198008187129011875491310547126581","97623331044818386269515456334926366572897563400500","42846280183517070527831839425882145521227251250327","55121603546981200581762165212827652751691296897789","32238195734329339946437501907836945765883352399886","75506164965184775180738168837861091527357929701337","62177842752192623401942399639168044983993173312731","32924185707147349566916674687634660915035914677504","99518671430235219628894890102423325116913619626622","73267460800591547471830798392868535206946944540724","76841822524674417161514036427982273348055556214818","97142617910342598647204516893989422179826088076852","87783646182799346313767754307809363333018982642090","10848802521674670883215120185883543223812876952786","71329612474782464538636993009049310363619763878039","62184073572399794223406235393808339651327408011116","66627891981488087797941876876144230030984490851411","60661826293682836764744779239180335110989069790714","85786944089552990653640447425576083659976645795096","66024396409905389607120198219976047599490197230297","64913982680032973156037120041377903785566085089252","16730939319872750275468906903707539413042652315011","94809377245048795150954100921645863754710598436791","78639167021187492431995700641917969777599028300699","15368713711936614952811305876380278410754449733078","40789923115535562561142322423255033685442488917353","44889911501440648020369068063960672322193204149535","41503128880339536053299340368006977710650566631954","81234880673210146739058568557934581403627822703280","82616570773948327592232845941706525094512325230608","22918802058777319719839450180888072429661980811197","77158542502016545090413245809786882778948721859617","72107838435069186155435662884062257473692284509516","20849603980134001723930671666823555245252804609722","53503534226472524250874054075591789781264330331690"};#include <vector>#include <cstdio>int digit(int i, int j) {	return numbers[50-j-1] - '0';}void main(){	std::vector<int> sum_digits;	int r = 0, q = 0;	for(int j = 0; j < 50 || q > 0; ++j) {		// pretend that we have as many extra columns of all zeros as needed		// to flush out all the leftover bits in our accumulator.		int column_sum = 0;		for(int i = 0; j < 50 && i < 100; ++i) {			column_sum += digit(i,j);		}		r = (column_sum+q)%10;		q = (column_sum+q)/10;		sum_digits.push_back(r);	}	printf("The first 10 digits are: ");	for(int i = 0; i < 10; ++i) {		printf("%d", sum_digits.back());		sum_digits.pop_back();	}	printf("\n");}
Quote:Original post by Melekor
The problem at projecteuler doesn't need explicit bignums, you can just use the high-school number adding algorithm to calculate one digit at a time. Since there are 100 numbers, and you work in base 10, the sum of a single column easily fits in 32 bits, and similarly since there are only 50 columns, carries won't get too large.

*** Source Snippet Removed ***

2 ^ 32 = 4294967296...

the largest datatype supported in visual studio is an unsigned long long or an __int64; 2 ^ 64 = 18446744073709551616. That's a far cry from 50 digits. What makes you think 32 bits is enough for 50 digits?

edit: nevermind, I misread your post. You meant that the sum of a column would fit in 32 bits, not the entire sum :)
Lookign at the problem it's only asking for the first ten digits (ie: the most significant?). So if you do some sort of controlled overflow where you add in the next column and then divide out the least significant digits, you should (might?) be able to handle the sum with a supported data type.
[size=2]Darwinbots - [size=2]Artificial life simulation
I have solved that problem using haskell that support big integers. Others have done the same thing. You can probably do it yourself if you like but I think there are others more interesting problems and the use of a library like GMP is acceptable.

This topic is closed to new replies.

Advertisement