I've been implementing my own big integer class in C++. I've been trying to implement the Karatsuba algorithmas an alternate method of doing multiplication once the bit length reaches a certain threshold. My current implementation seems to get the first 3 or 4 elements correct, but after that the values are incorrect.

I've tried reproducing the algorithm using C# and .Net's BigInteger, and I've been getting the same results. So, I believe that my actual math functions are correct (Multiply, Add, Subtract, LeftShift), I'm guessing I'm doing something incorrectly in the algorithm itself. I'm hoping some other sets of eyes might see something I missed.

My big integer class stores binary values in vectors of unsigned integers, the lowest order bits at the beginning of the array. All values are stored in the positive, along with a signed bit. Since the signs don't actually affect multiplication, the function is not concerned about them.

typedef std::vector<std::uint32_t> storage_t inline static void AlternateMultiplication(const storage_t& Lhs, const storage_t& Rhs, storage_t& Product) { /*Get the size of each input*/ const auto LhsSize(Lhs.size()); const auto RhsSize(Rhs.size()); /*If one of the inputs is a single element long, use regular multiplication*/ if((LhsSize <= 1) || (RhsSize <= 1)) { /* Multiplication ignoring the signs. First two parameters are inputs by const reference, 3rd parameter is the output by reference */ AbsoluteMultiplication(Lhs, Rhs, Product); return; } /*Find the length of the longest input*/ const auto LongestSize(std::max(LhsSize, RhsSize)); /*Find half of the longest input. The +1 helps it round up*/ const auto HalfLongestSize((LongestSize + 1) / 2); /*Hold over from messing with the algorithm at the moment*/ const auto HalfLongestSizeSafe(HalfLongestSize); /*Vector's to copy the split values*/ storage_t LhsHigh(HalfLongestSizeSafe); storage_t LhsLow(HalfLongestSizeSafe); storage_t RhsHigh(HalfLongestSizeSafe); storage_t RhsLow(HalfLongestSizeSafe); /*Iterators marking where the splits will occur*/ const auto LhsBegin(std::begin(Lhs)); const auto LhsEnd(std::end(Lhs)); const auto LhsMid(LhsSize >= HalfLongestSize ? LhsBegin + HalfLongestSize : LhsEnd); const auto RhsBegin(std::begin(Rhs)); const auto RhsEnd(std::end(Rhs)); const auto RhsMid(RhsSize >= HalfLongestSize ? RhsBegin + HalfLongestSize : RhsEnd); /*Split the inputs*/ storage_t LhsLow(LhsBegin, LhsMid); Truncate(LhsLow); storage_t LhsHigh(LhsMid, LhsEnd); Truncate(LhsHigh); storage_t RhsLow(RhsBegin, RhsMid); Truncate(RhsLow); storage_t RhsHigh(RhsMid, RhsEnd); Truncate(RhsHigh); storage_t Z0; AlternateMultiplication(LhsLow, RhsLow, Z0); storage_t Z2; AlternateMultiplication(LhsHigh, RhsHigh, Z2); storage_t Z1; AbsoluteAddition(LhsHigh, LhsLow, LhsHigh); AbsoluteAddition(RhsHigh, RhsLow, RhsHigh); AlternateMultiplication(LhsHigh, RhsHigh, Z1); /*Just to be safe, tracking the sign of Z1, since subtraction is involved*/ bool Z1Sign(true); Subtraction(Z1Sign, Z1, true, Z2, Z1Sign, Z1); Subtraction(Z1Sign, Z1, true, Z0, Z1Sign, Z1); /*Left shift doesn't affect the sign. Convert elements length of number of bits to shift*/ AbsoluteLeftShift(Z1, (HalfLongestSize * (sizeof(T) * CHAR_BIT)), Z1); bool Z2Sign(true); AbsoluteLeftShift(Z2, (LongestSize * (sizeof(T) * CHAR_BIT)), Z2); Addition(Z2Sign, Z2, Z1Sign, Z1, Z2Sign, Z2); bool ProductSign(true); Addition(Z2Sign, Z2, true, Z0, ProductSign, Product); }

In the pseudocode on Wikipedia, the example is in base 10, where my code is in binary. If I had to guess, the issue is in the left shifting or that I have to keep breaking down a single element down into a single bytes or smaller, since this is base 2 math.

storage_t Input = {0x2fad3fd6,0xb14e9f81,0x00000006}; storage_t Output; AlternateMultiplication(Input, Input, Output);

Output - 0xeb2706e4,0x1d2f3c5b,0x7627584b,0xca7d4ae8,0x00000008

Correct - 0xeb2706e4,0x1d2f3c5b,0x7627584b,0xca7d4ac4,0x0000002c

As you can see, the first 3 elements are correct, as well as all but the highest order byte in the 4th. This is similar if the inputs get bigger. If the inputs are smaller, the answer comes out correct.

I'm trying to avoid posting the entire big integer class, since it is massive. Any insight or suggestions would be helpful.