Sign in to follow this  
Wavesonics

SIMD for Modulus?

Recommended Posts

I'm looking at doing modulus on a large data set, and I'm wondering if using some SIMD could help here. Does anyone know some good tutorials on SIMD for GCC? Google hasn't been much help so far, mostly mailing lists...

Share this post


Link to post
Share on other sites
I'm assuming you're talking about SIMD on x86 through the MMX/SSE instruction sets.

You can't calculate integer modulus directly with MMX.
I guess you could use a floating-point division and calculate the remainder manually, though it almost certainly won't be worth bothering to use SSE-2 doubles if you need more than 25-bit precision.
If the divisor is (mostly) constant you could use reciprocal multiplication instead. Calculating the proper scale and fudge factors takes some work but you could borrow some code from the Hacker's Delight site.

Share this post


Link to post
Share on other sites
Thanks for the great response!

I definitely need the precision, I'm using unsigned 64bit Ints to get as large a number as possible, I'm calculating the modulus to checking primes, so the divisor will be completely constant, so I'll take a look at Hacker's Delight and hopefully it'll help, thanks again!



Ah, looks like the look up tables that his method requires become in inordinately large for large divisors like I will have :/

Share this post


Link to post
Share on other sites
Quote:
Original post by Wavesonics
Ah, looks like the look up tables that his method requires become in inordinately large for large divisors like I will have :/
That's only one of the possible methods. He actually covers a particularly neat method for handling exactly this problem, testing for zero remainders after division by a constant that is.

Anyway here's some code you might use without buying the book:
bool is_divisible(unsigned dividend, unsigned divisor) {
unsigned inverse;
unsigned limit;
unsigned temp;

// The divisor must be odd. Shouldn't be a problem for prime testing applications
assert(divisor % 2);

// Work out the multiplicative inverse using Newton's method
inverse = divisor;

do {
temp = 2 - divisor * inverse;
inverse *= temp;
} while(temp != 1);

// Calculate the direct fixed-point inverse as well
limit = (unsigned) -1 / divisor;

// Do the actual testing.
// Can be repeated arbitrarily many times on the same constants
return (dividend * inverse <= limit);
}
To be honest I'm not entirely sure why this works myself..

Unfortunately x86 has no 64-bit multiplication instructions so you'll have to combine four 32-bit operations to make it work, well, either that or use an x64 system.
Of course the only way to perform 64-bit division in hardware on 32-bit systems is through 80-bit long doubles, and making those work properly is messy unless you're writing assembly code.

Share this post


Link to post
Share on other sites
Perfect! I had not read it thoroughly enough!

Luckily I am doing this all on x86_64 (AMD Phenom) + Ubuntu 9.04 Server 64bit, so the 32bit issue won't be a problem.

I'll give this a try right away!

*edit*
About to try this, but thinking now, I wounder if some of the instruction set extensions my specific CPU supports would help here: MMX, SSE, SSE2, SSE3, SSE4a,3DNOW! Professional

Also, as an aside, for my processor, does anyone know if any of the processor specific GCC flags apply?

AMD Phenom X4 9350e Agena 2.0GHz

[Edited by - Wavesonics on June 5, 2009 4:04:21 PM]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this