SIMD for Modulus?

Started by
3 comments, last by Wavesonics 14 years, 10 months ago
I'm looking at doing modulus on a large data set, and I'm wondering if using some SIMD could help here. Does anyone know some good tutorials on SIMD for GCC? Google hasn't been much help so far, mostly mailing lists...
==============================
A Developers Blog | Dark Rock Studios - My Site
Advertisement
I'm assuming you're talking about SIMD on x86 through the MMX/SSE instruction sets.

You can't calculate integer modulus directly with MMX.
I guess you could use a floating-point division and calculate the remainder manually, though it almost certainly won't be worth bothering to use SSE-2 doubles if you need more than 25-bit precision.
If the divisor is (mostly) constant you could use reciprocal multiplication instead. Calculating the proper scale and fudge factors takes some work but you could borrow some code from the Hacker's Delight site.
Thanks for the great response!

I definitely need the precision, I'm using unsigned 64bit Ints to get as large a number as possible, I'm calculating the modulus to checking primes, so the divisor will be completely constant, so I'll take a look at Hacker's Delight and hopefully it'll help, thanks again!


Ah, looks like the look up tables that his method requires become in inordinately large for large divisors like I will have :/
==============================
A Developers Blog | Dark Rock Studios - My Site
Quote:Original post by Wavesonics
Ah, looks like the look up tables that his method requires become in inordinately large for large divisors like I will have :/
That's only one of the possible methods. He actually covers a particularly neat method for handling exactly this problem, testing for zero remainders after division by a constant that is.

Anyway here's some code you might use without buying the book:
bool is_divisible(unsigned dividend, unsigned divisor) {	unsigned inverse;	unsigned limit;	unsigned temp;	// The divisor must be odd. Shouldn't be a problem for prime testing applications	assert(divisor % 2);	// Work out the multiplicative inverse using Newton's method	inverse = divisor;	do {		temp = 2 - divisor * inverse;		inverse *= temp;	} while(temp != 1);	// Calculate the direct fixed-point inverse as well	limit = (unsigned) -1 / divisor;	// Do the actual testing.	// Can be repeated arbitrarily many times on the same constants	return (dividend * inverse <= limit);}
To be honest I'm not entirely sure why this works myself..

Unfortunately x86 has no 64-bit multiplication instructions so you'll have to combine four 32-bit operations to make it work, well, either that or use an x64 system.
Of course the only way to perform 64-bit division in hardware on 32-bit systems is through 80-bit long doubles, and making those work properly is messy unless you're writing assembly code.
Perfect! I had not read it thoroughly enough!

Luckily I am doing this all on x86_64 (AMD Phenom) + Ubuntu 9.04 Server 64bit, so the 32bit issue won't be a problem.

I'll give this a try right away!

*edit*
About to try this, but thinking now, I wounder if some of the instruction set extensions my specific CPU supports would help here: MMX, SSE, SSE2, SSE3, SSE4a,3DNOW! Professional

Also, as an aside, for my processor, does anyone know if any of the processor specific GCC flags apply?

AMD Phenom X4 9350e Agena 2.0GHz

[Edited by - Wavesonics on June 5, 2009 4:04:21 PM]
==============================
A Developers Blog | Dark Rock Studios - My Site

This topic is closed to new replies.

Advertisement