# [HLSL] Coping without bitwise operators

This topic is 4107 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi I've got a function which I want to move onto the GPU. Unfortunately, I am using DX9, so don't have access to SM4.0! Therefore my HLSL has no bitwise operators..... So how can I express bitwise functions in HLSL? For example: int i = val & 15; and if ( i&1 ) DoStuff; Thanks for any advice! My binary maths is pretty poor! Simon

##### Share on other sites
What are you actually trying to do?

##### Share on other sites
Its a simplex noise generator. Uses a lot of bitwise & but no shifting....

##### Share on other sites
Given that there is no integer instruction set either you're not really dealing with integers despite your int types.

I'd imagine a good compromise is a look-up texture of some kind, especially if point-filtered. A 256x256 texture allows for all combinations of 8bit operators indexed via 0..1 texture coordinates. Point filtering and texture addressing converts your colour values to integers and does the bitwise comparison all in one go [cool]

if( tex2D( sampLookup, float2( A, B ) ) > 0.5f ) /* stuff */

hth
Jack

##### Share on other sites
I've never actually tried this before, but I'm sure there are plenty of ways you can be clever if you really want to emulate bitwise operations.

For instance, take bitwise-AND. If you have the case x & 2N-1 such as your first example, the result is just frac(x/(2N))*2N. In other words you divide by the power-of-two, take the fractional component of the result, and multiply that by the power-of-two. Since these are all floating-point operations you might not get exact results, but for the purposes of conditionals and the like it should suffice. If you have x & 2N, then you first divide by 2N and check if the result is even or odd. If it's even then the result of the whole operation is 0, otherwise it's 2N. This can be expressed mathematically as ceil(x/(2N+1))*2N, because the result of the even/odd check will be either 0.0 or 0.5, which we can 'ceil' to get 0.0 or 1.0 and then multiply by the original power-of-two. And finally, for some other value x & K, just remember that 'K' is a sum of power-of-two's and repeat the previous test. You can get even more clever here with certain numbers. Take x & 239. Since there is only a single 0 bit to check here, it makes more sense to do x & 255 and then subtract x & 16, than it does to individually sum x & 1, x & 2, ..., all the way up to x & 128.

Along those same lines you can probably devise tests for OR, XOR, NOT, NOR, or whatever else you need. Just remember at all times to be careful when checking the results and computing values, since they're all floating-point computations, and you should be fine.

##### Share on other sites
Jack,

Nice approach, but DX9 can't do texture lookups in the vertex shader, right?

Zipster.... I will ponder this further. Looks interesting.

Before I embark on such a mission, does anyone have any theory about how much performance I could expect using a GPU instead of CPU to manipulate a Vertex Buffer each frame?

Ball park guess?

Factor of 2? 10?

Thanks

Si

##### Share on other sites
Quote:
 Original post by sipicklesNice approach, but DX9 can't do texture lookups in the vertex shader, right?
VS_3_0 can on Nvidia hardware, but otherwise no. But you never mentioned VS though [razz]

Quote:
 Original post by sipicklesBefore I embark on such a mission, does anyone have any theory about how much performance I could expect using a GPU instead of CPU to manipulate a Vertex Buffer each frame?
But unless you're using R2VB how are you expecting to store the data? Or are you just offloading it all so that you're re-computing static results every time its rendered?

I guess you're doing some sort of noise-function based perturbtion of vertex data. This sounds like a good parallelisable task, so throw a bounded buffer and OpenMP based solution at it and you should be fine for it on the CPU.

Jack

##### Share on other sites
Quote:
 This sounds like a good parallelisable task, so throw a bounded buffer and OpenMP based solution at it and you should be fine for it on the CPU.

Care to elaborate?!

##### Share on other sites
Damn, I thought I could pull off that whole sounding clever thing...

The exact MP mechanics will vary on how you've got your app set up, but you can conceptualise the deformation of a vertex buffer by a noise function as a "task". Usually this'll be very nice as you've got seperate inputs and you're not going to have to deal with synchronization or locking or nasty stuff with sharing writable memory with other threads.

OpenMP ships as part of VS'05 and is pretty easy to use (although I'm no expert) so you can set up a bounded buffer and have each 'task' running in the thread pool, crunching away as fast as it can. Obviously performance scales for dual and quad core CPU's. These CPU based tasks are the producers, and the idea is that the GPU is the consumer and a simple lock-copy-unlock operation sends the generated data up for being rendered.

You also have more control over how often the geometry is updated - you may want to render faster than it needs changing (e.g. I had a noise-based water renderer that only updated at 10hz despite rendering as fast as it could).

hth
Jack

##### Share on other sites
Wow, I was not aware of OpenMP, It looks fantastic.

My problem is, as an indie developer, I havent got £1000 to throw at MSVC2005 Pro, so am running Standard. No OpenMP :(

Strange that there is an option in Properties>Configuration>C++>Language to enable OpenMP support if MSVC Standard doesn't support it.

Microsoft give you hope then they take it away! :)

----

EDIT: Hmm, I even have vcomp.dll in C:\Program Files\Microsoft Visual Studio 8\VC\redist\x86\Microsoft.VC80.OPENMP

1. 1
2. 2
Rutin
20
3. 3
khawk
17
4. 4
A4L
14
5. 5

• 12
• 16
• 26
• 10
• 44
• ### Forum Statistics

• Total Topics
633759
• Total Posts
3013719
×