Efficient array shuffing in pure HLSL

Björn Näf · 2014-10-31T15:24:33

Hi all! I am looking for an efficient way to shuffle an array in plain HLSL (i.e., create a random sequence where every index is used exactly once). I've learned so far that Fisher–Yates shuffle (Knuth shuffle) algorithm would do the job, but I didn't find any implementations in HLSL so far. So I tried implementing the algorithm myself but the naiv approach of transforming code from a different language like C++ into HLSL produces rather slow running results. Any ideas for a really fast way to achieve this?

Graphics and GPU Programming Programming

Started by Meltac October 29, 2014 10:37 AM

15 comments, last by Meltac 9 years, 5 months ago

Meltac

508

Author

October 30, 2014 03:25 PM

Yes, but it still doesn't explain why you can't do this math once on the CPU, generate a texture (once), and then just bind it in your post-process shader...

HOLY COW! Why do people here always question why something is the way it is? Can't you just accept that THIS IS NOT POSSIBLE IN MY PARTICULAR CASE !???

Even if you might not believe or not be able to imagine it - there ARE developers that do NOT have the privilege to build their own engine / host application or build effects onto some open environment! I just can't do it on the CPU and can't generate, bind, or access any textures in my case! And I don't need to explain here why! It's just the way it is.

Sorry that I react this upset but I'm really really getting absolutely sick and tired of repeating myself all the time!

JohnnyCode

1,084

October 30, 2014 10:36 PM

You need a deterministic function for at least 1 milion values that returns enough noise over them (since you have a determined index for every pixel).

Pretty close to this volatilism demand might be goniometric function , for which you pick period size and way to volatilize y values on sufficient defintion length.

If you reduce to 100000 values and pick definition 0.0-100.0 then x would be multiple of index and constant 1/100.0. You than may decide to volatilize 10 periods of goniometric function by polynome of 10th degree (20 multiplications) - this randomizing polynome is predefined and does not change.

You can pick period size and polynome size/period size, what allows you to scale the noise and stereotype size.

There are more ways how to achive noise. You may also volatilize polynome definition (its constants) upon index (still determined), or you can experiment with closest prime number to index.

MattSutherlin

1,210

October 31, 2014 12:14 AM

HOLY COW! Why do people here always question why something is the way it is? Can't you just accept that THIS IS NOT POSSIBLE IN MY PARTICULAR CASE !???

I get your frustration, but at the same time, people are not questioning the specifics of your situation just to be difficult. It's possible that a better solution could be arrived at through an entirely different process, possibly one you didn't even consider to be plausible or know to exist. Maybe not, but why needlessly limit yourself and the quality and/or quantity of potential answers you could get?

That said, you indicated that your C++ -> HLSL port was running slow. Is it otherwise running correctly? If so, it might be a question of optimization rather than a new algorithm. Posting the HLSL you have might help you get more concrete answers to speed it up. If not, I don't directly have a good answer for you. I'm sorry! What you're asking for is somewhat difficult since you're not going to have a good time trying to store state between pixels or between frames without some kind of help from the CPU. The best I can do right now is to give you a few links that I think are tangential to your problem but might help you come up with a workable idea.

Nathan Reed talks about PRNG on the GPU, and how a hashing function can be helpful there.

Alan Wolfe talks about creating a random shuffle operator.

Like I said, neither of those links is going to give you exactly what you want. They both provide part of the solution but have limitations that might prove unworkable for you given the limitations you mention. Hopefully they can spark an idea for you, but at the least, I think that they're both interesting reads.

Good luck!

TheChubu

9,484

October 31, 2014 01:21 AM

I get your frustration, but at the same time, people are not questioning the specifics of your situation just to be difficult. It's possible that a better solution could be arrived at through an entirely different process, possibly one you didn't even consider to be plausible or know to exist. Maybe not, but why needlessly limit yourself and the quality and/or quantity of potential answers you could get?

That.

Sometimes, when we're deep inside a problem, we miss the forest for the trees. Its not that we doubt your skills OP, is just that a fresh look at the problem might yield not an implementation of XYZ idea, but a different ZYX idea altogether.

Now, if you don't like ZYX idea and still want to do it the XYZ way, that's fine too. Just try to consider other ideas first, they might save you lots of time.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

Meltac

508

Author

October 31, 2014 10:16 AM

You need a deterministic function for at least 1 milion values that returns enough noise over them (since you have a determined index for every pixel).

Thanks. I don't have to do it for every pixel. That was, as I said, a simplification I had to make to avoid needing to tell a whole-day story here. Actually I pixelate ("downsample" in some sense) the screen to some extend, say 48 x 48 blocks. Would you then still suggest going the way of goniometric functions, or is there a simpler approach?

The best I can do right now is to give you a few links that I think are tangential to your problem but might help you come up with a workable idea.

Nathan Reed talks about PRNG on the GPU, and how a hashing function can be helpful there.

Alan Wolfe talks about creating a random shuffle operator.

I've already found those pages myself, but thanks anyway.

Sometimes, when we're deep inside a problem, we miss the forest for the trees. Its not that we doubt your skills OP, is just that a fresh look at the problem might yield not an implementation of XYZ idea, but a different ZYX idea altogether.

Now, if you don't like ZYX idea and still want to do it the XYZ way, that's fine too. Just try to consider other ideas first, they might save you lots of time.

I'm sorry but you're still off-topic, as *repeatedly* questioning me not being able to do it CPU-wise is off-topic. Don't get me wrong here, it's completely ok to ask if I couldn't do it on the CPU side - ONCE. But sticking on that and asking the same thing over and over is just annoying and not helpful nor constructive at all.

JohnnyCode

1,084

October 31, 2014 12:28 PM

Actually I pixelate ("downsample" in some sense) the screen to some extend, say 48 x 48 blocks. Would you then still suggest going the way of goniometric functions, or is there a simpler approach?

You can scale the goniometric function and noise multiplicator the way it suites you to noise your area. In case of noising 1000x1000 screen it would quite do some job. I was understanding that you index [x,y] pixel to a where every i differs for [x,y] unique vector. In case you run a 2d function instead, it is just even more suitable solution.

And if some very reasonable other suggestions here, are not possible for you to perform , you should politely state so.

Meltac

508

Author

October 31, 2014 03:24 PM

Hash! I need a hash, simple as it is! I should have come to that conclusion myself already in the first place

And if some very reasonable other suggestions here, are not possible for you to perform , you should politely state so.

I think I have explained myself well enough to make clear why I reacted the way I did. It wasn't just some "other suggestion" that made me mad but the fact that some people insist on such a "other suggestion" even after I have stated clearly that it is not an option in my case.

Nonetheless, your hint on goniometric functions, even though not exactly what I was looking for, has lead me to the solution - I simple and stupid hash.

Thanks again.

Efficient array shuffing in pure HLSL

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Efficient array shuffing in pure HLSL

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines