Computing Matrices on the GPU

StanLee · 2016-06-10T09:45:55

Hi, I'm in the situation that I need to create a lot of model-view-projection (MVP) matrices (around 100 and more) out of a normal and position vector which are stored in a texture. I already implemented it in a way that I download these two textures and then create the matrices on the client side, but this is very very slow, as predicted. But the impact on the performance is really too big. How to do this entirely on the GPU? I don't even need the matrices on the client side. They are later used for rendering. My idea so far is to use a SSBO (Shader Storage Buffer Object) to store my matrices. I take every sample point on my texture (which corresponds to one matrix) and put it into a VBO which is then rendered to a screenquad. This should result in a fragment shader thread for every sample point. Then I sample the position and the normal from my textures in the fragment shader, construct a MVP matrix and store it in the SSBO. To get the right index for the SSBO every sample point is provided with its corresponding SSBO-index to the vertex shader. The question is: Is this the fastest way possible to generate the matrices? Are there other faster possibilities? I also thought about using 2D Textures to store the matrices, but this would imply 4 texture accesses later to get one matrix. (considering a 4 channel texture) But I don't know if this is faster than using SSBO's. Regards, Stan

Graphics and GPU Programming Programming

Started by StanLee June 03, 2016 02:13 PM

12 comments, last by Katie 7 years, 11 months ago

StanLee

157

Author

June 09, 2016 01:47 PM

The goal is the following:

I need to shoot rays from my camera into the scene every frame. At the intersection points I am creating virtual cameras. Which are then used to render the scene again (or only a bunch of objects). Creating virtual cameras is analogous to creating viewProjection matrices.

The current approach is:

I render the scene from my camera and output the world positions and the normals with multi render targeting to two different textures. So I have two textures on my GPU which contain the world positions and normals of the scene seen from the camera. To create my viewProjections matrices I only need to get the position and the normal from the textures ( I always assume the up-vector to be vec3(0, 1, 0) ). At the moment I am downloading these two textures and create the matrices on the client side, which are then send to the server side again for further rendering.

The models with 200K vertices are by far not absurd. My GPU runs fine at 40-60fps in a scene with about 1 million vertices and goes even higher when not all vertices are in the camera frustum.

dpadam450

2,403

June 09, 2016 02:46 PM

What technique are you performing? It sounds like you render the entire scene from the players camera, and output the world space positions and normals and then randomly pick 100 of those positions (or some other algorithm to pick cameras). I can't think of what that would be useful for unless you are trying to do some kind of reflection or GI but rendering from 100 cameras would be pretty expensive unless they are rendering to very small textures.

You could bind the 2 textures and run a shader that outputs to 4 float textures your matrices. Or as suggested already it will probably be faster to sample the normal/positions for each vertex and re-compute the matrix over and over again.

You could move this all to the CPU and do raycasting there. Any physics library will work. Intel also has a raytrace library. If I'm correct that you are casting the rays from a players perspective, then can't you cache some of the values between frames since they see mostly the same thing? This would be a bonus for using CPU side raycasting.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

kalle_h

2,470

June 09, 2016 07:47 PM

Just calculate it at vertex shader for start and then profile to see what is your bottleneck. If it seems that vertex shader calculations are problem then you start to think how to optimize it.

Katie

2,255

June 10, 2016 09:45 AM

"But wouldn't this mean two texture accesses (normal and position texture) plus matrix construction per vertex thread? "

Yes. But it'll probably end up living in cache and it's saving an entire rendering setup sequence -- it's one less program binding & execution along with all the GPU management overhead that contains.

It's also simpler, so it gets you running and (as kalle_h says) profiling faster.

GPUs are insanely fast -- particularly vertex programmes. It's really quite hard to saturate the vertex side before you're saturating the pixel side.

And you don't need code to run as fast as possible -- you only need fast enough. Fixating on being the besty-best-bestest-best-best instead of "good enough for what you need" is what kills amateur projects.

Computing Matrices on the GPU

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Computing Matrices on the GPU

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines