Computing Matrices on the GPU

Started by
12 comments, last by Katie 7 years, 11 months ago

The goal is the following:

I need to shoot rays from my camera into the scene every frame. At the intersection points I am creating virtual cameras. Which are then used to render the scene again (or only a bunch of objects). Creating virtual cameras is analogous to creating viewProjection matrices.

The current approach is:

I render the scene from my camera and output the world positions and the normals with multi render targeting to two different textures. So I have two textures on my GPU which contain the world positions and normals of the scene seen from the camera. To create my viewProjections matrices I only need to get the position and the normal from the textures ( I always assume the up-vector to be vec3(0, 1, 0) ). At the moment I am downloading these two textures and create the matrices on the client side, which are then send to the server side again for further rendering.

The models with 200K vertices are by far not absurd. My GPU runs fine at 40-60fps in a scene with about 1 million vertices and goes even higher when not all vertices are in the camera frustum.

Advertisement

What technique are you performing? It sounds like you render the entire scene from the players camera, and output the world space positions and normals and then randomly pick 100 of those positions (or some other algorithm to pick cameras). I can't think of what that would be useful for unless you are trying to do some kind of reflection or GI but rendering from 100 cameras would be pretty expensive unless they are rendering to very small textures.

You could bind the 2 textures and run a shader that outputs to 4 float textures your matrices. Or as suggested already it will probably be faster to sample the normal/positions for each vertex and re-compute the matrix over and over again.

You could move this all to the CPU and do raycasting there. Any physics library will work. Intel also has a raytrace library. If I'm correct that you are casting the rays from a players perspective, then can't you cache some of the values between frames since they see mostly the same thing? This would be a bonus for using CPU side raycasting.

NBA2K, Madden, Maneater, Killing Floor, Sims http://www.pawlowskipinball.com/pinballeternal

Just calculate it at vertex shader for start and then profile to see what is your bottleneck. If it seems that vertex shader calculations are problem then you start to think how to optimize it.

"But wouldn't this mean two texture accesses (normal and position texture) plus matrix construction per vertex thread? "

Yes. But it'll probably end up living in cache and it's saving an entire rendering setup sequence -- it's one less program binding & execution along with all the GPU management overhead that contains.

It's also simpler, so it gets you running and (as kalle_h says) profiling faster.

GPUs are insanely fast -- particularly vertex programmes. It's really quite hard to saturate the vertex side before you're saturating the pixel side.

And you don't need code to run as fast as possible -- you only need fast enough. Fixating on being the besty-best-bestest-best-best instead of "good enough for what you need" is what kills amateur projects.

This topic is closed to new replies.

Advertisement