Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 16 Mar 2012
Offline Last Active Jun 21 2016 06:13 AM

Posts I've Made

In Topic: Computing Matrices on the GPU

09 June 2016 - 07:47 AM

The goal is the following:


I need to shoot rays from my camera into the scene every frame. At the intersection points I am creating virtual cameras. Which are then used to render the scene again (or only a bunch of objects). Creating virtual cameras is analogous to creating viewProjection matrices.

The current approach is:

I render the scene from my camera and output the world positions and the normals with multi render targeting to two different textures. So I have two textures on my GPU which contain the world positions and normals of the scene seen from the camera. To create my viewProjections matrices I only need to get the position and the normal from the textures ( I always assume the up-vector to be vec3(0, 1, 0) ). At the moment I am downloading these two textures and create the matrices on the client side, which are then send to the server side again for further rendering.


The models with 200K vertices are by far not absurd. My GPU runs fine at 40-60fps in a scene with about 1 million vertices and goes even higher when not all vertices are in the camera frustum. 

In Topic: Computing Matrices on the GPU

08 June 2016 - 11:27 AM


I don't have experience doing this on the GPU but are you currently threading your work on the CPU. 100 objects doesn't seem like much.


I'd suggest that threading isn't even necessary here.  This really reads a lot like a misguided attempt to save memory by just storing 6 float sper object rather than all 16 of a 4x4 matrix, and that in this case it may very well be a more useful optimization to just burn the extra memory in exchange for more efficient ALU.  100 objects is, quite frankly, chickenfeed: 1996 class hardware could deal with that easily enough.


It's not about saving memory. As I have stated I need to create MVP matrices based on normals and positions which are stored on two different textures. So doing this on the CPU implies downloading the textures from server side, extracting the position and normal and then computing the matrices, which are then send to the server again. I am already doing this, but downloading textures every frame is too much of a performance killer. 

In Topic: Computing Matrices on the GPU

08 June 2016 - 06:22 AM

But wouldn't this mean two texture accesses (normal and position texture) plus matrix construction per vertex thread? I have models with a vertex count of 200.000 and more.

It would surprise me if this were faster than my proposed method. 

In Topic: Computing Matrices on the GPU

07 June 2016 - 05:38 AM

I am using the version 4.5. Unfortunately the proprietary framework I am working with does not support compute shaders yet. 

In Topic: Atomic Add for Float Type

26 May 2016 - 04:52 PM

I understand. =) I will stick to the NV extension for now, but try this out later. Thanks! =)