Jump to content
  • Advertisement
Sign in to follow this  
skyfire360

Per-Vertex FP32 Matrix4x4 in Hardware a pipe dream?

This topic is 4881 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hiya! I realize that my issue is somewhat graphics related, but it's the storage requirement issue that I figured you guys in the D3D forums might have run into before. I'm working on an application that requires a 4x4 matrix to be calculated per-vertex, per-frame. I've been brainstorming ways to deal with this requirement, and I've come up with a few ways to accomplish it, but unfortunately each has its downsides. Recalculate matrix in vertex shader every frame + Really easy to code + Single-pass - Vertex Program length unknown - Slow, as every vertex every frame would be recalculated - Can't interpolate/blend more than one matrix from surrounding verts Two pass: one to calculate matrix and store into a 2048x2048 FP32 texture, one to render image + Storage for 260000 vert matricies (2048*2048 / 16 floats per matrix) + Could recalculate per frame, or on call + If no need to recalculate, could skip pass - Accessing values per texel is inaccurate (fixable with NEAREST filter?) - Requires 16 texture lookups - only supported on 6800GT/Ultra - Can't figure out way to index and/or calculate texture coordinates per-vertex Has anyone else had to do something like this before? It seems like the 2nd way is much more robust and appears to be used in particle engines, fluid simulations etc. but I can't find any open source examples. Does anyone have any ideas or suggestions of other ways of accomplishing this? Thanks a bunch! -Sky

Share this post


Link to post
Share on other sites
Advertisement
It's useful to bare in mind that if you're talking about using shaders here, they have ridiculously powerful arithmetic units on GPU's when used properly. Consequently, I'd wager my money that your per-shader compute is probably better than that huge texture cache thing you're on about.

You might be able to find some sort of hybrid mind - re-compute in each shader pass, but if any parts of the equations remain static/constant for any length of time either cache them or refactor them out as constants (etc...) so as to save yourself a few cycles.

hth
Jack

Share this post


Link to post
Share on other sites
Since your second approach would store the matrices into a texture, you could also do this in a shader and render to texture? Would that do you any good? At least you could use the GPU to calculate that stuff, which should really be faster than your CPU in that matter.

Share this post


Link to post
Share on other sites
As far as the length of the vertex program being unknown goes, the way to resolve that is to prototype it [smile]

If you could tell us more about what this matrix is and how it's calculated, we might be able to get a better idea...

Another thing you could do is to use two streams. One has your regular vertex data, and the other has 4 float4s per vertex. So you recalculate the second stream without touching the first, and feed them both into the vertex pipeline together - giving you instant access to your matrix in the vertex shader without needing to do any texture lookups.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
What are you _doing_ with the matrix?

Share this post


Link to post
Share on other sites
jollyjeffers: Thanks, I'll keep that in mind. I have to see from my advisor exactly what parameters are necessary to create the matricies. If it's per-frame data, then shoving it into the vertex shader would work well. If it's a sizable amount of per-vertex data, I'll have to figure out some other way of getting the data into the gpu.

matches81: Rendering to a texture is what I'm kind of working on at the moment. However, I read somewhere that the Fragment Shaders were noticably slower at matrix operations than Vertex Shaders. Is this the case?

superpig: Interesting! I wasn't aware that you could read multiple float4s from a stream (effectively streaming matricies) on a per-vertex basis. I'll have to look into this...

AP: I wish I could elaborate, but there's two reasons why I can't say much: one, I'm not exactly sure myself, as I'm joining a graduate student research team and I'm the lowly undergrad who has to get caught up to speed on exactly how the program works. Their final paper/presentation is scheduled for next spring, so there's an understood NDA attached to it... sorry! About the most I can say is that every vertex needs to be manipulated and deformed independently. But thanks a bunch for the info, it'll really help!

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!