cg # instructions

Started by
9 comments, last by Bosh 18 years, 7 months ago
According to the Cg compiler, the maximum number of instructions in a fragment program for the fp40 profile is something like 4096. However, I want to iterate a number of commands 32*32 times, and if unrolled it definitely ends up being much more than the limit. Is there a way around this or is there a profile that accepts more instructions that this? I read in GPU Gems 2 that the Shader Model 3.0 for GeForce 6 series GPUs (I'm using a nvidia 6800gt) accepts up to 65,536 instructions for a fragment program, which is enough for what I need. According to this book, that is. Any suggestions to this problem would be greatly appreciated.
Advertisement
What are you trying to do that you need to iterate it that many times? And is it possible to send a lot of the info to the vertex shader?
Hi,

This is part of an application I'm trying to implement that does global illumination effects by collecting incident radiance from an HDR environment map interactively (or as much so as possible). The calculations are actually to be done per-vertex, but since fragment programs have more capabilities I thought I'd try doing it in the fragment program, render it to a 2D texture, then access the color for each vertex from the texture.

32 by 32 is the size of the shadow mask i'm using for each vertex, so I need to iterate for each pixel in this mask. This obviously raises another problem which is that I need to have either a uniform texture the size of 32*32*#vertices, which i think is not feasible, or somehow pass a different texture for each vertex (fragment-representing-vertex, in this case).

Thank you for the advice.

leon
Obviously, I might have to pass a lot of this to the CPU, but I'm trying to put as much into the GPU as possible. Also, it might not be cohesive to shift only chunks of this to the CPU and chunks of this in the GPU because the sets of data are pretty much interdependent so I want to calculate the BRDF and the shadows in the same program to save the space of having to store each transfer function independently per-vertex, but rather just calculate the color of the vertex before the shader ends. It's hard to explain in a short message.

Maybe you could sample your shadow mask pseudo randomly. The eye tolerates a bit of noise, and you could reduce the number of iterations by quite a lot.

Also, with the texture problem, 32*32*numverts might not be as big a problem as it first appears. You can fit 1024 32*32 textures in a 1024*1024 texture. Conveniently, thats one in each row. So if you don't mind sending your data in 1024 vertex batches, or if you can afford an even bigger texture, you might be ok. After all, it seems batch size will be the least of your worries...
___________________________________________________David OlsenIf I've helped you, please vote for PigeonGrape!
Thanks, that's a good idea, the only problem I foresee is that I would have to change the number of textures depending on the number of vertices in the model. A way to automate this would be to generate the Cg code at run time depending on the number of vertices in the model that is read, but that's for much later.
In the process of doing this, I've been having a problem. I'm relatively new to Cg:

I'm using a graphics card that should support all Cg language profiles (nvidia 6800GT), but when I run my C++ code an error appears saying that the profile chosen is not supported. I've tried all of them, VP40, VP30, VP20, and ARBV1. All give me the same error. Am I missing something? I downloaded the Cg toolkit very recently and copied all the .lib and .h files into the lib and include directories, respectively. Any help or comments would be appreciated.
Starting with something nice and simple I see [rolleyes]. Not sure about your problem. Can you run other peoples shaders?
___________________________________________________David OlsenIf I've helped you, please vote for PigeonGrape!
Yeah, I can run the examples in the NVIDIA Cg toolkit, although i haven't compiled them, I've ran the executables. Both my C++ code as well as the Cg code compile (in Visual Studio 2005 Beta 2 and cgc, respectively). I've looked at the code from these examples and I do everything in the same order, all the calls to cg runtime.

Actually, I've narrowed this first program to something relatively simple and straightforward, the only "complicated" part is the looping (32*32) and the size of the textures, which for now are just dummy variables for the most part. I'm just testing out how fast this vertex program runs to see if my algorithm is feasible.

Actually, I just compiled nvidia's example source code and it worked. Everything seems like it should work. I have enough variables in my one shader that it's possible I'm calling one of the parameters the wrong way, but the error wouldn't say it's the profile that is not supported if that were the case.

This topic is closed to new replies.

Advertisement