Sign in to follow this  

arbfp1: too many ALU instructions

This topic is 4730 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

i'm rather new to fragment programming so i tried to start with it to put together a simple fragment program to do lighting. i support point lights at the moment to start with and have placed code that thus blinn like shading with all the stuff like specularity and such. now i've want to use at max 8 point lights (perhaps more, depending on how it works out). thus i placed a simple for-each loop in there and as the documentation stats this will be unrolled by the cg compiler. now i tested with 4 lights only and i already get 'too many ALU instructions'. so i wonder if this has it already been, if my venture into fragment programs already ended at this close point. how is it possible to deal with around 8 lights in a fragment program of arbfp1 type or is it just not possible at all?

Share this post


Link to post
Share on other sites
You can try to put as much instructions as you can outside of the loop, if you haven't done this already. What I mean is, if some parameters can be precalculated and are the same for all 8 lights, do it before the loop. Also, try to limit the number of instructions. Normalizing a vector with math takes 3 instructions. If you do it with normalization cube maps, you'll just need one instruction. Same with attenuation maps. Generally, try to do as much computations as you can with texture lookups.

If even then you can't get it to run, you can always do multiple rendering passes.

Share this post


Link to post
Share on other sites
i heard this now some times about using texture maps to lookup stuff. what exactly is the idea there? i mean more about how to lookup a normal for example. an attenuation map makes somehow sense but a normal map can be rather large.

Share this post


Link to post
Share on other sites
i concidered multi-pass with a pass for each light very slow. donno how you folks see that but using that kind of multi-pass you've got a heckload of duplicate calculations. just saying as i use staged skins which can be build of more than one image with effects. calculating the final fragment color from the stages each time i do lighting (for each pass)... i guess this rips FPS rate down like hell.

but i'll try it out once to see how heavy the impact is.

Share this post


Link to post
Share on other sites
Quote:

i concidered multi-pass with a pass for each light very slow. donno how you folks see that but using that kind of multi-pass you've got a heckload of duplicate calculations. just saying as i use staged skins which can be build of more than one image with effects. calculating the final fragment color from the stages each time i do lighting (for each pass)... i guess this rips FPS rate down like hell.


There are solutions to work around this. First of all, if all you need is diffuse lighting, you can use multiple passes to accumulate just the lighting to the framebuffer, and with one final pass you modulate with the skins. No duplicated calculations here.

If you wish to include specular lighting this won't work, as speculars need to be applied after the textures, but there are other ways. Off the top of my head, you can first make a pass with only the skin effects(no lighting) and save that to a texture. In the next passes, you don't need to recalculate the effects, you just sample that texture using the window coord of the fragment.

Share this post


Link to post
Share on other sites
why does the specularity to be applied afterwards? because it can 'over-bright'? should the not be possible just to pump into the lighting pass?

i'v found a tutorial which describes a similar process. it uses the glBlendFunc( rather, the directx equivalent but this doesn't matter) to achieve this effect by storing the lighting value in the alpha channel of the color buffer. now there is one problem: i can not do transparency in the same go there unless i can access the current color+alpha from the color buffer from inside my fragment program. is this possible or a no-go?

Share this post


Link to post
Share on other sites
If you provide a link to that tutorial, I may be able to provide better help. Anyway, you can't access the color buffer through fp, but you can use the render-to-texture function to save it to a texture, and then you can access it.

Share this post


Link to post
Share on other sites
Tutorial

it's a directx tutorial but rather general written thus not that difficult to translate to opengl.

i've read somewhere in the mean time that normalizing cube-maps are not really that fast compared to direct normalization in the fp-blinn-phong shader.

EDIT: if i could use two shaders, one for doing the texture-color calculation using my effects-routines and one for the entire lighting, then this would be a rather good solution. but for this one i need to 'take-over' the attenuation value from my lighting shader to the texture shader.

i looked into the NV doc for Cg and they state there as input for your fragment program two values: COLOR0 and COLOR1. COLOR0 is obvious, but what is COLOR1? can it be hacked over to carry the lighting information?

Share this post


Link to post
Share on other sites
Quote:
Original post by RPTD
i concidered multi-pass with a pass for each light very slow. donno how you folks see that but using that kind of multi-pass you've got a heckload of duplicate calculations. just saying as i use staged skins which can be build of more than one image with effects. calculating the final fragment color from the stages each time i do lighting (for each pass)... i guess this rips FPS rate down like hell.

but i'll try it out once to see how heavy the impact is.


ive considered combining lighting passes before, but theres a problem/s with it, that i cant think of at the moment :), but trust me it can fall flat on its face. then again if youre not to doing any to complicated shading equations (not using shadows for a start) u might get it to work

also u will need to write different shaders depending on how many lights in the scene, u could use a loop in the shader, but (depending on the hardware) using a non const variable for a loop doesnt work

u might wanna check out 'deferred shading'

Share this post


Link to post
Share on other sites
prior to reading this i've got already poked around in papers about deffered rendering. currently i see me in the following situation:

- have to support multiple texture stages with texture animation effects.
- have to support alpha textures among those stages
- have to support (preferable) non-restricted number of lights (best low count of course).

now for the beginning i can ignore the alpha problem as it will be a headache no matter what technic i use later on. it is also not a bit trouble to leave it out at that stage as my engine design allows to completly turn over the rendering module without tempering with anything outside the render module.

deffered rendering cought my eye quickly as i might even solve the alpha problem if i do not a real deffered rendering but a hacked one. i came up with something that might work but there is one problem now. for this i need a second color-buffer with the same size as the front-buffer. this is not the problem, just sucks up memory. now the problem is if i've got rendered my model into this texturing-buffer (how i call it) i need to somehow apply it to the frontbuffer. i would have in this texturing-buffer color values paired with alpha values (thus 32bit). i'd have to copy/blend over to the back-buffer (where i could exclude using a stencil buffer or perhaps depth buffer only). this would not need 3d rendering as it would be a 2d-like operation. in directx i could think of a way to do it but how does it look like with opengl?

Share this post


Link to post
Share on other sites
i'll try to rephrase it. i have something in mind to solve my problem where i am not sure if it pays off, or works. but for it to work i need to be able to do the following:
- render the texture mapped and lit model into an off-screen render target of the same size as the back-buffer.
- blend this buffer with the back-buffer in a way that only the rendered parts are copied over (perhaps even blended using an alpha value).

the first point is not a problem, just some fiddeling with render targets. the tricky point is the second part: can i copy/blend some arbitrary render target onto another one (the back-buffer here) with/without blending?

pBuffers would do but they are scarely implemented (and my testing system does not have them).

Share this post


Link to post
Share on other sites
>>- render the texture mapped and lit model into an off-screen render target of the same size as the back-buffer.<<

easiest method is create a texture same size as the window, do all your drawing into the backbuffer, use glCopyTexSubImage2D(..) to upload the buffer into the texture, clear the screen draw the normal scene

>>- blend this buffer with the back-buffer in a way that only the rendered parts are copied over (perhaps even blended using an alpha value).<<

most likely use a blend of src = GL_SRC_COLOR, dst = XXX(depending on what effect u want)

Share this post


Link to post
Share on other sites
Quote:
>>- render the texture mapped and lit model into an off-screen render target of the same size as the back-buffer.<<

easiest method is create a texture same size as the window, do all your drawing into the backbuffer, use glCopyTexSubImage2D(..) to upload the buffer into the texture, clear the screen draw the normal scene

for this one i need to sacrifice the content of the back-buffer. unfortunatly this i can not afford as the buffer is filled with 2d/3d contents already (i've got a Swing like Widget GUI system running inside OGL). thus i can not touch the back buffer while doing any extranous render work.

Quote:
>>- blend this buffer with the back-buffer in a way that only the rendered parts are copied over (perhaps even blended using an alpha value).<<

most likely use a blend of src = GL_SRC_COLOR, dst = XXX(depending on what effect u want)

thus i need to assign the off-screen buffer as texture and render it as a full-screen quad right? how's the performance penality on this method? i heard glCopyTexSubImage2D is slow.

Share this post


Link to post
Share on other sites
>>i heard glCopyTexSubImage2D is slow.<<

depends on data format/card driver i suppose, FWIW i can do over 3billion pixels/sec with it. ie update a area of 1024x768 at over 300fps

Share this post


Link to post
Share on other sites
i'll look into this one after i got my graphic module update done. for the time beeing i'll just snoop around for solutions and focus on the non-engine-graphic things for the moment. might give some ideas not thinking about it all the time ^_^

Share this post


Link to post
Share on other sites

This topic is 4730 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this