Archived

This topic is now archived and is closed to further replies.

Pixel Shader limitations

This topic is 5570 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I''m currently working on a pretty sweet lighting model using shaders, bump maps, light maps, etc., but I''m slowly running into troubles with Pixel Shader Version 1.4. Are there any common tricks to pass more data from the vertex shaders to the pixel shaders than through the two colour registers? Ideally, I''d need four colour registers, but that''s obviously not possible. The restriction that you''re only allowed to use them in Phase 2 is annoying as well, but I''ve overcome this so far by doing everything that''s possible in any way in Phase 1. I''ve tried to find some compression technique for the colour registers as well, but to no avail. Any ideas? Thanks - JQ Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
Yeah, we need more colour iterators, that''s a fact.

If you don''t need all 4 texture units, then you can abuse one (or more) as additional colour registers. There is a pixelshader command that converts 3D texture coordinates into r,g,b triples.

/ Yann

Share this post


Link to post
Share on other sites
It is not easy to answer your question on such an abstract level. Just show us, what you got so far.
What did you put into wich register ?
You have six texture stages and the possibility to multipass as much as you like.

- Wolf

Share this post


Link to post
Share on other sites
I haven''t got anything so far in the Version 1.4 shader, though my 1.1 shader is done; I still have to try it out but it should work, I''ll post it as soon as I''ve tried it out. (I don''t have much time for game development right now unfortunately, damn day job!) I''m using every component of both colour registers (rgb: (light) vectors, alpha: intensities)
I came up with the idea of abusing unused texture coordinates (i.e. I only need u and v, so I can use w for something else) but I''m not sure if that''ll work out - I''ll probably have to use the u and v for reading from the texture in one phase and read the w value in the other phase, but I don''t know if it lets me do that. And anyway, despite the fact that there are four coordinates, you can only ever use 3 (according to the DirectX SDK) which means the trick can''t work for projective textures...
Yann: actually, I would like to use all textures, as that is the main idea for behind implementing a 1.4 shader, for me anyway.
Maybe I''ll only use 5 though, that''s still better than 4.
wolf: I know about multipass, but obviously I''m trying to get as much as possible into one pass.
I''ll post more as soon as I''ve got some evidence to back my theory up.
Thanks for the input so far though!

- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
You are using all 6 textures ? Hmm. Are you sure, you really need them all ? Or can you optimize one or two away, eg. by using separate alpha components of other units, replacing lookup textures by pixelshader math, or similar. What kind of pixel processing do you do ? Frankly, I don't know of many effects (even very high end ones), that would require 6 units.

quote:

And anyway, despite the fact that there are four coordinates, you can only ever use 3 (according to the DirectX SDK) which means the trick can't work for projective textures.


It works on all 4 coords, if no texture is used on that particular unit, you can directly convert them to an RGBA value (at least in OpenGL, I guess it'll also work in D3D).

It might get complicated though, if you try to remap a single texture coordinate onto a specific colour channel, while still having a texture on the same unit filling in the other channels. I'm not sure, if that will work, I never tried that. And in this case you can't use the 4th coordinate, since it will be used as homogeneous perspective divide, while accessing the texture. So a 2D texture will need 3 coordinates, a 3D or projective texture needs all 4.

/ Yann

[edited by - Yann L on September 10, 2002 8:47:26 AM]

Share this post


Link to post
Share on other sites
quote:
Original post by Yann L
You are using all 6 textures ? Hmm. Are you sure, you really need them all ? Or can you optimize one or two away, eg. by using separate alpha components of other units, replacing lookup textures by pixelshader math, or similar. What kind of pixel processing do you do ? Frankly, I don't know of many effects (even very high end ones), that would require 6 units.


*cough* yeah, I know. Me and my crazy ideas. It's got to do with applying more than one shadow map and possibly light map in one pass, and combining that with DOT3 bump mapping and standard texture mapping. (see the other thread) I know it's crazy, but then again, my crazy ideas often actually work.
I just thought of the next crazy optimization though. You guys here make me think too much.

quote:
It works on all 4 coords, if no texture is used on that particular unit, you can directly convert them to an RGBA value (at least in OpenGL, I guess it'll also work in D3D).

Hrm. I'll have to try it I guess, but I'm pretty sure you can only use .rgb or .rga for some reason. (it says something like that in the SDK for the, uh, texcrd instruction I think, can't quote right now)

quote:
It might get complicated though, if you try to remap a single texture coordinate onto a specific colour channel, while still having a texture on the same unit filling in the other channels. I'm not sure, if that will work, I never tried that. And in this case you can't use the 4th coordinate, since it will be used as homogeneous perspective divide, while accessing the texture. So a 2D texture will need 3 coordinates, a 3D or projective texture needs all 4.

Yeah, I know, the concept is still a bit wobbly, but i possibly can pull it off by doing one of the two operations in the first 1.4 "phase", and the other in the second. That plus the optimization I just mentioned that I thought of, I might get YET better performance. OK, I'll think about it tonight.

Thanks for the input.

EDIT: by the way, sorry about my incoherent writing style, but I'm at work, performing mundane tasks (writing ASP/SQL web applications) while thinking about pixel shaders, and my brain went screwy after a while.

- JQ
Full Speed Games. Coming soon.

[edited by - JonnyQuest on September 10, 2002 10:31:22 AM]

Share this post


Link to post
Share on other sites
quote:

*cough* yeah, I know. Me and my crazy ideas. It''s got to do with applying more than one shadow map and possibly light map in one pass, and combining that with DOT3 bump mapping and standard texture mapping. (see the other thread) I know it''s crazy, but then again, my crazy ideas often actually work.


Hehe, OK, I see...

quote:

Hrm. I''ll have to try it I guess, but I''m pretty sure you can only use .rgb or .rga for some reason. (it says something like that in the SDK for the, uh, texcrd instruction I think, can''t quote right now)


Really, I don''t know under D3D, it was just a thought (since all major 3D hardware support that feature and it''s available under OpenGL). I haven''t done very much pixelshader stuff under D3D, so probably you are right.

Hmm, again I don''t know if D3D gives you access to that, but in OpenGL (at least on nVidia''s extensions) you can abuse the fogfactor as an additional arbitrary monochromatic value. It isn''t really very useful (since it''s restricted to the final combiner stage), but sometimes it can save your butt, if you desperately need an additional iterator.

You can also do some interesting tricks with specially prepared dual-use textures. Eg. 3D textures, where the 2D part is a normal colour texture, and the (very small) third dimension is a lookup table for an additional (alpha) iterator. Very useful for layered volumetric perpixel fog, for example.

/ Yann

Share this post


Link to post
Share on other sites
quote:

Hmm, again I don''t know if D3D gives you access to that, but in OpenGL (at least on nVidia''s extensions) you can abuse the fogfactor as an additional arbitrary monochromatic value. It isn''t really very useful (since it''s restricted to the final combiner stage), but sometimes it can save your butt, if you desperately need an additional iterator.


Yeah, you''re right, you actually can set that in the vertex shader. I doubt it''ll be of any use in my current experiments though.

quote:

You can also do some interesting tricks with specially prepared dual-use textures. Eg. 3D textures, where the 2D part is a normal colour texture, and the (very small) third dimension is a lookup table for an additional (alpha) iterator. Very useful for layered volumetric perpixel fog, for example.

Hmmm, sounds interesting! I''ll think about it, maybe I can use this somewhere, but I''ll probably run out of arithmetic instructions before I have to resort to it... The main problem I''m having is still the limited per-vertex information, as I''m doing really simple stuff but unfortunately lots of it. I''ll try your texture coordinate idea though.

- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
The trick is to balance the calculations between vertex shader and pixel shader in such a way, that the intermediary results can run through the limited connection pipeline you have (basically 2 colours, 6 texcoords iterators). Sometimes per-vertex calculations should be performed perpixel, even if perpixel precision would not be needed from a visual point of view. But it can save an iterator.

If you want, post the pipeline setup you have in mind, there are surely hidden optimizations somewhere.

/ Yann

[edited by - Yann L on September 10, 2002 12:38:46 PM]

Share this post


Link to post
Share on other sites
quote:
Original post by Yann L
The trick is to balance the calculations between vertex shader and pixel shader in such a way, that the intermediary results can run through the limited connection pipeline you have (basically 2 colours, 6 texcoords iterators). Sometimes per-vertex calculations should be performed perpixel, even if perpixel precision would not be needed from a visual point of view. But it can save an iterator.

Hmm. Good point. It doesn''t work for me though, I think: I''m sending light vectors in texture space and a light intensity (distance-dependant) from vertex to pixel shader. That''s hardly anything I can compute in pixel shaders, and it can''t really be precomputed in textures either...

quote:
If you want, post the pipeline setup you have in mind, there are surely hidden optimizations somewhere.

I was going to today, but I forgot the CD with the source back home this morning. I''ll try again tomorrow, but like I said, I got an idea to minimize textures after you mentioned it, I''ll try implementing that.


- JQ
Full Speed Games. Coming soon.

Share this post


Link to post
Share on other sites
I''m soon to start on shaders. ATI asked me to support v1.4, but it seems like that''s worse than earlier versions? Surely the point of a new version is to offer better functionality more easily? Just what are the differences between 1.4 and anything else - why would I want to use it?



Read about my game, project #1
NEW (9th September)diaries for week #5

Also I''m selling a few programming books very useful to games coders. Titles on my site.


John 3:16

Share this post


Link to post
Share on other sites
ps.1.4 gives you 22 instructions, where ps.1.1 gives you only 11. Additionally you are able to program some things more efficiently, because you can re-use texture coordinates in the pixel shader.
Having six instead of four textures is another advantage.
Last but not least ps.1.4 shows you the syntactical way ps.2.0 will use.
Every upcoming graphics card has to be backward compatible to ps.1.4.

The four shader tutorials on www.gamedev.net compare ps.1.1 and ps.1.4. You will find there a more detailed explanation.

- Wolf

Share this post


Link to post
Share on other sites
quote:
Original post by wolf
ps.1.4 gives you 22 instructions, where ps.1.1 gives you only 11.


I''m not sure where you''re taking that info from, but it''s incorrect...
Pixel shaders 1.0 through 1.3:
8 arithmetic instructions, 4 texture instructions (=12)
Pixel shader 1.4:
2 phases with up to 8 arithmetic instructions and 6 texture instructions each. (=28)

To get back onto the topic of the thread, I''ll describe what I''m trying to do in about a week or so, there''s just no time to write it all down at the moment.

- JQ
Full Speed Games. Period.

Share this post


Link to post
Share on other sites
quote:
Original post by d000hg
I''m soon to start on shaders. ATI asked me to support v1.4, but it seems like that''s worse than earlier versions? Surely the point of a new version is to offer better functionality more easily? Just what are the differences between 1.4 and anything else - why would I want to use it?



Read about my game, project #1
NEW (9th September)diaries for week #5

Also I''m selling a few programming books very useful to games coders. Titles on my site.


John 3:16



a complete rundown of all the differences would be a novel, but A short list between 1.1 and 1.4:


All < 1.4 shaders can be up to 12 (4 tex + plus8 arthimetic) instructions. Although, some of the dependent texture operations are quite sophisticated.

Major differnce: Supports two phases, each with 6 texloads with texture bound on output result registers, multiple texloads can be done from same texcrd. Each phase can have 8 instructions, v registers can only be accessed from the second phase. Alpha''s are killed upon entry to second phase. First phase can be thought of as a dependent texcoord generator for second phase (can do more then just that though).

Additionally, in ps_1_4, arbitrary write masks are allowed, and all replicate swizzles are allowed. The intermeidate range is also defined to be at least -8 to 8. _x8, _d4, _d8 dest modifiers are provided, and an _x2 source modifier. Bias restriction is lifted from non-sat''d values.

You are provided 6 temp registers and no texture registers can be used as temp (note that texture registers have a port constraint of 2 in ps_1_1 and 3 in ps_1_2+, also no negate from _sat was lifted in ps_1_2+ as well).

ps_1_1 model is (nearly) a complete subset of ps_1_4. The only exception is the vspec instruction which cannot be done as written on ps_1_4 because of the way the eye vector is packed.








Share this post


Link to post
Share on other sites
You are right ... I didn''t remember it correctly :-)

The bottom line is: ps.1.4 is much more flexible or in DX9 syntax ps_1_4 :-).

- Wolf

Share this post


Link to post
Share on other sites