Excerpt from Offset dev blog, clarifying sampler usage[SOLVED]

Started by
3 comments, last by n00body 15 years, 9 months ago
From this entry (paragraphs 3 & 4): http://www.projectoffset.com/blog.php?id=21 What does he mean by "as far as the driver (and HLSL compiler) is concerned, I want to use all 16 possible textures." ? Did he mean that even if you don't have a texture bound to a sampler, it is still counted as a used sampler when the shader is compiled? Thanks for any help you can provide. [Edited by - n00body on July 17, 2008 11:30:39 AM]

[Hardware:] Falcon Northwest Tiki, Windows 7, Nvidia Geforce GTX 970

[Websites:] Development Blog | LinkedIn
[Unity3D :] Alloy Physical Shader Framework

Advertisement
No what he's saying is that even though no individual pixel accesses more than samplers, he still needs to have more than that bound at the time he issues the draw call. Take something like this for instance:

sampler2D sampler1 : register( s0 );sampler2D sampler2 : register( s1 );float4 PS(float2 texCoord : TEXCOORD0) : COLOR0{    if(texCoord.x > 0.5)        return tex2D(sampler1, texCoord);    else        return tex2D(sampler2, texCoord);}


Now for each run of this simple shader it will never sample more than one texture, however you need both textures bound at draw time since each pixel could access either of them. This is the problem he's talking about, just multiplied.
So, the real problem was branching, and not the samplers?

[Hardware:] Falcon Northwest Tiki, Windows 7, Nvidia Geforce GTX 970

[Websites:] Development Blog | LinkedIn
[Unity3D :] Alloy Physical Shader Framework

Yes and no.

Branching says to you and me (as developers) that I optionally choose to use a given resource. However the runtime (and hence compiler) don't know which ones we're going to choose therefore it has to provide us with all of them.

The key comes in that HLSL and GPU programming is very resource constrained - 16 unique samplers being the crucial constraint here.

In essence it says "you can have dynamic branching, but you must choose from a small subset that you tell me about in advance". As described in that blog, for what they're doing it is hard (if not impossible) to fit the entire set of possible options into such a constrained pool.

Similar happens in the CPU world, but the levels of RAM L1->L2->L3->SYSRAM->VRAM and the wonders of caching/paging tend to hide this from us entirely.

In real-terms for lighting, a multi-pass approach will usually allow you to skip this problem. Heading down the direction that blog suggests ended up with the "uber shader" facination we had for a while [smile]

hth
Jack

<hr align="left" width="25%" />
Jack Hoxley <small>[</small><small> Forum FAQ | Revised FAQ | MVP Profile | Developer Journal ]</small>

Okay, thanks for the input guys. It will be put to good use. ;)

[Hardware:] Falcon Northwest Tiki, Windows 7, Nvidia Geforce GTX 970

[Websites:] Development Blog | LinkedIn
[Unity3D :] Alloy Physical Shader Framework

This topic is closed to new replies.

Advertisement