I kept thinking about this and there's probably hardware limitations.
There are several types of types of divergences:
- All pixels in a wavefront A access texture[0]; All pixels in a wavefront B access texture[1]; This is very likely because the wavefronts belong to different draws.
- Pixel A accesses texture[0]; Pixel B accesses texture[1]. Both are Texture2D.
The first case is fine.
But the second one is not. You must specify NonUniformResourceIndex if you expect to be in scenario 2.
On the top of that, to cover what you want; we would have to add several more scenarios, where the type of texture could be divergent, not just the texture index.
Some hardware out there definitely cannot do that.
Note however, nothing's preventing you from doing this:
Texture2d texture_array2d[5000] : register(t0); //We assume the first 5000 textures are 2D
Texture3d texture_array3d[5000] : register(t5000); //We assume textures [5000; 10000) are 3D
Also note that Tier 1 supports up to 256 textures, so an array(s) of 10.000 will lock you out of a lot of cards (Haswell & Broadwell Intel cards, Fermi NVIDIA)
Edit: Just to clarify why it's not possible (or very difficult, or would add a lot of pointless overhead). There's three parts:
- Information about textures like format (i.e. RGBX8888 vs Float_R16, etc) and resolution. In some hardware it lives in a structure in GPU memory (GCN), in other hardware it lives in a physical register (Intel).
- Information about how to sample the texture (bilinear vs point vs trilinear, mip lod bias, anisotropy, border/clamp/wrap, etc). In GCN most of this information lives in a SGPR register that points to a cached region of memory. The border colour (for the border colour mode) lives in a register table. In Haswell this information lives in physical register IIRC.
- Information about the type of the texture, which affects how it is sampled (1D vs 2D vs 2D Array vs 3D vs Cube vs Cube Array). In GCN, sampling a cubemap requires issuing more instructions (V_CUBE*_F32 family if I recall); sampling 3D textures requires providing more VGPRs (since more data is needed) than for sampling 2D textures.
Your assumption is that the type of texture lives in GPU memory alongside the format and resolution (point 1). But this is not the case. It lives on the ISA instructions (point 3).
In fact D3D12 provides some level of abstraction: You think the format and resolution lives in GPU memory, when in fact on Intel GPUs it lives in physical registers (that's where the 256 limit of Tier 1 comes from btw. D3D11 by spec allowed up to 128 textures, and it happens to be both Fermi & Intel supported up to 256)
Therefore, it becomes too cumbersome to support this sort of generic-type texture you want.