Actually doing the opposite thing, and I'm not sure if this is the correct way, In d3d I'm emulating OGL style textures (aka embedded sampler state). When you define a uniform texture in the shader, implicitly i create a sampler uniform. If I use different sampler the hlsl compiler will cut-out the generated sampler state and there will be no overhead.
On modern GPU's, the D3D10+/GL3+ way of using samplers is more slightly optimal than the D3D9/GL2/GLES way.
When submitting a draw-call, the driver has to allocate a block of memory and copy into it all of the 'information' that the shaders used by that draw-call will use. This contains the header structures for buffers & textures (including the pointer to the buffer/texture data), and the structures that define samplers.
With the D3D10/GL3 model, you can greatly reduce the number of sampler structures that are used by the shader by sharing one sampler between many textures. If you emulate the D3D9/GL2 model, you're forced to have one sampler for every one texture, which results in each shader having a larger 'information' packet, which means slightly more work for the CPU as the driver produces these packets, and more work on the GPU as the pixels/vertices all download/read the larger packets from memory.
So if you're going to pick one abstraction, I would emulate D3D11 samplers on GL, instead of emulating GL samplers on D3D11
Abstraction is a sane approach, but a complicating factor here is that D3D12 is quite a lot different than D3D11 -- More different from D3D11 than OpenGL is, in many ways. The threading stuff and more-manual resource management is very different than previous graphics APIs. In practice, this means it will probably be impractical to unify the two styles under a single low-level or mid-level abstraction, there's just not enough wiggle-room there to hammer out the differences -- I mean, you could almost certainly implement a D3D11-style low-level API in D3D12, but you'd loose many, if not most, of the benefits of D3D12 in doing so.
The big difference is slot-based APIs vs bindless APIs.
In my abstract API, it's mostly slot-based, but with a few compromises. Instead of having "texture slots" in the API, I instead expose "resource-list slots". The user can create a resource-list object, which contains an array of textures/buffers, and then they can bind that entire list to an API slot.
This maps fairly naturally to both bindless and slot-based APIs (a shader-program in a slot-based API will reserve a range of contiguous slots for each resource-list used by that shader, and, a shader-program in a bindless API will just define that structure).
For things like samplers, you hardly ever use a large number of them, so a slot-based API works fine (and can be implemented on bindless APIs fairly efficiently).
As long as you're aware of all the underlying APIs when designing your abstraction, you can come up with something that's still very low-level but also fairly efficiently implemented across the board.
Personally, I have "techniques" (shaders-program-objects with permutations), resource-lists, samplers, cbuffers, state-groups, passes (render-target/viewports) and draw-calls using the same abstraction across ps3/4, xbox360/1, d3d9/11, GL and mantle
Edited by Hodgman, 28 August 2014 - 01:50 AM.