Create a command list, create some resources, execute a command list with a set of resources as inputs, done.
so, how do you keep a game (engine) flexible, yet knowing _all_ the states etc. that you don't want to create on runtime? (assume pipeline creation can take as much time as shader linking in opengl which is 1s in bad cases).
there is no driver anymore that does that in a background thread for you, in an as fast as possible way.
assume you have about 1024 shader combination, add stencil, rasterizer, blend, rendertarget permutations that might be part of the gpu-setup and therefor included in one static state you have to create.
assume, it's not a state creation that is cross platform, but per driver+gpu, you cannot really do it offline before you ship the game.
The rest of the changes are conceptual changes to simplify the resources model (no more different kinds of buffers, simpler texture semantics, etc.).
there still are. check out the links in the 2nd post. it's split in 2 stages
1. you allocate a bunch of memory
2. you prepare it for a specific usage case e.g. as render target or vertexbuffer.
now assume you want to use a texture as render target and use it as source in the 2nd drawcall (e.g. some temporal texture you use in post processing). you need to state that to the API.
assume further you take advantage of the new multithreaded command generation, so you can't keep track of the state of an object inside the object, you rather need to track states per commandbuffer/thread.
assume further, you don't want to do redundant state conversions, as those might be quite expensive (changing layouts of buffers to make them best suited for texture sampling, for rendering, for vertex access), so you'd need to actually somehow merge the states of resources you use in consecutive command buffers.
the more explicit threading model (only particularly relevant if you want/need render threading), and the more explicit device model (pick which GPU you use for what on multi-GPU systems).
you know you have to test and balance all that? cross fire works across different GPUs. you can have an APU gpu + some mid range Radeon HD 7700 + a top notch R9 290x.
and with D3D, there is a generic driver that might execute asymmetrically on those GPUs. it's something you'd need to handle.
I don't say that's impossible, but for the majority of devs, it can end up in either a lot of work (testing all kind of configuration in various parts of your game) or you can disappoint some high end users that their expensive 4x crossfire is no faster thatn 3x crossfire or even buggy.
a lot of work that drivers did before, will end up in the hands of devs and it's not optional, it's what you'll have to do. you might ship a perfectly fine running game and some new GPU might take advantage of something that hasn't been used before and it might uncover a bug in your 1year old game that ppl still play. and AMD/NV won't release a driver fix, you need to release a patch.
I see benefits you've mentioned, but I also see all the drawbacks.
I like low level programming on consoles, below what mantle/D3D12 offers, but I'm not sure about the PC side. when there was Glide/S3 Metal/RRedline/... and even GL was working different (MiniGL/PowerSGL/..) every developer felt relieved it ended with D3D. and the RefRas was actually pushed by game devs, to be able to validate something is a driver bug. now it all seem forgotten and like a step back.
the Cass Everitt talk really seems like the best balance of both worlds to me (if it would be extended a little bit).