Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 15 Dec 2001
Offline Last Active Private

#5278241 AMD - Anisotropic filtering with nearest point filtering not working properly?

Posted by phantom on 26 February 2016 - 02:32 AM

As i said, it works fine on Nvidia and Intel chipsets. But on AMD it doesn't.

Just as a minor point and to reinforce something Hodgman said - just because it works as expected somewhere doesn't mean it is 'right'.

I have found in the past, typically with OpenGL, that when it comes to implementation AMD tend to stick closer to the letter of the spec, where NV take a more 'whatever works' approach. This can mean that NV will let things go where AMD will throw up errors or do what you don't expect - annoying but there it is :|

#5278034 Right way to handle people complaining about price?

Posted by phantom on 25 February 2016 - 02:51 AM

A few people have passed comment that it was stupid/daft/whatever of them to write an engine for the game but consider this; 5 years ago the engine landscape was very different.

Unity had only recently come to Windows (and, from what I recall of it at the time, wasn't that great), UE4's open source model didn't exist, most other engines were either utter tosh or required you to pay to get access to them; in the context of THEN starting from the ground up makes sense.

If we were having this discussion in 2021 then I'd agree it might be a bit silly, but as it was...

#5275906 instancing with multiple meshes

Posted by phantom on 16 February 2016 - 07:15 AM

IIRC DX11 doesn't have the equivalent of that; you have to issue multiple indirect draw calls however that still requires N draw calls for N sets of instances (bit of API tricky in the GL world; still breaks down to multiple draw packets on the backend.)

AMD supports it in Dx11 via their "AMD GPU services" extension library (in their GPUOpen GitHub). They also do support it in HW - the GPU's command processing front-end decodes the single multi-draw-call.

Ah, I stand corrected.
Note to self; re-read the GCN docs at some point...

#5275893 instancing with multiple meshes

Posted by phantom on 16 February 2016 - 05:17 AM

IIRC DX11 doesn't have the equivalent of that; you have to issue multiple indirect draw calls however that still requires N draw calls for N sets of instances (bit of API tricky in the GL world; still breaks down to multiple draw packets on the backend.)

If you want to do 'multiple meshes in one draw call' then, depending on your hardware target, the OP's original idea will work.

You require a buffer with all the vertex data in, and another buffer which lets you offset index in to it on a per instance (or, I guess, per group of instances) frequency.
Bind the buffers as inputs, but not as traditional vertex buffers.
Then issue your instancing draw with the correct vertices and instance count (which would be total required).
In your vertex shader you'd use the incoming values (vertex id and instance id) to index in to the offset and vertex buffers as required to pull the vertex data yourself. (vertex and index buffers can be 'null' for this.)
At which point you proceed as normal.

One important point with this method is that your meshes have to have the same vertex count; this is because vertex count is a function of the input to the draw call so you'll always get say 64 vertices processed per instance. You could, of course, use degenerate triangles to have meshes with varying numbers of output verts but you'll still pay the vertex shader cost for them.

D3D12 (and I assume Vulkan) offer more interesting ways of doing this as you can construct your own multi-draw indirect buffers from a compute shader which also allows changing of vertex buffers per draw packet; that's a tad more complicated however ;)

On the D3D11/Vertex shader route however this presentation covers what I wrote above, but maybe in a slightly clearer way; http://www.gdcvault.com/play/1020624/Advanced-Visual-Effects-with-DirectX - the bit in question is 'Merge Instancing'.

#5275711 Vulkan is Next-Gen OpenGL

Posted by phantom on 15 February 2016 - 03:04 AM

AFAIK (and I hope I'm out of date in my knowledge):
• AMD fully supports this -- for example, shadow-mapping draw-calls are bottlenecked by the HW rasterizer, leaving the compute cores mostly idle. Post-processing on the other hand generally fully saturates all the compute cores. Async compute means that you can render your shadow maps and perform post processing at exactly the same time, making use of compute resources that traditionally would've been sitting idle during that time.
• NVidia pretends to support this -- they'll let you set up the graphics/general queue and the dedicated compute-only queue, and let you submit your shadow draw calls into the former and your post-process dispatch calls into the latter... but in the hardware, they'll still only be working on one at a time - either the draw calls, or the dispatch calls, but never both at once.

Pretty much, although based on my reading of the hardware spec the Maxwell 1 hardware can work on compute and gfx at the same time but due to the way the dispatcher is arranged it won't always do so.

So on AMD hardware you have the ACEs and the Gfx hardware queues feeding in to a dispatcher to the CU blocks - the work can be picked from any of them and is balanced based on what is going on. So on a high end card you'll have 9 hardware pipes backed by 65 software pipes feeding into the CU array.
(This is true of all GCN hardware, it is the number of ACEs that differ.)

NV on the other hand work in one of two modes;
- Graphics/Mixed mode
- Compute only

In compute only, ever since Kepler '2' (aka Titan/780), it has had 32 queues (although NVs lack of information doesn't say what kind; I'm assuming 'software') to pull work from.

However Maxwell 1 and backwards could only pull from a single queue in 'mixed mode' and relied on the dispatcher to throw more than one lot of work in to their CUDA array.
Now, when all this hit, I put forward the idea that you could, with driver assistance on a per-game basis, probably work around this problem - you'd need to massage the commands going in to the queue so you could interleave gfx and compute sanely (I'm assuming they must do something like this to let PhysX and Gfx play nicely? or maybe PhysX and gfx doesn't play nicely...) - however that raises the question of 'why not for this benchmark?' given they apparently had access to it for some time; couldn't or didn't think it would matter? (and couldn't feeds in to my physx/gfx question.)

With Maxwell 2 this, afaik, largely goes away - mixed mode can pull from 32 queues as with compute (1 gfx, 31 compute; assuming 'software' as before) but there is still the central dispatcher setup which might still causes some limitations with regards to the amount of work 'in flight' - it is, after all, effectively doing the work of all of AMD's ACEs and gfx front end in one chunk of silicon, but a lack of technical details makes it hard to say what is going on and how it'll be impacted.

I full expect Pascal to fix all these problems for NV however - if it doesn't have a reworked front end for this kind of work load I'll be surprised.
(also the fact MW1 and MW2 are setup like this doesn't surprise me; imo NV hardware has always been a 'good enough' implementation of things, in that it'll do what is required for now but doesn't work as well in to the future. GCN seems to be more 'future proof' in some regards.)

It'll be also interesting to see what AMD's Polaris brings to the table; they appear to be adding some more dispatcher logic so I'm looking forward to hopefully getting a GCN-style deep dive on that data.

( relevant link for the compute/dispatch stuff; AMD drives deepn on async shading)

#5275675 D3D12: Copy Queue and ResourceBarrier

Posted by phantom on 14 February 2016 - 04:58 PM

The copy queue 'could' be optimised; it is a bit hardware specific.

For example on Intel I doubt you'd get any pay back, AMD however have dedicated DMA hardware on their GPU so copying can be handled separately from other operations. (Same with compute queues; GCN has up to 8 hardware queues each servicing up to 8 software queues - although if memory serves currently you can only create one unique queue per type with D3D12.)

#5275673 Vulkan is Next-Gen OpenGL

Posted by phantom on 14 February 2016 - 04:55 PM

Maxwell 1 supports it, but there is an 'issue' with how the hardware dispatches work which means that if not fed in a certain way problems can appear.

As I recall Maxwell 2 has a slightly changed dispatcher front end which makes the problem vanish (or if not 'vanish' then isn't noticeable).

"Supports" is always a fun word in graphics land.

(If you go and read NV's whitepaper on Maxwell 1 hardware it is clear why it would have a problem with certain workloads - the driver could probably work around some of them on a case by case basis.)

FWIW I generally think GCN is a better overall arch, although that might be down to AMD telling us things and not hiding details; NV have had a massive driver advantage for some time however and few hardware tricks which are better for games when combined with tuning.

#5275488 Vulkan is Next-Gen OpenGL

Posted by phantom on 12 February 2016 - 06:51 PM

From an IHV point of view Fermi is dead and gone; by the time Vulkan focused engines appear (as in totally killed GL support) it'll be another couple of card releases down the line and matter even less.
Financially it makes zero sense to invest in Fermi based tech at this point - it is already two generations old and isn't selling after all. (Pascal is due later this year too...)

Sucks for the developers who can't get one, but such is life... and it's not like GL is going away any time soon so you can still continue development and, to be frank, chances are the majority of games going forward will be either AAA in-house or built on engines such as UE4 and Unity, both of which will support the tech so it Just Works on target platforms.

Same reason you won't see Vulkan on pre-GCN hardware from AMD, and I'd be surprised if the 1.0 hardware got much love at this point...

#5275060 C++ and Lumberyard engine

Posted by phantom on 09 February 2016 - 05:44 PM

Unity reuses their engine between major releases and just rewrites the parts that are new, whereas Epic claims to rewrite their engine with every major release

Both engines are continual evolutions; bits of UE4 were indeed rewritten from UE3 but a lot was still the same on release (majority of the core systems for example). I think there was some miss communication and people made some leaps/assumptions which lead to this "UE4 was rewritten" legend.

Don't get me wrong; bits of both engines have been rewritten and reworked over time, what was good yesterday isn't going to be good today after all, but no one does a dump and rewrite, and certainly not in the time scale suggested.

(Having worked for both companies I feel I've got a degree of authority on this subject ;) )

#5273143 Vulkan is Next-Gen OpenGL

Posted by phantom on 29 January 2016 - 02:56 AM

OK, firstly, there was nothing about 'one API per hardware vendor being bad' said - it was "we do not need another API. OpenGL is fine".
That is what is said.

Secondly, yes "write the code once with one API and get it to work everywhere" is a nice idea.. but it is a pipedream even today.

D3D : Does a pretty good job of ironing out the wrinkles of the hardware but even for that you end up with "work around for driver funkiness" code paths which apply per driver/GPU.
OpenGL : See above but worse with added extension funkiness.
OpenGL|ES : Android is where 'one code path' goes to die.

On top of that you have the PS4's API, the subtle differences which are the Xbox360's API, the Xbox One's API, the PS3's API, Wii/WiiU, PSP and more recently Metal.

People are, and were, already writing multiple backends to get the best from hardware so that piece of FUD was already a reality for many.
(And, frankly, I don't expect Vulkan to change that all that much either - Windows will still require per-GPU work arounds and code paths for Reasons, maybe Android will improve is Google somehow get a handle on things, but even that'll suffer the same problem.)

But, again, that wasn't the argument made.

The argument made was 'OpenGL is all you need.' and that 'another API wasn't needed'.

Which was my whole point back in the first post that touched on this and yet people seem to be having a problem understanding a simple concept?

#5272915 Vulkan is Next-Gen OpenGL

Posted by phantom on 27 January 2016 - 06:54 PM

Much like Longs Peak before it what we've seen of Vulkan looks sane and good...

Unfortunately Longs Peak went from a good idea to a pile of rubbish after a 6 month media blackout.
The same group of self serving companies are involved this time.

So, ya know, moan about it being 'unfair' as much as you like, the companies involved have previous when it comes to having good ideas and failing to execute...

(and lets not also forget that not long before they announced this new API plan the same people involved went to great pains to bang the drum saying "another API wasn't needed" and that "opengl extensions will do"... NV have already started poking holes to allow GL/Vulkan to play together, likely in some non-standard way which will lead to dicking everyone else over while shifting the blame to the standard supporting people - yes, they have previous with this as well - and until I see a driver I'm not convinced Apple are going to get involved either; currently neither iOS nor OSX are on the list of 'developing sdks' from the last news post on the Vulkan site, which was posted over a month ago on the day they missed their release target...)

#5269390 Criticism of C++

Posted by phantom on 05 January 2016 - 06:10 AM

Yes, because the deprecation of some library features is the same as a 'rethink of the language'...

and as apparently the level of conversation here is now describing things as 'fail' and bitter sarcasm I'm out; nothing productive will come of this.

#5269374 Criticism of C++

Posted by phantom on 05 January 2016 - 05:00 AM

I seriously think someone should re-think C++ from scratch instead of driving away into another language.

A 'rethink from scratch' would just give you another language.

Anything which breaks backwards compatibility is a new language.

End of story.

And people have tried, D which was brought up earlier is this indeed incarnate, yet it has failed to catch on.

A 'rethink of C++' is no longer C++.

#5269065 Criticism of C++

Posted by phantom on 03 January 2016 - 03:33 PM

Which is why I'm having big expectations for D (being developed by, basically, two people) once it matures a bit.

Yep, probably around the time of the Year Of The Linux Desktop...
(D is already 14 years old, it has had two people working on it since 2006... if you look in to the distance you can see the boat it missed sailing off with all the other languages which have come since partying on it...)

#5268117 Enable MSAA in DirectX 11

Posted by phantom on 27 December 2015 - 08:24 AM

On point 2 I think you've misread it; the data pointer must be null when creating a multi-sampled texture as there is no way to upload multisample data to the GPU. Multisample textures are only useful as render targets (and for sampling from once they have been rendered to) and thus can not be immutable.