Jump to content

  • Log In with Google      Sign In   
  • Create Account

We need your help!

We need 7 developers from Canada and 18 more from Australia to help us complete a research survey.

Support our site by taking a quick sponsored survey and win a chance at a $50 Amazon gift card. Click here to get started!

Matias Goldberg

Member Since 02 Jul 2006
Offline Last Active Today, 07:08 PM

#5215051 What are your opinions on DX12/Vulkan/Mantle?

Posted by Matias Goldberg on 06 March 2015 - 05:53 PM

There is something I don't really understand in Vulkan/DX12, it's the "descriptor" object. Apparently it acts as a gpu readable data chunk that hold texture pointer/size/layout and sampler info, but I don't understand the descriptor set/pool concept work, this sounds a lot like array of bindless texture handle to me.

Without going into detail; it's because only AMD & NVIDIA cards support bindless textures in their hardware, there's one major Desktop vendor that doesn't support it even though it's DX11 HW. Also take in mind both Vulkan & DX12 want to support mobile hardware as well.
You will have to give the API a table of textures based on frequency of updates: One blob of textures for those that change per material, one blob of textures for those that rarely change (e.g. environment maps), and another blob of textures that don't change (e.g. shadow maps).
It's very analogous to how we have been doing constant buffers with shaders (provide different buffers based on frequency of update).
And you put those blobs into a bigger blob and tell the API "I want to render with this big blob which is a collection of blobs of textures"; so the API can translate this very well to all sorts of hardware (mobile, Intel on desktop, and bindless like AMD's and NVIDIA's).

If all hardware were bindless, this set/pool wouldn't be needed because you could change one texture anywhere with minimal GPU overhead like you do in OpenGL4 with bindless texture extensions.
Nonetheless this descriptor pool set is also useful for non-texture stuff, (e.g. anything that requires binding, like constant buffers). It is quite generic.

#5215030 What are your opinions on DX12/Vulkan/Mantle?

Posted by Matias Goldberg on 06 March 2015 - 03:50 PM

I feel like to fully support these APIs I need to almost abandon the previous APIs support in my engine since the veil is so much thinner, otherwise I'll just end up adding the same amount of abstraction that DX11 does already, kind of defeating the point.

But it depends. For example if you were doing AZDO OpenGL, many of the concepts will already be familiar to you.
However, for example, AZDO never dealt with textures as thin as Vulkan or D3D12 do so you'll need to refactor those.
If you weren't following AZDO, then it's highly likely that the way you were using the old APIs is incompatible with the new says.

Actually there are way to do kindof multithreading in OpenGL 4 : (...). There is also glBufferStorage + IndirectDraw which allows you to access a buffer of instanced data that can be written like any others buffer, eg concurrently.
But it's not as powerful as what Vulkan or DX12 which allow to issue any command and not just instanced ones.

Actually DX12 & Vulkan are exactly following the same path glBufferStorage + IndirectDraw did. It just got easier, made thiner, and can now handle other misc aspects from within multiple cores (texture binding, shader compilation, barrier preparation, etc).

The rest was covered by Promit's excellent post.

#5214737 Vulkan is Next-Gen OpenGL

Posted by Matias Goldberg on 05 March 2015 - 08:36 AM

subjecting yourself to the tortures that OpenGL driver writers had to endure for so long (and still will unless they got promoted).
The OpenGL API is significantly flawed, which is specifically why these kinds of major upgrades have been requested for so long(’s Peak).


That might be fun as a pet project but otherwise I don’t see the point(...)

IMO the point is that instead of having one GL implementation per vendor; we could have just one running on top of Vulkan. So if it doesn't work in my machine due to an implementation bug, I can at least be 90% certain it won't work in your machine either.
In principle it's no different from ANGLE which translates GL calls and shaders into DX9.
However ANGLE is limited to ES2/WebGL-like functionality and DX9 is a high level API with high overhead; while running on top of Vulkan could deliver very acceptable performance and support the latest GL functionality.

#5214490 Vulkan is Next-Gen OpenGL

Posted by Matias Goldberg on 04 March 2015 - 08:57 AM

THIS. A lot of people don't seem to get these are very low level APIs with a focus on raw memory manipulation and baking of objects/commands that are needed very frequently. You destroyed a texture while it was still in use?

Come on, time has changed. Current game engines uses multithreading and multithreading is one of the best ways to kill your game project, still people are able to code games smile.png

It's not really the same. Multithreading problems can be debugged and there's a lot of literature and tools to understand them.
It's much harder to debug a problem that locks up your entire system every time you try to analyze it.

I'm currently at the state of handling many things by buffers and in the application itself and that with OGL2.1 (allocate buffer, manage double/triple buffering yourself, handling buffer sync yourself etc.). Most likely I use only a few % of the API at all. I think that a modern OGL architecture (AZDO, using buffers everywhere including UBOs etc) will be close to what you could expect from vulkan and that if they expose some vulkan features as extensions (command buffer), then switching over to vulkan will not be a pain in the ass.

If you're already doing AZDO with explicit synchronization then you will find these new APIs pleasing indeed. However there are breaking changes like how textures are being loaded and bound. Since there's no hazard tracking, you can't issue a draw call that uses a texture until the it is actually in GPU memory. Drivers were also handling residency for you, but since now they don't, out of GPU errors can be much more common unless you write your own residency solution. Also how textures are bound is going to change.
Then, in the case of D3D12, there's PSOs, which fortunately you should be already emulating them for forward compatibility.

Indeed, professional developers won't have much problems; whatever annoyance they may have is obliterated by the happiness from the performance gains. I'm talking from a rookie perspective.

#5214486 Litterature about GPU architecture ?

Posted by Matias Goldberg on 04 March 2015 - 08:39 AM

Perhaps this is a bit of shameless self-promotion, but I talked a bit about memory operations on modern hardware, it may be of your interest.

They're a bit outdated, but the ATI Radeon 2000 programming guide and Depth In Depth from Emil Persson explain a lot of background concepts that are still relevant today (Hi Z, Z Compression, Early Z, Fast Z Clear, dynamic branching and divergence).
Seeing his two recent talks for modern archs is also useful to find the differences.

#5214367 Vulkan is Next-Gen OpenGL

Posted by Matias Goldberg on 03 March 2015 - 11:03 PM

Remember, Vulkan is going to be a huge pain in the ass compared to GL. The Vulkan API is _much_ cleaner, yes, but it also eschews all the hand-holding and conveniences of GL and forces you to manage all kinds of hardware state and resource migration manually. Vulkan does not _replace_ OpenGL; it simply provides yet another alternative.

The same is true in Microsoft land: D3D11.3 is being released alongside D3D12, bringing the new hardware features to the older API because the newer API is significantly more complicated to use due to the greatly thinner abstractions; it's expected that the average non-AAA developer will want to stick with the older, easier APIs.

THIS. A lot of people don't seem to get these are very low level APIs with a focus on raw memory manipulation and baking of objects/commands that are needed very frequently. You destroyed a texture while it was still in use? BAM! Graphics corruption (or worse, BSOD). You wrote to a constant buffer while it was still in use? Let the random jumping of objects begin! You manipulated the driver buffers and had an off-by-1 error? BAM! Crash or BSOD. Your shader has a loop and is reading the count from unitialized memory? BAM! TDR kicks in or system becomes highly unresponsive.
You need to change certain states more frequently than you thought? Too bad, turns out you need to make some architectural modifications to do what you want efficiently.

It's hard. But I love it, with great power comes great responsability. None of this is a show-stopper for people used to low level programming. But it is certainly not newbie friendly like D3D11 or GL were (if you considered those newbie friendly). Anyway, a lot of people learned hardcore programming back in the DOS days when it was a wild west. So may be this is a good thing.

#5213319 Render Queue Design

Posted by Matias Goldberg on 27 February 2015 - 08:41 AM

You seem to be missing the base theory on which L. Spiro built his posts/improvements.

The article Order your draw calls around from 2008 should shed light on your questions.

#5213316 does g_p2DTex->SetResource(); moves GPU memory?

Posted by Matias Goldberg on 27 February 2015 - 08:23 AM

It just changes pointers.


But what happens inside of D3D11 is much more complex actually. The driver may have decided to page out Texture B from GPU memory because you were not using it (and probably it was running out of VRAM). If that's the case, setting Texture B means the driver will copy the data back from system RAM to VRAM.

And if it's really really really running out of space; it may page out Texture A to make room for Texture B (though it is extremely rare that a driver will page out a texture for another when both are going to be used in the same frame, in this case the driver will probably signal an out of GPU memory error; but if tex A was used in the previous frame and tex B in the next one, this might happen)


Also on a lot of hardware out there switching texture is a "relatively costly" CPU-side driver overhead as the driver needs to prepare all the texture descriptors that have changed. On some hardware this is quite cheap (almost free), on other hardware this has a cost as all their hardware texture registers have to be reset.


All of this is a lot of overhead. While GPU-side this is just switching pointers, internally:

  • The driver needs to track how often textures are being used; and decide to page out the ones that have remain unused for some time.
  • The driver needs to check if the texture needs to be paged in.
  • For some hardware, the driver may need to set all texture descriptors again (not just the ones that have changed) and bring the GPU to a temporary "mini-halt".

OpenGL4 with bindless texture extension gets rid of all this driver overhead thing because it places the burden of managing texture residency on yourself (however **only** DX11-level-hardware from NVIDIA and AMD support bindless, Intel cards can't support it due to hw limitations); and DX12 promises to place the burden on the developer too (which is a good thing for us performance squeezers).


While we wait for the future to arrive, texture arrays are the next best thing; which allow you to choose between textures in the shader and only call SetResource very infrequently; while indirectly controlling residency (if you pack 16 textures together in the same array, the driver has to page them in/out as a whole pack). Though it has its disadvantages too (textures must share same pixel format, same resolution, have lower granularity for paging in/out).

#5213020 DXVA-HD Question

Posted by Matias Goldberg on 25 February 2015 - 10:35 PM

But the one thing that I can't find is how to specify the input file.

You don't.
The DXVA interface doesn't deal with file formats like mp4/mkv. You need to open the file yourself, demux it, read the video stream, and send it to the DXVA interface for decoding. Basically you have the engine but not a car or the wheels. You can use the engine to power a boat.


If, for example, your project is about replaying live streaming, then you don't need to deal with mp4 files or demuxing. You send the raw stream in your own format via UDP and send it directly for decoding once it arrives on your client PC.

For reference I'd recommend you taking a look at Media Player Homecinema's source code. It is open source and the best video player I've seen for Windows.

#5212911 Appending to an append buffer several times

Posted by Matias Goldberg on 25 February 2015 - 07:36 AM

You can, but keep an eye on performance. The more you write to an UAV, the less scalable the performance will be across multithreading hardware, which means performance may be greatly affected with each additional use if the GPU can't hide the latency.

#5212298 Succesful titles from non AAA studios (recent)

Posted by Matias Goldberg on 22 February 2015 - 12:26 PM

To answer OP's question... Flappy Bird.

Now I better run before I get shot and a war starts.

#5211726 Hiding savedata to prevent save backup

Posted by Matias Goldberg on 19 February 2015 - 12:39 PM

1. Just name the save "sound.bak" or something. Really simple but also very easy to "crack"!

Just mask it as an asset exploting a file format which allows putting more stuff at the end of the stream while regular file viewers will ignore your save data (i.e. png, jpg, pdf) like AngeCryption does (see slides).
Just make sure you don't really depend on that asset in case the file saving goes corrupt.

2. Save the data so some silly folder like "C:/appdata/flashdata/fakecompany/sound.bak". But ugly to create folder on the users computer and what if this folder is cleaned out (since its not supposed to be affiliated with the game)? Then the user will loose the progress.

If you do that, your program enters malware territory.

3. Save a timestamp to the savefile and keep a registry of the timestamps somewhere. If the savefile is replaced they will mismatch and you can refuse to load that savegame. But if the player backups the registry then? Which means i have to "hide" the registry file as well.

What happens if the clock goes kaputt? Quite common if the battery died. You'll just annoy your users.
Timestamps aren't reliable.

Also be aware that the process of safely saving a file (that is, following good practices) inherently involves performing an internal backup: (assuming no obfuscation) You first rename your Save.dat as Save.dat.old; then write your new Save.dat; and finally delete Save.dat.old
If the system crashes or power goes off, you first check if there's Save.dat.old and verify Save.dat is valid and doesn't crash if loaded. Once Save.dat is known to be ok, delete Save.dat.old; otherwise delete Save.dat and rename Save.dat.old as Save.dat
This way your user won't lose their entire progress, just the last progress they did (the power went off while saving... after all).

Take in mind that solutions that rely on writing to two or more locations to verify the save hasn't been tampered; you have to be very careful that writing to all those files ends up as an atomic operation, otherwise your "anticheat" technology will turn against your honest users who just experienced a system crash or a power outage and now have a valid save file with a corrupt verification file.

Why prevent cheating on single player games? Cheating is part of the fun. Otherwise TAS Video communities wouldn't prosper.

#5211441 Strange CPU cores usage

Posted by Matias Goldberg on 18 February 2015 - 08:30 AM

If you check the docs from the libs you're using, audio stuff in SDL is multithreaded.


Starting with Windows Vista, all audio is software based; unlike Win XP which could have hardware acceleration. This could easily explain the higher cpu usage.

Just check with a profiler or with ProcessExplorer which threads are active.

#5209331 glTexSubImage3D, invalid enum

Posted by Matias Goldberg on 07 February 2015 - 05:52 PM

Then you've been using GL wrongly or out of touch with the driver team (also looking for the twitters from the devs is a good idea). Often they've fixed my bug reports within a week and included the fix in the next driver update.
Yes, sRGB textures got broken in of their releases and got fixed in the next driver release; which was a long time ago by now. I've been doing very bleeding edge OpenGL 4.4 and lots of low level management and haven't gotten into problems that haven't been fixed after being reported.

#5208465 SDL2 and Linux [RANT]

Posted by Matias Goldberg on 03 February 2015 - 03:10 PM

Roots was correct, my anger was in over excess considering it is free software.


However good part of that anger was fueled by the fact that one major bug (maximizing, resizing and restoring) was not only reported in 2012, but also had multiple patches proposed that were never applied. This made me question the will of the developers to push the sw forward on the Linux platform.

Add that to the other bugs, and my anger went off charts. I mean, a program that hangs if the video drivers aren't really, really up to date (i.e. we first try to create a GL context, if that fails try to do it again with a lower spec) can't be deployed (the amount of bug reports would be too high); which means I would have to seriously reconsider using SDL.


However considering two of those major bugs got fixed (which strongly affect end-user deployment) were fixed within a day after this post, restores my faith on the software; living up to its good reputation.