Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Promit

Member Since 29 Jul 2001
Online Last Active Today, 11:32 AM

#5215592 What are your opinions on DX12/Vulkan/Mantle?

Posted by Promit on 10 March 2015 - 12:22 AM


On the flip side, I am a bit concerned about sync issues. Sync between CPU and GPU (or even the GPU with itself) can lead to some really awful, hard-to-track down bugs. It's bad because you might think that you're doing it right, but then you make a small tweak to a shader and suddenly you have artifacts. It's hard enough dealing with that for one hardware configuration, so it's a little scary to imagine what could happen for PC games that have to run on everything. Hopefully there will be some good debugging/validation functionality available for tracking this down, otherwise we will probably end up with drivers automatically inserting sync points to prevent corruption (and/or removing unnecessary syncs for better performance). Either way, beginners are probably in for a rough time. 

Don't worry, a variety of shipping professional games will somehow make a complete mess of it in final build too rolleyes.gif




#5215538 Vulkan is Next-Gen OpenGL

Posted by Promit on 09 March 2015 - 06:21 PM


I've access to a machine with 32 cores, and believe me, for the vast vast vast majority of end-user computing, 4 cores is plenty.

Yeah but as a dev my 6x2-core CPU is bottlenecking my builds against my SSD RAID array laugh.png




#5215026 curious question, how good is Source engine?

Posted by Promit on 06 March 2015 - 03:25 PM

What he said. Source 1 is a vintage design, and to be quite candid it doesn't even seem to be a great design for the timeframe. There's no useful information about Source 2 upon which to speculate.




#5215019 What are your opinions on DX12/Vulkan/Mantle?

Posted by Promit on 06 March 2015 - 02:56 PM

Many years ago, I briefly worked at NVIDIA on the DirectX driver team (internship). This is Vista era, when a lot of people were busy with the DX10 transition, the hardware transition, and the OS/driver model transition. My job was to get games that were broken on Vista, dismantle them from the driver level, and figure out why they were broken. While I am not at all an expert on driver matters (and actually sucked at my job, to be honest), I did learn a lot about what games look like from the perspective of a driver and kernel.

 

The first lesson is: Nearly every game ships broken. We're talking major AAA titles from vendors who are everyday names in the industry. In some cases, we're talking about blatant violations of API rules - one D3D9 game never even called BeginFrame/EndFrame. Some are mistakes or oversights - one shipped bad shaders that heavily impacted performance on NV drivers. These things were day to day occurrences that went into a bug tracker. Then somebody would go in, find out what the game screwed up, and patch the driver to deal with it. There are lots of optional patches already in the driver that are simply toggled on or off as per-game settings, and then hacks that are more specific to games - up to and including total replacement of the shipping shaders with custom versions by the driver team. Ever wondered why nearly every major game release is accompanied by a matching driver release from AMD and/or NVIDIA? There you go.

 

The second lesson: The driver is gigantic. Think 1-2 million lines of code dealing with the hardware abstraction layers, plus another million per API supported. The backing function for Clear in D3D 9 was close to a thousand lines of just logic dealing with how exactly to respond to the command. It'd then call out to the correct function to actually modify the buffer in question. The level of complexity internally is enormous and winding, and even inside the driver code it can be tricky to work out how exactly you get to the fast-path behaviors. Additionally the APIs don't do a great job of matching the hardware, which means that even in the best cases the driver is covering up for a LOT of things you don't know about. There are many, many shadow operations and shadow copies of things down there.

 

The third lesson: It's unthreadable. The IHVs sat down starting from maybe circa 2005, and built tons of multithreading into the driver internally. They had some of the best kernel/driver engineers in the world to do it, and literally thousands of full blown real world test cases. They squeezed that system dry, and within the existing drivers and APIs it is impossible to get more than trivial gains out of any application side multithreading. If Futuremark can only get 5% in a trivial test case, the rest of us have no chance.

 

The fourth lesson: Multi GPU (SLI/CrossfireX) is fucking complicated. You cannot begin to conceive of the number of failure cases that are involved until you see them in person. I suspect that more than half of the total software effort within the IHVs is dedicated strictly to making multi-GPU setups work with existing games. (And I don't even know what the hardware side looks like.) If you've ever tried to independently build an app that uses multi GPU - especially if, god help you, you tried to do it in OpenGL - you may have discovered this insane rabbit hole. There is ONE fast path, and it's the narrowest path of all. Take lessons 1 and 2, and magnify them enormously. 

 

Deep breath.

 

Ultimately, the new APIs are designed to cure all four of these problems.

* Why are games broken? Because the APIs are complex, and validation varies from decent (D3D 11) to poor (D3D 9) to catastrophic (OpenGL). There are lots of ways to hit slow paths without knowing anything has gone awry, and often the driver writers already know what mistakes you're going to make and are dynamically patching in workarounds for the common cases.

* Maintaining the drivers with the current wide surface area is tricky. Although AMD and NV have the resources to do it, the smaller IHVs (Intel, PowerVR, Qualcomm, etc) simply cannot keep up with the necessary investment. More importantly, explaining to devs the correct way to write their render pipelines has become borderline impossible. There's too many failure cases. it's been understood for quite a few years now that you cannot max out the performance of any given GPU without having someone from NVIDIA or AMD physically grab your game source code, load it on a dev driver, and do a hands-on analysis. These are the vanishingly few people who have actually seen the source to a game, the driver it's running on, and the Windows kernel it's running on, and the full specs for the hardware. Nobody else has that kind of access or engineering ability.

* Threading is just a catastrophe and is being rethought from the ground up. This requires a lot of the abstractions to be stripped away or retooled, because the old ones required too much driver intervention to be properly threadable in the first place.

* Multi-GPU is becoming explicit. For the last ten years, it has been AMD and NV's goal to make multi-GPU setups completely transparent to everybody, and it's become clear that for some subset of developers, this is just making our jobs harder. The driver has to apply imperfect heuristics to guess what the game is doing, and the game in turn has to do peculiar things in order to trigger the right heuristics. Again, for the big games somebody sits down and matches the two manually. 

 

Part of the goal is simply to stop hiding what's actually going on in the software from game programmers. Debugging drivers has never been possible for us, which meant a lot of poking and prodding and experimenting to figure out exactly what it is that is making the render pipeline of a game slow. The IHVs certainly weren't willing to disclose these things publicly either, as they were considered critical to competitive advantage. (Sure they are guys. Sure they are.) So the game is guessing what the driver is doing, the driver is guessing what the game is doing, and the whole mess could be avoided if the drivers just wouldn't work so hard trying to protect us.

 

So why didn't we do this years ago? Well, there are a lot of politics involved (cough Longs Peak) and some hardware aspects but ultimately what it comes down to is the new models are hard to code for. Microsoft and ARB never wanted to subject us to manually compiling shaders against the correct render states, setting the whole thing invariant, configuring heaps and tables, etc. Segfaulting a GPU isn't a fun experience. You can't trap that in a (user space) debugger. So ... the subtext that a lot of people aren't calling out explicitly is that this round of new APIs has been done in cooperation with the big engines. The Mantle spec is effectively written by Johan Andersson at DICE, and the Khronos Vulkan spec basically pulls Aras P at Unity, Niklas S at Epic, and a couple guys at Valve into the fold.

 

Three out of those four just made their engines public and free with minimal backend financial obligation.

 

Now there's nothing wrong with any of that, obviously, and I don't think it's even the big motivating raison d'etre of the new APIs. But there's a very real message that if these APIs are too challenging to work with directly, well the guys who designed the API also happen to run very full featured engines requiring no financial commitments*. So I think that's served to considerably smooth the politics involved in rolling these difficult to work with APIs out to the market, encouraging organizations that would have been otherwise reticent to do so.

[Edit/update] I'm definitely not suggesting that the APIs have been made artificially difficult, by any means - the engineering work is solid in its own right. It's also become clear, since this post was originally written, that there's a commitment to continuing DX11 and OpenGL support for the near future. That also helped the decision to push these new systems out, I believe.

 

The last piece to the puzzle is that we ran out of new user-facing hardware features many years ago. Ignoring raw speed, what exactly is the user-visible or dev-visible difference between a GTX 480 and a GTX 980? A few limitations have been lifted (notably in compute) but essentially they're the same thing. MS, for all practical purposes, concluded that DX was a mature, stable technology that required only minor work and mostly disbanded the teams involved. Many of the revisions to GL have been little more than API repairs. (A GTX 480 runs full featured OpenGL 4.5, by the way.) So the reason we're seeing new APIs at all stems fundamentally from Andersson hassling the IHVs until AMD woke up, smelled competitive advantage, and started paying attention. That essentially took a three year lag time from when we got hardware to the point that compute could be directly integrated into the core of a render pipeline, which is considered normal today but was bluntly revolutionary at production scale in 2012. It's a lot of small things adding up to a sea change, with key people pushing on the right people for the right things.

 

 

Phew. I'm no longer sure what the point of that rant was, but hopefully it's somehow productive that I wrote it. Ultimately the new APIs are the right step, and they're retroactively useful to old hardware which is great. They will be harder to code. How much harder? Well, that remains to be seen. Personally, my take is that MS and ARB always had the wrong idea. Their idea was to produce a nice, pretty looking front end and deal with all the awful stuff quietly in the background. Yeah it's easy to code against, but it was always a bitch and a half to debug or tune. Nobody ever took that side of the equation into account. What has finally been made clear is that it's okay to have difficult to code APIs, if the end result just works. And that's been my experience so far in retooling: it's a pain in the ass, requires widespread revisions to engine code, forces you to revisit a lot of assumptions, and generally requires a lot of infrastructure before anything works. But once it's up and running, there's no surprises. It works smoothly, you're always on the fast path, anything that IS slow is in your OWN code which can be analyzed by common tools. It's worth it.

 

(*See this post by Unity's Aras P for more thoughts. I have a response comment in there as well.)




#5214901 Vulkan is Next-Gen OpenGL

Posted by Promit on 06 March 2015 - 02:06 AM

Alright, so today's Vulkan slides are now up:

https://www.khronos.org/developers/library/2015-gdc

Here's the main slideset (PDF): https://www.khronos.org/assets/uploads/developers/library/2015-gdc/Khronos-Vulkan-GDC-Mar15.pdf

Explicit multi-GPU is a go, including heterogeneous multi-vendor.




#5214362 Game Perfomance

Posted by Promit on 03 March 2015 - 10:24 PM

Moving this over to Game Programming.




#5214309 Vulkan is Next-Gen OpenGL

Posted by Promit on 03 March 2015 - 03:59 PM

There's nothing even vaguely unusual or improprietous about those excerpts. It's somewhat reflective of the standard internal MS dysfunction that has afflicted them for many years, but it's certainly not misconduct.




#5214288 Vulkan is Next-Gen OpenGL

Posted by Promit on 03 March 2015 - 02:10 PM

AMD published a new release: http://community.amd.com/community/amd-blogs/amd-gaming/blog/2015/03/03/one-of-mantles-futures-vulkan

The main point being that Vulkan is essentially an iterated cross platform version of Mantle. I like that AMD was willing to describe their own press release hours earlier as cryptic.




#5214223 Vulkan is Next-Gen OpenGL

Posted by Promit on 03 March 2015 - 09:36 AM

 


Why would MS have to do anything?
They don't support OpenGL, Mantle, OpenCL and CUDA and yet they all work just fine... this is no different.

 

Not on tablet/phone hardware they don't. Neither is there VS support, without wacky plugins from IHVs. Still, I don't know if I dare to dream that the new standards-attentive MS will actually boost Vulkan to first class support.




#5214221 Next-Gen OpenGL To Be Shown Off Next Month

Posted by Promit on 03 March 2015 - 09:34 AM

If nobody minds, I think it would be best if we moved to this thread: http://www.gamedev.net/topic/666286-vulkan-is-next-gen-opengl/




#5213481 Questions with GLSL Shaders

Posted by Promit on 28 February 2015 - 07:21 AM

In GLSL, you can use attribute qualifiers to select between interpolation of attributes across the triangle (smooth) or single value using the provoking vertex (flat). So using the value set in the last vertex is an option, though not the default.


#5212403 Designing a graphics programming portfolio

Posted by Promit on 23 February 2015 - 12:59 AM

As a portfolio type piece, I wouldn't even bother putting out anything that doesn't include full per pixel lightmapping with a basic tangent space bump map implementation. As adam said above, most of that list is just a waste of time. Post proc, HDR, deferred shading, forward+, PBR - these are things that make for portfolio items. I wouldn't count skeletal either unless you do something above and beyond a simple model playing simple full body animations.




#5211841 How to limit your FPS ?

Posted by Promit on 20 February 2015 - 01:41 AM

Traditionally, the advice is not to limit it and simply let the machine do what it wants to do. However, recently we've all become more aware of battery life and the importance of games (at least, some games) being kinder to the battery. This leads to an argument that artificially limited framerates are actually useful in order to conserve power. Personally I'd recommend that you simply request vertical sync in your game and let that control things. 

 

That said, I'm going to give you enough rope to hang yourself. Please don't use this unless vsync really doesn't work for your situation, for whatever reason. If you DO have to use it, and again I'd like to emphasize that you probably shouldn't, but if you do - just put it at the very top or very bottom of your top level render loop function.

	const double updateInterval = 1.0 / 60;
	static double lastUpdate = timeGetTime();
	double time = timeGetTime();
	while(time - lastUpdate < updateInterval)
	{
		Sleep(0);
		time = timeGetTime();
	}




#5211599 What's a good general max fps for a 3d graphics engine?

Posted by Promit on 18 February 2015 - 08:39 PM


I don't want it to dip below 60

Only part that matters. Recommend you test on a wide variety of hardware. Everything else you said is largely unimportant, especially without a game in place.




#5211574 GLSL Driver optimizer

Posted by Promit on 18 February 2015 - 05:27 PM

 

By the way, the real reason they had problems was because they just pumped desktop shaders through a translator and expected that to work on mobile GPUs. It's no surprise that it performed badly.

If you want performance you have to hand code the shader specifically targetting the mobile GPU, using low precision, etc. and reading the maufactures recommended practices.

 

Yes, it's true that some of their inefficiencies were coming from translation - same as this thread. But in general you don't want to be burdened with removing redundant operations, temporaries, constant folds, etc by hand. That's why we created optimizers. These things are easy to do in a machine fashion and difficult/time consuming for developers. Sitting down and hand tuning shaders for each individual platform, particularly when it's stupid stuff like redundant copy removal, is ridiculous.






PARTNERS