Followers 0

# OpenGL Vulkan is Next-Gen OpenGL

## 484 posts in this topic

I did recompile all the loader and layer libraries from source using the VS 2015 compiler, actually, and am using the debug versions.  Though I guess since they're MSVC-generated a lot or all of the debugging information is only in the PDBs?  Maybe I should give up on trying to use MinGW for this...

0

##### Share on other sites

I was wondering how many AAA games would be using Vulkan. I heard that the Frostbite engine Mantle render would be converted to Vulkan, but it would only be used on platforms that don't support DirectX 12.

0

##### Share on other sites

Aha!  I finally managed to solve the problem!

I was reading this page more closely, and it turns out that the loader needs to be able to find the layers using registry keys that indicate the [tt]VkLayer_xxx.json[/tt] manifests.  I added them and it all works perfectly now, even through MinGW instead of VS2015!

0

##### Share on other sites

I was wondering how many AAA games would be using Vulkan. I heard that the Frostbite engine Mantle render would be converted to Vulkan, but it would only be used on platforms that don't support DirectX 12.

It makes sense for engine to have a Vulkan spin since it makes Android port (almost) straight forward. There isn't the gl/gles "ambiguity" anymore and the render pass concept makes things like gbuffer achievable on tiled architecture while bringing some memory bandwidth saving on desktop (thus it makes sense to invest in them even if you don't target mobile at first)

I remember XCom being ported to ios and still keeping all the mechanics and gfx complexity and I think Vulkan may help with such dev.
0

##### Share on other sites

It looks like although the Mali T880 and the Adreno 530 are marketed as Vulkan compatible, their drivers seem far from being working :

http://vulkan.gpuinfo.org/displayreport.php?id=128

http://vulkan.gpuinfo.org/displayreport.php?id=133

The Mali doesn't expose swapchain extension, no MSAA. And the grid max size for compute workload is 0...

Apparently both GPU support DX11 feature level but the Tessalation/Geometry Shader stage are not exposed.

I wonder how they got UE4 running on Galaxy S7 device. Or maybe they keep a more formal Vulkan announcement for Android N.

0

##### Share on other sites

Async compute is looking to be a killer feature for DX12 / Vulkan titles. The benchmarks indicate that Pascal's implementation of async compute isn't half as good as AMD's.

0

##### Share on other sites

Async compute is looking to be a killer feature for DX12 / Vulkan titles. The benchmarks indicate that Pascal's implementation of async compute isn't half as good as AMD's.

Ah, one should be careful with such a blanket statement (unless you have some more detailled information?). Benchmarks are rarely unbiased and never fair, and interpreting the results can be tricky. I'm not trying to defend nVidia here, and you might quite possibly even be right, but I think solely on the linked results, this is a bit of a hasty conclusion.

From what one can see in that video, one possible interpretation is "Yay, AMD is so awesome, nVidia sucks". However, another possible interpretation might be "AMD sucks less with Vulkan than with OpenGL". Or worded differently, AMD's OpenGL implementation is poor, with Vulkan they're up-to-par.

Consider the numbers. The R9 Fury is meant to attack the GTX980 (which by the way is Maxwell, not Pascal). With Vulkan, it has more or less the same FPS, give or take one frame. It's still way slower than the Pascal GPUs, but comparing against these wouldn't be fair since they're much bigger beasts, so let's skip that. With OpenGL, it is however some 30+ percent slower than the competing Maxwell card.

All tested GPUs gain from using Vulkan, but for the nVidia ones it's in the 6-8% range while for the AMDs it's in the 30-40% range. I think there are really two ways of interpreting that result.
1

##### Share on other sites

Async compute is looking to be a killer feature for DX12 / Vulkan titles. The benchmarks indicate that Pascal's implementation of async compute isn't half as good as AMD's.

Ah, one should be careful with such a blanket statement (unless you have some more detailled information?). Benchmarks are rarely unbiased and never fair, and interpreting the results can be tricky. I'm not trying to defend nVidia here, and you might quite possibly even be right, but I think solely on the linked results, this is a bit of a hasty conclusion.

From what one can see in that video, one possible interpretation is "Yay, AMD is so awesome, nVidia sucks". However, another possible interpretation might be "AMD sucks less with Vulkan than with OpenGL". Or worded differently, AMD's OpenGL implementation is poor, with Vulkan they're up-to-par.

Consider the numbers. The R9 Fury is meant to attack the GTX980 (which by the way is Maxwell, not Pascal). With Vulkan, it has more or less the same FPS, give or take one frame. It's still way slower than the Pascal GPUs, but comparing against these wouldn't be fair since they're much bigger beasts, so let's skip that. With OpenGL, it is however some 30+ percent slower than the competing Maxwell card.

All tested GPUs gain from using Vulkan, but for the nVidia ones it's in the 6-8% range while for the AMDs it's in the 30-40% range. I think there are really two ways of interpreting that result.

Not really, because all games that have async compute AMD gains much more than pascal, including direct X titles. Check this out if you don't believe me:

http://wccftech.com/nvidia-geforce-gtx-1080-dx12-benchmarks/

0

##### Share on other sites
Does Doom have a configuration option do disable async compute? That would let you measure the baseline GL->Vulcan gain, and then the async gain separately.

We all knew from the existing data though, that GCN was built for async and Nvidia is only now being presurred into it by their marketing department.
0

##### Share on other sites

Async compute is looking to be a killer feature for DX12 / Vulkan titles. The benchmarks indicate that Pascal's implementation of async compute isn't half as good as AMD's.

Ah, one should be careful with such a blanket statement (unless you have some more detailled information?). Benchmarks are rarely unbiased and never fair, and interpreting the results can be tricky. I'm not trying to defend nVidia here, and you might quite possibly even be right, but I think solely on the linked results, this is a bit of a hasty conclusion.

From what one can see in that video, one possible interpretation is "Yay, AMD is so awesome, nVidia sucks". However, another possible interpretation might be "AMD sucks less with Vulkan than with OpenGL". Or worded differently, AMD's OpenGL implementation is poor, with Vulkan they're up-to-par.

Consider the numbers. The R9 Fury is meant to attack the GTX980 (which by the way is Maxwell, not Pascal). With Vulkan, it has more or less the same FPS, give or take one frame. It's still way slower than the Pascal GPUs, but comparing against these wouldn't be fair since they're much bigger beasts, so let's skip that. With OpenGL, it is however some 30+ percent slower than the competing Maxwell card.

All tested GPUs gain from using Vulkan, but for the nVidia ones it's in the 6-8% range while for the AMDs it's in the 30-40% range. I think there are really two ways of interpreting that result.

Not really, because all games that have async compute AMD gains much more than pascal, including direct X titles. Check this out if you don't believe me:

http://wccftech.com/nvidia-geforce-gtx-1080-dx12-benchmarks/

That benchmark shows that samoth is correct. Async Compute seems to be improving performance by 2-5% for AMD, which is far from the massive improvement in Doom. The "AMD is awful at OpenGL" theory seems even more likely now.

0

##### Share on other sites

There's an instructive comparison here: https://www.youtube.com/watch?v=ZCHmV3c7H1Q

At about the 4:35 mark (link) it compares CPU frame times between OpenGL and Vulkan on both AMD and NVIDIA and across multiple generations of hardware.  The take-home I got from that was that yes, AMD does have significantly higher gains than NVIDIA, but those gains just serve to make both vendors more level.  In other words, AMD's OpenGL CPU frame time was significantly worse than NVIDIA's to begin with.

So for example, the R9 Fury X went from 16.2 ms to 10.3 ms, whereas the GTX 1080 went from 10.7 ms to 10.0 ms, all of which supports the "AMD's OpenGL implementation is poor, with Vulkan they're up-to-par" reading.

0

##### Share on other sites

It'd be downright embarrassing if AMD wasn't up-to-par with Vulkan, since they pretty much designed it (or at least it's earlier incarnation).

0

##### Share on other sites
The simple truth is that for quite some time now NV have had better drivers than AMD, both in DX11 and OpenGL, when it came to performance.
They did a lot of work and took advantage of the fact you could push work to other threads and use that time to optimise the crap out of things.

Vulkan and DX12 however have different architectures which don't allow NV to burn CPU time to get their performance up; however it also shows that when you get the driver out of the way in the manner that these new APIs allow you to do then AMD's hardware can suddenly open up its legs a bit and get on par with NV.

That is where the majority of the gains come from.

As for Async; firstly it seems that pre-Pascal forget it on NV. If they were going to enable things to work sanely on those cards they would have done so by now, I suspect the front end just simply doesn't play nice.

After that, it seems like a gain BUT it depends on what work is going on and it is going to vary per GPU as to how much of a win it will be.

I suspect the reason you see slightly more gains on AMD is down to the front end design; the 'gfx' command queue process can only keep so many work units in flight, lets pretend that number is 10. So when 10 work units have been dispatched in to the ALU array it'll stop processing until there is spare space - I think it'll only retire in order, so even if WU1 finishes before WU0 it still won't have any spare dispatch space. However, AMD also have their ACE, which can keep more work units in flight; even if you only have access to one of those that'll be 2 more WU in flight in the array (iirc) so you can do more work if you have the space.

NV, on the other hand, seems to have a more unified front end (they are piss poor at telling people things so a bit of guess work involved), which I suspect means they can get more work in flight so the same amount of spare time might not be there to take advantage of.

This is all guess work however, the main take away is that async can be a win, but it very much depends on the work going on within the GPU and the GPU it is being run on.

The big wins however are the API changes, which get the driver out the way, let you use less CPU time setting things up per thread AND let you setup across threads.
0

##### Share on other sites
Yeah it's both.

AMD's drivers have traditionally been (CPU) slower than NV's. Especially GL, which is an inconceivable amount for driver code (for comparison, NV's GL driver likely dwarfs Unreal engine's code base).

A lot of driver work is now engine work,
letting AMD catch up on CPU performance by handing half their responsibilities to smart engine devs who can use design instead of heuristics now :)

Resource barriers also give engine devs some opportunity to micro-optimize a bit of GPU time, which was the job of magic driver heuristics previously -- and NV's heuristic magic was likely smarter than AMD's.

AMD were the original Vulkan architects (and probably had disproportionate input into D12 as well - the benefits of winning the console war), so both APIs fit their HW architecture perfectly (closer API fit than NV).

AMD actually can do async compute right (again: perfect API/HW fit) allowing modest gains in certain situations (5-30%). Which could mean as much as 5ms of extra GPU time per frame :o
1

##### Share on other sites

Axel Gneiting from id Software is porting Quake to Vulkan: https://twitter.com/axelgneiting/status/755988244408381443

This is cool; it's going to be a really nice working example and we'll be able to see exactly how each block of modern Vulkan code relates to the original cruddy old legacy OpenGL 1.1 code.

1

##### Share on other sites

As for Async; firstly it seems that pre-Pascal forget it on NV. If they were going to enable things to work sanely on those cards they would have done so by now

It's not so much limited to async compute, but to compute and transfers in combination alltogether it seems.

I can confirm that from my immediate experience with using Blender (which on NV uses CUDA) on a desktop computer with a Maxwell GPU in comparison to my notebook which only has the Skylake CPU's integrated graphics chip.

Yeah, Blender on a notebook, and setting the 3D viewport to "render", what a fucked up idea, this cannot possibly work. Guess what, it works way better than on the desktop computer with dedicated GPU.

Now looking at the performance counters shown by process explorer, it turns out the Maxwell has two shader units busy on average (two!) whereas the cheap integrated Intel GPU has them all 95-100% busy. So I guess if the GPU does not spend its time doing compute, the time must go into doing DMA and switching engines between "transfer" and "compute". Otherwise I couldn't explain why the GPU usage is so darn low.

Now, since this is not an entirely unknown issue, I would seriously expect Pascal schedule/pipeline that kind of mode switching much better, or even do it in parallel seamlessly (I think I have read something about them adding an additional DMA controllers at one point, too, though I believe that was even for Maxwell -- would make seem on yet older generations it's still worse?).
0

##### Share on other sites

Although it's true that Vulkan performs much faster than OpenGL, I don't think OpenGL is going anywhere anytime soon. Now, engines such as Unreal and Unity will perform Vulkan under the hood, but nobody in their right mind will code in Vulkan (have you seen this "simple" Hello Triangle?). However, I think it'll be interesting to see what happens. Who knows, maybe we'll all end up coding with Vulkan. Computers will continue to get faster, but so will our ambitions, so Vulkan seems like the new future of graphics processing, but OpenGL will be here to stay due to its "simplicity" (at least compared to Vulkan).  :D

0

##### Share on other sites

Modern OpenGL is also ridiculously complex and requires pages of code to render a triangle using the recommended "fast path". No one should be programming in GL either, except a small number of people within engine development teams :D

0

##### Share on other sites

Modern OpenGL is also ridiculously complex and requires pages of code to render a triangle using the recommended "fast path". No one should be programming in GL either, except a small number of people within engine development teams :D

This.

OpenGL's reputation for "simplicity", I suspect, probably stems from John Carmack's comparison with D3D3 dating back to 1996-ish.  Nobody in their right mind would write production-quality performance-critical OpenGL code in that style any more either.

0

##### Share on other sites

nobody in their right mind will code in Vulkan (have you seen this "simple" Hello Triangle?).

This isn't a triangle example, it's a simple rendering engine for triangles with a RGB colour at each vertex that has an easy to find hardcoded triangle in the middle as an example of the example:
// Setup vertices
std::vector<Vertex> vertexBuffer =
{
{ {  1.0f,  1.0f, 0.0f }, { 1.0f, 0.0f, 0.0f } },
{ { -1.0f,  1.0f, 0.0f }, { 0.0f, 1.0f, 0.0f } },
{ {  0.0f, -1.0f, 0.0f }, { 0.0f, 0.0f, 1.0f } }
};
uint32_t vertexBufferSize = static_cast<uint32_t>(vertexBuffer.size()) * sizeof(Vertex);

// Setup indices
std::vector<uint32_t> indexBuffer = { 0, 1, 2 };

In example code, the priority is calling the API properly, not good abstractions (which would be confusing).

EDIT: quoting loses links. The "triangle example" is https://github.com/SaschaWillems/Vulkan/blob/master/triangle/triangle.cpp from the well known Vulkan examples repository by Sascha Willems. Edited by LorenzoGatti
1

##### Share on other sites

nobody in their right mind will code in Vulkan (have you seen this "simple" Hello Triangle?).

This isn't a triangle example, it's a simple rendering engine for triangles with a RGB colour at each vertex that has an easy to find hardcoded triangle in the middle as an example of the example:
// Setup vertices		std::vector&lt;Vertex&gt; vertexBuffer = 		{			{ {  1.0f,  1.0f, 0.0f }, { 1.0f, 0.0f, 0.0f } },			{ { -1.0f,  1.0f, 0.0f }, { 0.0f, 1.0f, 0.0f } },			{ {  0.0f, -1.0f, 0.0f }, { 0.0f, 0.0f, 1.0f } }		};		uint32_t vertexBufferSize = static_cast&lt;uint32_t&gt;(vertexBuffer.size()) * sizeof(Vertex);		// Setup indicesstd::vector&lt;uint32_t&gt; indexBuffer = { 0, 1, 2 };
In example code, the priority is calling the API properly, not good abstractions (which would be confusing).
&nbsp;
EDIT: quoting loses links. The "triangle example" is https://github.com/SaschaWillems/Vulkan/blob/master/triangle/triangle.cpp from the well known Vulkan examples repository by Sascha Willems.
I think that's not what these quotes (both towards Vulkan and modern GL) are meant to be or what they should be read as.

Sure, the draw-triangle code is concise, straightforward, clear (both in GL4 and Vulkan), but it takes about 100 lines of code to even get a context which can do anything at all set up in GL, and about three times as much in Vulkan. Plus, it takes like half a page of code to do what you would wish to be none more "CreateTexture()" or the like.

Hence the assumption "no sane person will...". Almost everybody, almost all the time, will want to be using an intermediate library which does the Vulkan heavy lifting.
0

##### Share on other sites

Hence the assumption "no sane person will...". Almost everybody, almost all the time, will want to be using an intermediate library which does the Vulkan heavy lifting.

This looks to be a good assumption... When you have a real good low-level library, we can foresee new higher-level libraries that will provide more easy means. But then we might end with a plethora of libraries, for sure not compatible between themselves... Each of them having their good and bad things.

It's also possible for OpenGL implementations and even Direct3D to have future releases based on Vulkan... But maybe I'm too wrong here.

0

##### Share on other sites

Sure, the draw-triangle code is concise, straightforward, clear (both in GL4 and Vulkan), but it takes about 100 lines of code to even get a context which can do anything at all set up in GL, and about three times as much in Vulkan. Plus, it takes like half a page of code to do what you would wish to be none more "CreateTexture()" or the like.

Hence the assumption "no sane person will...". Almost everybody, almost all the time, will want to be using an intermediate library which does the Vulkan heavy lifting.

The thing is, creating a context is an absolutely standard task that is also something you will typically write code for once and once only, then reuse that code in subsequent projects (or alternatively, grab somebody else's code off the internet).

0

## Create an account

Register a new account

Followers 0

• ### Similar Content

• So it's been a while since I took a break from my whole creating a planet in DX11. Last time around I got stuck on fixing a nice LOD.
A week back or so I got help to find this:
https://github.com/sp4cerat/Planet-LOD
In general this is what I'm trying to recreate in DX11, he that made that planet LOD uses OpenGL but that is a minor issue and something I can solve. But I have a question regarding the code
He gets the position using this row
vec4d pos = b.var.vec4d["position"]; Which is then used further down when he sends the variable "center" into the drawing function:
if (pos.len() < 1) pos.norm(); world::draw(vec3d(pos.x, pos.y, pos.z));
Inside the draw function this happens:
draw_recursive(p3[0], p3[1], p3[2], center); Basically the 3 vertices of the triangle and the center of details that he sent as a parameter earlier: vec3d(pos.x, pos.y, pos.z)
Now onto my real question, he does vec3d edge_center[3] = { (p1 + p2) / 2, (p2 + p3) / 2, (p3 + p1) / 2 }; to get the edge center of each edge, nothing weird there.
But this is used later on with:
vec3d d = center + edge_center[i]; edge_test[i] = d.len() > ratio_size; edge_test is then used to evaluate if there should be a triangle drawn or if it should be split up into 3 new triangles instead. Why is it working for him? shouldn't it be like center - edge_center or something like that? Why adding them togheter? I asume here that the center is the center of details for the LOD. the position of the camera if stood on the ground of the planet and not up int he air like it is now.

Full code can be seen here:
https://github.com/sp4cerat/Planet-LOD/blob/master/src.simple/Main.cpp
If anyone would like to take a look and try to help me understand this code I would love this person. I'm running out of ideas on how to solve this in my own head, most likely twisted it one time to many up in my head
Toastmastern

• I googled around but are unable to find source code or details of implementation.
What keywords should I search for this topic?
Things I would like to know:
A. How to ensure that partially covered pixels are rasterized?
Apparently by expanding each triangle by 1 pixel or so, rasterization problem is almost solved.
But it will result in an unindexable triangle list without tons of overlaps. Will it incur a large performance penalty?
How to ensure proper synchronizations in GLSL?
GLSL seems to only allow int32 atomics on image.
C. Is there some simple ways to estimate coverage on-the-fly?
In case I am to draw 2D shapes onto an exisitng target:
1. A multi-pass whatever-buffer seems overkill.
2. Multisampling could cost a lot memory though all I need is better coverage.
Besides, I have to blit twice, if draw target is not multisampled.

• By mapra99
Hello

I am working on a recent project and I have been learning how to code in C# using OpenGL libraries for some graphics. I have achieved some quite interesting things using TAO Framework writing in Console Applications, creating a GLUT Window. But my problem now is that I need to incorporate the Graphics in a Windows Form so I can relate the objects that I render with some .NET Controls.

To deal with this problem, I have seen in some forums that it's better to use OpenTK instead of TAO Framework, so I can use the glControl that OpenTK libraries offer. However, I haven't found complete articles, tutorials or source codes that help using the glControl or that may insert me into de OpenTK functions. Would somebody please share in this forum some links or files where I can find good documentation about this topic? Or may I use another library different of OpenTK?

Thanks!

• Hello, I have been working on SH Irradiance map rendering, and I have been using a GLSL pixel shader to render SH irradiance to 2D irradiance maps for my static objects. I already have it working with 9 3D textures so far for the first 9 SH functions.
In my GLSL shader, I have to send in 9 SH Coefficient 3D Texures that use RGBA8 as a pixel format. RGB being used for the coefficients for red, green, and blue, and the A for checking if the voxel is in use (for the 3D texture solidification shader to prevent bleeding).
My problem is, I want to knock this number of textures down to something like 4 or 5. Getting even lower would be a godsend. This is because I eventually plan on adding more SH Coefficient 3D Textures for other parts of the game map (such as inside rooms, as opposed to the outside), to circumvent irradiance probe bleeding between rooms separated by walls. I don't want to reach the 32 texture limit too soon. Also, I figure that it would be a LOT faster.
Is there a way I could, say, store 2 sets of SH Coefficients for 2 SH functions inside a texture with RGBA16 pixels? If so, how would I extract them from inside GLSL? Let me know if you have any suggestions ^^.
• By KarimIO
EDIT: I thought this was restricted to Attribute-Created GL contexts, but it isn't, so I rewrote the post.
Hey guys, whenever I call SwapBuffers(hDC), I get a crash, and I get a "Too many posts were made to a semaphore." from Windows as I call SwapBuffers. What could be the cause of this?
Update: No crash occurs if I don't draw, just clear and swap.
static PIXELFORMATDESCRIPTOR pfd = // pfd Tells Windows How We Want Things To Be { sizeof(PIXELFORMATDESCRIPTOR), // Size Of This Pixel Format Descriptor 1, // Version Number PFD_DRAW_TO_WINDOW | // Format Must Support Window PFD_SUPPORT_OPENGL | // Format Must Support OpenGL PFD_DOUBLEBUFFER, // Must Support Double Buffering PFD_TYPE_RGBA, // Request An RGBA Format 32, // Select Our Color Depth 0, 0, 0, 0, 0, 0, // Color Bits Ignored 0, // No Alpha Buffer 0, // Shift Bit Ignored 0, // No Accumulation Buffer 0, 0, 0, 0, // Accumulation Bits Ignored 24, // 24Bit Z-Buffer (Depth Buffer) 0, // No Stencil Buffer 0, // No Auxiliary Buffer PFD_MAIN_PLANE, // Main Drawing Layer 0, // Reserved 0, 0, 0 // Layer Masks Ignored }; if (!(hDC = GetDC(windowHandle))) return false; unsigned int PixelFormat; if (!(PixelFormat = ChoosePixelFormat(hDC, &pfd))) return false; if (!SetPixelFormat(hDC, PixelFormat, &pfd)) return false; hRC = wglCreateContext(hDC); if (!hRC) { std::cout << "wglCreateContext Failed!\n"; return false; } if (wglMakeCurrent(hDC, hRC) == NULL) { std::cout << "Make Context Current Second Failed!\n"; return false; } ... // OGL Buffer Initialization glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT); glBindVertexArray(vao); glUseProgram(myprogram); glDrawElements(GL_TRIANGLES, indexCount, GL_UNSIGNED_SHORT, (void *)indexStart); SwapBuffers(GetDC(window_handle));

• 13
• 11
• 14
• 9
• 19