Looking at the vulkan api so far, it could solve many of my rendering performance issues in my engine. My engine was based on OpenGL 1.2, followed by a transition to OGL 2.0 and lot of extension later left me with a more or less modern deferred render engine. Still, there exists some really old and ugly rendering code, most famous the GUI code. My gui code is based on pure immediate mode commands, calculating and rendering every single icon and every single text letter every single frame. According to gDebugger a screen full of text adds more than 15k api calls !
But... and this is the reason I never refactored it earlier, the performance impact was relative small. On my workstation the performance difference by enabling/disabling the gui is negligible, so API calls alone are not the reason for low performance. Thought this might have more impact on slower PCs. I took comfort in thinking, that once lot of text is displayed on the screen, atleast the performance impact while rendering the 3d world is not that obvious, better said, hidden by the wall of text ;-)
Immediate mode will (most likely ;-) ) be not available in the Vulkan API, so, it would be a good idea to refactor the gui first and take the gui renderer as test object for my first vulkan based rendering approach. Thought the API is not officially available yet, it seems that it will work concurrently with OpenGL. My gui renderer although shares some of the more interesting core stuff of the 3d render engine, that is texture/shader/pipeline management.
So, what did I do to refactor my gui engine ?
First off, it is based on buffers only. Buffers are accessed by un-/mapping it, gui elements are calculated, cached and batched. I implemented a buffer allocation mechanism (multi-core ready) including defragmentation, a command queue (multi-core ready) including multiple sub-queues. And eventually my shaders needed to be updated to work with the old and new system.
I can toggle both renderer during run-time and so far both work perfectly. The performance increase depends on the hardware, my laptop with i5 running the game on an intel HD 4000 benefits most of it, but the overall performance on a HD4000 is really bad, so the felt impact isn't that great. Nevertheless, some quick tests show, that fullscreen text rendering consumed up to 30/40ms per frame (horrible!) with the old approach , and 1-2ms with the new approach. I think, that the performance could be increaded further by utilizing the buffer better (eg double buffering), but the real performance killer is the 3d-rendering engine (too many state changed, no instancing, low batching utilization).
Next I will exchange my own math library implementation with glm, maybe I can get some more performance improvements out of it.
PS: after verifying the performance again, 30/40ms seems to be just wrong. It is more like taking it down from 5-6ms to 3-4ms. Thought the game is really slow on a HD4000. At higher settings I get only 7fps, the GPU needs 112 ms longer than the CPU to finish its work, without any further API calls or whatever involved. Reducing the settings and render resolution helps a lot, but it is not really comparable to the mobile dedicated NVidia GPU I have in my notebook, where it runs flawless .