Jump to content
  • Advertisement

Search the Community

Showing results for tags 'Vulkan' in content posted in Graphics and GPU Programming.

The search index is currently processing. Current results may not be complete.


More search options

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Categories

  • Audio
    • Music and Sound FX
  • Business
    • Business and Law
    • Career Development
    • Production and Management
  • Game Design
    • Game Design and Theory
    • Writing for Games
    • UX for Games
  • Industry
    • Interviews
    • Event Coverage
  • Programming
    • Artificial Intelligence
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Engines and Middleware
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
  • Archive

Categories

  • Audio
  • Visual Arts
  • Programming
  • Writing

Categories

  • Game Developers Conference
    • GDC 2017
    • GDC 2018
  • Power-Up Digital Games Conference
    • PDGC I: Words of Wisdom
    • PDGC II: The Devs Strike Back
    • PDGC III: Syntax Error

Forums

  • Audio
    • Music and Sound FX
  • Business
    • Games Career Development
    • Production and Management
    • Games Business and Law
  • Game Design
    • Game Design and Theory
    • Writing for Games
  • Programming
    • Artificial Intelligence
    • Engines and Middleware
    • General and Gameplay Programming
    • Graphics and GPU Programming
    • Math and Physics
    • Networking and Multiplayer
  • Visual Arts
    • 2D and 3D Art
    • Critique and Feedback
  • Community
    • GameDev Challenges
    • GDNet+ Member Forum
    • GDNet Lounge
    • GDNet Comments, Suggestions, and Ideas
    • Coding Horrors
    • Your Announcements
    • Hobby Project Classifieds
    • Indie Showcase
    • Article Writing
  • Affiliates
    • NeHe Productions
    • AngelCode
  • Topical
    • Virtual and Augmented Reality
    • News
  • Workshops
    • C# Workshop
    • CPP Workshop
    • Freehand Drawing Workshop
    • Hands-On Interactive Game Development
    • SICP Workshop
    • XNA 4.0 Workshop
  • Archive
    • Topical
    • Affiliates
    • Contests
    • Technical
  • GameDev Challenges's Topics
  • For Beginners's Forum

Calendars

  • Community Calendar
  • Games Industry Events
  • Game Jams
  • GameDev Challenges's Schedule

Blogs

There are no results to display.

There are no results to display.

Product Groups

  • GDNet+
  • Advertisements
  • GameDev Gear

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


About Me


Website


Role


Twitter


Github


Twitch


Steam

Found 111 results

  1. Folks continue to tell me that Vulkan is a viable general-purpose replacement for OpenGL, and with the impending demise of OpenGL on Mac/iOS, I figured it's time to take the plunge... On the surface this looks like going back to the pain of old-school AMD/NVidia/Intel OpenGL driver hell. What I'm trying to get a grasp of is where the major portability pitfalls are up front, and what my hardware test matrix is going to be like... ~~ The validation layers seem useful. Do they work sufficiently well that a program which validates is guaranteed to at least run on another vendor's drivers (setting aside performance differences)? I assume I'm going to need to abstract across the various queue configurations? i.e. single queue on Intel, graphics+transfer on NVidia, graphics+compute+transfer on AMD? That seems fairly straightforward to wrap with a framegraph and bake the framegraph down to the available queues. Memory allocation seems like it's going to be a pain. Obviously theres some big memory scaling knobs like render target resolution, enabling/disabling post-process effects, asset resolution, etc. But at some point someone has to play tetris into the available memory on a particular GPU, and I don't really want to shove that off into manual configuration. Any pointers for techniques to deal with this in a sane manner? Any other major pitfalls I'm liable to run into when trying to write Vulkan code to run across multiple vendors/hardware targets? ~~ As for test matrix, I assume I'm going to need to test one each of recent AMD/NVidia/Intel, plus MoltenVK for Apple. Are the differences between subsequent architectures large enough that I need to test multiple different generations of cards from the same vendor? How bad is the driver situation in the android chipset space?
  2. Hi I'm searching for a good implementation of Vulkan in C#, Don't need to be the fastest as it's for an Editor, but i need the most complete and most active one. Saw a few ones but with no update in the past 5 months. Don't want to implement a "dead" library as my usage is for the next 3/4 years. Any idea ? thanks
  3. Hi everyone, I think my question boils down to "How do i feed shaders?" I was wondering what are the good strategies to store mesh transformation data [World matrices] to then be used in the shader for transforming vertices (performance being the priority ). And i'm talking about a game scenario where there are quite a lot of both moving entities, and static ones, that aren't repeated enough to be worth instanced drawing. So far i've only tried these naive methods : DX11 : - Store transforms of ALL entities in a constant buffer ( and give the entity an index to the buffer for later modification ) - Or store ONE transform in a constant buffer, and change it to the entity's transform before each drawcall. Vulkan : - Use Push Constants to send entity's transform to the shader before each drawcall, and maybe use a separate Device_local uniform buffer for static entities? Same question applies to lights. Any suggestions?
  4. Cannot get rid of z-fighting (severity varies between: no errors at all - ~40% fail). * up-to-date validation layer has nothing to say. * pipelines are nearly identical (differences: color attachments, descriptor sets for textures, depth write, depth compare op - LESS for prepass and EQUAL later). * did not notice anything funny when comparing the draw commands via NSight either - except, see end of this post. * "invariant gl_Position" for all participating vertex shaders makes no difference ('invariant' does not show up in decompile, but is present in SPIR-V). * gl_Position calculations are identical for all (also using identical source data: push constants + vertex attribs) However, when decompiling SPIR-V back to GLSL via NSight i noticed something rather strange: Depth prepass has "gl_Position.z = 2.0 * gl_Position.z - gl_Position.w;" added to it. What is this!? "gl_Position.y = -gl_Position.y;", which is always added to everything, i can understand - vulcans NDC is vertically flipped by default in comparison to OpenGL. That is fine. What is the muckery with z there for? And why is it only selectively added? Looking at my perspective projection code (the usual matrix multiplication, just simplified): vec4 projection(vec3 v) { return vec4(v.xy * par.proj.xy, v.z * par.proj.z + par.proj.w, -v.z); } All it ends up doing is doubling w-part of 'proj' in z (proj = vec4(1.0, 1.33.., -1.0, 0.2)). How does anything show at all given that i draw with compare op EQUAL. Decompile bug? I am out of ideas.
  5. I have a rather specific question. I'm trying to learn about linked multi GPU in Vulkan 1.1; the only real source I can find (other than the spec itself) is the following video: Anyway, each node in the linked configuration gets its own internal heap pointer. You can swizzle the node mask to your liking to make one node pull from another's memory. However, the only way to perform the "swizzling" is to rebind a new VkImage / VkBuffer instance to the same VkDeviceMemory handle (but with a different node configuration). This is effectively aliasing the memory between two instances with identical properties. I'm curious whether this configuration requires special barriers. How do image barriers work in this case? Does a layout transition on one alias automatically affect the other. I'm coming from DX12 land where placed resources require custom aliasing barriers, and each placed resource has its own independent state. It seems like Vulkan functions a bit differently. Thanks.
  6. Trying to figure out why input attachment reads as black with NSight VS plugin - and failing. This is what i can see at the invocation point of the shader: * attachment is filled with correct data (just a clear to bright red in previous renderpass) and used by the fragment shader: // SPIR-V decompiled to GLSL #version 450 layout(binding = 0) uniform sampler2D accum; // originally: layout(input_attachment_index=0, set=0, binding=0) uniform subpassInput accum; layout(location = 0) out vec4 fbFinal; void main(){ fbFinal = vec4(texelFetch(accum, ivec2(gl_FragCoord.xy), 0).xyz + vec3(0.0, 0.0, 1.0), 1.0); // originally: fbFinal = vec4(subpassLoad(accum).rgb + vec3(0.0, 0.0, 1.0), 1.0); } * the resulting image is bright blue - instead of the expected bright purple (red+blue) How can this happen? 'fbFinal' format is B8G8R8A8_UNORM and 'accum' format is R16G16B16A16_UNORM - ie. nothing weird.
  7. Hello everyone! For my engine, I want to be able to automatically generate pipeline layouts based on shader resources. That works perfectly well in D3D12 as shader resources are not required to specify descriptor tables, so I use reflection system and map different shader registers to tables as I need. In Vulkan, however, looks like descriptor sets must be specified in both SPIRV bytecode and when creating pipeline layout (why is that?). So it looks like I will have to mess around with the bytecode to tweak bindings and descriptor sets. I looked at SPIRV-cross but it seems like it can't emit SPIRV (funny enough). I also use glslang to compile GLSL to SPIRV and for some reason, binding decoration is only present for these resources that I explicitly defined. Does anybody know if there is a tool to change bindings in SPIRV bytecode?
  8. Hi, I am having problems with all of my compute shaders in Vulkan. They are not writing to resources, even though there are no problems in the debug layer, every descriptor seem correctly bound in the graphics debugger, and the shaders definitely take time to execute. I understand that this is probably a bug in my implementation which is a bit complex, trying to emulate a DX11 style rendering API, but maybe I'm missing something trivial in my logic here? Currently I am doing these: Set descriptors, such as VK_DESCRIPTOR_TYPE_STORAGE_BUFFER for a read-write structured buffer (which is non formatted buffer) Bind descriptor table / validate correctness by debug layer Dispatch on graphics/compute queue, the same one that is feeding graphics rendering commands. Insert memory barrier with both stagemasks as VK_PIPELINE_STAGE_ALL_COMMANDS_BIT and srcAccessMask VK_ACCESS_SHADER_WRITE_BIT to dstAccessMask VK_ACCESS_SHADER_READ_BIT Also insert buffer memory barrier just for the storage buffer I wanted to write Both my application behaves like the buffers are empty, and Nsight debugger also shows empty buffers (ssems like everything initialized to 0). Also, I tried the most trivial shader, writing value of 1 to the first element of uint buffer. Am I missing something trivial here? What could be an other way to debug this further?
  9. Hi, running Vulkan with the latest SDK, validation layers enabled I just got the following warning: That is really strange, because in DX11 we can have 15 constant buffers per shader stage. And my device (Nvidia GTX 1050 is DX11 compatible of course) Did anyone else run into the same issue? How is it usually handled? I would prefer not enforcing less amount of CBs for the Vulkan device and be as closely compliant to DX11 as possible. Any idea what could be the reason behind this limitation?
  10. Hi, I finally managed to get the DX11 emulating Vulkan device working but everything is flipped vertically now because Vulkan has a different clipping space. What are the best practices out there to keep these implementation consistent? I tried using a vertically flipped viewport, and while it works on Nvidia 1050, the Vulkan debug layer is throwing error messages that this is not supported in the spec so it might not work on others. There is also the possibility to flip the clip scpace position Y coordinate before writing out with vertex shader, but that requires changing and recompiling every shader. I could also bake it into the camera projection matrices, though I want to avoid that because then I need to track down for the whole engine where I upload matrices... Any chance of an easy extension or something? If not, I will probably go with changing the vertex shaders.
  11. I have a pretty good experience with multi gpu programming in D3D12. Now looking at Vulkan, although there are a few similarities, I cannot wrap my head around a few things due to the extremely sparse documentation (typical Khronos...) In D3D12 -> You create a resource on GPU0 that is visible to GPU1 by setting the VisibleNodeMask to (00000011 where last two bits set means its visible to GPU0 and GPU1) In Vulkan - I can see there is the VkBindImageMemoryDeviceGroupInfoKHR struct which you add to the pNext chain of VkBindImageMemoryInfoKHR and then call vkBindImageMemory2KHR. You also set the device indices which I assume is the same as the VisibleNodeMask except instead of a mask it is an array of indices. Till now it's fine. Let's look at a typical SFR scenario: Render left eye using GPU0 and right eye using GPU1 You have two textures. pTextureLeft is exclusive to GPU0 and pTextureRight is created on GPU1 but is visible to GPU0 so it can be sampled from GPU0 when we want to draw it to the swapchain. This is in the D3D12 world. How do I map this in Vulkan? Do I just set the device indices for pTextureRight as { 0, 1 } Now comes the command buffer submission part that is even more confusing. There is the struct VkDeviceGroupCommandBufferBeginInfoKHR. It accepts a device mask which I understand is similar to creating a command list with a certain NodeMask in D3D12. So for GPU1 -> Since I am only rendering to the pTextureRight, I need to set the device mask as 2? (00000010) For GPU0 -> Since I only render to pTextureLeft and finally sample pTextureLeft and pTextureRight to render to the swap chain, I need to set the device mask as 1? (00000001) The same applies to VkDeviceGroupSubmitInfoKHR? Now the fun part is it does not work . Both command buffers render to the textures correctly. I verified this by reading back the textures and storing as png. The left texture is sampled correctly in the final composite pass. But I get a black in the area where the right texture should appear. Is there something that I am missing in this? Here is a code snippet too void Init() { RenderTargetInfo info = {}; info.pDeviceIndices = { 0, 0 }; CreateRenderTarget(&info, &pTextureLeft); // Need to share this on both GPUs info.pDeviceIndices = { 0, 1 }; CreateRenderTarget(&info, &pTextureRight); } void DrawEye(CommandBuffer* pCmd, uint32_t eye) { // Do the draw // Begin with device mask depending on eye pCmd->Open((1 << eye)); // If eye is 0, we need to do some extra work to composite pTextureRight and pTextureLeft if (eye == 0) { DrawTexture(0, 0, width * 0.5, height, pTextureLeft); DrawTexture(width * 0.5, 0, width * 0.5, height, pTextureRight); } // Submit to the correct GPU pQueue->Submit(pCmd, (1 << eye)); } void Draw() { DrawEye(pRightCmd, 1); DrawEye(pLeftCmd, 0); }
  12. I publishing for manufacturing our ray tracing engines and products on graphics API (C++, Vulkan API, GLSL460, SPIR-V): https://github.com/world8th/satellite-oem For end users I have no more products or test products. Also, have one simple gltf viewer example (only source code). In 2016 year had idea for replacement of screen space reflections, but in 2018 we resolved to finally re-profile project as "basis of render engine". In Q3 of 2017 year finally merged to Vulkan API.
  13. vkQueuePresentKHR is busy waiting - ie. wasting all the CPU cycles while waiting for vsync. Expected, sane, behavior would of course be akin to Sleep(0) till it can finish. Windows 7, GeForce GTX 660. Is this a common problem? Is there anything i can do to make it behave properly?
  14. I am working on reusing as many command buffers as I can by pre-recording them at load time. This gives a significant boost on CPU although now I cannot get the GPU timestamps since there is no way to read back. I Map the readback buffer before and Unmap it after reading is done. Does this mean I need a persistently mapped readback buffer? void Init() { beginCmd(cmd); cmdBeginQuery(cmd); // Do a bunch of stuff cmdEndQuery(cmd); endCmd(cmd); } void Draw() { CommandBuffer* cmd = commands[frameIdx]; submit(cmd); } The begin and end query do exactly what the names say.
  15. https://www.phoronix.com/scan.php?page=article&item=vulkan-on-mac&num=1 Isn't there a similar Khronos project for DX12 to Vulkan that would cover Xbox?
  16. Hi, right now building my engine in visual studio involves a shader compiling step to build hlsl 5.0 shaders. I have a separate project which only includes shader sources and the compiler is the visual studio integrated fxc compiler. I like this method because on any PC that has visual studio installed, I can just download the solution from GitHub and everything just builds without additional dependencies and using the latest version of the compiler. I also like it because the shaders are included in the solution explorer and easy to browse, and double-click to open (opening files can be really a pain in the ass in visual studio run in admin mode). Also it's nice that VS displays the build output/errors in the output window. But now I have the HLSL 6 compiler and want to build hlsl 6 shaders as well (and as I understand I can also compile vulkan compatible shaders with it later). Any idea how to do this nicely? I want only a single project containing shader sources, like it is now, but build them for different targets. I guess adding different building projects would be the way to go that reference the shader source project? But how would they differentiate from shader type of the sources (eg. pixel shader, compute shader,etc.)? Now the shader building project contains for each shader the shader type, how can other building projects reference that? Anyone with some experience in this?
  17. I am working on a compute shader in Vulkan which does some image processing and has 1024 * 5=5120 loop iterations (5 outer and 1024 inner) If I do this, I get a device lost error after the succeeding call to queueSubmit after the image processing queueSubmit // Image processing dispatch submit(); waitForFence(); // All calls to submit after this will give the device lost error If I lower the number of loops from 1024 to 256 => 5 * 256 = 1280 loop iterations, it works fine. The shader does some pretty heavy arithmetic operations but the number of resources bound is 3 (one SRV, one UAV, and one sampler). The thread group size is x=16 ,y=16,z=1 So my question - Is there a hardware limit to the number of loop executions/number of instructions per shader?
  18. I need to index into a texture array using indices which are not dynamically uniform. This works fine on NVIDIA chips but you can see the artifacts on AMD due to the wavefront problem. This means, a lot of pixel invocations get the wrong index value. I know you fix this by using NonUniformResourceIndex in hlsl. Is there an equivalent for Vulkan glsl? This is the shader code for reference. As you can see, index is an arbitrary value for each pixel and is not dynamically uniform. I fix this for hlsl by using NonUniformResourceIndex(index) layout(set = 0, binding = 0) uniform sampler textureSampler; layout(set = 0, binding = 1) uniform texture2D albedoMaps[256]; layout(location = 0) out vec4 oColor; void main() { uint index = calculate_arbitrary_texture_index(); vec2 texCoord = calculate_texcoord(); vec4 albedo = texture(sampler2D(albedoMaps[index], textureSampler), texCoord); oColor = albedo; } Thank you
  19. I wanted to see how others are currently handling descriptor heap updates and management. I've read a few articles and there tends to be three major strategies : 1 ) You split up descriptor heaps per shader stage ( i.e one for vertex shader , pixel , hull, etc) 2) You have one descriptor heap for an entire pipeline 3) You split up descriptor heaps for update each update frequency (i.e EResourceSet_PerInstance , EResourceSet_PerPass , EResourceSet_PerMaterial, etc) The benefits of the first two approaches is that it makes it easier to port current code, and descriptor / resource descriptor management and updating tends to be easier to manage, but it seems to be not as efficient. The benefits of the third approach seems to be that it's the most efficient because you only manage and update objects when they change.
  20. Hello guys, My math is failing and can't get my orthographic projection matrix to work in Vulkan 1.0 (my implementation works great in D3D11 and D3D12). Specifically, there's nothing being drawn on the screen when using an ortho matrix but my perspective projection matrix work fantastic! I use glm with defines GLM_FORCE_LEFT_HANDED and GLM_FORCE_DEPTH_ZERO_TO_ONE (to handle 0 to 1 depth). This is how i define my matrices: m_projection_matrix = glm::perspective(glm::radians(fov), aspect_ratio, 0.1f, 100.0f); m_ortho_matrix = glm::ortho(0.0f, (float)width, (float)height, 0.0f, 0.1f, 100.0f); // I also tried 0.0f and 1.0f for depth near and far, the same I set and work for D3D but in Vulkan it doesn't work either. Then I premultiply both matrices with a "fix matrix" to invert the Y axis: glm::mat4 matrix_fix = {1.0f, 0.0f, 0.0f, 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f}; m_projection_matrix = m_projection_matrix * matrix_fix; m_ortho_matrix = m_ortho_matrix * matrix_fix; This fix matrix works good in tandem with GLM_FORCE_DEPTH_ZERO_TO_ONE. Model/World matrix is the identity matrix: glm::mat4 m_world_matrix(1.0f); Then finally this is how i set my view matrix: // Yes, I use Euler angles (don't bring the gimbal lock topic here, lol). They work great with my cameras in D3D too! m_view_matrix = glm::yawPitchRoll(glm::radians(m_rotation.y), glm::radians(m_rotation.x), glm::radians(m_rotation.z)); m_view_matrix = glm::translate(m_view_matrix, -m_position); That's all guys, in my shaders I correctly multiply all 3 matrices with the position vector and as I said, the perspective matrix works really good but my ortho matrix displays no geometry. EDIT: My vertex data is also on the right track, I use the same geometry in D3D and it works great: 256.0f units means 256 points/dots/pixels wide. What could I possibly be doing wrong or missing? Big thanks guys any help would be greatly appreciated. Keep on coding, cheers.
  21. As the title says, I am explicitly creating a too small descriptor pool, which should NOT support the resources I am going to allocate from it. std::array<vk::DescriptorPoolSize, 3> type_count; // Initialize our pool with these values type_count[0].type = vk::DescriptorType::eCombinedImageSampler; type_count[0].descriptorCount = 0; type_count[1].type = vk::DescriptorType::eSampler; type_count[1].descriptorCount = 0; type_count[2].type = vk::DescriptorType::eUniformBuffer; type_count[2].descriptorCount = 0; vk::DescriptorPoolCreateInfo createInfo = vk::DescriptorPoolCreateInfo() .setPNext(nullptr) .setMaxSets(iMaxSets) .setPoolSizeCount(type_count.size()) .setPPoolSizes(type_count.data()); pool = aDevice.createDescriptorPool(createInfo); I have an allocation function which looks like this, I am allocating a uniform, image-combined sampler and a regular sampler. Though if my pool is empty this should not work? vk::DescriptorSetAllocateInfo alloc_info[1] = {}; alloc_info[0].pNext = NULL; alloc_info[0].setDescriptorPool(pool); alloc_info[0].setDescriptorSetCount(iNumToAllocate); alloc_info[0].setPSetLayouts(&iDescriptorLayouts); std::vector<vk::DescriptorSet> tDescriptors; tDescriptors.resize(iNumToAllocate); iDevice.allocateDescriptorSets(alloc_info, tDescriptors.data());
  22. It is a bit unclear to me for what kind of tasks you would want to create a new command buffer/how to use them. is it ideal to have a command buffer per draw call? Per material call? Per render-pass? I know in Dx12 command lists can have complete rendering pipelines recorded, but I am a bit unsure how to see command buffers in Vulkan.
  23. I'm working my way through some vulkan examples, and I just want to make sure that my understanding of queues is correct.   Each physical device has a set (\(Q\)) of queue families, and each queue family has some amount \(N_{q \in Q}\) queues that can be used. When creating a logical device, you specify some amount of logical queues you want to create, such that the sum of each queueCount property for each queue family \(q\) is not greater than \(N_q\) (maybe some unforseen circumstance has led us to a VkDeviceQueueCreateInfo array of [(family=0,cnt=2), (family=1,cnt=1), (family=0,cnt=1)]). The driver will take care of multiplexing the queues, e.g. if I create two logical devices with 8 queues each from the same queue family, whether or not the driver assigns both logical queues \(0, \dots, 7\) to physical queues \(0, \dots 7\) or \(0, \dots, 15\) is none of my concern, the plumbing is all taken care of by the driver.   Different queues can be submitted in paralle, but extra safety should be taken care of to make sure that the command buffers don't screw with each other if they interact with the same object. Retrieving queues gets retrieved in the order created. e.g. If my VkDeviceQueueCreateInfo array looked like [(cnt=2,priorities=[1.0,0.5]), (cnt=1,priorities=[0.2])], I can expect that vkGetDeviceQueue(device, family, [0, 1, 2]) has priorities [1.0, 0.5, 0.2]). Queue priorities are a relative number, such that the following metaphor makes sense: if each queue can be represented as a thread, a priority of 1.0 means that the thread should work as hard as it possibly can while a priority of 0.5 means that it should only work half as hard, with the union of all threads representing the queue processing power of the entire physical device.   If I said something wrong, please feel free to correct me. I want to make sure I'm not misunderstanding something fundamental.
  24. I'm wondering if it's practically possible to do order independent transparency on mobile using render pass. OIT requires fragment to store non depth rejected result to store color and depth. There are 2 know way to do this, one is using per pixel linked lists stored in a buffer but it misses data locality. The other way is through a per pixel buffer ( I think it's called k buffer) and per pixel atomic counter (or using shader interlock if available). This one is interesting since every data are pixel local. The resolve pass has only to access the buffer values from the previous pass. However does it fit Vulkan render pass semantic? As far as I understand it express read from render target dependencies but in case of oit the kbuffer is written using image store and not rop. Additionally the big memory requirements for k buffer (typically 8 rgba8 values an 8 float values for depth) may not fit on chip memory for tile bin in mobile device.
  25. I'm starting to think about some of the issues involved in incorporating Vulkan into an engine. If I understand correctly, it's not safe to destroy a device memory resource while it's still referenced by a command buffer. In other words, Vulkan doesn't do any internal reference counting, and so a destroy operation will take effect immediately, even if the GPU is currently using the resource (or will do so in the near future).   Obviously the goal of this design approach was to allow engine builders to roll special case solutions for this. So I was wondering what approaches people were taking to deal with this problem?    I've run across this before, when implementing a PS3 version of an engine that was designed for DX9. The engine assumed that it could destroy GPU resources safely at any time (which is true for DX9). So, for my low level PS3 code, I had to provide the same guarantees.   To do this, I kept a list of referenced resources long with each command buffer. Periodically, I checked for completed command buffers, and could dereference the associated resource.   My goal was to reduce the cost in the most common case (ie, nothing being destroyed) to as little as possible. This solution was ok, but was a little awkward because the list of resources could get quite long. And (in that particular engine) some resources tended to be referenced many times by the same list; so I actually ended up sorting the list to avoid adding duplicates. That worked well, because it duplicated the kind of behaviour we expected from DX9.   Another option might be to group related resources together into a "box"... We could then do reference counting on the entire box, so that all contained resources get destroyed together. It would require some extra structure in the engine, but might reduce the low level overhead. That would be handy for streaming in and out character models -- where multiple resources will typically be evicted at the same time.   Another possibility would be to always delay any resource deallocation until all active command buffers have completed (regardless of whether it's actually referenced). It other words, we could assume that all resources are referenced by all command buffers... That would introduce the absolute minimal overhead in the normal case. But it would mean that deallocation never completes rapidly. It would also cause problems if even a single command buffer isn't promptly ended and submitted.   What approaches are people here taking for this issue?  
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!