• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.


  • Content count

  • Joined

  • Last visited

Community Reputation

662 Good

About Silverlan

  • Rank
  1. Following problem: I have a bunch of meshes that need to be rendered in one batch (They're not the same, so I can't use instancing). I've created a secondary command buffer, which does exactly that: (PseudoCode) VkCommandBuffer cmdSec = new SecondaryCommandBuffer; int subPass = 0; vkBeginCommandBuffer(cmdSec,COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE,renderPass,framebuffer,subPass); vkCmdBindPipeline(cmdSec,pipeline); foreach(mesh) { vkCmdBindVertexBuffers(cmdSec,...); vkCmdDraw(cmdSec); } vkEndCommandBuffer(cmdSec);   The secondary command buffer is later executed each frame from within the primary command buffer: VkCommandBuffer cmdPrim = new PrimaryCommandBuffer; vkBeginRenderPass(cmdPrim,renderPass,framebuffer,VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS); vkCmdExecuteCommands(cmdPrim,cmdSec); vkEndRenderPass(cmdPrim);  So far so good. The problem is, to render the meshes, I also need to push some additional data (e.g. matrix) to the pipeline, and this data changes every frame. Push constants are not an option, since they can't be used in a render pass with the VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS flag: (Source: https://www.khronos.org/registry/vulkan/specs/1.0/apispec.html#vkCmdBeginRenderPass) That means my only(?) option is to use a descriptor set. The idea is to bind the descriptor set inside the secondary command buffer recording, then update the descriptor set with the new data every frame, right before executing the secondary command buffer. Now, I'm still new at this, so I'd like someone to confirm whether this is correct or not. There's a couple of things I have to take into account:Since the memory of the descriptor set's buffer changes every frame (=non-coherent) it has to be created without the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT flag. vkFlushMappedMemoryRanges has to be called on the host, after the updated memory has been mapped. I'm not sure about this part. Since the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT flag isn't set, do I still need a pipeline barrier? (Or does vkFlushMappedMemoryRanges already take care of that?)The result would be this: (PseudoCode) VkCommandBuffer cmdSec = new SecondaryCommandBuffer; int subPass = 0; vkBeginCommandBuffer(cmdSec,COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE,renderPass,framebuffer,subPass); vkCmdBindPipeline(cmdSec,pipeline); vkCmdBindDescriptorSets(descSet); foreach(mesh) { vkCmdBindVertexBuffers(cmdSec,...); vkCmdDraw(cmdSec); } vkEndCommandBuffer(cmdSec); vkMapMemory(descSetBufferMemory); // Write data to mapped memory vkUnmapMemory(descSetBufferMemory); vkFlushMappedMemoryRanges(descSetBufferMemory); VkCommandBuffer cmdPrim = new PrimaryCommandBuffer; vkBeginCommandBuffer(cmdPrim); // Pipeline Barrier? vkBeginRenderPass(cmdPrim,renderPass,framebuffer,VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS); vkCmdExecuteCommands(cmdPrim,cmdSec); vkEndRenderPass(cmdPrim); vkEndCommandBuffer(cmdPrim); Would that be correct so far? Another thing I'm wondering about: The memory of the buffer is updated and used by the pipeline every frame. What happens if a frame has been queued already, but not fully drawn, and I'm updating the buffer for the next frame already? Would/Could that affect the queued frame? If so, could that be avoided with an additional barrier (source = VK_ACCESS_SHADER_READ_BIT, destination = VK_ACCESS_HOST_WRITE_BIT ("Wait for all shader reads to be completed before allowing the host to write"))? Would it be better to use more than 1 buffer/descriptor set (+ more than 1 secondary command buffer), and swap between them each frame? If so, would 2 be enough (Even for mailbox present mode), or would I need as many as I have swapchain images? I'd mostly just like to know if my general idea is correct, or if I'm missing/misinterpreting something.
  2. My Vulkan program is running extremely slow, and I'm trying to figure out why. I've noticed that even a few draw-calls already drain the performance far more than they should. For instance, here's an extract(Pseudocode) for rendering a few meshes: int32_t numCalls = 0; int32_t numIndices = 0; for(auto &mesh : meshes) { auto vertexBuffer = mesh.GetVertexBuffer(); auto indexBuffer = mesh.GetIndexBuffer(); vk::DeviceSize offset = 0; drawCmd.bindVertexBuffers(0,1,&vertexBuffer,&offset); // drawCmd = CommandBuffer for all drawing commands (single thread) drawCmd.bindIndexBuffer(indexBuffer,offset,vk::IndexType::eUint16); drawCmd.drawIndexed(mesh.GetIndexCount(),1,0,0,0); numIndices += mesh.GetIndexCount(); ++numCalls; } There are 238 meshes being rendered, with a total vertex index count of 52050. The GPU is definitely not overburdened (The shaders are extremely cheap). If I run my program with the code above, the frame is being rendered in approximately 46ms. Without it it's a mere 9ms. I'm using fifo present mode with 2 swapchain images. Only a primary command buffer at this time (No secondary command buffers/pre-recorded buffers), same buffer for all frames.   My problem is, I don't really know what to look for. These few rendering calls should barely make a dent, so the source of the problem must be somewhere else. Can anyone give me any hints how I should tackle this? Are the any profilers around for Vulkan already? I just need a nudge in the right direction.   // EDIT: So, it looks like vkDeviceWaitIdle takes about 32ms to execute, if all 238 meshes are rendered. (If none are rendered, it's < 1ms). Most of the stalling stems from there, but I still don't know what to do about it.
  3. I'm trying to implement shadow mapping with Vulkan. The rendering for the actual shadow map works fine, however in my scene shader, I need to do manual perspective division (Since gl_Position is already in use) to transform the scene depth values to light space (To be able to compare them with the samples from the shadow map), which produces weird results. So I ended up doing some tests, and I've noticed very weird behavior with depth values in general.   Here are some example cases:   #1 If I use the default fragment shader (Implicit depth output), the depth values are written correctly.   Vertex Shader: void main() { gl_Position = MVP *vec4(vertexPosition,1.0); } Fragment Shader: void main() {} // Also works with gl_FragDepth = gl_FragCoord.z; Result (Everything is as expected): (Depth values are linearized in post-processing to make them more clearly visible.)   #2 This case is an odd one which I've stumbled upon by accident. If I have any reference to gl_FragDepth anywhere in the fragment shader, the resulting depth values change for no apparant reason:   Vertex Shader (Same as #1): void main() { gl_Position = MVP *vec4(vertexPosition,1.0); } Fragment Shader: void main() { float d = gl_FragDepth; }  According to the OpenGL Docs, "If depth buffering is enabled and no shader writes to gl_FragDepth, then the fixed function value for depth will be used[...]", I'm assuming this also applies when using Vulkan(?). Obviously gl_FragDepth is not being written to here, and yet the result is different than case #1: (Values are very close to 0, but don't quite reach 0.)   #3 Manual perspective division If I don't use the default perspective division, and attempt to do it manually, only about half of the depth values for the geometry are being written.   Vertex Shader: void main() { gl_Position = MVP *vec4(vertexPosition,1.0); gl_Position.xyz /= gl_Position.w; gl_Position.w = 1.0; } Fragment Shader: void main() {} Result:   #4 Writing depth value manually Attempting to calculate and pass the depth value to the fragment shader manually has similar results to case #3.   Vertex Shader: out float test_depth; void main() { gl_Position = MVP *vec4(vertexPosition,1.0); test_depth = gl_Position.z /gl_Position.w; } Fragment Shader: #version 320 in float test_depth; void main() { gl_FragDepth = test_depth; } Result:     The depth range for the viewport is set to [0,1], in case that could make a difference. I'm using a 32bit floating point depth format, so precision shouldn't be a problem. I have the latest AMD beta Vulkan drivers installed, and I'm using Vulkan     Can someone explain this to me? What's the 'proper' way of doing perspective division manually with Vulkan (i.e. replicating the default behavior manually)?
  4. I've tried setting the tiling to linear without copying, and I've tried copying from a linear staging image to optimal, but the result is always the same.
  5. I'm still struggling with compressed images. Here's what the specification says about that:   Source: https://www.khronos.org/registry/dataformat/specs/1.1/dataformat.1.1.html#S3TC   Source: https://www.khronos.org/registry/vulkan/specs/1.0/xhtml/vkspec.html#resources-images   I'm using GLI to load the dds-data (Which is supposed to work with Vulkan, but I've also tried other libraries). Here's my code for loading and mapping the data: struct dds load_dds(const char *fileName) {     auto tex = gli::load_dds(fileName);     auto format = tex.format();     VkFormat vkFormat = static_cast<VkFormat>(format);     auto extents = tex.extent();     auto r = dds {};     r.texture = new gli::texture(tex);     r.width = extents.x;     r.height = extents.y;     r.format = vkFormat;     return r; } void map_data_dds(struct dds *r,void *imgData,VkSubresourceLayout layout) {     auto &tex = *static_cast<gli::texture*>(r->texture);     gli::storage storage {tex.format(),tex.extent(),tex.layers(),tex.faces(),tex.levels()};     auto *srcData = static_cast<uint8_t*>(tex.data(0,0,0));     auto *destData = static_cast<uint8_t*>(imgData); // Pointer to mapped memory of VkImage     destData += layout.offset; // layout = VkImageLayout of the image     auto extents = tex.extent();     auto w = extents.x;     auto h = extents.y;     auto blockSize = storage.block_size();     auto blockCount = storage.block_count(0);     //auto blockExtent = storage.block_extent();     auto method = 0; // All methods have the same result     if(method == 0)     {         for(auto y=decltype(blockCount.y){0};y<blockCount.y;++y)         {             auto *rowDest = destData +y *layout.rowPitch;             auto *rowSrc = srcData +y *(blockCount.x *blockSize);             for(auto x=decltype(blockCount.x){0};x<blockCount.x;++x)             {                 auto *pxDest = rowDest +x *blockSize;                 auto *pxSrc = rowSrc +x *blockSize; // 4x4 image block                 memcpy(pxDest,pxSrc,blockSize); // 64Bit per block                 //memset(pxDest,128,blockSize); // 64Bit per block             }         }     }     else if(method == 1)         memcpy(destData,srcData,storage.size());     else     {         memcpy(destData,tex.data(0,0,0),tex.size(0)); // Just one layer for now         //destData += tex.size(0);     } } Here's my code for initializing the texture (Which is 1:1 the same as the cube demo from the SDK, except for the dds-code): static void demo_prepare_texture_image(struct demo *demo, const char *filename,                                        struct texture_object *tex_obj,                                        VkImageTiling tiling,                                        VkImageUsageFlags usage,                                        VkFlags required_props) {     VkResult U_ASSERT_ONLY err;     bool U_ASSERT_ONLY pass;    /* const VkFormat tex_format = VK_FORMAT_R8G8B8A8_UNORM;     int32_t tex_width;     int32_t tex_height;     if (!loadTexture(filename, NULL, NULL, &tex_width, &tex_height)) {         printf("Failed to load textures\n");         fflush(stdout);         exit(1);     }     */     tiling = VK_IMAGE_TILING_OPTIMAL;     struct dds ddsData = load_dds("C:\\VulkanSDK\\\\Demos\\x64\\Debug\\iron01.dds");     VkFormat tex_format = ddsData.format;     int32_t tex_width = ddsData.width;     int32_t tex_height = ddsData.height;     tex_obj->tex_width = tex_width;     tex_obj->tex_height = tex_height;     const VkImageCreateInfo image_create_info = {         .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,         .pNext = NULL,         .imageType = VK_IMAGE_TYPE_2D,         .format = tex_format,         .extent = {tex_width, tex_height, 1},         .mipLevels = 1,         .arrayLayers = 1,         .samples = VK_SAMPLE_COUNT_1_BIT,         .tiling = tiling,         .usage = usage,         .flags = 0,         .initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED,     };     VkMemoryRequirements mem_reqs;     err =         vkCreateImage(demo->device, &image_create_info, NULL, &tex_obj->image);     assert(!err);     vkGetImageMemoryRequirements(demo->device, tex_obj->image, &mem_reqs);     tex_obj->mem_alloc.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;     tex_obj->mem_alloc.pNext = NULL;     tex_obj->mem_alloc.allocationSize = mem_reqs.size;     tex_obj->mem_alloc.memoryTypeIndex = 0;     pass = memory_type_from_properties(demo, mem_reqs.memoryTypeBits,                                        required_props,                                        &tex_obj->mem_alloc.memoryTypeIndex);     assert(pass);     /* allocate memory */     err = vkAllocateMemory(demo->device, &tex_obj->mem_alloc, NULL,                            &(tex_obj->mem));     assert(!err);     /* bind memory */     err = vkBindImageMemory(demo->device, tex_obj->image, tex_obj->mem, 0);     assert(!err);     if (required_props & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {         const VkImageSubresource subres = {             .aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,             .mipLevel = 0,             .arrayLayer = 0,         };         VkSubresourceLayout layout;         void *data;         vkGetImageSubresourceLayout(demo->device, tex_obj->image, &subres,                                     &layout);         err = vkMapMemory(demo->device, tex_obj->mem, 0,                           tex_obj->mem_alloc.allocationSize, 0, &data);         assert(!err);         // DDS         map_data_dds(&ddsData,data,layout);         //        // if (!loadTexture(filename, data, &layout, &tex_width, &tex_height)) {        //     fprintf(stderr, "Error loading texture: %s\n", filename);         //}         vkUnmapMemory(demo->device, tex_obj->mem);     }     tex_obj->imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;     demo_set_image_layout(demo, tex_obj->image, VK_IMAGE_ASPECT_COLOR_BIT,                           VK_IMAGE_LAYOUT_PREINITIALIZED, tex_obj->imageLayout,                           VK_ACCESS_HOST_WRITE_BIT);     /* setting the image layout does not reference the actual memory so no need      * to add a mem ref */ } I've uploaded the entire demo here. The only things I've changed from the cube demo from the Vulkan SDK are the functions above. I've tried various different images, with different compressions (BC1/2/3), none of them work.   Examples: #1: turns into: (Not the cube demo, but same principle)   #2: turns into:     Any hints would be much appreciated.
  6. #Q1 - Descriptor Binding Points I feel like I'm starting to get the hang of the API, but there's still a couple of things that are unclear to me. For instance, how do I actually change the binding point for a descriptor? I have a small test fragment shader with 2 uniform descriptor sets at binding point 0: #version 400 #extension GL_ARB_separate_shader_objects : enable #extension GL_ARB_shading_language_420pack : enable out vec4 fs_color; layout(std140, set = 0, binding = 0) uniform testa { vec4 color; } u_testa; layout(std140, set = 1, binding = 0) uniform testb { vec4 color; } u_testb; void main() { fs_color = (u_testa.color +u_testb.color) /2.0; } In my test program they're initialized like this: (This is essentially a copy of the "multiple_sets" demo from the Vulkan SDK) static const unsigned descriptor_set_count = 2; const int binding_point = 0; // The shader binding point VkDescriptorSetLayoutBinding uniform_binding[1] = {}; uniform_binding[0].binding = binding_point; uniform_binding[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; uniform_binding[0].descriptorCount = 1; uniform_binding[0].stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; uniform_binding[0].pImmutableSamplers = NULL; VkDescriptorSetLayoutCreateInfo uniform_layout_info[1] = {}; uniform_layout_info[0].sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; uniform_layout_info[0].pNext = NULL; uniform_layout_info[0].bindingCount = 1; uniform_layout_info[0].pBindings = uniform_binding; VkDescriptorSetLayoutBinding uniform_binding2[1] = {}; uniform_binding2[0].binding = binding_point; uniform_binding2[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; uniform_binding2[0].descriptorCount = 1; uniform_binding2[0].stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT; uniform_binding2[0].pImmutableSamplers = NULL; VkDescriptorSetLayoutCreateInfo uniform_layout_info2[1] = {}; uniform_layout_info2[0].sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; uniform_layout_info2[0].pNext = NULL; uniform_layout_info2[0].bindingCount = 1; uniform_layout_info2[0].pBindings = uniform_binding2; // Create multiple sets, using each createInfo static const unsigned uniform_set_index = 0; static const unsigned uniform_set_index2 = 1; VkDescriptorSetLayout descriptor_layouts[descriptor_set_count] = {}; auto res = vkCreateDescriptorSetLayout(device, uniform_layout_info, NULL, &descriptor_layouts[uniform_set_index]); assert(res == VK_SUCCESS); res = vkCreateDescriptorSetLayout(device, uniform_layout_info2, NULL, &descriptor_layouts[uniform_set_index2]); assert(res == VK_SUCCESS); // Create pipeline layout with multiple descriptor sets VkPipelineLayoutCreateInfo pipelineLayoutCreateInfo[1] = {}; pipelineLayoutCreateInfo[0].sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO; pipelineLayoutCreateInfo[0].pNext = NULL; pipelineLayoutCreateInfo[0].pushConstantRangeCount = 0; pipelineLayoutCreateInfo[0].pPushConstantRanges = NULL; pipelineLayoutCreateInfo[0].setLayoutCount = descriptor_set_count; pipelineLayoutCreateInfo[0].pSetLayouts = descriptor_layouts; VkPipelineLayout pipeline_layout; res = vkCreatePipelineLayout(device, pipelineLayoutCreateInfo, NULL, &pipeline_layout); assert(res == VK_SUCCESS); // Create a single pool to contain data for our two descriptor sets VkDescriptorPoolSize type_count[1] = {}; type_count[0].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; type_count[0].descriptorCount = 2; VkDescriptorPoolCreateInfo pool_info[1] = {}; pool_info[0].sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO; pool_info[0].pNext = NULL; pool_info[0].maxSets = descriptor_set_count; pool_info[0].poolSizeCount = sizeof(type_count) / sizeof(VkDescriptorPoolSize); pool_info[0].pPoolSizes = type_count; VkDescriptorPool descriptor_pool[1] = {}; res = vkCreateDescriptorPool(device, pool_info, NULL, descriptor_pool); assert(res == VK_SUCCESS); VkDescriptorSetAllocateInfo alloc_info[1]; alloc_info[0].sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; alloc_info[0].pNext = NULL; alloc_info[0].descriptorPool = descriptor_pool[0]; alloc_info[0].descriptorSetCount = descriptor_set_count; alloc_info[0].pSetLayouts = descriptor_layouts; // Populate descriptor sets VkDescriptorSet descriptor_sets[descriptor_set_count]; descriptor_sets[descriptor_set_count] = {}; res = vkAllocateDescriptorSets(device, alloc_info, descriptor_sets); assert(res == VK_SUCCESS); // Using empty brace initializer on the next line triggers a bug in older // versions of gcc, so memset instead VkWriteDescriptorSet descriptor_writes[2]; memset(descriptor_writes, 0, sizeof(descriptor_writes)); VkDescriptorBufferInfo buffer_info; buffer_info.buffer = colorBuffer; buffer_info.offset = 0; buffer_info.range = sizeof(glm::vec4); // Populate with info about our uniform buffer descriptor_writes[0] = {}; descriptor_writes[0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; descriptor_writes[0].pNext = NULL; descriptor_writes[0].dstSet = descriptor_sets[uniform_set_index]; descriptor_writes[0].descriptorCount = 1; descriptor_writes[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; descriptor_writes[0].pBufferInfo = &buffer_info; descriptor_writes[0].dstArrayElement = 0; descriptor_writes[0].dstBinding = binding_point; VkDescriptorBufferInfo buffer_info2; buffer_info2.buffer = colorBuffer2; buffer_info2.offset = 0; buffer_info2.range = sizeof(glm::vec4); // Populate with info about our sampled image descriptor_writes[1] = {}; descriptor_writes[1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; descriptor_writes[1].pNext = NULL; descriptor_writes[1].dstSet = descriptor_sets[uniform_set_index2]; descriptor_writes[1].descriptorCount = 1; descriptor_writes[1].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; descriptor_writes[1].pBufferInfo = &buffer_info2; descriptor_writes[1].dstArrayElement = 0; descriptor_writes[1].dstBinding = binding_point; vkUpdateDescriptorSets(device, descriptor_set_count, descriptor_writes, 0, NULL); [...] 'binding_point' is used in the 'VkDescriptorSetLayoutBinding' and 'VkDescriptorBufferInfo' structures.   During rendering, the descriptor sets are bound using: vkCmdBindDescriptorSets(cmdBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline_layout, 0, descriptor_set_count, descriptor_sets, 0, NULL); This works fine, as long as the binding point is 0. If I change 'binding_point' to 1, and the binding points in the shader respectively, the program crashes at the 'vkUpdateDescriptorSets'-call with a write access violation.   What am I missing?   #Q2 - Uniform Buffer Memory Barriers Another question, regarding uniform buffers: Say I have a uniform for my MVP matrix. This uniform changes for every object that has to be rendered, so my approach was this: 1) Bind the shader pipeline 2) Map the memory for the MVP buffer, write the object matrix, unmap the memory 3) Bind the descriptor set for the uniform 4) Bind the vertex buffer 5) Run the draw-call 6) Repeat for all objects from step 2)   This works if there's just 1 object. If there's more than one, they all use the MVP from the very last object. Most likely because the draw-calls are delayed, so the MVP memory is overwritten before the previous draw-call has finished. What can I do to prevent this? My first thought was to create a memory pipeline barrier with VK_ACCESS_UNIFORM_READ_BIT as source mask and VK_ACCESS_HOST_WRITE_BIT as destination access mask (=All uniform reads should finish before the host tries to write to the memory again), that didn't seem to have any effect however.   #Q3 - Compressed Images One more question, regarding compressed images. How can I achieve the equivalent of glGetCompressedTexImage with DXT1, DXT3 and DXT5 compressions in Vulkan? The equivalent format should be: GL_COMPRESSED_RGBA_S3TC_DXT1_EXT -> VkFormat::VK_FORMAT_BC1_RGBA_SRGB_BLOCK GL_COMPRESSED_RGBA_S3TC_DXT3_EXT -> VkFormat::VK_FORMAT_BC2_SRGB_BLOCK GL_COMPRESSED_RGBA_S3TC_DXT5_EXT -> VkFormat::VK_FORMAT_BC3_SRGB_BLOCK Is that correct? Since compressed images are supported directly by the hardware, I should be able to use the same image data I've used for glGetCompressedTexImage, and map it to the memory of a VkImage, true? So far the image has always ended up corrupted, is there anything else I need to take into account?
  7. I'm having a lot of problems with collisions between objects, and I'm kind of lost on what I can do to combat these issues. I've recorded a video which should make it more clear what I mean: [video]https://youtu.be/XhdhyUH6Nf4[/video] • The player (Kinematic Capsule Controller) pushes dynamic objects away at full force. This is especially problematic, because objects can be pushed out of the world this way. I'd like to know if I can either 1) disable being able to push objects completely, so I can write my own implementation, or 2) decrease the force by which objects are being pushed if possible. • Dynamic physics objects can push other objects out of the world, as demonstrated in the video. All of the objects are of type btRigidBody (Including the ragdoll body parts). The mass of the box is 80. • Small objects are sometimes very jittery when on the ground due to gravity (Noticable with the ragdoll). However, without applying gravity on the ground, objects wouldn't slide down slopes, so I'm not sure what to do against the jittering. • Constraints are too lenient. As you can see in the video, when the player walks into the ragdoll, the ragdoll parts sort of 'stretch'. This is because the ragdoll parts are moving apart even though they shouldn't be able to, and then return to the correct position. All of the ragdoll constraints are conetwist constraints with softness = 1, bias factor = 0.3 and relaxation factor = 1. My simulation rate is 60 ticks per second. CCD is enabled for all objects. Some of these problems might be related, but I'm out of ideas at this point.
  8. I have a vector A, which represents a velocity, and a normalized vector B, which represents a direction. I'd like to modify A, so that only the part of the velocity that doesn't go in the direction of B, remains.   Examples: #1 A = (175,25,33) B = (0,1,0) Result = (175,0,33) -- Only x and z remain   #2 A = (175,25,33) B = (1,0,0) Result = (0,25,33) -- Only y and z remain   #3 A = (175,25,33) B = (-0.116872,0.943617,0.309723) Result = ?   How can I accomplish this? I suppose I'd have to 'map' the velocity A onto a plane orthogonal to B, I'm just not sure how. Another idea I've had is: 1) Set C = (0,1,0) 2) Calculate quaternion rotation from B to C using the cross product 3) Rotate A by the result 4) Set A.y = 0 5) Rotate A by the inverse of the result   Would that work?
  9. One more thing I forgot to ask: Previously I had one buffer per light, each containing the respective light data, and then I just bound (Using glBindBufferBase) the buffers of the visible lights to the slots of the LightSources array.   How would I do that with this approach? I'd need the opposite of glBindBufferRange (To bind a buffer to a range of an indexed buffer target), but that doesn't seem to exist. Do I have to use a single buffer for all light data? That would leave me with two alternatives: - Store ALL lights within a large buffer and impose a limit of max lights allowed in a level (The MAX_LIGHTS only referred to max visible(in the view frustum)/active lights thus far), or - Use a smaller buffer (sizeof(lightData) *MAX_LIGHTS) and re-upload the light data of all visible lights whenever there's a change. I don't really like either solution, is there a third option I'm missing?
  10. Have you tried what Mathias suggested? Redefining the array inside the block, not outside? layout (std140) uniform LightSourceBlock {     LightSource lights[MAX_LIGHTS]; } LightSources; This works. Its how I define UBOs. Tried it on nVidia hardware (GT21x, Fermi), and AMD (HD 5xxx, HD 4xxx), GLSL 330. I wanted to give both approaches a shot, but after trying Mathias' solution it worked immediately. Thanks!   However, I've stumbled over another shader error on Nvidia, which doesn't occur on my AMD-card: error C5208: Sampler needs to be a uniform (global or parameter to main), need to inline function or resolve conditional expression The line it gives me just points me to my main function which does the processing, which doesn't help me at all. The error doesn't make much sense to me either, all of my samplers are uniforms. The only possible cause I could think of are my shadow samplers, which are a uniform array: uniform sampler2DShadow shadowMaps[MAX_LIGHTS];   The sampler for each light is then accessed by shadowMaps[LightSources.lights[i].shadowMapID]. (shadowMapID is an int) Could this be the cause? If so, is there a simple way to get around it?   I apologize for the sparse information, I would just do some tests myself, but I can only get access to the Nvidia machine for testing sporadically, and the shaders work fine on my AMD-setup. Maybe someone has encountered this error before?
  11.   Thanks, sadly the same error still occurs. Changing the version to "#version 420 core" also didn't help.   v330 introduced the intBitsToFloat/floatBitsToInt functions, which allow bitwise/reinterpret casts. You can use a 32bit int buffer and then reinterpret it's elements as floats as required.   Are buffer texture lookups as expensive as regular texture lookups? I have 12 variables(=512 bytes) per light, and since the data types differ (int,float,vec3,mat3,...) I suppose GL_R32I makes the most sense for the texture format? However, that would result in a lot of lookups.   Also, I've noticed that [i][URL=https://www.opengl.org/sdk/docs/man/html/texelFetch.xhtml]texelFetch[/URL][/i] returns a 4-component vector. So, if I have a 1-component texture format, would that give me the data of 4 texels?
  12. Hm... My uniform block contains both integer and float data, I suppose I'd have to add the integers as floats, and then cast them back in the shader? Couldn't that result in imprecision errors? (Although I suppose I could just round the value to be sure)     I didn't realize there was such a thing. Well, I tried running my example through the glslangValidator and no errors were generated.
  13. I have a uniform block in my shader, which I'm accessing within a loop: #version 330 core const int MAX_LIGHTS = 8; // Maximum amount of lights uniform int numLights; // Actual amount of lights (Cannot exceed MAX_LIGHTS) layout (std140) uniform LightSourceBlock { vec3 position; [...] } LightSources[MAX_LIGHTS]; // Light Data void Test() { for(int i=0;i<numLights;i++) { vec3 pos = LightSources[i].position; // Causes "index must be constant expression" error on Nvidia cards [...] } } This works fine on my AMD card, however on a Nvidia card it generates the error "index must be constant expression". I've tried changing the shader to this: #version 330 core const int MAX_LIGHTS = 8; // Maximum amount of lights uniform int numLights; // Actual amount of lights (Cannot exceed MAX_LIGHTS) layout (std140) uniform LightSourceBlock { vec3 position; [...] } LightSources[MAX_LIGHTS]; // Light Data void Test() { for(int i=0;i<MAX_LIGHTS;i++) { if(i >= numLights) break; vec3 pos = LightSources[i].position; // Causes "index must be constant expression" error on Nvidia cards [...] } } I figured this way it might consider "i" to be a constant, but the error remains.   So how can I access "LightSources" with a non-const index, without having to break up the loop and just pasting the same code below each other a bunch of times?
  14. Actually, the solution was to add noexcept to the destructor. Someone on [URL=http://stackoverflow.com/a/33099844/2482983]stack overflow[/URL] posted this idea, I'll just paste his response here for better visibility, in case anyone else has the same problem:
  15. I've used the example from http://www.rasterbar.com/products/luabind/docs.html#deriving-in-lua to define a class in c++ that I can derive from in lua: class base { public:     base(const char* s)     { std::cout << s << "\n"; }     virtual void f(int a)     { std::cout << "f(" << a << ")\n"; } }; struct base_wrapper : base, luabind::wrap_base {     base_wrapper(const char* s)         : base(s)     {}     virtual void f(int a)     {         call<void>("f", a);     }     static void default_f(base* ptr, int a)     {         return ptr->base::f(a);     } }; ... module(L) [     class_<base, base_wrapper>("base")         .def(constructor<const char*>())         .def("f", &base::f, &base_wrapper::default_f) ]; I've then created a derived class in lua: class 'base_derived' (base) function base_derived:__init(str) base.__init(self,str) end function base_derived:f() this_function_doesnt_exist() end Any call to 'f' is supposed to throw a lua error, which works fine if I do it in lua: local x = base_derived("Test") x:f() -- Throws "attempt to call a nil value..." error I'd like to do the equivalent of that, but in c++: auto g = luabind::globals(l); auto r = g["base_derived"]; if(r) { luabind::object o = r("Test"); auto gm = luabind::object_cast<base_wrapper*>(o); if(gm != nullptr) { try { luabind::call_member<void>(o,"f",5); } catch(luabind::error &e) { std::cout<<"[LUA] Error: "<<e.what()<<std::endl; } } o.push(l); } However the 'luabind::call_member'-call causes an abort in 'luabind/detail/call_member.hpp', line 258: // Code snippet of luabind/detail/call_member.hpp ~proxy_member_void_caller() { if (m_called) return; m_called = true; // don't count the function and self-reference // since those will be popped by pcall int top = lua_gettop(L) - 2; // pcall will pop the function and self reference // and all the parameters push_args_from_tuple<1>::apply(L, m_args); if (pcall(L, boost::tuples::length<Tuple>::value + 1, 0)) { assert(lua_gettop(L) == top + 1); #ifndef LUABIND_NO_EXCEPTIONS //////////////////////////////////////////// throw luabind::error(L); // LINE 258 //////////////////////////////////////////// #else error_callback_fun e = get_error_callback(); if (e) e(L); assert(0 && "the lua function threw an error and exceptions are disabled." "If you want to handle this error use luabind::set_error_callback()"); std::terminate(); #endif } // pops the return values from the function stack_pop pop(L, lua_gettop(L) - top); } The exception in that line isn't actually thrown, but it is what causes the abort. However, the abort only happens if the lua-functions causes a lua error. If I comment the 'this_function_doesnt_exist()'-call, both the lua- and c++-versions run just fine. Why is the 'throw luabind::error(L);' causing an abort and what can I do to safely call the function from c++ even with potential lua errors?