Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 02 May 2013
Offline Last Active Yesterday, 06:25 AM

Posts I've Made

In Topic: [Vulkan] Descriptor binding point confusion / Uniform buffer memory barriers...

25 March 2016 - 01:49 AM

Based on my understanding of the spec, it's actually because you're mapping into an image with tiling VK_IMAGE_TILING_OPTIMAL layout, which does not necessarily have to have things exist in linear scanlines. The fix is to create a staging resource with identical format and VK_IMAGE_TILING_LINEAR layout instead, and then copy from the LINEAR image to the OPTIMAL one by way of a vkCmdCopyImage.

I've tried setting the tiling to linear without copying, and I've tried copying from a linear staging image to optimal, but the result is always the same.


In Topic: [Vulkan] Descriptor binding point confusion / Uniform buffer memory barriers...

23 March 2016 - 01:02 AM

I'm still struggling with compressed images.

Here's what the specification says about that:


Compressed texture images stored using the S3TC compressed image formats are represented as a collection of 4×4 texel blocks, where each block contains 64 or 128 bits of texel data. The image is encoded as a normal 2D raster image in which each 4×4 block is treated as a single pixel.

Source: https://www.khronos.org/registry/dataformat/specs/1.1/dataformat.1.1.html#S3TC



For images created with linear tiling, rowPitch, arrayPitch and depthPitch describe the layout of the subresource in linear memory. For uncompressed formats, rowPitch is the number of bytes between texels with the same x coordinate in adjacent rows (y coordinates differ by one). arrayPitch is the number of bytes between texels with the same x and y coordinate in adjacent array layers of the image (array layer values differ by one). depthPitch is the number of bytes between texels with the same x and y coordinate in adjacent slices of a 3D image (z coordinates differ by one). Expressed as an addressing formula, the starting byte of a texel in the subresource has address:

// (x,y,z,layer) are in texel coordinates

address(x,y,z,layer) = layer*arrayPitch + z*depthPitch + y*rowPitch + x*texelSize + offset

For compressed formats, the rowPitch is the number of bytes between compressed blocks in adjacent rows. arrayPitch is the number of bytes between blocks in adjacent array layers. depthPitch is the number of bytes between blocks in adjacent slices of a 3D image.

// (x,y,z,layer) are in block coordinates

address(x,y,z,layer) = layer*arrayPitch + z*depthPitch + y*rowPitch + x*blockSize + offset;

arrayPitch is undefined for images that were not created as arrays. depthPitch is defined only for 3D images.

For color formats, the aspectMask member of VkImageSubresource must be VK_IMAGE_ASPECT_COLOR_BIT. For depth/stencil formats, aspect must be either VK_IMAGE_ASPECT_DEPTH_BIT or VK_IMAGE_ASPECT_STENCIL_BIT. On implementations that store depth and stencil aspects separately, querying each of these subresource layouts will return a different offset and size representing the region of memory used for that aspect. On implementations that store depth and stencil aspects interleaved, the same offset and size are returned and represent the interleaved memory allocation.


Source: https://www.khronos.org/registry/vulkan/specs/1.0/xhtml/vkspec.html#resources-images


I'm using GLI to load the dds-data (Which is supposed to work with Vulkan, but I've also tried other libraries).

Here's my code for loading and mapping the data:

struct dds load_dds(const char *fileName)
    auto tex = gli::load_dds(fileName);
    auto format = tex.format();
    VkFormat vkFormat = static_cast<VkFormat>(format);
    auto extents = tex.extent();
    auto r = dds {};
    r.texture = new gli::texture(tex);
    r.width = extents.x;
    r.height = extents.y;
    r.format = vkFormat;
    return r;
void map_data_dds(struct dds *r,void *imgData,VkSubresourceLayout layout)
    auto &tex = *static_cast<gli::texture*>(r->texture);
    gli::storage storage {tex.format(),tex.extent(),tex.layers(),tex.faces(),tex.levels()};

    auto *srcData = static_cast<uint8_t*>(tex.data(0,0,0));
    auto *destData = static_cast<uint8_t*>(imgData); // Pointer to mapped memory of VkImage
    destData += layout.offset; // layout = VkImageLayout of the image
    auto extents = tex.extent();
    auto w = extents.x;
    auto h = extents.y;
    auto blockSize = storage.block_size();
    auto blockCount = storage.block_count(0);
    //auto blockExtent = storage.block_extent();

    auto method = 0; // All methods have the same result
    if(method == 0)
        for(auto y=decltype(blockCount.y){0};y<blockCount.y;++y)
            auto *rowDest = destData +y *layout.rowPitch;
            auto *rowSrc = srcData +y *(blockCount.x *blockSize);
            for(auto x=decltype(blockCount.x){0};x<blockCount.x;++x)
                auto *pxDest = rowDest +x *blockSize;
                auto *pxSrc = rowSrc +x *blockSize; // 4x4 image block
                memcpy(pxDest,pxSrc,blockSize); // 64Bit per block
                //memset(pxDest,128,blockSize); // 64Bit per block
    else if(method == 1)
        memcpy(destData,tex.data(0,0,0),tex.size(0)); // Just one layer for now
        //destData += tex.size(0);

Here's my code for initializing the texture (Which is 1:1 the same as the cube demo from the SDK, except for the dds-code):

static void demo_prepare_texture_image(struct demo *demo, const char *filename,
                                       struct texture_object *tex_obj,
                                       VkImageTiling tiling,
                                       VkImageUsageFlags usage,
                                       VkFlags required_props) {
    VkResult U_ASSERT_ONLY err;
    bool U_ASSERT_ONLY pass;
   /* const VkFormat tex_format = VK_FORMAT_R8G8B8A8_UNORM;
    int32_t tex_width;
    int32_t tex_height;
    if (!loadTexture(filename, NULL, NULL, &tex_width, &tex_height)) {
        printf("Failed to load textures\n");
    struct dds ddsData = load_dds("C:\\VulkanSDK\\\\Demos\\x64\\Debug\\iron01.dds");

    VkFormat tex_format = ddsData.format;
    int32_t tex_width = ddsData.width;
    int32_t tex_height = ddsData.height;

    tex_obj->tex_width = tex_width;
    tex_obj->tex_height = tex_height;

    const VkImageCreateInfo image_create_info = {
        .pNext = NULL,
        .imageType = VK_IMAGE_TYPE_2D,
        .format = tex_format,
        .extent = {tex_width, tex_height, 1},
        .mipLevels = 1,
        .arrayLayers = 1,
        .samples = VK_SAMPLE_COUNT_1_BIT,
        .tiling = tiling,
        .usage = usage,
        .flags = 0,
        .initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED,

    VkMemoryRequirements mem_reqs;

    err =
        vkCreateImage(demo->device, &image_create_info, NULL, &tex_obj->image);

    vkGetImageMemoryRequirements(demo->device, tex_obj->image, &mem_reqs);

    tex_obj->mem_alloc.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
    tex_obj->mem_alloc.pNext = NULL;
    tex_obj->mem_alloc.allocationSize = mem_reqs.size;
    tex_obj->mem_alloc.memoryTypeIndex = 0;

    pass = memory_type_from_properties(demo, mem_reqs.memoryTypeBits,

    /* allocate memory */
    err = vkAllocateMemory(demo->device, &tex_obj->mem_alloc, NULL,

    /* bind memory */
    err = vkBindImageMemory(demo->device, tex_obj->image, tex_obj->mem, 0);

    if (required_props & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT) {
        const VkImageSubresource subres = {
            .aspectMask = VK_IMAGE_ASPECT_COLOR_BIT,
            .mipLevel = 0,
            .arrayLayer = 0,
        VkSubresourceLayout layout;
        void *data;

        vkGetImageSubresourceLayout(demo->device, tex_obj->image, &subres,

        err = vkMapMemory(demo->device, tex_obj->mem, 0,
                          tex_obj->mem_alloc.allocationSize, 0, &data);

        // DDS

       // if (!loadTexture(filename, data, &layout, &tex_width, &tex_height)) {
       //     fprintf(stderr, "Error loading texture: %s\n", filename);

        vkUnmapMemory(demo->device, tex_obj->mem);

    demo_set_image_layout(demo, tex_obj->image, VK_IMAGE_ASPECT_COLOR_BIT,
                          VK_IMAGE_LAYOUT_PREINITIALIZED, tex_obj->imageLayout,
    /* setting the image layout does not reference the actual memory so no need
     * to add a mem ref */

I've uploaded the entire demo here. The only things I've changed from the cube demo from the Vulkan SDK are the functions above.

I've tried various different images, with different compressions (BC1/2/3), none of them work.





turns into:


(Not the cube demo, but same principle)




turns into:




Any hints would be much appreciated.

In Topic: GLSL Error C1502 (Nvidia): "index must be constant expression"

16 October 2015 - 12:29 AM

Use this instead:
struct LightSource
vec3 position;

layout (std140) uniform LightSourceBlock
LightSource lights[MAX_LIGHTS];
} LightSources;

//Index like this: LightSources.lights[i]
This uses one constant buffer and places all lights in that single buffer. That's way more efficient, more compatible, and the method everyone uses.


One more thing I forgot to ask:

Previously I had one buffer per light, each containing the respective light data, and then I just bound (Using glBindBufferBase) the buffers of the visible lights to the slots of the LightSources array.


How would I do that with this approach? I'd need the opposite of glBindBufferRange (To bind a buffer to a range of an indexed buffer target), but that doesn't seem to exist.

Do I have to use a single buffer for all light data? That would leave me with two alternatives:

- Store ALL lights within a large buffer and impose a limit of max lights allowed in a level (The MAX_LIGHTS only referred to max visible(in the view frustum)/active lights thus far), or

- Use a smaller buffer (sizeof(lightData) *MAX_LIGHTS) and re-upload the light data of all visible lights whenever there's a change.

I don't really like either solution, is there a third option I'm missing?

In Topic: GLSL Error C1502 (Nvidia): "index must be constant expression"

15 October 2015 - 01:29 PM


Thanks, sadly the same error still occurs.
Have you tried what Mathias suggested? Redefining the array inside the block, not outside?
layout (std140) uniform LightSourceBlock
    LightSource lights[MAX_LIGHTS];
} LightSources;

This works. Its how I define UBOs. Tried it on nVidia hardware (GT21x, Fermi), and AMD (HD 5xxx, HD 4xxx), GLSL 330.

I wanted to give both approaches a shot, but after trying Mathias' solution it worked immediately. Thanks!


However, I've stumbled over another shader error on Nvidia, which doesn't occur on my AMD-card:

error C5208: Sampler needs to be a uniform (global or parameter to main), need to inline function or resolve conditional expression

The line it gives me just points me to my main function which does the processing, which doesn't help me at all. The error doesn't make much sense to me either, all of my samplers are uniforms.

The only possible cause I could think of are my shadow samplers, which are a uniform array:

uniform sampler2DShadow shadowMaps[MAX_LIGHTS];


The sampler for each light is then accessed by shadowMaps[LightSources.lights[i].shadowMapID]. (shadowMapID is an int)

Could this be the cause? If so, is there a simple way to get around it?


I apologize for the sparse information, I would just do some tests myself, but I can only get access to the Nvidia machine for testing sporadically, and the shaders work fine on my AMD-setup.

Maybe someone has encountered this error before?

In Topic: GLSL Error C1502 (Nvidia): "index must be constant expression"

15 October 2015 - 03:32 AM

lol yeah, I remember when I got this issue (same GLSL version, nVidia hardware), apparently "const int" isn't constant enough for nVidia biggrin.png As Mathias said, a #define works fine in this case.


Thanks, sadly the same error still occurs.

Changing the version to "#version 420 core" also didn't help.



y uniform block contains both integer and float data, I suppose I'd have to add the integers as floats, and then cast them back in the shader?

v330 introduced the intBitsToFloat/floatBitsToInt functions, which allow bitwise/reinterpret casts. You can use a 32bit int buffer and then reinterpret it's elements as floats as required.


Are buffer texture lookups as expensive as regular texture lookups? I have 12 variables(=512 bytes) per light, and since the data types differ (int,float,vec3,mat3,...) I suppose GL_R32I makes the most sense for the texture format? However, that would result in a lot of lookups.


Also, I've noticed that texelFetch returns a 4-component vector. So, if I have a 1-component texture format, would that give me the data of 4 texels?