Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 04 Oct 2010
Offline Last Active Today, 05:23 PM

#5140371 Array of samplers vs. texture array

Posted by Chris_F on 19 March 2014 - 12:08 PM

A texture array is a native resource type designed to let you access pixels in a big 3d volume of mip-mapped 2d slices.


An array of samplers is syntactic sugar over declaring a large number of individual samplers.

It's the same as having sampler0, sampler1, sampler2, etc... If you're using them in a loop that the compiler can unroll, then they're fine. If you're randomly indexing into the array, then I'd expect waterfalling to occur (i.e. if(i==0) use sampler0 elif(i==1) use sampler1, etc...)


I'm sure on old hardware that is the case. Old hardware doesn't even support bindless textures, so that is not really an issue anyway.


A quick experiment that renders 1024 random sprites per frame from an array of 32 images shows a performance difference of about 0.05ms when comparing a TEXTURE_2D_ARRAY to an array of bindless handles stored in a SSBO.


Edit: I just changed the SSBO to a UBO and now there is no performance difference at all.

#5139762 [TEXTURE] BC7

Posted by Chris_F on 17 March 2014 - 12:26 PM

Using BC5 you have to compute the vector from the normal map like that for each pixel


A fairly trivial calculation by today's standard. The advantage is that the R and G channels are completely uncorrelated, and you have 8bpp divided between just two channels instead of 3. If anyone happens to know of a in depth comparison of BC5 and BC7 specifically for normal maps, I'd be interested to see it.


Does that mean BC7 can't be used because of that or it's a particular case where BC7 should not be used ?


I would say to use BC1 over BC7 whenever you can get away with it, e.g. whenever the quality difference will not be noticeable.

#5139597 [TEXTURE] BC7

Posted by Chris_F on 16 March 2014 - 08:54 PM

BC7 isn't really a he so much as it is a inanimate and gender neutral it. BC7 can't represent HDR, so for that you will want BC6H. BC7 isn't ideal for normal maps, so for that you will want BC5. If you have multiple uncorrelated single channel textures combining them in a BC7 texture would introduce cross talk, so for that you will want BC4. Not all textures need the highest quality possible. For some textures BC7 will be too large, so for that you will want BC1.


No one format is good at everything, that is until ASTC is implemented on desktop GPUs, then you will be able to replace all of those formats with one.

#5138277 Having a VBO/VAO for each object.

Posted by Chris_F on 11 March 2014 - 07:46 PM

Isn't a shader storage buffer object considerably slower than an uniform buffer object?


When I'm only rendering a few thousand sprites to the screen, there is no noticeable difference in speed. Using a UBO though puts a relatively small limit on the number of sprites, since I think even newer GPUs are limited to around 64K.

#5138053 Having a VBO/VAO for each object.

Posted by Chris_F on 11 March 2014 - 02:40 AM



I'm working on a 2D Game Framework for a college assignment, I need to handle Basic sprite creation and animation, along with text display.


You don't need a vbo then. About 7 years ago when I started openGL I used to draw all my models with glBegin()/glEnd() drawing 100,000 some vertices at 60 fps or more I dont know. So for a basic 2D game say mario, then vbo's arent even needed. So whatever you decide to do I wouldn't worry too much about architecture in managing sprites. Managing sprites/particles in a next-gen 3d renderer is when you need to get down to managing sprite vbos.



The link in the OP's post mentions "modern OpenGL", and I would recommend that he pursue something with OpenGL 3.x/4.x core profile instead of going after compatibility mode, or fixed function OpenGL, which has been deprecated for the better part of a decade.


I played around with my idea (this is my first time working with this kind of thing.) I create a SSBO (UBO would work too) that contains the position, scale, rotation, and texture index for each of my sprites. I upload my sprites into a 2D texture array. Then I call glDrawArraysInstanced. In my vertex shader I get the vertex position using gl_VertexID and I look up my sprite attributes in the SSBO using gl_InstanceID.


2048 sprites rendered to screen with 1 OpenGL call and no attribute/element buffers needed. If anyone knows of a better way to handle this, I'd like to hear about it. smile.png

Attached Thumbnails

  • sprites.jpg

#5134015 Support multiple lights in Opengl ES 2.0 (no MRT)

Posted by Chris_F on 23 February 2014 - 11:31 PM

By "utilizes multiple lights with multiple render target" do you mean to say that it was a deferred renderer?


Light Pre-Pass (deferred lighting) doesn't require MRT and neither does forward shading. There is technically no limit to the number of lights either is capable of supporting. If you want to port your engine more or less as it is currently, you could maybe consider targeting OpenGL ES 3.0 instead. It's still pretty new, but there are already phones and tablets that support it, and ES 3.0 has MRT.

#5133720 OpenGL supporting libs

Posted by Chris_F on 22 February 2014 - 09:19 PM

I disagree. GLFW has MANY bugs.. I ran into enough that I just gave up on it.


I hope you reported those bugs, so that they can be fixed. I haven't ran into any thus far.


And as for assimp, I use assimp in my engine because my engine supports modding and I want modders to be able to load any file type that they want. Think of an engine like Unity. If it used a special format, no one would use it because it would be to complicated to convert EVERY model to the proprietary format.


Unity is more than a game engine, it is a game engine and a advanced editor. There is absolutely no logical reason that the engine needs to be loading models from a number of different textual file formats. Have your assimp support built into the editor, so your modders can load assets to their heats content, and have that editor convert them to a binary representation specific to your engine, that way when someone actually distributes a game it can load the assets quickly. People don't like long loading times, and that is exactly what you will get if the engine has to parse every single model from a text or XML file.

#5132821 TexSubImage2D performance

Posted by Chris_F on 19 February 2014 - 06:31 PM

I was curious to see the performance of texture uploads with my configuration using OpenGL and noticed something I think is odd. I create a 4K texture using glTexStorage2D with one MIP level and a format of GL_RGBA8. Then, every frame I use glTexSubImage2D to re-upload a static image buffer to the texture. Based off the frame rate I get about 5.19GB/s. Next, I changed the format of the texture to GL_SRGB8_ALPHA8 and re-try the experiment. This time I am getting 2.81GB/s, a significant decrease. This seems odd because as far as I know there shouldn't be anything different about uploading sRGB data verses uploading RGB data, as there is no conversion that should be taking place (sRGB conversion takes place in the shader, during sampling).


Some additional information. All I'm rendering is a fullscreen quad with a pixel shader that simply outputs vec4(1), I'm not even sampling from the texture or doing anything else each frame other than calling glTexSubImage2D. For the first test I use GL_RGBA and GL_UNSIGNED_INT_8_8_8_8_REV in the call to glTexSubImage2D, as this is what the driver tells me is ideal. For the second test I use GL_UNSIGNED_INT_8_8_8_8, as per the drivers suggestion. A bit of testing confirms that these are the fastest formats to use respectively. This is using an Nvidia GPU.

#5132698 Which version of OpenGL is recommended for beginners?

Posted by Chris_F on 19 February 2014 - 11:42 AM

Don't learn/use OpenGL 1.x or fixed function. It won't teach you anything about modern graphics and the newer APIs aren't actually any harder to learn. Learn with whatever version of the API you think you are actually going to be using. I would recommend OpenGL 3.3 at least. OpenGL ES 3.0 is catching on and WebGL 2.0 probably will soon as well, so there is no reason to waste much time on older versions.

#5132398 Port HLSL to x86 Assembly

Posted by Chris_F on 18 February 2014 - 12:23 PM

I'm not aware of any HLSL compiler that will output x86 assembly.


What you could maybe do is use the Microsoft HLSL compiler to output HLSL IR and then use this to convert the HLSL IR -> GLSL, and then use MESA to convert GLSL -> LLVM IR, then finally use LLVM to get x86 assembly.

#5132078 DDS loader and question

Posted by Chris_F on 17 February 2014 - 11:59 AM



Haven't used it myself, but I do use GLM regularly.

#5131733 Why C++?

Posted by Chris_F on 16 February 2014 - 10:36 AM

If you were to try and re-implement something like Unreal Engine 4 in Java or C# it might become apparent why the industry uses C++ instead.


For simple indie games, I don't suppose it maters quite as much.

#5130110 SampleLevel Texture coordinate format

Posted by Chris_F on 09 February 2014 - 11:14 AM

As far as white noise goes, I'm playing around with this shader at the moment.

layout(location = 0) uniform float time;
out vec4 frag_color;

uint hash(uint x)
    x = ((x >> 16) ^ x) * 0x85ebca6b;
    x = ((x >> 13) ^ x) * 0xc2b2ae35;
    x = ((x >> 16) ^ x);
    return x;
uint hash(uvec2 v) { return hash(v.x ^ hash(v.y)); }
uint hash(uvec3 v) { return hash(v.x ^ hash(v.y) ^ hash(v.z)); }
uint hash(uvec4 v) { return hash(v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w)); }

float floatConstruct(uint m)
    m &= 0x007FFFFFu;
    m |= 0x3F800000u;

    float  f = uintBitsToFloat(m);
    return f - 1.0;

float random(float x) { return floatConstruct(hash(floatBitsToUint(x))); }
float random(vec2 v) { return floatConstruct(hash(floatBitsToUint(v))); }
float random(vec3 v) { return floatConstruct(hash(floatBitsToUint(v))); }
float random(vec4 v) { return floatConstruct(hash(floatBitsToUint(v))); }

void main()
    frag_color = vec4(random(vec3(gl_FragCoord.xy, time)).xxx, 1.0);

It costs 0.28ms @ 1080p on my GTX 760, which is not all that much slower than the fract/sin implementation. The results however are a lot better. The fract/sin implementation produces a lot of noticeable patterns for me, and because of the way trig functions are implemented differently on various GPUs, it can have a lot of variability. This is GLSL by the way, but I'm sure you can modify it if you want to try it.

#5129495 AMD's Mantle API

Posted by Chris_F on 06 February 2014 - 11:31 PM

My greatest hope for Mantle is that it will pressure the OpenGL ARB into developing an OpenGL 5.0 that doesn't fall massively short, as each major version update has.

#5128287 how to prevent save cheating? (for games where saving is NOT allowed)

Posted by Chris_F on 02 February 2014 - 07:02 PM

Don't even bother. If a person wants to cheat, let them cheat. They bought the game. Let them play it the way they want.


This is of course assuming we are talking about single player games where cheating doesn't affect anyone else.