# OpenGL Does it save on performance/memory a whole lot to use vec3 instead of vec4?

## Recommended Posts

By default in the fixed function pipeline for OpenGL things like position are actually 4 element vectors. Also I read about how when doing SIMD on vectors and things like that, processors have room for 4 element arrays.

I'm guessing sending things to OpenGL as 3 element vectors saves on bandwidth or something, but when writing actual shaders does it matter at all if I'm using vec3 or can I just use vec4 all over the place with no penalty? What about vec2 even?

##### Share on other sites
Depends on what part of the pipeline you are talking about, and the context in which you use the float3 or float4.

SIMD allows for typically 4 floating point operations to be performed simultaneously. In this regards, float4 is better than float3*
* but if you pack 4 float3s together (12 floats), you have 3 SIMD operations instead of 4 if you were treating the float3s as their own float4s (3 floats, padding, 3 floats padding, 3 floats, padding)

If you are talking about GPU, then yes the same principles still apply - but the compiler will generally fix this for you. Set yourself up some simple examples, then look at the assembly output of your shader compiler. Its even smart enough to realize that normalize(normalize(N)) is pointless

However, IIRC:
* vertex buffers are best done minimally to reduce bandwidth these days, as opposed to rigidly stick to register aligned strides
* vertex shader outputs (interpolator stage) is best packed into float4s as each output *is* its own register, and the compiler will not do this for you automatically as it has to be done consistently at each part of the pipeline (geo -> vertex, vertex -> pixel, etc)

##### Share on other sites
When sending vertex attribs, I would expect no difference. Hardware has been unpacking [font=courier new,courier,monospace]vec3[/font] for quite a while, you can bet it's full speed.
When we go out of GPU pipelines, most libraries dealing with [font=courier new,courier,monospace]vec3[/font] always allocate [font=courier new,courier,monospace]vec4[/font] instead to take advantage of aligned reads. While it is possible to use the "pack" trick pointed above, I have doubts its feasible in general.

##### Share on other sites
Every GPU register (with some exceptions that are not relevant here - such as loop counter registers) is already a 4-element vector so you're not getting any memory saving here. Interesting here to note that the old gl_TexCoord[n], etc, slots were also specified as 4-element vectors.

As a general rule, the GPU works very differently to your CPU, so what's relevant for one won't always be relevant - or even intuitive - for the other.

Where this does become useful is if you're hitting against the max number of register slots, as you could pack e.g. 2 vec2s into a single vec4 and get an extra slot that way. A good shader compiler should make this optimization for you, but I don't know if it's specified and I certainly wouldn't rely on it either way. Edited by mhagain

##### Share on other sites
-GPUs:
nowadays (on NVidia since G80/GTX8800 and ATI since GCN/7970) do not operate on vectors anymore, but on single elements, that way the compiler can easier optimize and a much higher utilization of the GPU is possible. in this place, using vec3 can be a benefit, as the compilers might sometimes not know that some vector elements are 0 or 1 and could be completely rejected (e.g. when those 'default'/'neutral' values come from constant registers). Registers are also organized as individual elements, saving space (if the compiler cannot), can further rise the utilization -> speed.
-CPUs:
if you do simple math, it all will be done in a scalar way usually. some compiler can vectorize the code, but whether you're using vec3 or vec4 can give you sometimes a boost or a hit, depending on what operations you are doing and if the compiler can detect what you intend to do. if it's a common pattern, it will vectorize it, no matter if vec3 or vec4.
if you're writing hand optimized vector code e.g. using SSE, it's usually critical to have vec4 code, vec3 leads to a lot of operations to reorganize the vector layout to vec4 before you do the math; basically, the overhead is more than the benefit in most simple cases.
-memory:
-DiskSpace
that's fully worth it, usually all kind of saving on disk will save you time, even if you do some funky decompression or conversion (e.g. using float16 instead of float32), you will have a saving, I would never waste space, even on SSDs.

## Create an account

Register a new account

• ## Partner Spotlight

• ### Forum Statistics

• Total Topics
627664
• Total Posts
2978522
• ### Similar Content

• Both functions are available since 3.0, and I'm currently using glMapBuffer(), which works fine.
But, I was wondering if anyone has experienced advantage in using glMapBufferRange(), which allows to specify the range of the mapped buffer. Could this be only a safety measure or does it improve performance?
Note: I'm not asking about glBufferSubData()/glBufferData. Those two are irrelevant in this case.
• By xhcao
Before using void glBindImageTexture(    GLuint unit, GLuint texture, GLint level, GLboolean layered, GLint layer, GLenum access, GLenum format), does need to make sure that texture is completeness.
• By cebugdev
hi guys,
are there any books, link online or any other resources that discusses on how to build special effects such as magic, lightning, etc. in OpenGL? i mean, yeah most of them are using particles but im looking for resources specifically on how to manipulate the particles to look like an effect that can be use for games,. i did fire particle before, and I want to learn how to do the other 'magic' as well.
Like are there one book or link(cant find in google) that atleast featured how to make different particle effects in OpenGL (or DirectX)? If there is no one stop shop for it, maybe ill just look for some tips on how to make a particle engine that is flexible enough to enable me to design different effects/magic
let me know if you guys have recommendations.
• By dud3
How do we rotate the camera around x axis 360 degrees, without having the strange effect as in my video below?
Mine behaves exactly the same way spherical coordinates would, I'm using euler angles.
Tried googling, but couldn't find a proper answer, guessing I don't know what exactly to google for, googled 'rotate 360 around x axis', got no proper answers.

References:
Code: https://pastebin.com/Hcshj3FQ
The video shows the difference between blender and my rotation:

• By Defend
I've had a Google around for this but haven't yet found some solid advice. There is a lot of "it depends", but I'm not sure on what.
My question is what's a good rule of thumb to follow when it comes to creating/using VBOs & VAOs? As in, when should I use multiple or when should I not? My understanding so far is that if I need a new VBO, then I need a new VAO. So when it comes to rendering multiple objects I can either:
* make lots of VAO/VBO pairs and flip through them to render different objects, or
* make one big VBO and jump around its memory to render different objects.
I also understand that if I need to render objects with different vertex attributes, then a new VAO is necessary in this case.
If that "it depends" really is quite variable, what's best for a beginner with OpenGL, assuming that better approaches can be learnt later with better understanding?

• 10
• 10
• 12
• 22
• 13