You are right: position and scale fit perfectly into a single float array, and this is exactly what I'm using, with a single call to glUniform1fv which allows me pass the array to the shader. I don't think this is much different than a vec4, is it?
I'm not an expert on this and it might even be implementation dependent, but I believe there will be differences. A single uniform location can store four float values. I fear an array of single floats will waste three fourth of that space.
Even assuming the graphics card can pack the array tightly and doesn't waste extra time on every glUniform* call I would worry if the compiled shader code would be optimal (I'm not sure how well the swizzle operators in a vec4 would transfer to array indicing).
1) My unit cube has its coordinates set once, for a single cube. The reason I had avoided using an array of attributes is that since I have 36 vertices in my unit cube I was under the impression that if I were to use an array of attributes I would need to copy the data for each cube 36 times to make sure each vertex has its own data. I felt this would defeat my purpose of reducing the data being sent down to the card.
First, reducing the data being sent is not always the best solution. Transferring data from the system to the graphics card is expensive, yes. But not as expensive as it used to be and increasing the number of GL calls significantly instead will probably turn it into a fool's errand. GL calls are expensive and some even more than others.
That said, I'm not even sure what you mean by not using an array. If you need individual normals for each cube face, then the data has to be duplicated. It doesn't matter if you use the deprecated API (glVertex*, glNormal*) or glDrawArrays (with or without an VBO), the same data has to be send to the card behind the scenes.
That said, either with uniforms or glVertexAttribDivisior you just need one unit cube for the whole life time of the program. Upload the data once to a VBO. The only question is how to transfer and store the (position, scale)-information. That would depend heavily on how much change is expected from frame to frame and how many there actually are and if some or more static than others.
2) Now supposed that rather than drawing one cube at a time I create, say, 64 copies of the cube with glVertexPointer. This way I can buffer the data and reduce the number of calls associated with the attribute - especially, if I understand correctly, I only need to see attribute values 64 times, and NOT 64*36 times, if I use glVertexAttribDivisor. Then the number of calls to pass down the attribute values would be reduced by a factor of 64 (or whatever multiple I choose).
As said above, the only sensible thing is putting the cube data into a VBO. It is completely immutable for the whole run time. If you then need to render N cubes at different positions and sizes you set a vertex attribute which contains a vec4 and set glVertexAttribDivisor(theAttributeIndex, 1). So the array backing it must contain N vec4s and you call glDrawArraysInstanced with an instance count of N.
3) From what I understand, when working with the vertex shader after a call to glDrawArrays I don't really have any control over the order the vertices are coming in through. But it appears that glDrawArraysInstanced gives me some control over this. Is this correct?
The graphics card executes dozens or even hundreds of vertex/fragment shaders in parallel. Speaking of order has absolutely no point here. You can access some information like the index of the current vertex (gl_VertexID) or the index of the current instance (gl_InstanceID) in the individual shader instance (see this page
for more details) but the nature of the massive parallelism limits both the information available and what you can do with it.
Edited by BitMaster, 14 January 2014 - 02:11 AM.