Return array to use in glColor3fv

Started by
8 comments, last by torakka 17 years, 6 months ago
Hello all, I'm trying to write a function that takes 3 input parameters in the range 0-255 and return an array of 3 floats {R, G, B} to use in the glColor3fv function.

float* rgb_converter(int r, int g, int b); //prototype


float *color = rgb_converter(0, 255, 0); //to get pure green

glColor3fv(color);

float* rgb_converter(int r, int g, int b)
{
   float vector[3];
   
   vector[0] = ((float)r)/255; //color value in the 0-1 range
   vector[1] = ((float)g)/255;
   vector[2] = ((float)b)/255;

   return vector;
}
My function isn't working the right way because the compiler gives me a warning: "returning address of local variable or temporary" and instead of drawing a green polygon it's drawn in pure black. I appreciate you help, thank you.
Advertisement
The compiler gave you that warning because you are doing exactly that. The vector[3] went out of scope when the function ends, so the address became bad. I suggest you return by value instead.
Quote:Original post by mello
My function isn't working the right way because the compiler gives me a warning: "returning address of local variable or temporary"


This is coz vector in the case of your function is a local variable which is destroyed as soon as the procedure exits and it doesnt make sense in returning somehting which is not there.

Try something like:

float* rgb_converter(int r, int g, int b); //prototypefloat *color = rgb_converter(0, 255, 0); //to get pure greenglColor3fv(color);float* rgb_converter(int r, int g, int b){   float* vector = (float*) malloc (sizeof (float) * 3 ) ;   vector[0] = ((float)r)/255; //color value in the 0-1 range   vector[1] = ((float)g)/255;   vector[2] = ((float)b)/255;   return vector;}


Since you are explicitly asking for memory, its destruction is your responsibility and hence is not destroyed when your procedure exits.






Is there a reason you're doing this instead of just

glColor3f(r,g,b)
You could do:

struct color3f
{
float r,g,b;
};

color3f bla(...) { ...

Or:

void bla(float* color, ... )
{
color[0] = ...;
...

OpenGL also has functions to use small than float components for the API calls. This saves space, why store 24 bit pieces of data into 96 bits? That's waste of bandwidth & memory.

glVertex*() calls are not very optimal either. use glDrawArrays() or glDrawElements(), if you use these with VBO you should get nice speed. Unless you use display list. ;)
glColor3ub(r, g, b);

Takes values 0-255 as opposed to glColor3f, which takes values 0.0f-1.0f.

And that malloc code is evil, especially since you're casting the return value of malloc, which means you're using C++, not C.

Σnigma
Thank you guys, for your help.

Quote:Original post by chowe6685
Is there a reason you're doing this instead of just

glColor3f(r,g,b)


That's because RGB colors are usually specified in the 0-255 range.

Using some links or Photoshop you can easily pick the color that you want. I never saw a color list using the 0.0f-0.1f range.

Quote:Original post by Enigma
glColor3ub(r, g, b);

Takes values 0-255 as opposed to glColor3f, which takes values 0.0f-1.0f.

And that malloc code is evil, especially since you're casting the return value of malloc, which means you're using C++, not C.

Σnigma


Oh, really nice trick! I didn't knew that function. Thanks! :)

I'll use the glColor3ubv() that accepts a vector of GLubytes... is there any performance penalty in using this function instead of the traditional glColor3fv()?

I learned a lot with all you answers... thanks!
You can't really "return an array" in C++ (or C). You can return a pointer to some stuff, but returning a pointer (or reference for that matter) to local variables (i.e. stuff that's on the stack) is Bad News(TM), because it doesn't exist (as far as you are concerned) after the function returns, which is exactly when you need it.

You *can* return a struct which has an array as a data member, and that will be treated as a value. You can even give it a member function that overloads operator[], and then it will behave just like an array except that it will actually behave like an object.

But don't re-invent the wheel in cases where you need that; use boost::array instead. And anyway, you don't need it here.

All the GL functions tend to have several versions along these lines, exactly because you might not want to have to mess around doing these kinds of conversions. They will, in effect, do the conversion *for you*, as needed, in the best way they know how. You should do it this way (i.e. call the function that accepts the data you *already have*, where possible; otherwise do the simplest conversion you can) because

(a) Less code for you.
(b) The library writers designed it this way, and it's their job to know what they're doing; otherwise, *you* would be the one writing the code that translates coordinates and colours and camera positions into bytes in video memory, and I'm sure you don't want all *that*.
(c) Who knows which format is fastest? It might actually be that the ubv() version is "native", and 3uf() or ufv() does conversion internally in order to call the ubv() version. Or it might be yet some other way around. In which case, there's a high probability that you're doing work that will only be cancelled out (possibly introducing needless inaccuracies, as well).
I see that the OP's problem is solved, but I just want to mention this for any newbies browsing this thread:

Never, never, never allocate an object and assume it's going to be destroyed properly. Always make sure you take care of it as soon as you don't need it.

Code like this is bad:

int* makeArray(int n,int a,int b){  int* p = malloc(n*sizeof(int));  for (int i=0;i<b-a;i++)    p = a+i;  return p;};int main(){  int* p = makeArray(12,3,9);  free(p);  return 0;};


It's not obvious to somebody else that the variable needs freed. Keep in mind that 6 months later, it won't be obvious to you either. If you must do something along these lines, it's best to allocate the memory before the function, so that it is created and destroyed in the same scope.

int makeArray(int* p,int a,int b){  if (p == null) return 0;  for (int i=0;i<b-a;i++)    p = a+i;  return 1;};int main(){  int n = 12;  int* p = malloc(n*sizeof(int));  makeArray(p,3,9);  free(p);  return 0;};
We''re sorry, but you don''t have the clearance to read this post. Please exit your browser at this time. (Code 23)
If the data doesn't change over time, it's best to store it in display list or vertex buffer object (VBO). This way the driver can store the data in graphics card's local memory.

Desktop graphics cards are connected to the host system with PCI, AGP, PCIe and other low-bandwidth buses, the local bandwidth can be 20-30 GB/s which is a big benefit. Also, when you use glDrawArrays() or glDrawElements() you lose the overhead of glBegin, glVertex* and other calls. These calls must *build* the dataset dynamically. The graphics processor (GPU) works asynchronously with the CPU so it means the GPU has to be told what to do. This is implemented as a stream of commands to the GPU. More commands, slower the rendering.

The immediate mode (glBegin/glEnd paradirm) usually dispatches the drawing command after glEnd(), when all the required data is collected and processed into format that can be handled by the GPU. This is significant CPU overhead. Then the data has to be transfered over the bus to the GPU's local memory. This introduces latency, the command cannot start processing until it can guarantee that all data is ready to be used. Ofcourse a good driver architechture doesn't poll but rather works async internally aswell. So there must exist a buffer for commands inside the GPU. Now, if the buffer is filled with commands that use data which is already sitting there in the GPU's local memory, the commands can execute without delay.

This is important, because OpenGL specification has a strict drawing order, it cannot process a draw command before other even if all the data for later command is already present: the GPU has to stall until it can process the pending orders before the one that is ready to go as it were.

Look at what parameters are in glDrawArrays() and glDrawElements(). The glDrawArrays() is minimal amount of data to transfer. glDrawElements() requries transfer of index array, but that can be eliminated if the array is placed on a VBO, then the overhead for this call aswell is minimized.

You really don't want to use glBegin/End paradigm. It's inefficient. Also, if you steer away from it you get additional benefit: OpenGL ES doesn't have immediate mode so your applications will work on mobile graphics hardware with less modifications to the sourcecode.

About datatypes. The OpenGL GLSL 1.10 shading language pretty much encourages the use of "big" datatypes. Think of "int" and "float" (and to a degree, "half", aka. OpenEXR float, or float16). Internally, you could think, that vec4 means 4 x 32-bit float .. ivec4 means 4 x 32-bit int and so on. This means that internally, these are the formats the data is converted after reading from arrays or immediate mode "packets" (I encapsulate the informaiton regarding implementation detail with this term in quotes ;)

The biggest thing about datatypes is savings in bandwidth and storage. The rule-of-thumb is that your data is going to be converted to either bool, int or float for the vertex and fragment processing (accuracy withstanding, can be modified with mediump, etc. keywords as far as the hw implementation will leverage)

If you have 32-bit colors in, say, ABGR8888 format it makes sense to store it as such in the arrays. There is absolutely no benefit in storing the color in 4xfloat even if that is what it will be used as in the vertex program. The reason is that reading from arrays in different formats is implemented in silicon. If you don't use the type conversion then the transistors implementing this functionality sit idle there in the GPU. It doesn't make any difference from this point of view if you do conversion or not.

Where it DOES make a difference is caching schemes inside the GPU. There is large amount of "slow" DDR memory. Be it DDR2, DDR3, DDR4 of any flavour you can imagine or whatever they invent next. But there is always faster on-chip memory which is not very large (if it were the chips would be even larger and consume even more power and would require even more powerful cooling.. the cards are noisy enough as they are ;)

If your data is 32 bits wide, keep it in that format. Don't explode it to 128 bits as no additional information is stored. If the implementation is smart, as you can think it is for companies which been at this for a decade now (or more), it will keep the data as small as possible as late as possible. The byte-to-float, short-to-float etc. conversions are practically free. The cost been paid already. The transistors are there already. It's not a free lunch but it's pre-paid lunch. Take advantage of it. Make your application more efficient.

You may not notice it on the latest super-duper card if your load isn't too high, but bring the same work to lower-end graphics chip and you might begin to suffer.

Thinking in hardware terms like GPU is different from thinking in software developer's terms. A software developer, if not thinking things through, will automatically assume that "there is type conversion, it must be instruction of some sort" and that leads to thinking that it's like a CPU program that this instruction is called in sequence of other instructions. It doesn't work that way. It's a lot more complicated than the example I am going to give, but bear with me it is only for illustration purposes.

Let'a assume we have a hypothetical OpenGL implementation. We have glDrawArrays() call which we are implementing. We have arrays which have stride and base address and datatype (GL_FLOAT, number of elements, that sort of thing).

We store this stuff into VBO. We can choose how to implement this. We have all the power in the world. We are GODS of our own creation. We want efficient hardware. We want good memory bandwidth.

So. We store the array in as small footprint of memory as possible. We want to *pack* different arrays into SOA format so that when we want one "packet" of data (namely a vertex), we want to do that with as few reads as possible. If we scatter the data around the GPU's local memory that means we need more memore read requests. Bad. Even more bad is that we cannot keep so many locations in cache at the same time as cache uses fixed size blocks (there are reasons for this aswell but I am going into 100 tangents already as it were ;)

Since we packing the data, it makes sense to keep it as small as possible aswell. So we don't blow up byte to float and so on. We want circuitry in the chip which can "decompress" these structs which are dynamically configurable. So, we need unit in the chip which can do this decompression. It returns, say, "4 x float" data presentation to the processing units implementing for example vertex shader. After and even before this point it depends heavily on the architechture what we really want to do. The GLSL (and Cg/HLSL) design drives the hardware design to specific directions but it doesn't mean all designs will end up identical.

There are many ways to skin a cat. But some things are same for all: memory bandwidth and footprint issues. For everyone the fast on-chip memory is expensive, off-chip stock memory is cheaper. These facts are same for all, they drive the design. These facts drive the way what is sound software engineering practise to drive the hardware. Don't promote datatype prematurely. But don't be shy using float arrays either when you need the precision and range.

This topic is closed to new replies.

Advertisement