View more

View more

View more

### Image of the Day Submit

IOTD | Top Screenshots

### The latest, straight to your Inbox.

Subscribe to GameDev.net Direct to receive the latest updates and exclusive content.

# decompositing and recompositing color (pixel)

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

18 replies to this topic

### #1/ fir   Members

Posted 27 June 2014 - 09:50 AM

In my kind of app i quite often need to do that, color of the pixel is usually unsigned value in the format of ARGB so to do something with it (like dimming color mixling, adding etc) I need to decomposite it something like that

int red = (color >> 16) & 0xff;

int green = (color >> 8) & 0xff;

int blue = (color ) & 0xff;

then do something with this then recomposite this like

if(red<0) red = 0;

if(green<0) green = 0;

if(blue<0) blue = 0;

if(red>255) red = 255;
if(green>255) green = 255;
if(blue>255) blue = 255;

color = (red <<16) + (green<<8) + blue;

this strikes me both as an ugly and probably inefficient.. is there maybe some
way to make this better? (this decomposition and recomposition )

yet if doing this what should i use for this intermediate values (I mark bold up there)

should it be int or maybe unsigned char?

### #2Vortez  Members

Posted 27 June 2014 - 10:17 AM

int red = (color >> 16) & 0xff;
int green = (color >> 8) & 0xff;
int blue = (color ) & 0xff;

Why store them in a int? Use a BYTE instead(or even better, a struct of 3 bytes), which make this

Ex: BYTE red = (BYTE)((color >> 16) & 0xff);

if(red<0) red = 0;
if(green<0) green = 0;
if(blue<0) blue = 0;

if(red>255) red = 255;
if(green>255) green = 255;
if(blue>255) blue = 255;

totally unnecessary.

When "recomposing", use a DWORD or UINT instead of a int, no need for a signed variable in this code.

Edited by Vortez, 27 June 2014 - 10:25 AM.

### #3/ fir   Members

Posted 27 June 2014 - 11:36 AM

Why store them in a int?

some operations on such r g b are making overflow (for example adding on ergb to another) often i need a saturation there so need to use int and yet clip it with ifs - though such ynpacking and repacking seem overhead to me but i dont know what to do with that

also for cases when unsigned char would suffice im not sure if stating unsigned char, r, g, b will not do paradoxally things slower as compiler would need to do such arithmetic constrained to unsigned char where

it may be easier to him operate on processor words - hard to say,

in general passing color as one unsigned int is more handy to me but it seem (though not sure if it has some "> 0" effect on real efficiency) that passing this as separate three values and thus avoiding some of this unpackin/packing could be (theoretically) a bit quicker- but as i said

im not quite sure

### #4Vortez  Members

Posted 27 June 2014 - 03:44 PM

POPULAR

As other pointed out to you many times, just stop worrying about micro-optimization like that, today, it's almost impossible to beat the compiler optimizations in release builds. I tried it (something similar to this, color packing/unpacking), with MMX and SSE, i could beat a debug build, but not a release build (it was a tie), because you know what, the compiler(visual studio) in release build use SSE optimizations when it can so i basically did that test for nothing, except learning that such asm optimization are worthless now, in most cases.

With that said, it's always usefull to do some profiling to find the real bottlenecks and optimize the algorithms where it count.

Edited by Vortez, 27 June 2014 - 03:56 PM.

### #5HappyCoder  Members

Posted 27 June 2014 - 10:25 PM

I agree with Vortez. Optimize it when it becomes a problem. Unless you are doing some serious image processing it shouldn't be a big problem.

I would also recommend packing the color in struct

struct Color
{
unsigned char r, g, b, a;

// constructors, operators, ect
};

Behind the scenes, the compiler will be doing the bit mask and bit shifts for you but with much cleaner code.

EDIT: I assumed you are using c++. Is that correct?

Edited by HappyCoder, 27 June 2014 - 10:28 PM.

My current game project Platform RPG

### #6Samith  Members

Posted 27 June 2014 - 11:51 PM

EDIT: It's late...

Like everyone else in this thread, I think these kinds of micro-optimizations are usually wasted effort. If I were you, I would try to determine if I was performing more decomposition/recompositions than I needed to and try to minimize that, first. I usually find way bigger performance gains by making my code do less stuff than I do by trying to make my code do more stuff quickly.

Edited by Samith, 27 June 2014 - 11:54 PM.

### #7/ fir   Members

Posted 28 June 2014 - 02:34 AM

I agree with Vortez. Optimize it when it becomes a problem. Unless you are doing some serious image processing it shouldn't be a big problem.

I would also recommend packing the color in struct

struct Color
{
unsigned char r, g, b, a;

// constructors, operators, ect
};

Behind the scenes, the compiler will be doing the bit mask and bit shifts for you but with much cleaner code.

EDIT: I assumed you are using c++. Is that correct?

im using c [but compile in c++ mode ]

this with struct is maybe a good hint, tnx, i forgot this option

1) if my color mode is ARGB , i mean blue is lowest bits (0-7)

shouldnt it be

struct Color { unsigned char b, g, r, a};

Im not sure if such structs are organized in the endiann of machine

or endian independant

then i could probably use it the way with casting

though Im not shure if I would use it how it would be passed and hold in the memory and code (if in one register or if in 4?) - if i just will pass this by value foo(Color color) will it be passed just like 32bit unsigned int

or in some other way?

As to "advices" dont do that - I was writing about this before - this is not an answer but the thing i call "propaganda" (this is more trashing this forum (with unvaluable propaganda that is repeated with no change) than proper technical speakin), also this "profile your code to find if this is a bottleneck" is a propaganda - i hear it 20-th time here (literrally! or close about) so no need to repeating 60-th 70-th time

- specifically as im doing proffiling propably 100X more than those propaganda givers

((1)accidentally this is in my bottleneck code of some shading /coloring 100k triangles per frame (even if it would be not i just like to understand some code so propaganda is not suitable for this attitude (2) as to such optymizations i often profile and optymize and find in group all this kind or microoptymizations speeds up my code on the contrary to the propaganda people here say

(recent case i started with frame time nearly 35 ms when searching hardly for any case of microoptymizations i could use in my mind droppeddown to 16.5 ms )

Edited by fir, 28 June 2014 - 03:25 AM.

### #8Khatharr  Members

Posted 28 June 2014 - 02:49 AM

Not really an optimization, but I'd like to throw a little fuel on the fire...

union uColor {
struct {unsigned char blue, green, red, alpha;};
unsigned int uint;
unsigned char channels[4];
};



More importantly, take my advice and use a profiler on your code before you try to optimize it.

void hurrrrrrrr() {__asm sub [ebp+4],5;}

There are ten kinds of people in this world: those who understand binary and those who don't.

### #9Ohforf sake  Members

Posted 28 June 2014 - 03:05 AM

As to "advices" dont do that - I was writing about this before - this is not an answer but the thing i call "propaganda" (this is more trashing this forum (with unvaluable propaganda that is repeated with no change) than proper technical speakin), also this "profile your code to find if this is a bottleneck" is a propaganda - i hear it 20-th time here (literrally! or close about) so no need to repeating 60-th 70-th time

You know, I was about to give you a code snippet that shows, how this can be done in SSE using pack/unpack instructions, but man, you really have a way of discouraging people from helping you.

### #10/ fir   Members

Posted 28 June 2014 - 03:12 AM

As to "advices" dont do that - I was writing about this before - this is not an answer but the thing i call "propaganda" (this is more trashing this forum (with unvaluable propaganda that is repeated with no change) than proper technical speakin), also this "profile your code to find if this is a bottleneck" is a propaganda - i hear it 20-th time here (literrally! or close about) so no need to repeating 60-th 70-th time

You know, I was about to give you a code snippet that shows, how this can be done in SSE using pack/unpack instructions, but man, you really have a way of discouraging people from helping you.

heh, if you are interested in such optymizations i think you should better understand what im saying about this antyoptymizing (and mertithoricaly invaluable propaganda that is so often repeated here) - but you seem not - but imo you should

what is so hard to understand here - those propaganda is realy invaluable for someone who want to do this anyway ;\

sse intrinsics? ye i forgot i had to learn it ;k

Edited by fir, 28 June 2014 - 03:26 AM.

### #11/ fir   Members

Posted 28 June 2014 - 03:17 AM

Not really an optimization, but I'd like to throw a little fuel on the fire...

union uColor {
struct {unsigned char blue, green, red, alpha;};
unsigned int uint;
unsigned char channels[4];
};



More importantly, take my advice and use a profiler on your code before you try to optimize it.

thats good, forgot about this i was not using unions for 12 years

it would be maybe good also doing for some other structures like for example triangle (from 3 vertexes) etc sometimes it is good to acces it by named fields but sometimes it would be nice to iterate on this in loop

### #12/ fir   Members

Posted 28 June 2014 - 03:39 AM

Ex: BYTE red = (BYTE)((color >> 16) & 0xff);

ps this is also nice of c that it works this way

int x=0x10203040;

unsigned char y = x;   //y gives 0x40 - handy thing

when int x=0x102030f0;  char y = x; -> y gives  (-16) also fine

### #13fastcall22  Moderators

Posted 28 June 2014 - 07:17 AM

color = (red <<16) + (green<<8) + blue;

this strikes me both as an ugly and probably inefficient..

Yeah, me too...
You should really be using bitwise-or:
color = (red<<16) | (green<<8) | blue;

There we go -- much better.

Edited by fastcall22, 28 June 2014 - 07:18 AM.

zlib: eJzVVLsSAiEQ6/1qCwoK i7PxA/2S2zMOZljYB1TO ZG7OhUtiduH9egZQCJH9 KcJyo4Wq9t0/RXkKmjx+ cgU4FIMWHhKCU+o/Nx2R LEPgQWLtnfcErbiEl0u4 0UrMghhZewgYcptoEF42 YMj+Z1kg+bVvqxhyo17h nUf+h4b2W4bR4XO01TJ7 qFNzA7jjbxyL71Avh6Tv odnFk4hnxxAf4w6496Kd OgH7/RxC

### #14/ fir   Members

Posted 28 June 2014 - 07:32 AM

color = (red <<16) + (green<<8) + blue;

this strikes me both as an ugly and probably inefficient..

Yeah, me too...
You should really be using bitwise-or:
color = (red<<16) | (green<<8) | blue;

There we go -- much better.

why much? iznt add one cycle and well optymized?

Posted 28 June 2014 - 08:00 AM

What you want is saturating arithmetic. SSE provides opcodes for this Kind of task

### #16fastcall22  Moderators

Posted 28 June 2014 - 08:00 AM

why much? iznt add one cycle and well optymized?

Depends.
zlib: eJzVVLsSAiEQ6/1qCwoK i7PxA/2S2zMOZljYB1TO ZG7OhUtiduH9egZQCJH9 KcJyo4Wq9t0/RXkKmjx+ cgU4FIMWHhKCU+o/Nx2R LEPgQWLtnfcErbiEl0u4 0UrMghhZewgYcptoEF42 YMj+Z1kg+bVvqxhyo17h nUf+h4b2W4bR4XO01TJ7 qFNzA7jjbxyL71Avh6Tv odnFk4hnxxAf4w6496Kd OgH7/RxC

### #17Bacterius  Members

Posted 28 June 2014 - 08:15 AM

POPULAR

color = (red <<16) + (green<<8) + blue;

this strikes me both as an ugly and probably inefficient..

Yeah, me too...
You should really be using bitwise-or:
color = (red<<16) | (green<<8) | blue;

There we go -- much better.

why much? iznt add one cycle and well optymized?

No, there is no time difference, addition and bitwise OR take the same time in most hardware (see the Intel docs on throughput/latency of both instructions, they are identical and quite fast indeed). On (very) old hardware bitwise OR could even be slightly faster since you don't need to carry bits, but good luck measuring that. There is also no runtime difference as long as red, green and blue are no larger than a byte. But it's slightly more readable, because when packing bytes into a single word you are not really doing any addition in the usual sense, you're just.. packing bits. So in this sense bitwise OR is better than addition, not that it matters much (both will give wrong answers if red, green and blue are wider than 8 bits anyway).

How can you claim with a straight face that you've properly profiled your code "100x more" and identified likely bottlenecks when you are still questioning in this very thread whether bitwise OR is less "optymized" than addition? You keep getting tons of very useful advice that you really should follow, but you keep brushing it off as "propaganda" as if you were too good for it. It's getting very repetitive. If you think you know better, why are you asking for advice? If you are not looking for help, why are you making threads?

My final advice to you is: get off your high horse and face the possibility that you actually might not know everything (or anything) about optimization. Then try and modify your code and see what changes in the resulting assembly to learn what your compiler does and does not do. Read up a bit on how CPU hardware works, and get familiar with at least the basics of your own architecture (probably x86 Pentium 3 or Core 2). Find existing C/C++ code on github or whatever. There have to be dozens of software rasterizers online - you could study a few and see how they implemented various parts of their pipeline. Learn from other people's code, compare it to yours. It is hard work, yes. But asking vague questions on a forum unfortunately only gets you so far - to learn to write fast code, you must work at it. There's no secret. If you don't want to take this advice, your loss. I will have only wasted 15 minutes writing it.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

### #18/ fir   Members

Posted 28 June 2014 - 08:57 AM

How can you claim with a straight face that you've properly profiled your code "100x more" and identified likely bottlenecks when you are still questioning in this very thread whether bitwise OR is less "optymized" than addition? You keep getting tons of very useful advice that you really should follow, but you keep brushing it off as "propaganda" as if you were too good for it. It's getting very repetitive. If you think you know better, why are you asking for advice? If you are not looking for help, why are you making threads?

well i dont questioning this - fastcall suggested that this is better so im assking if really (i got some say 'medium/moderate' knowledge on assembly and i suspected that it has not big difference

the trouble is that at c lewel here you are just not able to fully express your intention both if using chars or using ints - compiler is forced to generate code that would be conformant to many other rules of working of such types not your intentions where you need only some of them

on assembly level optymization there could be not a big difference though but im interested in such kind of things just fopr the science of it - so the constant 'propaganda' agains it (that i should not be interesting in what im interesting) is not to much appriopriate and is a waste of words here

### #19/ fir   Members

Posted 28 June 2014 - 09:06 AM

My final advice to you is: get off your high horse and face the possibility that you actually might not know everything (or anything) about optimization. Then try and modify your code and see what changes in the resulting assembly to learn what your compiler does and does not do. Read up a bit on how CPU hardware works, and get familiar with at least the basics of your own architecture (probably x86 Pentium 3 or Core 2). Find existing C/C++ code on github or whatever. There have to be dozens of software rasterizers online - you could study a few and see how they implemented various parts of their pipeline. Learn from other people's code, compare it to yours. It is hard work, yes. But asking vague questions on a forum unfortunately only gets you so far - to learn to write fast code, you must work at it. There's no secret. If you don't want to take this advice, your loss. I will have only wasted 15 minutes writing it.

Im doing that (i mean studyin rasterization, sse assembly and so on, but it goes slow)    Forum is for talking so  I am both asking here (and also othes sites) and studying it seperately - (forum could be quicker and better) - that is what such kind of forums are for

I know that assembly is not so much popular topic these days so this is maybe a bit of trouble discussing this - [ if I would find a better one for this kind of question i would like to move there ]

ps. my soft "engine" after the previous optymizations,

https://www.dropbox.com/s/b1ae8l2u7tybb2o/tie57.zip

now for 1200x1000 i got 40-50-60 ms it would be very nice to move it down to 30-40-50 - but i feel to do this i would need to babble a bit with this intrinsics optymizations

so i welcome if someone would talk on this

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.