Write mask and swizzle syntax for C++

Started by
11 comments, last by snk_kid 17 years, 4 months ago
This is pretty much just a repost from the Boost Developer list, but I thought I'd post here to get other impressions: I have uploaded a small prototype that demonstrates the syntax mentioned in the thread below. gmane thread It is in the Vault under Math - Geometry as swizzle_demo_01.zip. Boost Vault Here's a small example showing its use:

float4 b(1, 2, 3, 4);

float4 a = b.xxxx(); // a is now 1, 1, 1, 1
a = b.wwwz(); // a is now 4, 4, 4, 3
a = b.yzzw(); // a is now 2, 3, 3, 4
a = b.wzyx(); // a is now 4, 3, 2, 1

float4 c(10, 20, 30, 40);
c.yz() = b.zy(); // c is now 10, 3, 2, 40
This mimics Cg, HLSL, and GLSL syntax closely. The swizzle syntax is implemented as function calls so that it doesn't increase the size of the vector struct (important for using those structs directly for graphics API calls). The prototype code does not implement arithmetic operators for simplicity's sake, but my local copy has it implemented. This isn't meant as a full library, only a demonstration showing that the swizzle syntax can be achieved rather simply using the Boost PP and TypeTraits libraries. One item that could be changed is the multiple template parameters to the classes. In the discussion that took place during the review of Andy Little's library the prospect of heterogeneous vectors took place. In my opinion heterogeneous vectors work with little expense to the library writer and transparently to the user. It would also "just work" with Andy's Quan library, or the other Units library being developed. This is pretty off-topic for the syntax that I am demonstrating, I just wanted to mention that I hadn't removed it from this demo. It is in no way necessary, but removal would require the macros to be changed a little bit. I have only compiled using VS 7.1 and 8.0. I am by no means a PP or template guru, so I would welcome your thoughts and critiques. --Michael Fawcett [Edited by - mfawcett on June 17, 2008 4:12:26 PM]
--Michael Fawcett
Advertisement
It's been a few days with no comments. Is this because the files don't compile or have people just not found it interesting? I was really hoping some people with a lot of preprocessor and/or template experience would comment on the implementation.

There is one issue that has been reported to me that I can't figure out a suitable solution for.
float4 red(1, 0, 0, 1);glColor4fv(&red); // worksglColor4fv(&red.xxxw()); // doesn't work

The problem is that the type returned by a swizzle function is vec4<float &>, not vec4<float> which is what float4 is a typedef for.

Does anyone have any ideas on how to solve this?

My first solution was to dispatch the address-of operator to a template class that was specialized for reference types. The specialized dispatch implementation copied the values to a function local static vec4<T> and returned the address of that, but that 'solution' breaks down in situations like:
doStuff(&red.xxxw(), &red.xxxw());

since both pointers will be pointing at the same vec4. Without incurring large perfomance penalties, I'm at a loss as to how to solve this.

--Mike

[Edited by - mfawcett on December 4, 2006 2:43:04 PM]
--Michael Fawcett
Quote:Original post by mfawcett
There is one issue that has been reported to me that I can't figure out a suitable solution for.
float4 red(1, 0, 0, 1);glColor4fv(&red); // worksglColor4fv(&red.xxxw()); // doesn't work

The problem is that the type returned by a swizzle function is vec4<float &>, not vec4<float> which is what float4 is a typedef for.

Does anyone have any ideas on how to solve this?

My first solution was to dispatch the address-of operator to a template class that was specialized for reference types. The specialized dispatch implementation copied the values to a function local static vec4<T> and returned the address of that, but that 'solution' breaks down in situations like:
doStuff(&red.xxxw(), &red.xxxw());

since both pointers will be pointing at the same vec4. Without incurring large perfomance penatlies, I'm at a loss as to how to solve this.

--Mike
Note, I haven't got time to look over your code at the moment, but:

First of all, I think you have to ask yourself: What does someone want to take the address of it for? Do they intend to modify the original through the swizzled copy? If so, you can prevent that completely by returning a "const vec4<float>"

Secondly, typically taking the address of the return value of a function is a mistake (taking the address of a temporary). Rather than trying to make it work, I suggest considering whether or not the user really needs to do what they want to do, and if so redesigning it so that this can be done in a nice and safe way. Not that I have any ideas on that at this point.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
I think you could get it to work by having your vec4<T&> class would hold 4 references and an array of 4 Ts. Hardly ideal considering this is quite a low-level concept, and the overhead in keeping twice as much information around (and keeping the array synchronised with the references) could be substantial.

Also, the & operator for vec4<T&> would return a T*, not a vec4<T&>* - which is best avoided is possible.
Quote:Original post by iMalc
First of all, I think you have to ask yourself: What does someone want to take the address of it for? Do they intend to modify the original through the swizzled copy? If so, you can prevent that completely by returning a "const vec4<float>"


That is the kicker though. I didn't draw attention to it, but this syntax works for write masks as well.
reversed.wzyx() = temp;

Returning a vec4<float> would make the above not work (modifying the temporary, not 'reversed').

Quote:Original post by iMalc
Secondly, typically taking the address of the return value of a function is a mistake (taking the address of a temporary). Rather than trying to make it work, I suggest considering whether or not the user really needs to do what they want to do, and if so redesigning it so that this can be done in a nice and safe way. Not that I have any ideas on that at this point.


I think I absolutely agree with this. I think it could be done pretty easily like so:
typename boost::remove_reference<T>::type *operator&(){	BOOST_STATIC_ASSERT( (boost::is_reference<T>::value == false) );	return static_cast<T *>(&x);}

--Michael Fawcett
Quote:Original post by Nitage
I think you could get it to work by having your vec4<T&> class would hold 4 references and an array of 4 Ts. Hardly ideal considering this is quite a low-level concept, and the overhead in keeping twice as much information around (and keeping the array synchronised with the references) could be substantial.


I thought of that, but I was hoping for a more ideal solution. What I like about the current implementation is that since they are all still vector classes, all of the operators still work, so you can do:
temp.ywxz() *= another.zyxw();


If I change the class layout and implementation for only the type returned by the swizzle functions I would have to re-implement all of the operators as well.

Quote:Original post by Nitage
Also, the & operator for vec4<T&> would return a T*, not a vec4<T&>* - which is best avoided is possible.


Agreed. I was using boost::remove_reference to handle that.

[Edited by - mfawcett on December 4, 2006 3:57:33 PM]
--Michael Fawcett
Quote:Original post by mfawcett
It's been a few days with no comments. Is this because the files don't compile or have people just not found it interesting?
I don't want to be a party-pooper, but I consider this stuff to be outright harmful. It will have serious performance issues because you're taking addresses left and right to make this syntactical "sugar" work. The problem is that taking addresses causes needless aliasing, which screws with the compiler's ability to remove redundant loads and stores, which in turn negatively affects scheduling, hoisting of data, and loop-based optimizations.

Also, if you were to try to make this work with SIMD, you'd find that e.g. gcc will barf all over your attempts at merging SIMD, keeping things in registers, and using references (i.e. resulting in particularly crappy code). Other compilers may or may not have the same issues.

I'm afraid this falls in the category of just because something can be done doesn't mean it's a good idea.
Quote:Original post by Christer Ericson
I don't want to be a party-pooper, but I consider this stuff to be outright harmful. It will have serious performance issues because you're taking addresses left and right to make this syntactical "sugar" work. The problem is that taking addresses causes needless aliasing, which screws with the compiler's ability to remove redundant loads and stores, which in turn negatively affects scheduling, hoisting of data, and loop-based optimizations.

Also, if you were to try to make this work with SIMD, you'd find that e.g. gcc will barf all over your attempts at merging SIMD, keeping things in registers, and using references (i.e. resulting in particularly crappy code). Other compilers may or may not have the same issues.

I'm afraid this falls in the category of just because something can be done doesn't mean it's a good idea.


I think you are seriously overstating your claims. Your post read like a list of every optimization that can't be done to references. Abolish references and pointers due to aliasing issues. Good plan. At any rate...

In most cases, by the time any actual code sees the values it will be a vector of values, not references.
float4 white = red.xxxw();

There's nothing crazy going on there. No "taking of addresses left and right". It's simply syntactic sugar for:
float4 white = float4(red.x, red.x, red.x, red.w);


We could look at the generated assembly for write masks. There we might see aliasing issues come up.

If that's not good enough I could always get rid of the write mask ability and just have the swizzle functions return by value so there would be no references. At that point all of your arguments go bye-bye.
--Michael Fawcett
Quote:Original post by mfawcett
Quote:Original post by Christer Ericson
I'm afraid this falls in the category of just because something can be done doesn't mean it's a good idea.

I think you are seriously overstating your claims.
That's certainly possible, of course, but perhaps unlikely as I've implemented SIMD libraries on e.g. PS2 and PS3 and seen happen exactly what I mentioned. It is well-known that e.g. gcc is notoriously bad in this respect, and will spill to memory as soon as you take a reference of a vector object (whether you inline or not). On e.g. the PS3 (and the XBOX 360), needlessly spilling registers to memory will kill CPU performance and the type of code-shenanigans you posted are a complete no-no for that reason. These problems are well-known in the console developer circuits and are frequently discussed in the (closed access) developer fora.

Perhaps you should try implementing a SIMD-based vector library for yourself and see. You might be surprised!
Pointer to data members might be helpful to you, take a look here for some inspiration.

This topic is closed to new replies.

Advertisement