• entries
    146
  • comments
    436
  • views
    197621

Type Traits, Part 1

Sign in to follow this  
Washu

296 views

Wow, was this ever long in coming?

Type Traits
So, what are type traits? Well, type traits are essentially a compile-time mechanism that allows you to:

  • Obtain information about types

  • Modify types, such as adding or removing cv (const-volatile) qualifiers

  • Make compile-time decisions using metafunctions to direct the compiler down certain code paths



In case you have been in a box for a while, there is a new extension to the standard, named "Technical Report on C++ Library Extensions". This extension adds a great many faculties to the existing standard library, amongst which are type traits.

In the following few posts I'm going to examine some of the abilities of type traits and demonstrate how you can use them for a variety of purposes.

Type Information
The first thing I'm going to focus on is the ability to query for type information. To facilitate this ability, TR1 defines the following set of metafunctions:
Quote:

template
struct integral_constant {
static const T value = v;
typedef T value_type;
typedef integral_constant type;
};
typedef integral_constant true_type;
typedef integral_constant false_type;
// [4.5.1] primary type categories:
template struct is_void;
template struct is_integral;
template struct is_floating_point;
template struct is_array;
template struct is_pointer;
template struct is_reference;
template struct is_member_object_pointer;
template struct is_member_function_pointer;
template struct is_enum;
template struct is_union;
template struct is_class;
template struct is_function;
// [4.5.2] composite type categories:
template struct is_arithmetic;
template struct is_fundamental;
template struct is_object;
template struct is_scalar;
template struct is_compound;
template struct is_member_pointer;
// [4.5.3] type properties:
template struct is_const;
template struct is_volatile;
template struct is_pod;
template struct is_empty;
template struct is_polymorphic;
template struct is_abstract;
template struct has_trivial_constructor;
template struct has_trivial_copy;
template struct has_trivial_assign;
template struct has_trivial_destructor;
template struct has_nothrow_constructor;
template struct has_nothrow_copy;
template struct has_nothrow_assign;
template struct has_virtual_destructor;
template struct is_signed;
template struct is_unsigned;
template struct alignment_of;
template struct rank;
template struct extent;

All of these metafunctions derive from the base class (either directly or indirectly) integral_constant, with the T in integral_constant being bool, and v either a true or a false. Basically, these classes will derive from true_type or false_type depending on if the type T provided matches with the type trait.
All of these classes also provide an operator type() const; which just returns an instance of the integral_constant::type typedef.

The function of the majority of these is fairly clear, although the function of the last three may not be obvious at first. alignment_of essentially will return the minimum alignment requirement of the type passed to the metafunction (in integral_constant::value, the alignment is of type size_t). The rank function will return the number of dimensions an array has. For instance: rank::value will return 3, as the array has 3 dimensions. The extent metafunction will return the number of elements a particular dimension can hold (in an array). An example usage would be: extent::value which will return 2 (0 based counter).

So, of what use is this information to us? Well, lets look at a simple example of an optimized copy algorithm. First of all, we know that if a type has a trivial assignment operator then it must be a POD type or a fundamental type. If it is either of those, then we can use a memcpy (or memmove) to do a bulk copy. So, lets start off with a basic copy function:

template<class T1, class T2, bool b>
T2 internal_copy(T1 start, T1 stop, T2 dest, integral_constant<bool, b> const&) {
while(start != stop) {
*dest = *start;
++start;
++dest;
}
return dest;
}
template<class T>
T* internal_copy(T* start, T* stop, T* dest, true_type const&) {
memcpy(dest, start, (stop - start) * sizeof(T));
return dest + (stop - start);
}
template<class T1, class T2>
T2 copy(T1 start, T1 stop, T2 dest) {
typedef typename std::iterator_traits::value_type value_type;
return internal_copy(start, stop, dest, has_trivial_assign());
}



Seems simple enough... basically, if it has a trivial assignment operator and if T1 and T2 are pointers, it will dispatch to the second internal_copy function, otherwise it will fall back to the first internal_copy function. I am using iterator traits here because it will properly dereference pointers to give me the base type.

Well, this seems like a rather nice copy function, however we can do even better. If for the first internal copy we again dispatch to another copy, in this case depending on if the iterator is a random access iterator or not, then we can provide the compiler with an opportunity for loop unrolling. So, changing the first internal_copy, we get:

template<class T1, class T2>
T2 internal_copy_2(T1 start, T1 stop, T2 dest, std::random_access_iterator_tag const&) {
typedef typename std::iterator_traits::difference_type difference_type;
for(difference_type count = stop - start; count > 0; --count) {
*dest = *start;
++start;
++dest;
}

return dest;
}
template<class T1, class T2>
T2 internal_copy_2(T1 start, T1 stop, T2 dest, std::forward_iterator_tag const&) {
while(start != stop) {
*dest = *start;
dest++;
start++;
}
return dest;
}
template<class T1, class T2, bool b>
T2 internal_copy(T1 start, T1 stop, T2 dest, integral_constant<bool, b> const&) {
typedef typename std::iterator_traits::iterator_category iterator_catagory;
return internal_copy_2(start, stop, dest, iterator_catagory());
}



Note that copy requires a minimum of a forward iterator in order to work, hence why the second internal_copy_2 take a constant reference to it.

Ok, so we've got the copy optimized for random access iterators, and pointer types with trivial assignment operators, but what about aligned types? First, I should note that memcpy will most often do this by default, however for the sake of an example, I'm going to show it here. If we change the second internal_copy to dispatch to another function, which will use a different copy mechanism depending on the alignment of the type, then we can elicit a further speed boost.

template<class T>
T* internal_copy_1(T* start, T* stop, T* dest) {
//optimized copy for 4 byte aligned types
__asm {
mov esi, start
mov edi, dest
mov ecx, stop
sub ecx, esi
shr ecx, 2
rep movsd
}
return dest + (stop - start);
}
template<class T>
T* internal_copy_1(T* start, T* stop, T* dest, bool b) {
memcpy(dest, start, (stop - start) * sizeof(T));
return dest + (stop - start);
}
template<class T>
T* internal_copy(T* start, T* stop, T* dest, true_type const&) {
if(alignment_of::value % 4) {
return internal_copy_1(start, stop, dest, true);
} else {
return internal_copy_1(start, stop, dest);
}
}



Now, if you are thinking: 'But Washu, that if statement is going to cost me runtime speed!', then you would be completely wrong! In fact, the compiler will see that alignment_of::value % 4 is either always true or always false for a type and will optimize it out. Furthermore, the function calls will all be inlined (as the compiler can also see that it's just dispatching to another function). In the end you get a whole lot of compile-time flexability, and lose nothing in terms of speed.
Sign in to follow this  


4 Comments


Recommended Comments

Trading memory for speed.
Looking forward to future articles of this. Since I'm still sketchy on all this template meta programming stuff. ;)

Share this comment


Link to comment
Oh dear, this was SWEEEEEEEEEEEEEEEEEEEEEEEEEEEET. Thanks a million, Washu [smile]

The more I get into C++ metaprogramming, the more I'm impressed and scared at the same time. This stuff is very powerful, yet very "heavy" to the mind, IMO. Scary.

Share this comment


Link to comment
The more I think about this. The more I want to file it into http://en.wikipedia.org/wiki/Accidental_complexity.
All this template meta programming seems overly complex, mainly because the cost from what I keep seeing out weigh gains. (Like poorly designed coprocessors etc)

I'm still having trouble with the purpose. I see how it works, but the applications still don't make sense. I'd like to see a comparision between this and a similar systems not using meta programming. Doesn't mean anyone is obligated, and maybe when I get all my other projects out of the way I'll do some sort of benchmark.

Share this comment


Link to comment
I guess this is a little late, but would this happen to be the same functionality provided by the "DTI" (Dynamic Type Information) class described near the beginning of Game Programming Gems 2?

Looks like it is.

(Again, sorry this is so late, I would normally have asked in #gamedev, but I don't care to wait for Washu to be done with...whatever he's doing.)

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now