Type Traits
So, what are type traits? Well, type traits are essentially a compile-time mechanism that allows you to:
- Obtain information about types
- Modify types, such as adding or removing cv (const-volatile) qualifiers
- Make compile-time decisions using metafunctions to direct the compiler down certain code paths
In case you have been in a box for a while, there is a new extension to the standard, named "Technical Report on C++ Library Extensions". This extension adds a great many faculties to the existing standard library, amongst which are type traits.
In the following few posts I'm going to examine some of the abilities of type traits and demonstrate how you can use them for a variety of purposes.
Type Information
The first thing I'm going to focus on is the ability to query for type information. To facilitate this ability, TR1 defines the following set of metafunctions:
Quote:template struct integral_constant { static const T value = v; typedef T value_type; typedef integral_constant type;};typedef integral_constant true_type;typedef integral_constant false_type;// [4.5.1] primary type categories:template struct is_void;template struct is_integral;template struct is_floating_point;template struct is_array;template struct is_pointer;template struct is_reference;template struct is_member_object_pointer;template struct is_member_function_pointer;template struct is_enum;template struct is_union;template struct is_class;template struct is_function;// [4.5.2] composite type categories:template struct is_arithmetic;template struct is_fundamental;template struct is_object;template struct is_scalar;template struct is_compound;template struct is_member_pointer;// [4.5.3] type properties:template struct is_const;template struct is_volatile;template struct is_pod;template struct is_empty;template struct is_polymorphic;template struct is_abstract;template struct has_trivial_constructor;template struct has_trivial_copy;template struct has_trivial_assign;template struct has_trivial_destructor;template struct has_nothrow_constructor;template struct has_nothrow_copy;template struct has_nothrow_assign;template struct has_virtual_destructor;template struct is_signed;template struct is_unsigned;template struct alignment_of;template struct rank;template struct extent;
All of these metafunctions derive from the base class (either directly or indirectly) integral_constant, with the T in integral_constant being bool, and v either a true or a false. Basically, these classes will derive from true_type or false_type depending on if the type T provided matches with the type trait.
All of these classes also provide an operator type() const; which just returns an instance of the integral_constant::type typedef.
The function of the majority of these is fairly clear, although the function of the last three may not be obvious at first. alignment_of essentially will return the minimum alignment requirement of the type passed to the metafunction (in integral_constant::value, the alignment is of type size_t). The rank function will return the number of dimensions an array has. For instance: rank::value will return 3, as the array has 3 dimensions. The extent metafunction will return the number of elements a particular dimension can hold (in an array). An example usage would be: extent::value which will return 2 (0 based counter).
So, of what use is this information to us? Well, lets look at a simple example of an optimized copy algorithm. First of all, we know that if a type has a trivial assignment operator then it must be a POD type or a fundamental type. If it is either of those, then we can use a memcpy (or memmove) to do a bulk copy. So, lets start off with a basic copy function:
template<class T1, class T2, bool b>T2 internal_copy(T1 start, T1 stop, T2 dest, integral_constant<bool, b> const&) { while(start != stop) { *dest = *start; ++start; ++dest; } return dest;}template<class T>T* internal_copy(T* start, T* stop, T* dest, true_type const&) { memcpy(dest, start, (stop - start) * sizeof(T)); return dest + (stop - start);}template<class T1, class T2>T2 copy(T1 start, T1 stop, T2 dest) { typedef typename std::iterator_traits::value_type value_type; return internal_copy(start, stop, dest, has_trivial_assign());}
Seems simple enough... basically, if it has a trivial assignment operator and if T1 and T2 are pointers, it will dispatch to the second internal_copy function, otherwise it will fall back to the first internal_copy function. I am using iterator traits here because it will properly dereference pointers to give me the base type.
Well, this seems like a rather nice copy function, however we can do even better. If for the first internal copy we again dispatch to another copy, in this case depending on if the iterator is a random access iterator or not, then we can provide the compiler with an opportunity for loop unrolling. So, changing the first internal_copy, we get:
template<class T1, class T2>T2 internal_copy_2(T1 start, T1 stop, T2 dest, std::random_access_iterator_tag const&) { typedef typename std::iterator_traits::difference_type difference_type; for(difference_type count = stop - start; count > 0; --count) { *dest = *start; ++start; ++dest; } return dest;}template<class T1, class T2>T2 internal_copy_2(T1 start, T1 stop, T2 dest, std::forward_iterator_tag const&) { while(start != stop) { *dest = *start; dest++; start++; } return dest;}template<class T1, class T2, bool b>T2 internal_copy(T1 start, T1 stop, T2 dest, integral_constant<bool, b> const&) { typedef typename std::iterator_traits::iterator_category iterator_catagory; return internal_copy_2(start, stop, dest, iterator_catagory());}
Note that copy requires a minimum of a forward iterator in order to work, hence why the second internal_copy_2 take a constant reference to it.
Ok, so we've got the copy optimized for random access iterators, and pointer types with trivial assignment operators, but what about aligned types? First, I should note that memcpy will most often do this by default, however for the sake of an example, I'm going to show it here. If we change the second internal_copy to dispatch to another function, which will use a different copy mechanism depending on the alignment of the type, then we can elicit a further speed boost.
template<class T>T* internal_copy_1(T* start, T* stop, T* dest) { //optimized copy for 4 byte aligned types __asm { mov esi, start mov edi, dest mov ecx, stop sub ecx, esi shr ecx, 2 rep movsd } return dest + (stop - start);}template<class T>T* internal_copy_1(T* start, T* stop, T* dest, bool b) { memcpy(dest, start, (stop - start) * sizeof(T)); return dest + (stop - start);}template<class T>T* internal_copy(T* start, T* stop, T* dest, true_type const&) { if(alignment_of::value % 4) { return internal_copy_1(start, stop, dest, true); } else { return internal_copy_1(start, stop, dest); }}
Now, if you are thinking: 'But Washu, that if statement is going to cost me runtime speed!', then you would be completely wrong! In fact, the compiler will see that alignment_of::value % 4 is either always true or always false for a type and will optimize it out. Furthermore, the function calls will all be inlined (as the compiler can also see that it's just dispatching to another function). In the end you get a whole lot of compile-time flexability, and lose nothing in terms of speed.
Looking forward to future articles of this. Since I'm still sketchy on all this template meta programming stuff. ;)