Code bloat : Templates

Started by
15 comments, last by phresnel 13 years, 11 months ago
I am trying to understand what leads to code bloat when using templates. For ex if I have a class as follows:


#include<iostream>
using namespace std;

template< class T>
class CBloat
{
T* pvar;
public:
  CBloat(T *x):pvar(x){}
  void print()
  {
      cout<<*pvar<<endl;
  }
};

int main()
{
    int x =10;
    float y=20.0;
    CBloat<int> bInt(&x);
    CBloat<float> bFlt(&y);

    bInt.print();
    bFlt.print();

    return 0;
}



I understand the compiler will generate/instantiate two versions of the class CBloat. If the same needs to be prevented , what techniques are available? EDIT: I am unable to wrap my head around the 'gets compiled multiple times' issue. I came across statements such as 'If vector is included in two different classes' the compiler will compile all functions related to vector twice. i.e once for each class. Ideally should it link directly to the object of vector just once? ( assuming both instantiate the same type)
"I think there is a world market for maybe five computers." -- Thomas Watson, Chairman of IBM, 1943
Advertisement
Quote:
If the same needs to be prevented , what techniques are available?


To my knowledge, there are none. That's the way C++ templates work. Why does it matter?
It is the same bloat as you get in

  void print(int x)  {      cout<<x<<endl;  }  void print(float x)  {      cout<<x<<endl;  }


(where overload-functions are not specific to C++)

Also note that nowadays L1-caches are way way bigger than most programs, so another vote for "Why does it matter".
Curiosity :)

I did run across this..

http://www.linuxtopia.org/online_books/programming_books/c++_practical_programming/c++_practical_programming_122.html

I found the 2nd example complicated to follow :(
"I think there is a world market for maybe five computers." -- Thomas Watson, Chairman of IBM, 1943
Btw, C++ templates are instantiated lazily, so unless expliticly asked for, only what you use is put into the final binary. In that respect, you can sometimes even reap smaller executables than you get by manually typing overloads.
Quote:Original post by phresnel
Btw, C++ templates are instantiated lazily, so unless expliticly asked for, only what you use is put into the final binary. In that respect, you can sometimes even reap smaller executables than you get by manually typing overloads.

Do you mean to say if I add another function to my template class say printChar() , but there is no reference to the same in my code , the compiler will not generate code for it ?

Thanks!
"I think there is a world market for maybe five computers." -- Thomas Watson, Chairman of IBM, 1943
Quote:
EDIT:
I am unable to wrap my head around the 'gets compiled multiple times' issue. I came across statements such as 'If vector is included in two different classes' the compiler will compile all functions related to vector twice. i.e once for each class. Ideally should it link directly to the object of vector just once? ( assuming both instantiate the same type)

Unlike other constructs, there is no separation of definition and declaration.
With a normal function you can do:

 foo.h int foo( float ); foo.cppint foo( float a ) { std::cout << a << std::endl; }


With templates you have to place all of that in the header.
 foo.h template< typename T >int foo( T a ) { std::cout << a << std::endl; }


C++ runs one compile step per C++ file, and then one link step to combine all the objects.

In the non-template case, that means anything including foo.h only compiles the declaration of "foo()". Only foo.cpp has to compile the definition of foo() and create code for it. Everything links against the single definition of foo() later.

In the template case, anything including foo.h has to parse the template, then look at the uses and create code for the template. It is then up to the linker to figure out that there are now dozens of copies of "int foo<float>()" in dozens of object files. A good linker will combine all the definitions into one place, and a bad one will leave multiple redundant copies.

On the other hand, it is easier to optimize inside a compile unit than it is to optimize in a link unit. Since the template version (or inline functions in a header) provide all the code inside a compile unit, the compiler will have an easier time optimizing the use. This is especially useful for the small getter/setter type functions that can be inlined.
If this is really an issue, take a look at the thin template pattern.
Interesting, thanks! That should bring down the bloat for multiple types.

As for multiple objects of the same type can be brought down by the following model by Bruce Eckel.

http://www.linuxtopia.org/online_books/programming_books/c++_practical_programming/c++_practical_programming_135.html

Cheers!
"I think there is a world market for maybe five computers." -- Thomas Watson, Chairman of IBM, 1943
Just to quickly address the 'Why care' bit: those desiring to pursue a career in programming should very much care simply because the platform you're working on may not always have a big cache to exploit. If you're programming on a highly restrictive platform where (literally) every byte counts, then it certainly behooves you to know consequences you may face if using templates :)

I have heard many stories from colleagues where their previous employment simply banned the use of templates and recursion (among other things) because they had to keep the executable as small as possible and the stack was (is) extremely limited.

This topic is closed to new replies.

Advertisement