self-registering factory in C++

Started by
11 comments, last by Washu 12 years, 6 months ago
Is there a way to do the thing in which you use static instances of derived classes to register the derived class with a factory that doesn't depend on undefined behavior regarding the order in which static variables are initialized?
Advertisement
I don't have the references here atm, but the defined behaviour (ala original C++ stroustrup book - so it's NOT undefined behaviour) is that:


- global variables have program scope, and are initialised before main() is called.
- global primitives are initialised to NULL first, then
- global types* with copy / constructor are initialised.

So

[source lang="cpp"]
int globalA;
MyClass globalB(0);
int *global C;
[/source]

globalB is guaranteed to be initialised AFTER both globalA and globalC, which are both 0/NULL.

As a result, the automatic-registration pattern can capitalise on this to have the 'registration' class add itself to an array or linked list of types, ie:
[source lang="cpp"]
//in animal.cpp
AnimalCreateInfo *factory_list;
//in dog.cpp
AutoRegister _dogRegister(Dog::Create, TYPE_DOG);
//in cat.cpp
AutoRegister _catRegister(Cat::Create, TYPE_CAT);
[/source]

which allows your AutoRegister class to, in its constructor (and therefore, on initialisation of the global _dogRegister and _catRegister), place construction data into factory_list, which may be a simple linked list or some other construct.


[size="1"]* I can't remember if 'int globalA = 1' counts as a primitive initialisation or not and is guaranteed to precede complex types - I think it does, but I might have to go find the stroustrup book again sometime.

- global variables have program scope, and are initialised before main() is called.

No, that's not guaranteed. A global variable may be initialized after main() begins as long as it's initialized before the first use of any function or variable defined in the same translation unit. (See section 3.6.2 paragraph 3 in the C++03 standard or 3.6.2 paragraph 4 in the C++11 standard.)
I remember researching it extensively the first time I used it, years ago, but since then I've just developed a pattern which works reliably in my projects. Arguably, I'm not using the globals in my simplified version above, rather:

[source lang="cpp"]
//dog.h
class Dog {
static AutoRegister _register;
};

//dog.cpp
AutoRegister Dog::_register(...);
[/source]

Obviously putting Dog::_register() in the factory's translation unit would be guaranteed to work, but I'll have to revisit the standards to see if my dog.cpp approach isn't horribly broken...
It's an annoying thing because some compilers are even more happy to optimize and delay than others (ie. making it in work in GCC was harder than VC).

Too many version flying around, but what eventually worked in both was this (pretty much what was already posted):


template <class Class, class FactoryClass>
class RegisterClass
{
public:
typedef typename FactoryClass::AbstractType Abstract;
typedef typename FactoryClass::IDType IDType;

static Abstract* create() { return new Class; }
explicit RegisterClass(const IDType& id) { FactoryClass::registerCreator(id, create); }
};


You create a static global registration object that does the registration in the constructor.

The funny part is that doing basically the same with a simple variable like
static bool classXisRegistered = register...

will be removed by both compilers, unless you reference the variable somewhere. Even then, GCC would still happily ignore the side effects of the your register function and delay initialization until you first access the variable. So even with the approach of a global object instead of a variable I wouldn't be surprised to see this fail on some compilers.
f@dzhttp://festini.device-zero.de
I like low-tech factory functions like this:
// base.cpp

#include "base.h"
#include "derived1.h"
#include "derived2.h"

std::shared_ptr<Base> factory(std::string const &description) {
if (description == "DERIVED1")
return std::shared_ptr<Base>(new Derived1());
if (description == "DERIVED2")
return std::shared_ptr<Base>(new Derived2());
// Otherwise issue some error complaining about an unkown type of Base
}


It works every time, and everyone can understand what's going on. Automatic registration might seem neat, but I don't think it's worth the pain.

I remember researching it extensively the first time I used it, years ago, but since then I've just developed a pattern which works reliably in my projects. Arguably, I'm not using the globals in my simplified version above, rather:

[source lang="cpp"]
//dog.h
class Dog {
static AutoRegister _register;
};

//dog.cpp
AutoRegister Dog::_register(...);
[/source]


That's somewhat similar to what I used for my own unit test framework for registering test procedures. The guts of it is://header file
typedef void (*testMethod)();

extern std::vector<std::string> testMethodNames;
extern std::map<std::string, testMethod> testMethods;

void registerTest(std::string methodName, testMethod method);

#define STRING(x) #x
#define TESTMETHOD(x) \
void x(); \
struct x##_type { \
x##_type() { \
registerTest(STRING(x), &x); \
} \
} x##_var; \
void x


//source file
// Ok so a few globals - you'll live
std::vector<std::string> testMethodNames;
std::map<std::string, testMethod> testMethods;

void registerTest(std::string methodName, testMethod method)
{
// These statics are a hack around the static initialisation order problem
static std::vector<std::string> myTestMethodNames;
static std::map<std::string, testMethod> myTestMethods;
if (method != NULL)
{
myTestMethodNames.push_back(methodName);
myTestMethods[methodName] = method;
}
else // This is where the magic happens
{
testMethodNames.swap(myTestMethodNames);
testMethods.swap(myTestMethods);
}
}



// In main
int main()
{
registerTest("", NULL); // Get the tests

// booyah! Our global list of tests is now guaranteed safely populated!
}


// In another source file elsewhere
TESTMETHOD(ExampleTestMethod)()
{
//Do some tests
}


Maybe something similar will work for you, or I hope it at least gives you an idea. There are definitely other ways of doing it though.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
It's possible only if you rely on linker-specific behaviour that is in no way guaranteed. For example, you'll often find that some of the techniques mentioned in this thread will work fine when the code is linked directly from object files, but not when it's put in a static library. The Gory details are here (applies just as well to Visual C++ as to GCC).
I'll just chime in to say that automatic-registration systems in C++ are evil. The linker can and will be a source of bugs at one time or another, usually when it's most painful for you, such as suddenly deciding to dead-strip a single registration object two days before burning your gold-master... There are steps you can take to ensure your registration objects are not dead-stripped by the linker, but they are different for every compiler.

IMHO, it's much cleaner (and non-reliant on implementation-defined behaviour) to simply have a single point of registration that you add all of your factories to:[source lang=cpp]//FactoryRegistration.cpp
#define REGISTER_FACTORY(x) void Register_##x(FactoryList*); Register_##x(list);
void RegisterFactories( FactoryList* list )
{
REGISTER_FACTORY( MyFactory );
REGISTER_FACTORY( Foo );
REGISTER_FACTORY( Bar );
}

//MyFactory.cpp
class MyFactory
{
};
void Register_MyFactory( FactoryList* list )
{
list->Add( new MyFactory() );
}

//main.cpp
int main()
{
FactoryList factories;
RegisterFactories( &factories );
}[/source]

It's possible only if you rely on linker-specific behaviour that is in no way guaranteed. For example, you'll often find that some of the techniques mentioned in this thread will work fine when the code is linked directly from object files, but not when it's put in a static library. The Gory details are here (applies just as well to Visual C++ as to GCC).


Ah.. I remember when another project happened to use the same approach. It all worked fine when used as a lib, but then the factory was used for a static lib that was used to build a dynamic dll and of course the linker figured "why would I need all this stuff?". While I managed to make the linker keep those unreferenced statics, I'd really stay away unless you are absolutely sure you will always use the same compiler/linker for the project. For the project itself, people decided not to make it a library. It's kind of a solution, but not really a satisfying one.

Though there is a good reason to dislike that "central registration function". It makes things unneat. Instead of adding a single cpp file, you now have to add a header file, include that header file in some central file and modify existing code. Pretty much all the things you don't want.
f@dzhttp://festini.device-zero.de

This topic is closed to new replies.

Advertisement