Sign in to follow this  
greggman

C++ Serialization Woes

Recommended Posts

greggman    134
Recently I needed to serialize some data in C++ and it drove me nuts. Does anyone know some good solutions? Serialiation is the process of loading and saving data. Generally if you have a good library or a good language it can be very easy to save a giant tree of data with very little work. For example in Perl if you have a hash of hashes of arrays of values etc, (some massively complicated structure) you can save it all to a file with like a couple of lines of code and load it back up with a couple of lines as well. With langauges like C you'd have to write hundreds or thousands of lines of code to first save all the data and then re-load it. Java and C# come with serialization built into their standard libraries so it's relatively easy to load and save. Java in particular supposedly has lots of issues but I'm sure their are workarounds. C++ has no standard serialization and C++ is still an evolving language which means there are newer *features* that are not always compatible with older code. So for example there are the boost libraries which are libraries often created by the very people that work on the C++ spec but which have not made it into the spec yet. One of those is a serialization library. Unfortnately it requires a relatively newer C++ feature called Runtime Type Indentification or RTTI. RTTI code is incompatible with non RTTI code and the stuff I'm writing is a plugin to another program. That program didn't use RTTI so I can't you RTTI in my plugin. So much for using the boost library. Supposedly they have some support for non RTTI but I tried the examples and they would not compile until I turned RTTI on. (Yes, I removed the class that was using RTTI but the compiler still complained that it couldn't include the header files for serialization unless I turned on RTTI). Worse, I tried some simple example with RTTI just to see if it worked and as far as I could tell it didn't actually work. I made a simple structure with a couple of floats, serialized the struct as per the docs and looked at the data, the floats were not in there. Maybe that's an issue with VC++ 2003 although the docs claimed it works with no problems. I found the s11n library. It doesn't say if it requires RTTI or not. Unfortunately it does say they don't support VC++ and they don't support graphs which is something I need. A graph in data terms means you have things that point to each other. For example a house may have a reference to its landlord and the landlord might have a list of references to houses she owns. A system that doesn't support graphs could not save that data, it could only save the connection one way, either houses could save their landlord or landlords could save their houses but the connection between them going the opposite way would be lost. I found the Eternity library. It also requires RTTI Someone suggested trying to pull out the CArchive feature of MFC (Microsoft Foundation Classes). They support serialization without RTTI (as RTTI didn't exist when MFC was created) but they required you to base every object you wanted to serialize on CObject and I really did not want to get into multiple inheritance issues. Another minor requirement is that I'd like the serialization library to save in XML. XML is relatively easy to read for a human so having the option to load and save as XML means it's easy to load up the data in a text editor and check for problems. Java and C# both support this. It's one extra line of code in Perl. The boost library is supposed to support this as well but as I already pointed out it doesn't seem I can use that library. To give you an example of how important a serialization library is, in my current project I have 2 tools. One tool writes a bunch of data, the second tool reads that data and processes it. If I had a working serialization library it would be as little as a single line of code in each tool to load and save the data. As I added new data to tool 1 it would automatically be loadable in tool 2. But, as I don't have a working serialization library, saving in tool 1 and loading in tool 2 took between 8 and 14 hours to implement including finding a non giant XML library and then writing support functions for that, then writing the code to use it to deserialize with all the various error checking etc and that was the first time only. Last week I added a bunch of new data and when it came time to load and save that it took another 6 hours. If I had a serialization library that time would be close to 0 hours. Maybe at most 30 minutes to write the few support functions needed. I learned a lot about serialization writing Tanjunka. In that I spent almost no time getting a fairly complicated data graph serialized. Unfortunately, writing a good serialization library is not a trival task. Maybe to someone that's done it before but my best guess is that it would take me at least a couple of weeks to write a good, easy to use, non obtrusive serialiation library for C++. This is one of those unfun tradeoffs. I actually wasted a week trying out different solutions and fudging trying to write my own until I decided just to do it the old fashion way and manually write saving and loading functions because after a week it was clear to me it was going to take another week or 2 and I didn't have the time. At the same time, everytime I add even a little bit of data I've got to update both programs, something that would basically be handled automatically if I had a serialization library. If you know of a library please point it out. it must serialize/deserialize to XML it must support STL containers it must support boost::shared_ptr (and containers of shared pointers) it must not require RTTI it must handle versioning. it must be easy to use, not massively cumbersome. it would be nice if it wasn't a bazillion lines of code (the boost serialization libraries compile to 7MEG!)

Share this post


Link to post
Share on other sites
paulecoyote    1065
Well the serialastion can be achieved in a pretty standard way by overloading the << and >> operators to save and restore object state


//.h
#pragma once

#include <iostream>
#include <iomanip>


enum BoardAlignment
{
BA_FIGHTING=-1, //Position could go either way
BA_NEUTRAL=0, //Position owned by no one
BA_LEFT_PLAYER=1, //Player on the left owns position
BA_RIGHT_PLAYER=2, //Player on the right owns position
BA_BOTH=3 //Player on the right owns position
};

// For classes you would need to declare these "friend" in the class definition.
//... this would allow the functions to get at data members as if they are part
//of the class despite being seperate.
std::ostream& operator << (std::ostream& out, BoardAlignment& posType);
std::istream& operator >> (std::istream& in, BoardAlignment& posType);



// .cpp

#include "..\include\BoardAlignment.h"


std::ostream& operator << (std::ostream& out, BoardAlignment& boardAlignment)
{
return out << (int)boardAlignment << " ";
}

std::istream& operator >> (std::istream& in, BoardAlignment& boardAlignment)
{
int i = 0;
in >> i;

boardAlignment = (BoardAlignment)i;

return in;
}










I realise that's probably not anything that you do not already know from the way you are talking - but hey ho perhaps some one else will be able to reply with something more useful, plus it might be useful to any newbies reading the topic.

Useful search terms could be along the lines of "persistant objects" as well as the whole serialisation thing.
Here's what [google] just came up with for me:
http://s11n.net/

[Edited by - paulecoyote on June 29, 2005 4:54:14 AM]

Share this post


Link to post
Share on other sites
Deyja    920
Quote:
Or check out Boost::serialization


Quote:
So for example there are the boost libraries ... One of those is a serialization library.


ffx; read the post next time.

OP - You're probably going to have to write it yourself. For saving graphs and trees, there is a simple technigue. Iterate over the object, and assign each one an integer ID. Iterate again, writing them to file, but wherever you have to record a link, record the target's ID instead of the pointer. When you load, you read in every object, then iterate over them replacing ID's with addresses. It takes an intermediatary data structure, but it works.
For STL containers, it's trivial. There are no standard graph structures, so you don't have to worry about links. Just save everything in the container in iteration order, even for tree based structures, and the container can easily be rebuilt when loaded.

Share this post


Link to post
Share on other sites
DrEvil    1148
I was under the impression that RTTI was a project level setting that didn't break compatibility, and that for example having it disabled for your main application, and enabled inside a dll plugin would still be compatible. Is that not the case?

Share this post


Link to post
Share on other sites
MrEvil    970
Quote:
Original post by DrEvil
I was under the impression that RTTI was a project level setting that didn't break compatibility, and that for example having it disabled for your main application, and enabled inside a dll plugin would still be compatible. Is that not the case?


As I understand it, adding RTTI in changes the size/layout of the structure, making it difficult to link RTTI/non-RTTI code.

This is done purposely, too, according to This ABI spec, section 3.4.4.2

Share this post


Link to post
Share on other sites
greggman    134
Thanks for the advice.

The C++ Lite Faq covers how to serialize graphs if you are rolling your own but I was hoping to find a solution that didn't require me to write my own. The whole point is to save time and writing my own means spending more time (at least in the short term) then doing it the old fashion way so if I could find an existing solution that would be great but if I have to roll my own I could already tell it was going to take me longer than expected (things like needed special constuctors for filling in constant fields, references, etc....)

For example if you have a class


class Something
{
const string _name;
SomethingElse& _somethingElse;
public:
Something(const string& name, SomethingElse& se)
: _name(name)
, _somethingElse(se)
{ }

void serialize(Archive& ar);
};


this is NOT going to work


void Something::serialize(Archive& ar)
{
if (ar.isWriting())
{
ar << _name;
ar << _somethingElse;
}
else
{
ar >> _name; // ERROR! _name is const!
ar >> _somethingElse; // ERROR! you can't set a reference
}
}


And there are a bunch of other little things like that which I'm sure there are solutions for but that's why I knew it would take me quite a while to roll my own.

Yesterday I was hoping there was an option in VC++ that would let my specify no RTTI for certain classes. Kind of like you can do this

extern "C" {
void myfunc (int c);
}

well if I could do this

extern "NORTTI" {
#include <plugin/api.h>
}

then the problem would be solved but I didn't find any such solution :-(

Share this post


Link to post
Share on other sites
JY    289
Sometimes you just have to write your own, but if you can write it in a generic way then you should only have to do it once.

I created my own framework based on the Composition design pattern plus a couple of others which I use in all my new projects.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster

You might consider www.webEbenezer.net. It does some of what you
mentioned.

We support STL containers, don't require RTTI, and in my opinion
it is easy to use.

We don't support XML, boost::shared_ptr or versioning. I don't
think s11n.net supports versioning either.

Brian

Share this post


Link to post
Share on other sites
Guest Anonymous Poster   
Guest Anonymous Poster
"One of those is a serialization library. Unfortnately it requires a relatively newer C++ feature called Runtime Type Indentification or RTTI. RTTI code is incompatible with non RTTI code and the stuff I'm writing is a plugin to another program. That program didn't use RTTI so I can't you RTTI in my plugin. So much for using the boost library. Supposedly they have some support for non RTTI but I tried the examples and they would not compile until I turned RTTI on. (Yes, I removed the class that was using RTTI but the compiler still complained that it couldn't include the header files for serialization unless I turned on RTTI). Worse, I tried some simple example with RTTI just to see if it worked and as far as I could tell it didn't actually work. I made a simple structure with a couple of floats, serialized the struct as per the docs and looked at the data, the floats were not in there. Maybe that's an issue with VC++ 2003 although the docs claimed it works with no problems."

I am the author of the Boost Serialization Library. The library includes dozens of tests and numerous examples which all compile and run with VC 7.1 (VC 2003). The library can run in an environment that does not support rtti.

Robert Ramey

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this