Serializing without breaking encapsulation

Started by
17 comments, last by wood_brian 16 years, 8 months ago
Lately I've been working on using Python's pickling system with my game. I plan on using it to make save files of my game. I'm wrapping C++ classes into Python via Boost::Python. The problem comes when I write the pickle suite class which forms the tuples of information that need to be saved and then translates that information back into the class. This pickler class requires that I know about some of the internals of the class. Usually it's nothing vital - much of it was already part of the public interface. But sometimes it starts to get nastier and nastier exposing things that I had no previous need to expose, even if it doesn't hurt to do so. It feels messy, like a hack. I don't like it. Since boost::python (which I use) basically mandates that you have a separate class so that it can inherit from pickle_suite, I find it forces me to expose this stuff to the whole program. Ick. The obvious solution to me is friend classes, but I'm not so sure that they're a good solution. What is a good solution? On a side note, there are some things that I'm pickling which are part of another library. Some of these things do not give you all the information that you need to completely reconstruct the class nicely. For example, my physics engine's solid class takes in some shapes when you construct it, but there's no way to get those shapes back after it's constructed. It has redundant information and is a little annoying to write, but it works. I've gotten around this by writing a wrapper class which stores the shapes as they are passed in. Again, what is a good solution for this?
Advertisement
Writing your own wrappers to store information needed to recreate objects that don't give access to their internals sounds like a good solution, and if the shape information is already being pulled from some data source like a file then to recreate it you just need to go back to that same data.

C++ wasn't designed for serializing objects, so I think anything that does serialize them is going to look like a hack in comparison to the serialization available in some more modern languages. You could give a class public methods that aren't exposed to python that are used by your serializing class to save and load an instance's state.
Impossible, that's one of the reasons why people don't use serialization in real products. Serialization is dependent on implementation. Therefore, if you write a class to disk for version 1.0, then change that class even slightly in version 1.1, you won't be able to resurrect that old data. In order to do this you need to use a method to separate the file representation of a class from it's implementation.

The standard way to do this is with an additional class called a gateway - it takes a class and write/reads it to/from the disk/database/whatever data storage mechanism. That way you can keep all of the reading/writing code in one place, making it easier to deal with version changes and also keeping your implementation hidden.
One good solution is to create a separate class that reads/writes the object to/from a data source. That way changes to the class do not require changes to the data format, and they may not even require changes to the reader/writer class(es).
Programming since 1995.
Quote:Original post by T1Oracle
One good solution is to create a separate class that reads/writes the object to/from a data source. That way changes to the class do not require changes to the data format, and they may not even require changes to the reader/writer class(es).


i just said that
@Vorpy - Unfortunately, I can't just go back to the file they are loaded from (this *is* the file they are loaded from). :( But I did just find out that my physics engine has "GetData" and "SetData" routines which return a simple struct that contains all the information to store and restore them, so that solves that particular problem. Speaking of which, is this a good way to go about it? To have GetData and SetData routines that fill out a basic struct and then can return themselves to that state.

@Aressera - I don't see how serializing something is any different from any other method of saving and or loading files. So I guess I'm saying that I don't really know what you're getting at with your first paragraph.

@Aressera's second paragraph and T1Oracle - I think I already have what you're talking about. Python reads and writes it to and from the disk for me. I don't know any of the details about that operation. I'm worried about the process of getting information from a class into Python's pickling system without breaking the encapsulation of said class. ASCII diagram time!
*---------*         *----------------*          *--------** MyClass * ------> * MyClassPickler * -------> * pickle **---------*         *----------------*          *--------*              ^^              ||    Here's the encapsulation problem.    MyClassPickler needs to know    about MyClass And I find    that icky.

MyClass - some class that I want to save and then load from a file
MyClassPickler - turns the current state of a MyClass into a tuple, and can use a tuple to return MyClass back to a state
pickle - Python's pickling system. It takes the tuple and writes it to a file, and can then load from the file and turn it back into a tuple.
Quote:Original post by Ezbez
*---------*         *----------------*          *--------** MyClass * ------> * MyClassPickler * -------> * pickle **---------*         *----------------*          *--------*              ^^              ||    Here's the encapsulation problem.    MyClassPickler needs to know    about MyClass And I find    that icky.



You should not feel icky. You have to see those 2 classes as working together. They form a system, an interface to work with the data in MyClass.

To do this cleanly in C++, you don't even have to create new accessors or use friend, you can simply nest the classes.

class MyClass{public:     class MyClassPickler     {     }}


You get free access and a clear notion that they are strongly related.

If you don't want to do that or use friend, you can have MyClass emit a state object(could simply be a struct with a copy of all the data) that contains all data necessary to be saved and have MyClassPickler work on it. MyClass then needs another function to reload from the state.
Quote:Original post by Ezbez
@Aressera - I don't see how serializing something is any different from any other method of saving and or loading files. So I guess I'm saying that I don't really know what you're getting at with your first paragraph.


You probably just don't understand serialization. Serialization is a method of putting a class in memory into a representation that can be written to disk. This memory representation is totally dependent on the implementation of the class and totally breaks incapsulation. From the serialized class, you can tell what its fields are and what methods it has, whether or not they are public or private has no affect on the visibility. Therefore, if you decide to change the internal implementation of a class (private members) later on, you will be unable to resurrect the older versions that have already been serialized. Basically, if the way a class is arranged in memory changes at all, your program will break.

That is why you use a "gateway" object that uses data abstractions as well as getters/setters to write a format-independet version of that class to disk. Then when you go to ressurect a class, it doesn't matter what the actual structure of the class is, just as long as its abstracted data matches.
read this:
http://en.wikipedia.org/wiki/Serialization
Thank you, I think I see what you mean now. I will read up on nested classes, something which I have never used before.

This topic is closed to new replies.

Advertisement