File Loading design

Started by
5 comments, last by Drew_Benton 19 years, 4 months ago
I've spent the last few months trying to wrap my head around how I should handle file loading. I usually go into a problem knowing a few different standard ways of solving the problem, then specialize it to fit what I'm doing. For example, If I was designing an event system, I know I would look into messages and callbacks and probably boost/signals. For file loading, there don't seem to be any standard techniques. Let me explain how I've progressed through ideas. The first system I used involved self serialization. Each object implemented the right and left shift operators to write and read themselves to ofstreams and ifstreams. While this worked, it tied the file format to the objects' implementation. Although this worked, I knew there was a better way. So, I decided to go all out with making everything as generic as possible. I jumped into learning more about the STL. (I had stay away from it before) Additionally, I looked heavily into Boost. I decided to make a tree based on boost::any. It was made to mimic the usage of the STL; passing about iterators and whatnot. Using boost::spirit, I wrote a parser. Debugging something so abstract was a nightmare and it quickly became aparent that although I could get this to work, I might not be worth all the effort. I decided that I needed inspiration from real life examples. I downloaded the source for Quake2, Doom, and Wolfenstien. Unfortunatly, its all C. Using fread is not my idea of good C++ programming. I also looked at the 3ds file format. What struck me was their use of ID chunks to tell the parser what it should expect next. Is this how its usually done? I tried searching for generalized file loading algorithms, but that doesn't seem to be a particularly hot topic. Can anyone point me in the right direction? I'm looking for an article or a design pattern or techniques for designing file loaders. I'd like to be able to use the same framework, and change what it loads by plugging in different grammars. How do you handel file loading? I'm more interested in what techniques your file loader uses in order to translate file data into something your project can handel, not how to write a loader for a specific format. If I'm on the wrong track, somebody slap on the back of the head.
Advertisement
Hi.

If you want to a generic file loader that operates on grammars, then you want to write a parser. Writing a parser from scratch is crazy and not worth the effort. Instead there are automated tools that spit out parser classes for you (based on a grammar, usually in bnf).

When I was in school taking a compilers course, we wrote a compiler in java. We used Jflex to scan it (parse it into tokens). We then used CUP to check if the tokens conform to the grammar. If I recall correctly, CUP also spit out an AST tree for us to walk.

There are equivalents in the c++ world. In fact, I'm pretty sure jflex and CUP were based on existing c++ scanners/parsers. I seem to recall the prof mentioning something called "bison" and "yacc".

Anyway, for a generic file loader (if you are serious about writing one) I would suggest looking into topics such as "parsing" and "compiling". Just remember that you're in it not to compile programs, but to compile and store data.

But I also suggest you reconsider your actual needs for your projects. Do you really need a generic loader? Could you not whip up a loader more quickly in a couple days (given a file format)? Worse yet, what if you do write a generic loader, but one day you need the loader to do something it can't do (eg. load on demand or via web). Sometimes, reusing a logical format, but writing special code for each project is the way to go. Sometimes.

I hope this was helpful.

-j
Jonathan Makqueasy gamesgate 88[email=jon.mak@utoronto.ca]email[/email]
Quote:Original post by aaron_ds
Let me explain how I've progressed through ideas.
The first system I used involved self serialization. Each object implemented the right and left shift operators to write and read themselves to ofstreams and ifstreams. While this worked, it tied the file format to the objects' implementation. Although this worked, I knew there was a better way.


Have you checked out boost::serialization? I'm currently working on a similar lib myself, but with some changes to make things a bit nicer. (I'm using XML a lot, so I want a name/value pair to be a requirement, and having to use pair all the time is somewhat annoying. Additionally, I want to use the same serialization for networking, which requires a way to bind additional properties to data).

In any case, the method used is great for separating the file-format from the implementation, since all you get are a series of values, and how you actually write them to file is up to the class passed in. In fact, that's what gave me the idea for using it for networking as well. If a value isn't tagged for replication, it's skipped. A little more overhead than writing a separate function, but then again, it's not like gathering the data is the bottleneck for networking anyway :)
@nuvem: Exactly. File Loading and Networking are very much related. They both must convert data between a linear represenation(files and packets) and their representation within the program.

No, I don't want to write my own generic parser. Using boost::spirit and phoenix is confusing enough. I can't even begin to imagine writing something like that.

Thanks for mentioning boost::serialize. It sounds like a great library. Maybe I'll be able to incorporate it somehow.

@Queasy:
Quote:Sometimes, reusing a logical format, but writing special code for each project is the way to go. Sometimes.


I agree. I'm struggling to come up with the logical format part. I'm looking for parts of file loading that each format has in common. I can then write specializations for all the formats I want to deal with.
The parser doens't really have to be incredibly specialized. :D

I've written a simple one that just loads a line of text and then tokenizes it and everytime you call specific function, it returns a token to indicate what type the parsed text is: NUMBER, STRING, etc, etc. It also stores the text where you can just grap it and do what is needed with it.

So, a parser doesn't have to be really hardcore, you can just leave it as something simple like this.
[size="2"][size=2]Mort, Duke of Sto Helit: NON TIMETIS MESSOR -- Don't Fear The Reaper
@Endar: I've also considered using a tokenizer along with a FSM.
The tokenizer would split up the input(duh) and then each token would passed into the FSM machine. Each FSMnode would be bound to a functor which would act on the token given.

I started preliminary work on this concept..but instead of using something like boost::tokenizer, I used boost::spirit to tokenize the input. I got stuck when I was templating the token type. I was using a std:map<boost::any, FSMnode*> to map the input tokens to nodes to determine which node to go to next. ewww.

Using boost::tokenizer where the output is the same time would be excelent. I can't believe I didn't think about that; I've used boost::tokenizer before.

(yeah I'm a huge fan of boost. Before This project I was scared of using libraries, even the stl. Oh how I've grown)

EDIT: theres so many ways to do file loading, its hard to choose the 'right' one.
For my game engine. I designed my own XML_Parser class. I worked on this class forever, but I think its finally *done*. If you take a look at XML files, they do add more overhead b/c of tags and stuff, but you can cut that down easily. I sugguest looking into XML for all your file needs, b/c XMl can be used to store anytype of file as long as its not in binary, in which you could convert your xml file into binary. If your intrested I can give you some good pointers on what your file loading class should be designed like, since I can say mines pretty darn good [lol]. Anyways I know you said:

Quote:
How do you handel file loading? I'm more interested in what techniques your file loader uses in order to translate file data into something your project can handle, not how to write a loader for a specific format.


But i haven't found anything I *cant* do with my xml parser, even tohugh I cant remodify the data and re-save *yet*. I hope this helps and good luck!

This topic is closed to new replies.

Advertisement