Parsing a text file

Started by
9 comments, last by evolutional 19 years, 2 months ago
I'm beginning to implement a scripting system, but I'm rusty with string functions. An example (data) script might look something like:
SCRIPT FILE
<Police Car>
INT Driver 12345
STRING PoliceForce "LAPD"
<Vehicle>
<Axle>
<Wheel>
FLOAT radius 0.25
</Wheel>
<Wheel>
FLOAT radius 0.25
</Wheel>
</Axle>
<Axle>
<Wheel>
FLOAT radius 0.25
</Wheel>
<Wheel>
FLOAT radius 0.25
</Wheel>
</Axle>
</Vehicle>
</Police Car>
What string manipulation functions are going to be helpful here? I need to be able to read the file on line by line basis, from each line I should be able to tell if
  • it's the start/end of a block
  • what the block name is
  • if it's a data member
  • what the data type and value is
  • I'm not interested in implementing a super-flexible system using XML or anything like that. Just something simple, where I define the possible data types in c++ and every script must follow simple rules. Thanyou for any help.
    Advertisement
    That's more of a configuration file than a script.
    Chess is played by three people. Two people play the game; the third provides moral support for the pawns. The object of the game is to kill your opponent by flinging captured pieces at his head. Since the only piece that can be killed is a pawn, the two armies agree to meet in a pawn-infested area (or even a pawn shop) and kill as many pawns as possible in the crossfire. If the game goes on for an hour, one player may legally attempt to gouge out the other player's eyes with his King.
    I'd argue that it'd be simpler to implement this sytem using TinyXML than it would be to hand parse the files yourself. In fact, I'm working on a similar system myself, basing things off a generic entity object and implementing a 'class' heirarchy structure for these entities.

    Your dataset could easily be restructured into XML:

    <entity class="policeCar">  <attribute name="force" type="string">LAPD</attribute>  <attribute name="driver" type="int">12345</attribute>  <entity class="vehicle">    <entity class="axle">      <entity class="wheel">        <attribute name="radius" type="float">12345</attribute>      </entity>    </entity>  </entity></entity>


    I like this way as you can set up a loop in your entity factory to scan through all the entity elements, looking for the relevant script classes and adding/setting attributes and heirarchy structure appropriately.

    Of course, you could fully implement it in a similar way using named elements:

    <policeCar>  <attribute name="force" type="string">LAPD</attribute>  <attribute name="driver" type="int">12345</attribute>  <vehicle>    <axle>  .... SNIP ....
    OK so the way it's written is easily converted to XML. But then I still have to write a parser, no? Or is this TinyXML a library for parsing XML? If so how does it look getting the information out of the file in C++ - do you have an example or good link?
    I'll want to add data like 3D vectors, how would that be shown as XML?
    Quote:Original post by smart_idiot
    That's more of a configuration file than a script.
    Yeah, that's why I said a (data) script - though a fairly loose set of rules would let the same parser load a data config file and a level script, no?
    Quote:Original post by d000hg
    OK so the way it's written is easily converted to XML. But then I still have to write a parser, no? Or is this TinyXML a library for parsing XML?


    TinyXML is a small, lightweight parser for XML. It transforms the XML into a DOM-style tree in memory allowing you to easily navigate the data.

    Quote:
    If so how does it look getting the information out of the file in C++ - do you have an example or good link?


    Take a look

    Quote:
    I'll want to add data like 3D vectors, how would that be shown as XML?


    Any way you want. I'd choose <vector x="12.3" y="45.6" z="78.9" /> myself.
    That was a useful article; I found some other information on TinyXML. It looks quite handy, maybe I'll reconsider although I want the option for these script files not to be in text format but a binary. These are easier to parse, smaller to download and stop people being able to hack your game so easily and cheat! Being able to load both a text script and a 'compiled' one would be best.
    One question about the XML thing - in the article he seems to force the elements to come in a certain order. Can I easily let things come in any order? For instance position and rotation elements of an object?
    Sure, you can do anything you need. In the article the elements are read by any order, how TinyXML works is that you effectively say "give me all elements that match 'alien'", you go off and parse them. Then you go onto the next one... So unless your application NEEDS a strict order to work, it doesn't matter.

    As for the Binary option, TinyXML can't help you there. However, I recall seeing an article in one of the Game Programming Gems books in which they presented an XML reader that could be serialised to Binary. I'll see if I can dig out the ISBN of the book that it's in.
    I've gone one better than that, I've found the system the article was talking about. It's called <bXDS - eXtensible Data Stream. It can use both XML and XDS (I think) and has tools for setting up your project for XDS use.

    I've not used it so can't recommend it, but from reading the GPG4 article it certainly looks like it could be very useful for what you're trying to do.

    Also, you can get the Home Edition of XMLSpy for editing your XML. Looks decent, installing now =)

    [Edited by - evolutional on February 10, 2005 7:39:34 AM]
    Cool. I think I might just try using TinyXML for now. Learning how to parse would be useful at some point but I'd rather progress the game...
    One little question:
    <attribute name="attribute name" type="string">This is a string with <bold>bold</bold> text</attribute>

    How would this be structured in the DOM? We get an element with attributes, but what does the bold modifier do in the middle of the attribute?

    This topic is closed to new replies.

    Advertisement