Sign in to follow this  
lo1989

xml

Recommended Posts

lo1989    122
Hi! i want to use xml for my gameconfig and my GUI-config. till now i used binary files but for my redesign of a new GUI System a want to use xml files because of the simpler changeability. what c++ (not managed) xml - parsers to you prefer? Or is it much slower than loading/reading my binary files. (only small files).?

Share this post


Link to post
Share on other sites
paic    645
Hi,

If you just want to have a "readable" file format to allow fast editing of the gui, and your files are quite simples, then you should write your own ascii format. It will be faster / easier to implement.

And for XML, I used MSXML. I think there is simpler libraries for xml, but I didn't want to include another .lib / headers in my project, so I used MSXML.

[Edited by - paic on November 10, 2005 4:40:23 AM]

Share this post


Link to post
Share on other sites
Dean Harding    546
I believe MSXML 3.0 is included with IE6, but I don't think MSXML 4.0 is included with any system components (maybe IE7, but it's only in beta at the moment)

Still, since MSXML is COM-based, you don't actually have to link to any libraries, you just need the typelib or headers...

Share this post


Link to post
Share on other sites
paic    645
eh, I shouldn't have specified "4.0" ^^
I'm just including <msxml2.h>. I never went into the details so I don't really know how it works internally. But as far as I know, it works on any Windows machine it has been tried on.

Share this post


Link to post
Share on other sites
Red Ant    471
I normally use libxml2. It's a small, compact, validating XML parser / writer. I discovered it just at the right moment to keep me from going insane with Xerces *BARF*.

Share this post


Link to post
Share on other sites
swiftcoder    18426
Quote:
Original post by lo1989
Or is it much slower than loading/reading my binary files, (only small files)?

For small files the difference is not really noticeable, but for large files it is a pain. My current project uses TinyXML to load the entire world structure, 40-50 xml files each of several hundred lines (generated by another tool, I didn't write them all by hand), which takes 45+ seconds on my machine. In contrast, I am switching to a binary format (file content debbuging is largely over), which takes only 4-6 seconds to load the same data.
I recall simple ascii files are somewhere in between (depending on the complexity of the formatting, of course).

Share this post


Link to post
Share on other sites
ROBERTREAD1    100
Consider using SQLite instead of XML?

http://www.sqlite.org/

It should be a lot faster and smaller than XML, and a lot more resilient if you design your Schema correctly and write good SQL.

I personally would only use XML for moving information between machines.

Share this post


Link to post
Share on other sites
Simagery    732
Quote:
Original post by swiftcoder
My current project uses TinyXML to load the entire world structure, 40-50 xml files each of several hundred lines (generated by another tool, I didn't write them all by hand), which takes 45+ seconds on my machine. In contrast, I am switching to a binary format (file content debbuging is largely over), which takes only 4-6 seconds to load the same data.


Really? 45+ seconds? I've not used TinyXML yet, but I was hoping for performance at least on par with a naieve text file parser. What type of machine are you running on? Are the files coming off of CD, network, local hard drive?

I'd be really curious to see a profile of that session to see what's soaking up all the processor cycles. Just a brief glance at the TinyXML source didn't seem to suggest that kind of performance hit (and if that's true, then I guess I'll have to mark it off my list).

Makes me think I may need to do some perf tests right away...

Jon Watte has a great library that does XML tokenization. It's definitely not "real" XML parsing, but depending on the complexity of your XML (how many XML features you leverage beyond tags and attributes) it may be a good fit for the runtime (it reads XML only). Basically, you give it a text buffer representing the XML and it tokenizes it (returns offsets and lengths, I believe) for each XML "node." It does it all in-line (just like strtok) thus requires no memory allocation (which makes it exceptionally fast).


Share this post


Link to post
Share on other sites
swiftcoder    18426
Quote:
Original post by Simagery
Really? 45+ seconds? I've not used TinyXML yet, but I was hoping for performance at least on par with a naieve text file parser. What type of machine are you running on? Are the files coming off of CD, network, local hard drive?

XML files are in the same folder as the app, no problems there.
Quote:
I'd be really curious to see a profile of that session to see what's soaking up all the processor cycles. Just a brief glance at the TinyXML source didn't seem to suggest that kind of performance hit (and if that's true, then I guess I'll have to mark it off my list).

I think the main problem is that an unformatted text dump of my average size data from a single file takes ~100 kb, whereas the xml takes closer 10 ~750 kb, all of which is XML code. So TinyXML is having to build a large document tree in memory, and most of the time seems to be spent reading XML commands.
Quote:
Makes me think I may need to do some perf tests right away...

I would like to see any results you find, it isn't neccassarily a drawback of TinyXML itself.
Quote:
Jon Watte has a great library that does XML tokenization. It's definitely not "real" XML parsing, but depending on the complexity of your XML (how many XML features you leverage beyond tags and attributes) it may be a good fit for the runtime (it reads XML only). Basically, you give it a text buffer representing the XML and it tokenizes it (returns offsets and lengths, I believe) for each XML "node." It does it all in-line (just like strtok) thus requires no memory allocation (which makes it exceptionally fast).

This sounds like it would suit my needs much better than TinyXML, as I use none of XML's advanced features, and the overheads should be much lower. Do you have a link for it?

Share this post


Link to post
Share on other sites
Simagery    732
Jon Watte's (hplus's) XMLSCAN library. I'd also recommend digging through the rest of his site. He's a very smart man and I respect his opinion when I hear it.

It suprises me because we use an XML parser (custom-written, but essentially fully-conforming) internally that parses through 100's of megabytes of XML, with complex href-usage, and builds giant in-memory DOMs in 1/10th of that time. Of course, when you do that 100's or 1000's of times, the 4-5 seconds is still not fast enough, so we now optionally support a binary form of the data (essentially binary XML) that's more compact and lods relatively instantaneously.

The SQLite looks intriguing. Someone had mentioned "embedded SQL databases" and I had been meaning to look into them. It's hard to beat the expressiveness of a true relational database.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this