Sign in to follow this  

Best XML Parser?

Recommended Posts

What's the best XML parser out there (and if it needs to be a very specific use, though I doubt it - for simply parsing outputted files in XML) Is there anything else I need for this? (Again I doubt it but just want to make sure)

Share this post

Link to post
Share on other sites
Original post by Nitage
What do you mean by 'best'?

Easiest to use?
Most features?
XPATH support?

Smallest code size, at the expense of features and maybe performance?
Smallest memory footprint, sacrificing performance and "deluxe" features?
Fastest, even if it is bloated and/or not quite correct due to cutting corners?
Most correctly implementing XML & c.?

In what programming language?
On which operating systems?

XML parser selection should be goal-driven:

1) What should the whole XML input phase do for my program? Extracting simple data, building complex application objects, building a representation of the XML document for writing it back, transforming a XML document into another one? This determines what API style is suitable.
2) What libraries are available for my platform and how much do they fit my application architecture?
3) Among the libraries that I can use without problems, which ones have the most useful combination of good performance, nice-to-have features, pleasant API and so on?

For example, the Java projects I develop at work mixes JAXB (complex serialized objects, read and written each to its own file), SAX (reading huge documents to build complex data structures), dom4j (acquiring configuration files for later querying), and a custom DOM-based framework for web services.

Share this post

Link to post
Share on other sites
Best XML Parser?

Parsing xml files is very easy so you can develop one within 60 minutes using a parser generator such as ANTLR, flex/bison, ...

On the other hand ask yourself whether you need to use xml at all.

representing complex meshes with xml is
a) a waste of memory (on the file side)
b) make parsing a lot more difficult than parsing just a simple token stream
c) the parsing complexity increases load times a lot

e.g.: Just recently I wrote a parser grammar for q3radiant .map file. The file I tested it with took around 48 seconds to parse(200000 lines),
most of the time was wasted parsing fixed floating point numbers.
Thats due to the LL(k) {k=const} algorithm used in most parser generators, those only allow a constant lookahead and thus you need to work with syntactic predicates.

On the other hand if you only have to parse a text file that only consists of token stream such as

tri <idx> <v1.x> <v1.y> <v1.z> <v2.x> <v2.y> <v2.z> <v3.x> <v3.y> <v3.z>

you can simply use the shift operators to read in the data, since the istream offer tokenization at whitespaces.

One difficulty however is ready in strings in key value pairs

keyname "some string here"

reading this with istream its better to use istream::get()

So before you consider using xml files for anything other than config files or shader definitions make sure you think about the alternatives.

If you load in large datasets a "WHITESPACE" tokenized format is considerable when reading, but requires a little more effort from your side( writing the parser) but you gain a whole lot of performance.

Share this post

Link to post
Share on other sites
I'm using libxml for my project. Works pretty well, though the documentation for it is kind of... challenging. :P You have to work with it a bit, but I really like it.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this