Jump to content
  • Advertisement
Sign in to follow this  

[C++] XML parser and std::wstring

This topic is 4772 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi. I'm looking for some good XML parser written in C++ with which I could use std::wstring (UTF-16) without additional hassle, ie. I could use libxml2 but that would force me to convert returned data from UTF-8 to UTF-16 (with iconv), but I don't want to do that. So, any suggestions?

Share this post

Link to post
Share on other sites
It's not hard to convert between different UTF formats yourself. You can read about the format at www.unicode.org, or just use their example code. I've been looking into Unicode just recently, and I would advise against using wstring everywhere because GCC's wchar_t is actually UCS-4 (32-bit). Great for compatibility, but a poor use of memory :)

Share this post

Link to post
Share on other sites
That conversion code is a bit ugly, but thanks anyway. =) Actually, memory is the thing that I need least, what I need is the stability ane ease of use. It seems that I have to write my own pseudo-XML parser. =)

Share this post

Link to post
Share on other sites

I wrote one recently using lex & yacc which neared completion. But as I have lost patience with lex & yacc (flex & bison actually) I have been writing my own parser and am in the process of rewriting my various parsers so that they don't use lex & yacc.

If you want to use mine when it has been revised then be my guest, it should only be a few days (because I'm wrangling with INTERNAL COMPILER ERRORs).


Share this post

Link to post
Share on other sites
Original post by daerid
You could use Boost.Spirit for parsing


Share this post

Link to post
Share on other sites
Spirit would be good solution. Does anyone have got any 'spirit hello world'? I hate reading long documentations - learning from some example would be preferred. =)


Share this post

Link to post
Share on other sites
The Spirit documentation is actually very well documented and good reading. I find I get large compile times with boost::spirit, however.
Boost.Spirit user guide.

Here's a program that may parse very simple XML. It probably doesn't work, but you get the idea.

#include <boost/spirit.hpp>
#include <boost/spirit/grammar.hpp>
#include <iostream>
#include <fstream>
#include <iomanip>
#include <iterator>
#include <string>
#include <alorgithm>
using namespace boost::spirit;
using namespace boost;
using namespace std;

//In practice, you'd probably want to parse XML
//into an abstract syntax tree.

//Define a custom actor that prints out its arguments.
struct print_actor{

string name;

print_actor(string const& name):

template<class IteratorT>
void operator()(IteratorT begin, IteratorT end) const{
cout << this->name << string(begin, end) << endl;

print_actor print_a(string const& name){
return print_actor(name);

//Grammars are used to allow rules to operate on
//different types of scanners.
struct xml_grammar: public grammar<xml_grammar>{

template<class ScannerT>
struct definition{
typedef rule<ScannerT> rule_type;
definition(xml_grammar const& self){

element = opening_tag >> middle >> closing_tag;
opening_tag = ('<' >> tag_name >> !(attributes))[print_a("opening_tag")];
closing_tag = ("</" >> tag_name >> ">")[print_a("closing_tag")];
name = +alnum_p; //Probably not adhering to xml grammar here,
//but as I said, it's a simple parser
attributes = +attribute;
attribute = attribute_name[print_a("attribute")] >> "=" >> lexeme_d['\"' >> (*(!ch_p('\"')))[print_a("value")] >> '\"'];
//I think confix_p would work here too.

element, opening_tag, middle, closing_tag, name, &tag_name = name, &attribute_name = name, attributes, attribute;

rule_type const& start(){
return element;

int main(int argc, char** argv){

string s;
ifstream ifs;
istream* stream;

if(argc == 1)
stream = &cin;
stream = &ifs;

string line;
getline(*stream, line);
s += line + '\n';

xml_grammar g;
if(parse(s.begin(), s.end(), g, space_p).full){
cerr << "Yayz. It worked." << endl;
cerr << "Unlucky meight, your XML sucks." << endl;
return 1;


[Edited by - MrEvil on June 24, 2005 6:32:32 AM]

Share this post

Link to post
Share on other sites
Well, I think this might help. It's an XML parser / system I wrote awhile back. It does a little bit with a custom string object, thats easily replaced by std::string or whatever you want (STL wasn't availible for the project I wrote this for, so a little here and there will clear it up)

Even if you dont use it directly, the load and loadXML functions of DOMDocument might help you out. (It's a DOM-style parser, sort of.)

Anyways, hope this helps somehow!


Share this post

Link to post
Share on other sites
Sign in to follow this  

  • Advertisement

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!