Archived

This topic is now archived and is closed to further replies.

XML Parsing

This topic is 5148 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Basically I have decided to try and use XML as a way of building a scene in my 3D Engine. I figured that in conjunction with fx files I could write a single app and change the xml and fx files and give completely a different look, graphics, physical properties, etc, without having to touch the code at all. So having a pretty flexible scene handler in place I decided to start on the XML code, using the Apache Xerces parser. The problem is that I dont seem to be able to actually get much from the XML file using this (all the examples seem to be interested in counting errors, and printing statistics than extracting and using data). I was wondering if anyone here has used this parser (or any other for that matter - I'm not 100% stuck on using this one) to extract information from the following XML sample (it is roughly what the final XML will be like, and it parses OK). I feel like I'm at my wits end trying to get Xerces to get anything from the XML.
<?xml version="1.0" encoding="utf-8"?>
<root desc="Level1">
	<objects>
    		<object>
    			<Filename File ="Sphere.x"/>
    			<Position>
    				<Pos x="0"/>
    				<Pos y="0"/>
    				<Pos z="0"/>
    			</Position>
    			<Properties>
    				<Property Type = "Dynamic"/>
    				<Property Mass = "100"/>
    				<Property Shape = "Sphere"/>
    			</Properties>
		</object>
		<object>
		    	<Filename File ="Plane.x"/>
		    	<Position>
		    		<Pos x="0"/>
		    		<Pos y="0"/>
		    		<Pos z="0"/>
		    	</Position>
		    	<Properties>
		    		<Property Type = "Static"/>
		    		<Property Mass = "100"/>
		    		<Property Shape = "Box"/>
		    	</Properties>
		</object>
	</objects>
	<Shaders>
		<Filename File = "Effects.fx">
			<Shader Code = "VS_ZDRAW" Pass="1"/>
			<Shader Code = "VS_LIGHTING" Pass="2"/>
		</Filename>
	</Shaders>
</root>

Kind Regards Neil [EDIT] I'm using C++[/EDIT] WHATCHA GONNA DO WHEN THE LARGEST ARMS IN THE WORLD RUN WILD ON YOU?!?! [edited by - thedo on November 8, 2003 4:49:48 PM]

Share this post


Link to post
Share on other sites
I''m totally confused as to what your problem is. You seem to be doing fine... Can''t you actually read the XML from the file or is it a dry spell in creativity on what to put in it?

Share this post


Link to post
Share on other sites
Well I'm not entirely sure what to code

I have this class definition

XERCES_CPP_NAMESPACE_USE

class FIRE_API CFireXMLParseHandlers : public HandlerBase
{
public:
void startElement(const XMLCh* const name, AttributeList& attributes);
};


and the code


void CFireXMLParseHandlers::startElement(const XMLCh* const name, AttributeList& attributes)
{
}


But when I try and read the name parameter I just get garbage. When I get the attribute data none of it seems to be populated with anything.

Maybe I'm overriding the wrong method, or expecting the wrong thing from this.

I kind of expected to be able to go

if name='Object'
{
if attribute.name='Filename'

}

[EDIT] Obviously I wouldnt compare strings that way - just pseudo code [/EDIT]
but as my code seems to get no data, I cant even do that. I know it's looking at the correct file as if I invalidate it, the parse crashes.

Neil

WHATCHA GONNA DO WHEN THE LARGEST ARMS IN THE WORLD RUN WILD ON YOU?!?!

[edited by - thedo on November 8, 2003 7:31:56 PM]

Share this post


Link to post
Share on other sites
I've cut and paste some of my own code for parsing part of an XMLized GUI below. It uses a DOM parser (instead of a progressive parse or SAX parser). The parser setup is a bit more complicated because it uses some of special features of Xerces. Sorry, I can't go over everything in detail, it would become a 10 page tutorial. I hope looking at someone elses code might help. Good luck!

Try to read the Xerces documentation for all the classes.

1) First create the DOM parser:

int cWidgetFactory::createParser() {
// create the parser and attach the errorhandler

try {
XMLPlatformUtils::Initialize();
}
catch (const XMLException & toCatch) {
char * message = XMLString::transcode(toCatch.getMessage());
fprintf(gui_errlog, "***Error during initialization! :\n %s\n", message);
XMLString::release(&message);
return -1;
}

Assert(parser==NULL, "xml parser already exists")
parser = new XercesDOMParser();
resolver = new cLyraResolver();
parser->setEntityResolver(resolver);
parser->setValidationScheme(XercesDOMParser::Val_Always);
parser->setDoNamespaces(true);
parser->setCreateCommentNodes(false); // ignore comments

parser->setIncludeIgnorableWhitespace(false); // ignore whitespace

parser->setCreateEntityReferenceNodes(false); // convert entity ref nodes to inline text substitution


Assert(parserErrHandler==NULL, "xml error handler already exists")
parserErrHandler = new cParserErrorHandler();
parser->setErrorHandler(parserErrHandler);


// create the Writer

DOMImplementation * pDOMImplementation = DOMImplementationRegistry::getDOMImplementation(L"LS");
if (!pDOMImplementation)
return -1;

writer = ((DOMImplementationLS*)pDOMImplementation)->createDOMWriter();
if (!writer)
return -1;

// set some features on this serializer

writer->setEncoding(L"UTF-8");
if (writer->canSetFeature(XMLUni::fgDOMWRTDiscardDefaultContent, true))
writer->setFeature(XMLUni::fgDOMWRTDiscardDefaultContent, true);
if (writer->canSetFeature(XMLUni::fgDOMWRTFormatPrettyPrint, true))
writer->setFeature(XMLUni::fgDOMWRTFormatPrettyPrint, true);

writerErrHandler = new cWriterErrorHandler();
writer->setErrorHandler(writerErrHandler);

return 0;
}


2) Then parse in the DOM tree:
   
Assert(parser!=NULL, "trying to parse skin, parser doesn't exist yet")

// because we are doing multiple parses with the same instance of a Xerces,

// and we want to recover from errors which may leave behind remenants of a document

// be sure there is no current document.

parser->resetDocumentPool();
parser->resetErrors();

try {
parser->parse(xmlSkinFile);
if (parserErrHandler->errorsOccurred()) {
parserErrHandler->resetErrorStatus();
fflush(gui_errlog);
fprintf(gui_errlog, "*Terminating parse due to warnings/errrors\n");
return -1;
}
// extract the root node

rootNode = (DOMNode*)(parser->getDocument()->getDocumentElement());

// idrefs is freed in endParse()

idrefs.reserve(500);
}
catch (const XMLException & toCatch) {
char * message = XMLString::transcode(toCatch.getMessage());
fprintf(gui_errlog, "***XML Exception message: %s\n", message);
XMLString::release(&message);
return -1;
}
catch (const DOMException & toCatch) {
char * message = XMLString::transcode(toCatch.msg);
fprintf(gui_errlog, "***DOM Exception message:\n %s\n DOMException code is: %d\n",
message, toCatch.code);
XMLString::release(&message);
return -1;
}
catch (...) {
fprintf(gui_errlog, "***Unexpected Exception Occurred\n");
return -1;
}


3) Get the root node:

rootNode = (DOMNode*)(parser->getDocument()->getDocumentElement());


4) From the root node you can use:

DOMNode * objectsElement = rootNode->getFirstChild();
DOMNode * shadersElement = objectsElement->getNextSibling();
DOMNode * objectElement = objectsElement->getFirstChild();
char buffer[256];
while (objectElement!=0) {
DOMNode * fileElement = objectElement->getFirstChild();

// xerces uses Unicode as its native encoding

unsigned short * filename = fileElement->getAttrib(L"File");

// transcode to native code page

XMLString::transcode(filename, buffer, 255);

// now buffer holds the ASCII value of the File="" attribute

DOMNode * positionElement = fileElement->getNextSibling();

// ... etc ...

}


edit - fixed up the code a bit

edit - editting a post with multiple source tags merges the source tags!

[edited by - Z01 on November 8, 2003 8:53:41 PM]

Share this post


Link to post
Share on other sites
In the loop in the last bit of code, there should be a line:

objectElement = objectElement->getNextSibling();

(If I redit the above post, I'll have to fix all the source tags again because of the problems the forums for multiple source tags in one post)

Its better to work with a DTD and allow the parser to validate the XML instead of using the code you posted:

if name='Object'{
if attribute.name='Filename'
...
}

Remember that without a DTD, you can't eliminate whitespace nodes so you have to test for them when analyzing your DOM tree.

[edited by - Z01 on November 8, 2003 9:13:01 PM]

Share this post


Link to post
Share on other sites
Wow, thanks. I''ll have a look and try to digest it.

A quick question about DOM parsing - from what I understand SAX is event driven (ie OnStartElement events,etc), whereas I ASSUME DOM doesnt. I''m assuming (from a quick glance of your code) that DOM parses the structure and builds a tree structure. Would this assumption be correct?

I was considering using a DTD, but I already have a good SGML parser, which I would have used instead. I''ve some experience of SGML DTDs, but not XML DTDs/Schemas. I may look into this later.

Many thanks for the push in the right direction.

Neil

WHATCHA GONNA DO WHEN THE LARGEST ARMS IN THE WORLD RUN WILD ON YOU?!?!

Share this post


Link to post
Share on other sites
There are 2 types of parsers:

DOM - This parses the whole XML doc into memory and creates a tree. You traverse it as a tree & linked list.

SAX - This is event driven parsing as you mentioned used to access the XML serially. This is good for things like counting the number of a particular tag in a document. Its also good, if you''re processing really large XML documents because the whole tree doesn''t need to be in memory at once.

With DOM you can also do something that is called a progressive parse.

In nearly all XML APIs, DOM parsers are built upon the SAX parser. DOM is higher level, more appropiate for the type of application you are writing, and you should use it if possible to save yourself a lot of grief.

The actual DOM and SAX APIs are nearly a standard. If you know how to use one, then you can pick others up very easily (except Microsoft which decided to mix up COM with their parser ).

Try reading "The XML Thread" section on this page:

http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/index.html

It will teach you about DTDs. Yes, the link is Java related, but lots of the information carries over to C++ easily. Most people shy away from DTDs, but once you learn how to use them, you''ll realize that you were wasting a lot of time doing error checking that the parser could do for you.

Share this post


Link to post
Share on other sites