Guys and Girls,
Looking for some feedback on a XML reader that i've thrown together, simply put, it reads the file in, verbose, and using vectors recreates the structure of the XML file into memory, which can then be searched...
I am using a COLLADA file as a benchmark, particularly the larger of the two files contained in the rar at this location: http://www.wazim.com/Downloads/AstroBoy_Walk.rar, for clarity purposes using astroBoy_walk_Max.DAE.
The file is 1.5mb decompressed, comprises of ~14000 lines of xml structured data.
Currently based on my basic benchmarking, i'm reading the file and creating the structure in, on average 9.5 seconds, i feel this is too slow especially if i plan to read in a number of these files... to be fair though this is a marked improvement over my first effort that took on average 45 seconds to read the same file.
int PopulateNode(std::string fC){
//get line
int lStart = 0, lEnd = 0, tagsOpen = 0;
std::vector<XMLNODE*> nodeStack;
nodeStack.push_back(this);
while(((lStart = fC.find_first_not_of(" ", lStart)) != -1) && ((lEnd = fC.find("\n", lStart)) != -1)){
//do{
//lStart = fC.find_first_not_of(" ", lStart);
//lEnd = fC.find("\n", lStart);
std::string curLine;
curLine = fC.substr(lStart,lEnd-lStart);
if (curLine.at(0) != '<'){
//no open tag found, but there is data on the line... approach as though elements of the node.
nodeStack.back()->elements.push_back(XMLELEMENT("value",curLine.substr(lStart, lEnd - lStart- 1)));
}else{
switch (curLine.at(1)){
case '/':
{
//close tag
std::string closetag;
nodeStack.pop_back();
tagsOpen--;
break;
}
case '?':
{
//question mark found
std::string qnMark;
break;
}
default:
{
//not closed/not question
int bSBoundS = 0;
int bSBoundE = curLine.find('>', bSBoundS);
std::vector<std::string> tempStrVec = SubStrDelim(curLine.substr(bSBoundS+1, bSBoundE-bSBoundS-1),' ');
std::string closeTag = std::string ("</").append(tempStrVec.at(0));
std::string selfClose = "/>";
if(!nodeStack.back()->nodeType.empty()){
nodeStack.back()->childNodes.push_back(XMLNODE());
nodeStack.push_back(&nodeStack.back()->childNodes.back());
}
nodeStack.back()->nodeType = tempStrVec.at(0);
if (tempStrVec.size() > 1){
for(int i=1; i<tempStrVec.size();++i){
nodeStack.back()->elements.push_back(XMLELEMENT(tempStrVec.at(i)));
}
}
int closeTagPos = curLine.find(closeTag);
int selfCloseTag = curLine.find(selfClose);
if(closeTagPos != -1){
nodeStack.back()->elements.push_back(XMLELEMENT("value",curLine.substr(bSBoundE+1, closeTagPos-bSBoundE-1)));
}
if((closeTagPos == -1) && (selfCloseTag == -1)){
//no close tag on line.
nodeStack.back()->childNodes.push_back(XMLNODE());
nodeStack.push_back(&nodeStack.back()->childNodes.back());
tagsOpen++;
}else{
nodeStack.pop_back();
}
break;
}
}
}
lStart = lEnd +1;
}//while(lStart < fC.size() && lEnd != -1);
return 0;
}
In this case the input is a Char* / std string from a full file read to memory.
I appreciate this might not be the most tidy, or error safe code, i'm working on getting it reading first then adding in some error trapping as i'm using the errors (if any are generated) to stop me in my tracks as i'm doing this rapid style.
To head off questions of "why not use a library out there?", i'm using this as a learning experience, and would like to learn ways of improving my own code over using someone elses...
So i'd like to ask for some C&C, pointers, or even just an idea on how long it takes others to read in this same file to see if my expectations are being unfair.
Thanks.