Processing a file full of XML

Started by
2 comments, last by mrmrcoleman 15 years, 10 months ago
Hey guys, I have a file full of XML with seemingly random carriage returns and spaces, just one big mess basically. What I need to do is find every instance of <a>foo and then skip forwards in the file and output the matching <d>bar</d> for that record ignoring any carriage returns or whitespace encountered on the way. I believe my saving grace here is that each XML record has the same fields for every record so, do you guys have any ideas on how I can achieve this on the command line in Linux? I could hack together a small tool but I'd really rather avoid that! Thanks in advance for any help on this, Mark
Advertisement
I'd do it in Python. :)
TinyXPath, or similar XPath library. A non-library-based solution is pretty much guaranteed to be fragile and hackish, grumbled about later by programmers who know how XML works. Don't be that guy.
Thanks for the feedback guys. I managed it in the end by Googling a combination of SED and AWK and learning just enough.

It's not pretty, but I didn't have to do any coding, and I've learned something!

This topic is closed to new replies.

Advertisement