Sign in to follow this  
TheChubu

Why XML is all the rage now?

Recommended Posts

Well, that's the question.

 

It seems like every tool promotes its support of some XML based format and everyone is supposed to go all like "omg xml! yay!" yet I can't quite fanthom why is such a big deal.

 

I don't particulary consider XML a "pretty" schema to write your configuration files. It uses so many symbols, and not in a particularly "easy to look at" way.

 

What do you think?

Share this post


Link to post
Share on other sites

There are so many existing tools and libraries for working with XML that you don't have to do much of your own implementation -- for most languages you can be writing or parsing XML files within a few minutes and only a few lines of code.

 

JSON or YAML are both becoming popular alternatives.

 

//EDIT: What SiCrane said -- XML was all the rage a couple of years back, and because of that there are loads of tools and libraries supporting it, but I'd agree with the feeling that JSON and YAML are now being preferred by most people.

Edited by jbadams

Share this post


Link to post
Share on other sites

I had to check the date of this post biggrin.png , thought it was necroed.

 

I'm sure the tools support XML mostly for legacy reasons.  Some people refuse to change, and there are people who still think XML is the best thing ever.

Edited by alnite

Share this post


Link to post
Share on other sites

Aww man. Well, in my defense, 2/3 of the tools you find have some sort of XML file for configurations/formats/serialization/etc (say, build tools, or formats like collada).

 

Its just that it looked so ugly to think "Oh, I see what problem this is solving!"

 

I'm glad its been phased out. I knew a bit about JSON but YAML looks much, much better.

Share this post


Link to post
Share on other sites

From a professional standpoint, a lot of businesses still use xml as their data markup language.  I think that carries over into the gaming industry as well simply because of the sheer number of objects that support it.

Share this post


Link to post
Share on other sites

YAML is taking off? I have yet to see anything use it. JSON though, yes, it seems that lately everybody and their dogs are using JSON now. Probably because we're in full HTML5 mode, and JSON data is valid javascript, so using it is a non-brainer as you don't even need a parser (I wonder if anybody understands the implications of loading data as code though).

 

Personally, I prefer INI files anyway (well, INI-like at least). Yeah, call me old-fashioned, but they're a lot easier to deal with. XML is good when you need tree-style nesting, but most of the time you don't, really (and even then, those using XML more often than not abuse it resulting in ridiculously complex formats for no real reason).

Share this post


Link to post
Share on other sites

Many of the tools that are using XML were likely written several years ago when XML was more popular and there were no standard alternatives. Rewriting their configuration to use an alternative format might seem like a waste of time that might be better spent of the tool's core.

Share this post


Link to post
Share on other sites

Yes, that does makes sense. I guess that in a few years we'll see more tools with current markup (or similar) languages.

 

Anyway, thanks for the answers! At least I know I'm not the only one who doesn't likes to look at XML :D

Edited by TheChubu

Share this post


Link to post
Share on other sites

I hold the firm belief that given time, the open-source world will achieve its ultimate goal of reducing every piece of software in the world down to operations on a key/value store (see the rise of plist, JSON, Lua, and NoSQL).

 

Then we can resurrect the INI file, and be done with it.

Share this post


Link to post
Share on other sites

XML is falling out of favor, but its still used in a lot of commercial software, particularly .net/java software, where ease of serialization make it very accessible. YAML and JSON are saner options for simpler structured data files, and are typically good enough for simpler use cases like configuration files. Importantly, both of those formats are a lot less verbose, and the support libraries are much smaller than for XML.

 

On the other hand, YAML and JSON are no substitute for XML is many applications, where more robust data formatting is needed, or where, you know... you're actually doing markup. There are a lot more tools available XML -- XSLT for one, XSL for another, and being able to verify the document structure with DTDs for a third.

 

XML is far from a bad technology, its just the sledgehammer everyone seems to use to hang pictures on their walls.

Share this post


Link to post
Share on other sites
I've been working with YAML lately and have come to the conclusion that it needs to die screaming. I have not found a single library (other than libyaml in C) which follows the spec correctly (I need a library for .Net - several are available but none of them work). The spec itself is incredibly difficult to read, making it hard to write your own library. YAML written out by existing tools is often incompatible with other tools (indicating that one or the other isn't following the spec).

JSON is widely supported. Libraries usually just work or are easy to fix if they don't work. It's easy to write a library from scratch due to the extremely simple rules.

XML is extremely well supported and it's hard to find libraries that have show-stopping bugs in them. Writing an XML library is much harder than JSON but you usually don't need to.

I don't LIKE any of these formats, but XML and JSON at least are fairly easy to work with if you need to. Edited by Nypyren

Share this post


Link to post
Share on other sites

Points about JSON > XML aside...

 

As a game developer, the big deal to me is that it's a flexible standardized text format which means:

  • I don't have to create my own libraries to read, write, or navigate it.
  • At least for the purposes of developing and debugging tools that use, generate, or convert it, it's human readable.
  • It's diff-able and potentially merge-able, which to me makes it first-class revisionable.

Share this post


Link to post
Share on other sites

The open source engine I use came with solid XML support that is open source and works well. I know how to use it to print out valid properly formatted XML. Thus XML works extremely well for any purpose in the engine since it can read it and write it.

 

I agree that the verbosity is somewhat annoying. But its widely supported, easy to use and understand, free libraries.

 

Its just so convenient. Maybe its slow or bad for large documents or w/e. But its worth it.

Share this post


Link to post
Share on other sites

FWIW, some of my stuff uses S-Expressions... (mostly for structured data: ASTs, world delta-messages, ...)

 

I had considered a few ideas a few times for other concise syntax designs partly combining S-Expressions and XML, but haven't done much with it.

 

some of my stuff also uses a binary serialized (and Huffman coded) variation of S-Expressions.

a few other things use a binary serialized XML variant.

 

there are tradeoffs either way, the main advantage S-Expressions having is that their in-memory form can be a lot easier and more efficient to work with. it takes a bit of work to get good-performance from a DOM-like system (and generally involves "tricks"), whereas S-Expressions can be handled straightforwardly and moderately efficiently by straightforwardly implementing Lisp-like APIs.

 

sometimes, they key/value nature of XML attributes is useful, and doing similar using S-Expressions is less efficient than with explicit key/value pairs.

this is one area XML (and JSON) have an advantage.

 

one option is to extend S-Expressions, potentially ending up with a format like EDN.

 

another option is mostly to strip down and streamline the XML notation, and constrain/tweak it in a few ways to permit a more efficient implementation.

 

...

 

 

a lot depends on use-case though.

generally, I was using things more for structured data, such as world-delta messages and compiler ASTs.

 

OTOH, for configuration files I have most often used line-oriented command-driven text formats (the analogy being using batch-files or shell-scripts as config files).

Share this post


Link to post
Share on other sites

I'm still using XML because it works, because I'm used to the library (TinyXML) that I'm using it with, because I have existing code that makes use of it, and because it doesn't make any real difference.

 

XML is ugly? Sure, but who told you that you're entitled to look at the files? People always seem to imply that every file must be inspected in a hex or text editor, everything must be human readable, and everything that remotely looks like one might be able to edit by hand must be edited by hand. Why?

 

XML is overly complicated, redundant, bloated, etc...? Read again the last paragraph. You need not look at it if you don't like it. You need not edit it, The Program will read/write its data just fine without you interfering.

 

XML takes way too much storage space? Wait, did you hear that? That's the world's saddest song playing on the world's smallest violin. Seriously, you have an office package installed that takes half a gigabyte of disk space only for a text editor and a spreadsheet, you have 2 TiB of MP3s on your harddisk, and you worry whether a puny XML file is 4 kiB or 8 kiB? Tell you what, there is WinZIP if you need to worry about 4KiB. Right, the XML files in your content pipeline aren't precisely 4 kiB, they're more like 40 MiB. Good grief, I'm shocked.

 

Sure, there are other formats that are more comprehensible and more space-efficient. And sure, I'd rather use msgpack or protocol buffers when data has to go over a wire. If I was starting from zero, I'd probably choose something different for on-disk storage, too.

But as it is, for most things, XML works just good enough by all means. It isn't pretty, but who cares.

Edited by samoth

Share this post


Link to post
Share on other sites

Imho XML is the better suited format to represent data. JSON is worthwhile when you use JavaScript, or want to save some bits (ajax-traffic). JSON comes at a cost - the lost flexibility and the technology which XML offers:

XSD - you can validate the data easily before reading it with your application.

XSLT - you can transform the data into almost any other representation. You can even create JSON out of your XML with a pretty small XSL-Transformation, or merge multiple XML-Files to create a new one. There are almost no limits!

XPath - you can search/access single/multiple fields of data. This can be used in your code, in XSLTs or just by other tools (editors, IDEs for example).

 

Your thoughts:

XML takes alot of space! - Use compression. The difference to JSON is pretty low after it.

XML is painful to edit! - Use a proper editor with auto completion and auto validation. JSON with a deep hierarchie is also not easy to edit due to alot of brackets. JSON also requires to escape more characters than XML, which can cause alot of problems when editing by hand.

Share this post


Link to post
Share on other sites


XML is overly complicated, redundant, bloated, etc...? Read again the last paragraph. You need not look at it if you don't like it. You need not edit it, The Program will read/write its data just fine without you interfering.

It matters though. If you make the structure complex, the program will become just as complex. Granted, the issue is not so much with XML, but rather with the fact it got abused like crazy, but that still makes the point stand. Just because something can be relegated to a program doesn't mean it's going to be easier to maintain.

 


XML takes way too much storage space? Wait, did you hear that? That's the world's saddest song playing on the world's smallest violin. Seriously, you have an office package installed that takes half a gigabyte of disk space only for a text editor and a spreadsheet, you have 2 TiB of MP3s on your harddisk, and you worry whether a puny XML file is 4 kiB or 8 kiB? Tell you what, there is WinZIP if you need to worry about 4KiB. Right, the XML files in your content pipeline aren't precisely 4 kiB, they're more like 40 MiB. Good grief, I'm shocked.

Just checked. About 6GB for 8385 files, and I know there's some redundancy there. If you have 2TB in music you probably have other matters to worry about (especially since 2TB are some of the largest hard disks available - 3TB is not that common yet).

 

And that kind of thinking is what results in modern computers feeling just as crap as early ones even though they're thousands of times more powerful (or in the case of memory, millions of times). I know some stuff does indeed require more power, but this idea that we should waste resources just because we can waste is just plain stupid.

Share this post


Link to post
Share on other sites

And that kind of thinking is what results in modern computers feeling just as crap as early ones even though they're thousands of times more powerful (or in the case of memory, millions of times). I know some stuff does indeed require more power, but this idea that we should waste resources just because we can waste is just plain stupid.

QFE.

My quad-core i7 should be able to launch Microsoft Word faster than a 386 in the mid-90's. And yet... it takes 10x longer.

How much of that is due to picking inferior approaches just "because"?

Share this post


Link to post
Share on other sites

What is so horribly bloated with:

 

<someobject arg0="value" arg1="value"/>

 

the ":s?  the slash?

Not much to argue about I think...

 

Sure, a bit more bloat if the object needs a variable amount of sub-objects (like an array of whatever):

 

<someobject arg0="value" arg1="value">

<subobject arg0="value"/>

<subobject arg0="anothervalue"/>

</someobject>
 
But I think it is rather easy-to-read...  And any sane editor will help you with closing tags and such...
 
If you only need the JSON-style data definitions, you don't really need much more from XML then that...
 
Then of course you can fuck up and make something like:
 
<obj type="someobject">
<argument name="arg0" value="value"/>
<argument name="arg1" value="value"/>
<subobjects>
  <obj type="subobjecttype">
    <argument name="arg0" value="value"/>
  </obj>
  <obj type="subobjecttype">
    <argument name="arg0" value="anothervalue"/>
  </obj>
 
</subobjects>
</obj>
 
But thats just stupid.... and not really XMLs fault...
Edited by Olof Hedman

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this