• Create Account

## YAML vs JSON vs XML?

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

22 replies to this topic

### #1stein102  Members

556
Like
2Likes
Like

Posted 20 August 2013 - 04:32 PM

I'm working on a 2D RPG in Java using LibGdx, and need to find a way to store my game data. The things I've considered are YAML, JSON and XML. I only have experience in working with XML, but the others can't be too hard to learn. What I need to achieve is being able to define items to be generated by listing the components that the item would have and my EntityGenerator would turn them into game objects.

Additionally, I would like to be able to build myself an "Item creator" utility where I can use a GUI to simply check off the components to add for my item and it would add that into the data file.

Any suggestions?

Edited by stein102, 20 August 2013 - 04:32 PM.

### #2runnerjm87  Members

177
Like
2Likes
Like

Posted 20 August 2013 - 05:03 PM

I can't really speak to the pros/cons of YML, I wrote a quick and dirty parser for it a while ago for a webapp, but that's my entire experience. As far as JSON is concerned, if you're not using Javascript, there isn't much point. The reason that JSON is so nice in web development is that it is instantly recognized by the browser's JS interpreter as an object so there's no additional parsing necessary. This is only true in Javascript, though. Any other language requires a custom library to interpret JSON.

Personally, I would say work with XML. For one thing, you already know it. For another, a lot of people already know it, so if you end up bringing another person in on your project, they won't be stuck trying to familiarize themselves with a new data format (and if the person you're bringing in isn't at least somewhat comfortable with XML, you might need to ask if they're a good fit).

### #3Ravyne  Members

14133
Like
5Likes
Like

Posted 20 August 2013 - 05:47 PM

POPULAR

Those are the three semi-popular go-to choices.

The downside to YML is that I've heard in a similar thread to this one that what few library options exist, none of them are really implemented in exactly the same way. Thus YML is loosely standardized in practice.

JSON is widely popular and relatively compact, but lacks real tooling.

XML is also widely popular, and has an extensive tools ecosystem (editors, validation via DTDs, XPATH, transformations via XSLT), but XML formats tend to be very, very verbose. Verbosity is fine during development, but its burdensome when you've got to write that data to every user's disk, or push it over the wire to them.

LUA can also be used for data interchange, in addition to scripting duties.

The question, really, is what do you actually need? XML is a structured data format language, whereas JSON or YML are (loosely) structured data interchange formats. In other words, you can use XML to define complex, hierarchical, rigid, versionable data formats, and then use them to exchange data using that format; JSON or YAML is structured in a sense, but only by convention -- your program only understands it as well as it's kept up with the latest changes, and no alarm-bells go off when unexpected data is found.

Personally, I like XML for some things as a content-authoring format, but I dislike it as a content-delivery format. If you use XML to author content, you can use XSLT to transform the authoring format into a format that's more-suitable for delivery, which might be JSON or a stripped-down, lighter-weight XML schema -- perhaps JSON data wrapped in a CDATA section within an XML root node that's just used for versioning.

throw table_exception("(ノ ゜Д゜)ノ ︵ ┻━┻");

### #4stein102  Members

556
Like
1Likes
Like

Posted 20 August 2013 - 06:39 PM

I think I'll stick to XML like was suggested above. No need to learn extra technologies to achieve something that can be done with a protocol  that I already know.

Does this look like a good structure to you? Or is there a better way?

<?xml version="1.0"?>
<item>
<id>1</id>
<name>sword</name>
<components>
<wieldable></wieldable>
</components>
</item>


### #5Ravyne  Members

14133
Like
4Likes
Like

Posted 20 August 2013 - 06:52 PM

I think at the very least you could abbreviate that format -- remember that XML is really meant to be a semantic format -- I might do something like the following:

<item id="1" name="sword" category="weapon">
<description> Just a regular sword.</description>
<components>
<wieldable />
</components>
</item>

There's quite a bit of art to designing and XML schema, but good designs use elements and properties together in a way that reduces redundancy and encourages correct use. You can define your format using a DTD and validate such files before they touch your game or content pipleline.

Keep in mind that <item> might or might not be a valid element in your schema depending on how much commonality all the different kinds of items have with one another. Say, for example, that it makes no sense for some sub-class of items to have any components, but another sub-class of items might require at least one component. You could enforce this just in the DTD, but it could make maintaining the DTD complicated -- it might be better to not have a generic <item> element, but instead have elements for the different subsets, say <weapon> and <potion>. Like I said, there's some art and intuition behind these kinds of decisions, just think carefully about them rather than tossing the first thing that comes to mind together.

Edited by Ravyne, 20 August 2013 - 07:00 PM.

throw table_exception("(ノ ゜Д゜)ノ ︵ ┻━┻");

### #6stein102  Members

556
Like
0Likes
Like

Posted 20 August 2013 - 08:02 PM

I think at the very least you could abbreviate that format -- remember that XML is really meant to be a semantic format -- I might do something like the following:

<item id="1" name="sword" category="weapon">
<description> Just a regular sword.</description>
<components>
<wieldable />
</components>
</item>

There's quite a bit of art to designing and XML schema, but good designs use elements and properties together in a way that reduces redundancy and encourages correct use. You can define your format using a DTD and validate such files before they touch your game or content pipleline.

Keep in mind that <item> might or might not be a valid element in your schema depending on how much commonality all the different kinds of items have with one another. Say, for example, that it makes no sense for some sub-class of items to have any components, but another sub-class of items might require at least one component. You could enforce this just in the DTD, but it could make maintaining the DTD complicated -- it might be better to not have a generic <item> element, but instead have elements for the different subsets, say <weapon> and <potion>. Like I said, there's some art and intuition behind these kinds of decisions, just think carefully about them rather than tossing the first thing that comes to mind together.

I'm not fully sure what I'll need for my items yet as I'm just starting development. I like the idea of not using a generic <item> element, but using elements for different categories of items there could be. Do you have any other suggestions that I may need to have? I'm sure I'll be running into complications in no time otherwise.

### #7Ravyne  Members

14133
Like
5Likes
Like

Posted 20 August 2013 - 10:03 PM

POPULAR

Well, like I said its all very dependent on your particular situation. The closest things I can give as rules-of-thumb are:

• Include versioning information in your root element for maximum robustness. An application that reads your XML file should be able to understand all schema versions within a major version number -- it may have too little information (the program should choose reasonable defaults) or too much information (the program should ignore what it doesn't expect) so the experience may be degrated, but strive to make it work.
• Increment the minor version number with each schema change.
• Increment the major version number when the schema change is such that an application reading the file can no longer provide reasonable defaults for necessary information (e.g. a breaking change in the schema).
• If a thing that exists in your schema can have children, or large and/or complex content, make that thing an element.
• If a thing that exists in your schema can have siblings under the same parent element, make that thing an element.
• If a set of things that exist in your schema are logical siblings (such as components); that is, they represent the same concept, but do not share an element name, consider creating an element who's only job is to contain those things (like <components> above).
• If a thing that exists in your schema does not have children, is small, simple or otherwise relatively "atomic". its a good candidate for being a property.
• Don't be afraid of using properties to store data that you have to parse later. Examples in common use are properties containing universal timestamps, or javascript expressions in the On<X> event handlers in HTML. However, if this data could get long-winded, consider allowing that property to be defined by a child element (or other element, by way of reference) as an option to maintain good readability.

The trouble, really, is that the decision criteria I've given above is often not be immediately clear, especially as requirements evolve. That's really the best argument I can give you for aggressive versioning, and robust handling of incomplete/extra information in the XML file.

Edited by Ravyne, 20 August 2013 - 10:07 PM.

throw table_exception("(ノ ゜Д゜)ノ ︵ ┻━┻");

### #8DaBono  Members

1414
Like
0Likes
Like

Posted 21 August 2013 - 02:17 AM

As for the verbosity of XML, it is fairly simple to use zlib to compress the XML. In the case of network transfer, is may be quite worth to spend the extra CPU-cycles and send a XMLZ-file.

You then have the tooling of XML in a fairly (albeit not the most) compact format.

### #9markr  Members

1692
Like
2Likes
Like

Posted 21 August 2013 - 07:31 AM

Remember the various security pitfalls of parsing XML from an untrusted source. If you're not doing that, then you're fine though.

### #10lerno  Members

268
Like
0Likes
Like

Posted 21 August 2013 - 08:47 AM

As far as I can tell, many systems are moving towards json for config etc. I can't speak for the rest, but I personally switched from xml (or xml-plists actually) to json because editing (and finding errors) is so much easier in json. Not to mention that json can be much more compact while still retaining readability.

My vote's for json.

### #11nevS  Members

154
Like
3Likes
Like

Posted 21 August 2013 - 11:32 AM

...
There's quite a bit of art to designing and XML schema, but good designs use elements and properties together in a way that reduces redundancy and encourages correct use. You can define your format using a DTD and validate such files before they touch your game or content pipleline.
...

+1 but use XSD instead of the deprecated DTD for validation purposes

As far as I can tell, many systems are moving towards json for config etc. I can't speak for the rest, but I personally switched from xml (or xml-plists actually) to json because editing (and finding errors) is so much easier in json. Not to mention that json can be much more compact while still retaining readability.

My vote's for json.

Well, in my experience it's alot easier to spot formatting errors in XML than in json. If your texteditor did not spot it already, just use XSD-validation, it tells you exactly where the error is. Finding errors in data content is even harder in json.

### #12Olof Hedman  Members

5698
Like
0Likes
Like

Posted 21 August 2013 - 01:18 PM

I think at the very least you could abbreviate that format -- remember that XML is really meant to be a semantic format -- I might do something like the following:

<item id="1" name="sword" category="weapon">
<description> Just a regular sword.</description>
<components>
<wieldable />
</components>
</item>

I think you could abbreviate that a bit further, by removing the <components> tag, since it is kind of redundant. Any child elements of "item" could be seen as components of the item, and the "description" is one of them.

Personally, I try to put everything in tags and arguments too, since I find it easier to write the parser for with tinyxml.

So I'd write

<description text="Just a regular sword."/>

That way, you also avoid the end tags, and get the format a little bit more brief.

So this:

<item id="1" name="sword" category="weapon">
<description text="Just a regular sword."/>
<wieldable/>
</item>

or something...

That's one problem with xml, so many ways to do it.

But I like it.

Edited by Olof Hedman, 21 August 2013 - 01:25 PM.

### #13Serapth  Members

6651
Like
0Likes
Like

Posted 21 August 2013 - 01:30 PM

As far as JSON is concerned, if you're not using Javascript, there isn't much point. The reason that JSON is so nice in web development is that it is instantly recognized by the browser's JS interpreter as an object so there's no additional parsing necessary. This is only true in Javascript, though. Any other language requires a custom library to interpret JSON.

This is becoming increasingly less true.  Many of the NoSQL databases for example expect and return JSON.  More and more web services are moving from XML to JSON as well.  There are increasingly parsers for JSON in more and more languages, often many times lighter than XML.  Frankly, if you aren't having to describe/discover the data's type, XML is overkill.

Since you are working in Java ( or C++ is another option here ), I would also consider checking out ProtocolBuffers.  Ultra light weight and better, can actually generate java code to create a Builder for the type you define.

### #14BGB  Members

1570
Like
0Likes
Like

Posted 21 August 2013 - 05:07 PM

As for the verbosity of XML, it is fairly simple to use zlib to compress the XML. In the case of network transfer, is may be quite worth to spend the extra CPU-cycles and send a XMLZ-file.

You then have the tooling of XML in a fairly (albeit not the most) compact format.

this works pretty good IME.

though, granted, it is possible to get the speed/compression tradeoff a little better with a specialized binary serialization, but this is a bit more work (IOW: may involve working with entropy-coded bitstreams, ...).

a minor issue of XML+Deflate, is that it isn't by itself ideally suited for transmitting a stream of messages, so generally needs to be wrapped in some sort of container. in a typical example, this will be a tag-value, followed by a length, followed by the compressed data.

in some cases, it may also make sense to "escape code" the data, such that a tag may never appear in data. this may allow things like resynchronizing with a data stream.

as for data representations:

I also use S-Expressions to some extent as well, though these are not as widely popular.

another option is basically creating a byte-oriented data serialization, and then feeding this through zlib / deflate.

Edited by cr88192, 21 August 2013 - 05:23 PM.

### #15rnlf  Members

1775
Like
0Likes
Like

Posted 22 August 2013 - 04:16 AM

I've had very positive experiences with Protocol Buffers. They are fast, compact and have text and binary representation (even convertible from one form into the other using just a small tool). They are just so little work and versatile that I am using them for lots if different things: Network protocols, configuration files, save games, ...

I have found few libraries that made my life as much easier as protobuf did.

### #16Olof Hedman  Members

5698
Like
0Likes
Like

Posted 22 August 2013 - 04:46 AM

Protocol buffers looks really nice for anything that isn't a tree. (or am I missing something?)

I guess XML is at its best when you have to describe a tree, but I will definitely look into protocol buffers some more for things that isn't, and sending stuff

Edited by Olof Hedman, 22 August 2013 - 04:49 AM.

### #17stein102  Members

556
Like
0Likes
Like

Posted 22 August 2013 - 08:39 AM

Basically all I need this to do is the following:

-Have "Item" class and "ItemLoader" class

-ItemLoader reads the data file and creates an instance of item for each item in the file

-Item has very limited fields/methods(Id/description/name)

-ItemLoader reads the data file and adds all components to the item(ex. Consumable, wieldable, weapon, shield, etc.)

-Easy to manipulate via API(Going to be editing and creating items from a GUI)

### #18Karsten_  Members

2306
Like
0Likes
Like

Posted 22 August 2013 - 11:44 AM

Perhaps just wrap it one level higher, so you can easily switch it out depending on the job at hand.

At work, we tend to swap between JSON and XML weekly ;)

http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.

### #19rnlf  Members

1775
Like
0Likes
Like

Posted 23 August 2013 - 12:09 AM

Protocol buffers looks really nice for anything that isn't a tree. (or am I missing something?)

I guess XML is at its best when you have to describe a tree, but I will definitely look into protocol buffers some more for things that isn't, and sending stuff

message X {
optional X x = 1;
optional X y = 2;
}


This works just fine. Trees!

### #20Alpheus  GDNet+

6806
Like
0Likes
Like

Posted 23 August 2013 - 01:38 AM

<?xml version="1.0"?>
<item id="1">
<name>sword</name>
<components>
<component wieldable="true" />
</components>
</item>

External Articulation of Concepts Materializes Innate Knowledge of One's Craft and Science