Jump to content

  • Log In with Google      Sign In   
  • Create Account

Alternative to JSON and XML?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
29 replies to this topic

#1 Nairou   Members   -  Reputation: 418

Like
1Likes
Like

Posted 19 November 2012 - 04:14 PM

Currently, in my game, I use XML for most of my data files. Not because I particularly like XML, but because the library I'm using (PugiXML) is particularly friendly. However, because the files are XML, they feel rather unwieldy to edit.

In the past I tried switching to JSON. Ignoring the fact that I couldn't find a C++ JSON parsing library that didn't spit out tons of compiler warnings, I found JSON to be just as annoying to use as XML. My primary interest in JSON was that it was a simpler format, and seemed to have less extraneous text required to be a valid and easily-readable file. However, after using it, I found that to not be the case.
  • The entire contents of the file has to be surrounded in curly braces. Not a big deal, but bothers me as it feels unnecessary. I don't care about it being parseable by Javascript, I just want it to hold some data from my game, that only my game will look at. I can't just dive in, create a new file, I first have to start with the curly braces to make it valid.
  • All variable names have to be surrounded by quotes. If the file is a decent size, pretty soon all you see is a big sea of quotes. They add noise and extra work to what should otherwise be a simple layout.
  • JSON is very strict about commas. If you create an array of items, every item must end with a comma, EXCEPT for the very last item, which must NOT end with a comma. It is very common fro me to get parse errors because I added or removed something from an array, and didn't notice that the commas were wrong.
I recently found this blog post which gives pretty much the same list, and shows an example of an alternative layout. I think the result he comes up with is pretty good. If there was a library to parse that format, I'd use it.

But I think it could be taken even further. Considering that this is for storing data for a game to use, I don't remotely care about any features that exist for the sake of Javascript or the web. All I need is a list of key-value pairs, where the value could be a single value or an array. Something like this (as a random example):

name "Test model"			// Simple key-value pair
version 1.0				// Numeric values are allowed
geometry {				// Value is an array
	vertices {
		0, 0, 0			// Array of values. No key given, so accessible by index position
		1, 0, 0			// Another array of values, also accessible by index
		1, 1, 0			// ...
		0, 1, 0			// ...
	}
	indices 0, 1, 2, 2, 3, 0	// Value is an array. Array values are inline, so don't need curly braces.
}

To me (and I know this is personal opinion), this format looks great. Simple, no extra characters, easy to understand and write.

So, I guess I'm wondering:
  • Why do most people seem to be using XML or JSON for their data files? In the context of writing a game, where you don't need to communicate with an external piece of software, it seems like overkill.
  • Does there already exist a file format (and accompanying library) that is extremely simple, like what I've written here? If not, I may have to write it myself, but no sense rewriting something that already exists.


Sponsor:

#2 FLeBlanc   Crossbones+   -  Reputation: 3101

Like
2Likes
Like

Posted 19 November 2012 - 04:22 PM

I've never liked XML, and I have many of the same objections to JSON that you do. I'm not aware of any library that does exactly as you are asking for, though. Personally, I like to use Lua for my game files and data. It's quite handy for data description, even if I don't use it as an embedded script but just to load and move data.

#3 saejox   Members   -  Reputation: 714

Like
0Likes
Like

Posted 19 November 2012 - 08:02 PM

i use 2 formats for serialization and transfer. Binary and XML.
There is practically no reason to use human readable format if you transfer data.

i prefer XML because it is more flexible. I have event scripts defined in XML files, lots of CDATA sections etc...
JSON's strict syntax with commas and double quotes make it hard for me to write.

#4 ApochPiQ   Moderators   -  Reputation: 15698

Like
6Likes
Like

Posted 19 November 2012 - 08:37 PM

XML is primarily useful if you take advantage of the hundreds of tools that exist for working with XML. JSON is in a similar boat.

If you're editing XML by hand, you're doing it wrong. At the bare minimum, use an XML editing tool and a schema to validate your content for you.

#5 Telastyn   Crossbones+   -  Reputation: 3726

Like
3Likes
Like

Posted 19 November 2012 - 09:10 PM

Because I can serialize and deserialize json, XML, or binary in one line and move on to things that matter.

#6 Nairou   Members   -  Reputation: 418

Like
0Likes
Like

Posted 19 November 2012 - 11:10 PM

If you're editing XML by hand, you're doing it wrong. At the bare minimum, use an XML editing tool and a schema to validate your content for you.


But that's just it. I don't care about the schema, or using a special editor. I just want an easy to use, easy to read file format for storing structured data. Whether it is XML, or JSON, or Lua, or something else, doesn't really matter so long as it fits those two goals.

Edited by Nairou, 19 November 2012 - 11:11 PM.


#7 dmatter   Crossbones+   -  Reputation: 3093

Like
0Likes
Like

Posted 20 November 2012 - 03:31 AM

Why do most people seem to be using XML or JSON for their data files? In the context of writing a game, where you don't need to communicate with an external piece of software, it seems like overkill.

XML and JSON are well understood formats with easy-to-use, mature parser implementations in many languages (often with some kind of support for at least XML built-into the language); when something ticks those boxes it's difficult to call it overkill. In the context of games you frequently are communicating with other bits of software, you could have a complex toolchain spanning a number of platforms with tools written in various languages.

Does there already exist a file format (and accompanying library) that is extremely simple, like what I've written here? If not, I may have to write it myself, but no sense rewriting something that already exists.

Take a look at YAML, it's pretty much a format expressly intended to alleviate the problems you highlighted.

Edited by dmatter, 20 November 2012 - 03:31 AM.


#8 shadowomf   Members   -  Reputation: 315

Like
0Likes
Like

Posted 20 November 2012 - 05:46 AM

I never understood why anyone would call xml human readable. It's a complete mess and most of the parsers only implement half of it.
JSON is more readable to me.

But if you really need something only for yourself/inhouse I would alwqays choose ini or an ini-based format.
You can choose arbitrary section names and use them to build hierarchies (e.g. [Engine\Renderer]).
If you need lists or tables, just use a csv-string, or make up your own format.

Of course you shouldn't use GetPrivateProfileString for performance and portability reasons. Write your own parser or use something like SDL_Config.

Edited by shadowomf, 20 November 2012 - 05:47 AM.


#9 Bregma   Crossbones+   -  Reputation: 5133

Like
3Likes
Like

Posted 20 November 2012 - 06:31 AM

The main advantage to XML and JASON is they're fully buzzword-compliant. There are plenty of books published on them to earn the authors money, not to mention websites to sell views, and it's easy to write filters for scanning resumes based on those keywords.

Another advantage is you can just grab an available library and integrate it into your project. Excepting that they tend to be some complex and do-everything that by the time you've bent the API to your will and got everything integrated into your project, you could have written and tested your own simple data loader twice over.

A small data loader is easy to write and write tests to validate. JFDI.
Stephen M. Webb
Professional Free Software Developer

#10 Strewya   Members   -  Reputation: 1433

Like
2Likes
Like

Posted 20 November 2012 - 07:01 AM

might want to look at libconfig, which i discovered because of this topic, and on first glance it might be just what you're looking for.
http://www.hyperrealm.com/libconfig/
(check out the documentation link on their site)

devstropo.blogspot.com - Random stuff about my gamedev hobby


#11 NightCreature83   Crossbones+   -  Reputation: 2824

Like
0Likes
Like

Posted 20 November 2012 - 08:01 AM

I never understood why anyone would call xml human readable. It's a complete mess and most of the parsers only implement half of it.
JSON is more readable to me.

But if you really need something only for yourself/inhouse I would alwqays choose ini or an ini-based format.
You can choose arbitrary section names and use them to build hierarchies (e.g. [Engine\Renderer]).
If you need lists or tables, just use a csv-string, or make up your own format.

Of course you shouldn't use GetPrivateProfileString for performance and portability reasons. Write your own parser or use something like SDL_Config.

I wouldn't ever pick a comma to be a delimiter in my own formats to be honest. If you ever plan on distributing it and also support float types in that file you might run in to trouble with locale on certain machines.

Remember that in Europe most countries use "," as the decimal operator and not ".", this means that floats written out will be 0,05 instead of 0.05, the .x and .obj formats have tripped me up with this on several occasions.

Again on home project shouldn't be an issues but it's worth knowing about.
Worked on titles: CMR:DiRT2, DiRT 3, DiRT: Showdown, GRID 2, Mad Max

#12 swiftcoder   Senior Moderators   -  Reputation: 9992

Like
2Likes
Like

Posted 20 November 2012 - 10:36 AM

YAML is the answer to your prayers.

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#13 Nairou   Members   -  Reputation: 418

Like
0Likes
Like

Posted 20 November 2012 - 11:58 AM

might want to look at libconfig, which i discovered because of this topic, and on first glance it might be just what you're looking for.

Looks great! I just wish it wasn't LGPL. Dynamic linking just for a tiny config library seems silly.

YAML is the answer to your prayers.

I actually tried YAML very briefly. My only issue with it is the lack of good C++ libraries. I tried the three listed on the YAML website, but each was either a pain to use (API design) or liked to spit out tons of compiler warnings.

#14 swiftcoder   Senior Moderators   -  Reputation: 9992

Like
0Likes
Like

Posted 20 November 2012 - 12:14 PM


YAML is the answer to your prayers.

I actually tried YAML very briefly. My only issue with it is the lack of good C++ libraries. I tried the three listed on the YAML website, but each was either a pain to use (API design) or liked to spit out tons of compiler warnings.

Really? I've used yaml-cpp for a couple of years now, with nary an issue.

It does issue warnings here and there if your compiler is out of date, but no more than any other template-based library.

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#15 NightCreature83   Crossbones+   -  Reputation: 2824

Like
1Likes
Like

Posted 20 November 2012 - 01:40 PM



YAML is the answer to your prayers.

I actually tried YAML very briefly. My only issue with it is the lack of good C++ libraries. I tried the three listed on the YAML website, but each was either a pain to use (API design) or liked to spit out tons of compiler warnings.

Really? I've used yaml-cpp for a couple of years now, with nary an issue.

It does issue warnings here and there if your compiler is out of date, but no more than any other template-based library.

Then the people who are writing the library need to realise that having warnings in a lib is bad and that they should change the compile settings to treat warnings as errors.
Worked on titles: CMR:DiRT2, DiRT 3, DiRT: Showdown, GRID 2, Mad Max

#16 swiftcoder   Senior Moderators   -  Reputation: 9992

Like
1Likes
Like

Posted 20 November 2012 - 02:23 PM

Then the people who are writing the library need to realise that having warnings in a lib is bad and that they should change the compile settings to treat warnings as errors.

Meh. I don't think I've ever seen a serious piece of software without at least some compiler warnings. How often do you recompile dependent libraries anyway?

Tristam MacDonald - Software Engineer @Amazon - [swiftcoding]


#17 shadowomf   Members   -  Reputation: 315

Like
1Likes
Like

Posted 20 November 2012 - 03:39 PM


I never understood why anyone would call xml human readable. It's a complete mess and most of the parsers only implement half of it.
JSON is more readable to me.

But if you really need something only for yourself/inhouse I would alwqays choose ini or an ini-based format.
You can choose arbitrary section names and use them to build hierarchies (e.g. [Engine\Renderer]).
If you need lists or tables, just use a csv-string, or make up your own format.

Of course you shouldn't use GetPrivateProfileString for performance and portability reasons. Write your own parser or use something like SDL_Config.

I wouldn't ever pick a comma to be a delimiter in my own formats to be honest. If you ever plan on distributing it and also support float types in that file you might run in to trouble with locale on certain machines.

Remember that in Europe most countries use "," as the decimal operator and not ".", this means that floats written out will be 0,05 instead of 0.05, the .x and .obj formats have tripped me up with this on several occasions.

Again on home project shouldn't be an issues but it's worth knowing about.


Good that you mention it. I'm from Germany and we use a decimal comma.
You could replace the comma in csv with character and use whatever floats your boat.

Maybe one could put strings in quotes and accept any character (including commas) inside, use backspace to escape quotes, tabs, backspaces and newlines.
And for floats I would use the format with dot (just something simple, no scientific notation). Most of the time you do need the value and do the localization only when you have to put it on screen.
For list use csv or if you like switch to semicolon, which is a common alternative. But allow quoted strings in csv-data, in case you need to put a comma/semicolon in a table.

#18 sungupta   Members   -  Reputation: 103

Like
0Likes
Like

Posted 27 November 2012 - 04:07 AM

I find using liquid xml studio makes it a little bit easier when working with xml files, mainly because its got everything i need including extensive schemas, wsl, xml, schema, editing parsing etc etc.

#19 shadowomf   Members   -  Reputation: 315

Like
-1Likes
Like

Posted 13 December 2012 - 04:18 PM

extensive schemas, wsl, xml, schema, editing parsing etc etc.


Aside from the fact that this smells like spam...

I would guess most of the time developers won't use these features.

At least I can't imagine a game developer that is starting to work with schemas and validation unless he/she is exporting it to software not developed inhouse. And even then I would probably not write a schema or use an existing for validation.

It takes time to write a schema (and even more to learn all about writing them) and is not exactly something that is fun to do. I have also worked with xslt and to if I had to do the same thing again in the future, I wouldn't use xslt and would try to write my own tool that loads and works with the tree. Or even better, not use xml.

#20 samoth   Crossbones+   -  Reputation: 4771

Like
1Likes
Like

Posted 14 December 2012 - 05:01 AM


Remember that in Europe most countries use "," as the decimal operator and not ".", this means that floats written out will be 0,05 instead of 0.05, the .x and .obj formats have tripped me up with this on several occasions.

Good that you mention it. I'm from Germany and we use a decimal comma.

Gah. That was the biggest stupidity since the invention of computers, it annoys the hell out of me regularly. Prior to "general availability" of computers (or before anyone ever wasted a thought on localization), pocket calculators had used '.' for decades, and everyone was fine with that. Why couldn't they just for f...'s sake keep it that way.

Especially since the '.' key on your numpad is ',' too -- which is usually exactly what you don't want, it renders your numpad useless for its main use (entering a series of numbers quickly).

Except in Excel and its free clones of course, where you actually need comma as a decimal operator (and have to translate function names like SUM and AVERAGE -- wonder which genious came up with that idea).

I never understood why anyone would call xml human readable.

That's probably because a lot of people are used to SGML due to writing or having written HTML by hand once upon a time.

When I first heard about XML a decade or so ago, my first thought was "bah, looks just like cheap HTML, what a rip-off". My second thought was: "Wait, this works for arbitrary data, there is a readily working parser, and it looks like HTML, how cool is that...".

So yeah, XML is a poor format, bloated, too explicit, too much this, too little that, call it what you like. But you and your mother can write it with eyes closed and one arm tied to your back, which is what counts. Most of the negative properties of XML just don't count when you use it as input format in your production pipeline. It's only parsed once during build, and packed to some binary format anyway.

Of course the same is true for JSON or YAML once you're used to them, and at that point they're probably more intuitive.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS