Jump to content

  • Log In with Google      Sign In   
  • Create Account


LEON - A simple text-based data format


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
5 replies to this topic

#1 Scarabus2   Members   -  Reputation: 539

Posted 01 April 2013 - 09:25 AM

Between having some issues with XML at our small independent game company and just wanting to scratch an itch I decided to create my own text-based data format. Its primarily an in-house format for our artists/designers to use, but part of the idea was also to make something that would be useful to inexperienced game programmers and the indie community as a whole. So right now I've made the code available on Sourceforge.
 
Source code: https://sourceforge.net/projects/leon-serialize/
Specification: https://docs.google.com/document/d/1Jx4OLn_c22RwZ3GIpucylFZGSU-JyT-steDfjpjubH0/edit
 
What is it, and why?
In short, it's an API that takes a LEON-formatted text file and gives you a data structure for accessing the values in a straight-forward way.
// LEON
value : 100
 
// C++
int iValue = leon["value"].AsInteger();
 
I like it when complexity comes from syntax and not the other way around, so I limited myself to a handful of structural characters and they only become necessary as complexity grows. Rather than
expecting strict rules to be followed the LEON parser will generally deal with what it's given.
You can have complex tree-based structures but they're not required. Similarly you don't have to put names on things if you don't need to. If you just want to throw some arbitraty text and numbers into a file that is fine too.
 
The most important point to me was that I wanted to give a LEON-file to someone who hasn't seen it (like one of our artists) and they should intrinsically understand the syntax and discern how to make changes, without me telling them how. That's why I opted out of using JSON, for instance.
 
Whether or not I've actually managed to do that is up for debate, but I still think it's a neat little API and so far it's worked out okay. At the end of the day this is just something I did for fun.
 
Why you should care
I often recommend TinyXML to anyone who dabbles with text files, rather than writing their own crude text readers. LEON is for those of you who use text files but doesn't want the syntactical or implementational burden that comes with XML, nor wishes to write their own parser. It sits somewhere in between those two cases.
 
Features
  • Supports common data types: strings, numbers, colors, lists, tables, etc. as well as custom types.
  • Straight-forward syntax and code.
  • Parser intelligently trims off irrelevant whitespace.
  • URI-naming convention for look-ups of values.
  • Syntax error reporting with line, column.
  • Ability to enable stricter parser rules to avoid unexpected results.
  • Currently used in production code.
Caveats
  • Types are evaluated by the user. A list could be a list, a table, or a sequence of unrelated values.
  • The API offers many function in a const and a non-const flavor. The const methods will always return a value, even if it's a null-value or zero. The non-const methods will throw an exception if a result can't be given. For convenience there are accompanying reader and writer-classes.
  • So far there is no API documentation other than the specification, which includes example data and code.
  • The code has not been optimized beyond regular memory-aware programming.
  • Uses STL internally.
  • This might look and behave exactly like another format that I am not aware of. In that case, now there is one more solutions to that problem.
Sample LEON data & C++ code:
leDataDocument leon("sample.txt");
Given the document loaded into a variable "leon"...
 
// Some prime numbers
2, 3, 5, 7, 11, 13, 17, 19, 23, 29
leDataStringList primes = leon.GetAttribute("#1").AsList();
 
// Some names
JJ. Abrams
Jon Favreau
Francis F. Coppola
Joss Whedon
Michael Bay
leDataStringList names = leon.GetRoot().AsList();
 
Enemy (
    name: Goomba
    health: 50
    strength: 10
    sprite: sprites/goomba.png
)

Enemy (
    name: Turtle
    health: 80
    strength: 20
    sprite: sprites/turtle.png
)
std::string enemyName = leon["Enemy#2.name"];
 
Element #gold
(
    name: "Au (Gold)"
    "atomic number": 79
    description: "Gold is a dense, soft, shiny,
                  malleable and ductile metal."
)
int goldNumber = leon["gold.atomic number"].AsInteger();

visualnovelty.com - Novelty - Visual novel maker

Sponsor:

#2 LorenzoGatti   Crossbones+   -  Reputation: 2311

Posted 03 April 2013 - 01:48 AM

 

Rather than
expecting strict rules to be followed the LEON parser will generally deal with what it's given.
You can have complex tree-based structures but they're not required. Similarly you don't have to put names on things if you don't need to. If you just want to throw some arbitraty text and numbers into a file that is fine too.
 
The most important point to me was that I wanted to give a LEON-file to someone who hasn't seen it (like one of our artists) and they should intrinsically understand the syntax and discern how to make changes, without me telling them how. That's why I opted out of using JSON, for instance.

This is only easy and simple if "it works". The first time the parser doesn't guess your intent, your artist wins a trip to debugging hell.

[*]Parser intelligently trims off irrelevant whitespace.
[*]URI-naming convention for look-ups of values.
[*]Ability to enable stricter parser rules to avoid unexpected results.

Rather scary features. Training each author to understand a straightforward file format (I'm thinking of JSON, with its modest amount of boring details like escaping characters in strings or which ways to write numbers are allowed) before using it would be more prudent and less expensive than starting "intuitively" and later hitting a brick wall.

If you want a useful text-based file format, you should invest your semantic and syntactic complexity budget to do something XML and JSON aren't particularly good at, like templates for partly random data or succinctly defining large or complex non-tree data structures, not to second-guess the user or spare some delimiters.
Produci, consuma, crepa

#3 Scarabus2   Members   -  Reputation: 539

Posted 03 April 2013 - 03:21 AM

This is only easy and simple if "it works". The first time the parser doesn't guess your intent, your artist wins a trip to debugging hell.

 

Perhaps I wasn't very clear. The parser doesn't make any guesses. I meant it as to say that you only get what you put in. Omitting syntax merely results in a less complex data structure. I really wanted to avoid having a lot of boilerplate syntax like <Key name="Foo" value="Hello"/> or symbols that by themselves doesn't tell you anything.
 

 

Rather scary features. Training each author to understand a straightforward file format (I'm thinking of JSON, with its modest amount of boring details like escaping characters in strings or which ways to write numbers are allowed) before using it would be more prudent and less expensive than starting "intuitively" and later hitting a brick wall.

 

Naturally I would love for everyone on my team to be thoroughly schooled in the various formats they use so they could avoid making errors or at least debug them independently of my or my co-workers involvement, but in reality most have only superficial experience with XML, JSON or whatever. Making changes is usually fine, but only a few of us will actually create new data files and write the code to parse them. Finding a missing space between two XML-attributes could hold up two people for 30 minutes even if we know which file to look in.

This format isn't going to solve typos, but it alleviates the syntactical burden of the other formats we've used so far by being relatively light.
 

To me LEON is about convenience and that obviously brings scary stuff into the mix to hardened developers, but I believe this should be more than fine for the rest of us.


visualnovelty.com - Novelty - Visual novel maker

#4 LorenzoGatti   Crossbones+   -  Reputation: 2311

Posted 04 April 2013 - 01:39 AM

&nbsp;

Perhaps I wasn't very clear. The parser doesn't make any guesses.&nbsp;I meant it as to say that you only get what you put in. Omitting syntax merely results in a less complex data structure. I really wanted to avoid having a lot of&nbsp;boilerplate syntax like &lt;Key name="Foo" value="Hello"/&gt; or symbols that by themselves doesn't tell you anything.


A parser that allows "omitting syntax" isn't guessing in the formal sense of the word (choosing arbitrarily when given an ambiguous input), but it's enacting your own higher level guessing about what the users might omit and what they could mean; such a language follows complex and implicit rules, and sooner or later users are going to have to understand these rules (probably failing and/or seeing them as bugs) instead of the simple rules of a less fancy language.

If you don't see the threat, I can only recommend that you invest on tools, so that manual editing of text files is minimized; GUI based editors can serialize and deserialize data without manual intervention (a task for which JSON and XML have vast library support), while noninteractive tools can process data files and ensure they are written correctly.
Can you tell more about which types of LEON files are you going to use and about their place in the workflow?
Produci, consuma, crepa

#5 Scarabus2   Members   -  Reputation: 539

Posted 04 April 2013 - 08:36 AM

Oh if you are in a position to set up a proper XML schema/tool workflow, do that! LEON was created for our own needs and made available to anyone that aren't able or willing to make that leap from a more notepad-based workflow. And to avoid misunderstanding let me reiterate that LEON partially stems from my own desire to create something for the fun of it.

We're developing a war game where the player researches and upgrades weapons to fight waves of attacking enemies. All of the weapon and enemy details (including stats, behaviour and asset information) are stored in CSV files, and edited with open office or excel, while other information is typically stored in XML-files. Some files have multiple localized versions. Gameplay balancing is divided between one designer and a few artists, while us programmers focus on implementing the systems.

 

Most syntax-hiccups have been related to XML but CSV is finicky with Subversion. It's a very "horizontal" format and in practicality only one person can work a file at any one time.

Recently we've only begun to replace some of these files with LEON files, to test the stability and feasibility of the format. So far it's gone smoothly both on the designer side and on the programming side. One of the programmers have used LEON to script tutorial sequences, connecting UI elements and game mechanics.
 

​I agree that ambiguity is bad, but I honestly don't believe LEON is a large enough format to warrant much worries about that. Some rules are implicit, yes, but only where it makes sense.

If someone writes:

 

description: "He was a tall
              and handsome man."

 

the parser (or I, rather) assumes they didn't want to keep the whitespace at the start of the second line, but just in case there is a way to retain that whitespace (although I'm not sure whether to keep this feature or not):
 

description = ("He was a tall
               and handsome man.")

 

The worst example I can think of is if someone meant to write this:

enemy-health: 100

 

but instead wrote:

enemy-health 100

that would be a valid value to LEON but not a valid value to the game. But this isn't much different from any typo in any format, and by setting a flag in the parser you can catch this error at runtime.


Edited by Scarabus2, 04 April 2013 - 08:53 AM.

visualnovelty.com - Novelty - Visual novel maker

#6 mikenovemberoscar   Members   -  Reputation: 215

Posted 04 April 2013 - 10:34 AM

LEON looks really cool, good job.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS