• 03/01/14 12:15 PM
    Sign in to follow this  

    YAML Basics and Parsing with yaml-cpp

    Engines and Middleware


    YAML - YAML Ain't a Markup Language

    YAML is a general-use data serialization language. Its most common uses are storing configurations, data persistence and online software messaging. As its name states, YAML is not a markup language, and this allows for more readable code. Example XML code: true ultrahigh 1920 1080 hard false Equivalent YAML code: settings: graphics: vsync: true quality: ultrahigh resolution: width: 1920 height: 1080 gameplay: difficulty: hard invert_y: false

    Very Brief Introduction to YAML

    This is a short, yet objective, introduction. By reading this you should be able to create and edit simple YAML files (it's not that hard, but takes some minutes to get the hang of the syntax). If you want a complete overview, go to yaml.org and look at their specifications, you'll be amazed by how flexible (or inflexible, sometimes) it can be. YAML, a data serialization language, is designed to be easily readable by humans. It can store any type of data (even in binary form, if you so want). name: Dejaime #this is a comment #the character ":" is used for value attribution, so the line above # means "name = Dejaime" YAML has three basic structures: scalars, sequences and mappings. A document is formed by several nodes (or objects, if you prefer), that can be a scalar node, that holds information, or a map or sequence node, that hold other nodes. In a comparison to graphs, they can be branch or leaf nodes.
    • Scalars: as the name hints out, these are simple values, sometimes with an individual identifying name, sometimes without. Can be string, numbers...
    name: Dejaime #this is a simple string scalar age: 22 #this is me misstyping my age in a number scalar so it makes me look younger description: > #This is a string scalar that spans over two lines. needs a haircut, badly.
    • Sequences: these are simply a list of nodes without any special identifications. They are easily accessible by index.
    meals: ['0930', '1230', '1600', '2130'] #format hhmm, this is a string sequence
    • Mappings: also referred to as hashes and dictionaries, this is a structure that allows you to relate identifiers and their informations in a more direct way. These identifiers (also referred to as keys) are usually a simple string name and the information in them can be of any type; as in they can be numbers, strings or even other maps and sequences. It is comparable with an std::map or, in some cases, with std::multimap.
    #same examples above, they are actually a map, where I give names to values. person: name: Dejaime #maped value named "name". age: 22 #maped value named "age" description: > #maped value named ... I guess you got it. needs a haircut, badly. meals: ['0930', '1230', '1600', '2130'] #sequence mapped to a name, "meals". days_skipped_gym: #this is a sequence of sequences mapped to a name (phew!) [Oct2013, 31] [Nov2013, 30] [Dec2013, 31] [Jan2014, 31] [Feb2014, 14, and_growing] #they do not need to be uniform! Of course, these can be used together to create more and more complex documents, which, in turn, can be used to store any kind of information. Just don't create a password vault with this, it wouldn't be a good idea. The language itself has some neat features, such as unique key identifiers (marked by a "? ") and variables marked by the '&' and '*' characters. ? [sprite, zombie1] #this is a unique sequence identifier : sprite_file: zombie1.png sprite_sheet_folder: &sprites_folder \home\dejaime\GameDev\Spritesheets\ #sprites_folder is a reusable reference ? [sprite, zombie2] : sprite_file: zombie2.png sprite_sheet_folder: *sprites_folder #reusing the string above Having GUIDs is a great way to serialize any types of asset or object data, but I personally don't like YAML's syntax for these unique keys (actually I hate it, but well, them's the breaks). In that example, I use a sequence (denoted by "[]") that allows me to use, for example [sound, zombie1] or [data, zombie1] later, when I want to store different types of data. This unique sequence key feature is extremely useful. Using reference variables is also very handy, especially when you may need to change a variable on several items. In this example, if I change the folder where I hold my spritesheets, all I need to do is change that one single line, instead of editing every object that references that variable (or use a global variable in my engine). The YAML syntax is quite tricky to get and there are lots of gotchas; your first 10~20 tries will have invalid syntaxes. Worry not! Some syntax gotchas!:
    • No tabs allowed. The blocks in YAML are all defined by indentation, and they banned tabs. More info on this here.
    • White Spaces are meaningful when starting a line and are used to identify blocks (through identation).
    • ": " isn't necessarily ":". While "name: Dejaime" means that name is an scalar node and its value is "Dejaime", name:Dejaime means that the node is actually called name:Dejaime with type null. This is so we can have colons inside values like in "time_created: 19:03".
    For those who want to try YAML, there's an online syntax checker called yamllint. It validates your text and then spits out a version optimized for ruby, which'll be of no use for us. The important part is just checking the syntax validity. This should be enough for the article, but if you want to go deeper and into the fancy stuff, dive into the official specification.

    Our Example Problem

    We want to load sprites and their definitions, including all possible animations, frame duration, spritesheet location, and anything necessary, from a single data file. We'll be using the following (public domain) spritesheet:
    The original is available here: http://opengameart.org/content/dutone-tileset-objects-and-character We will assume that all frames of an animation are at the same horizontal level, with no border (just like in the sheet above) and each may have independent durations. We'll also assume a sprite has a unique name and can change between more than one animation (assuming their respective sizes). All right. Now that we have our definitions and assumptions, we need to define our YAML file structure, so we can store the information necessary for the whole sprite. So let's list the basic informations we need to store: Sprite:
    • Unique Name
    • Spritesheet ID
    • Animation List
    • Animation Name:
      • Initial SpriteSheet Offset
      • Animation Size
      • Frame List
        • Frame Duration
    Basically, that is all the information we will want in our YAML file. It is a complex map, so let's break it down into how to list it. The first information we need is actually the identifying name of the sprite (can be a number if you prefer): Player: Notice how I didn't use Name: Player, but simply inserted Player: directly. This means we have something called Player, and not that we have a node called Name and valued Player. Now we add the spritesheet reference. This can be a numeric ID, the path to the spritesheet file or something else. I will be using the spritesheet file path, as we won't be using any file loader that would handle UIDs (wenewbies:D [or not]). This takes us to our next line. SpriteSheet: /Resources/Textures/duotone.png We now have the basic information for the sprite, and need to detail the animations themselves. As animations each have their own names, let us list these names, in our example: Anim_Names: [run, idle, jump, die] These are the four animations for the player sprite, all added in a sequence called Anim_Names, so we can look them up later. Now that we know, in advance, the names of our sprite's animations, we can map them using their names with no problem! run: #more code idle: #more code jump: #more code die: #more code These animations also have specific informations: their size and their offset in the spritesheet. Offset: {x: 0, y: 0} Size: {w: 32, h: 32} The animations also need to know how many frames they have as well as each of these frame's time duration. We'll do the same we did to list the animations names. Frame_Durations: [80, 80, 80, 80, 80, 80] We have an animation with six frames where all of them have 80 ms duration. This is the last information we need to add to the animation, and to the sprite itself. Which takes us to our Player sprite configuration: Player: SpriteSheet: /Resources/Textures/duotone.png Anim_Names: [run, idle, jump, die] run: Offset: {x: 0, y: 0} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80, 80] idle: Offset: {x: 0, y: 32} Size: {w: 32, h: 32} Frame_Durations: [80, 120, 80, 30, 30, 130] #Notice the different durations! jump: Offset: {x: 0, y: 64} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 120, 80, 80, 0] #Can I say 0 means no skipping? die: Offset: {x: 0, y: 192} #192? Yup, it is the last row in that sheet. Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80] #this one has only 5 frames. And to add the remaining sprites: Monster: #lam nam SpriteSheet: /Resources/Textures/duotone.png Anim_Names: [hover, die] hover: Offset: {x: 0, y: 128} Size: {w: 32, h: 32} Frame_Durations: [120, 80, 120, 80] die: Offset: {x: 0, y: 160} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80] Gem: SpriteSheet: /Resources/Textures/duotone.png Anim_Names: [shine] shine: Offset: {x: 0, y: 96} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80, 80] Now we have a small problem. As the entire file is a map, we'll need to know what are the unique names of our sprites (in this case, Player, Monster and Gem). In addition, there's no way to access them by a numeric index. It won't be a problem when we have some sort of level definition, specifying all of its objects and their respective Sprites by name, referencing our sprite definitions file. But even then, this line won't hurt: Sprites_List: [Player, Monster, Gem] So, this is our definite Sprites.yaml file: Sprites_List: [Player, Monster, Gem] Player: SpriteSheet: /Resources/Textures/duotone.png Anim_Names: [run, idle, jump, die] run: Offset: {x: 0, y: 0} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80, 80] idle: Offset: {x: 0, y: 32} Size: {w: 32, h: 32} Frame_Durations: [80, 120, 80, 30, 30, 130] #Notice the different durations! jump: Offset: {x: 0, y: 64} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 120, 80, 80, 0] #Can I say 0 mean no skipping? die: Offset: {x: 0, y: 192} #192? Yup, it is the last row in that sheet. Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80] #this one has only 5 frames. Monster: #lol that lam nam SpriteSheet: /Resources/Textures/duotone.png Anim_Names: [hover, die] hover: Offset: {x: 0, y: 128} Size: {w: 32, h: 32} Frame_Durations: [120, 80, 120, 80] die: Offset: {x: 0, y: 160} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80] Gem: SpriteSheet: /Resources/Textures/duotone.png Anim_Names: [shine] shine: Offset: {x: 0, y: 96} Size: {w: 32, h: 32} Frame_Durations: [80, 80, 80, 80, 80, 80]


    The library yaml-cpp has two major versions right now, 0.3 and 0.5, both stable enough for use. The version we will be using here is 0.5 (0.5.1, as of this writing), since it has a new revamped API, one that makes a better use of C++ in my view. It is available under this link: http://code.google.com/p/yaml-cpp/; X11 (MIT) license, so no worries here.

    Building yaml-cpp

    The compilation is simple, but you'll probably want to use the Boost library. Under linux platforms, to compile boost, you'll only need to run ./bootstrap.sh && ./b2 and it will be built. You may want to issue a sudo ./b2 install so boost is installed in your system (under /usr/local). You can also install boost with apt-get or Synaptic, but you'll get a slightly outdated version. Notice that building the entirety of the boost library can take long, but you can build only the ones you're interested in (I personally always build it all). After you install boost on your system, yaml-cpp is also just as simple. Create a Build folder in the yaml-cpp root directory and issue cmake .. && make inside it, and it will be built. If you want, a sudo make install will install yaml-cpp in your system. If you didn't want to install boost, you may need to set the correct path in the boost variables of yaml-cpp CMakeLists file (or use ccmake, or even a cmake gui, if you prefer). If you have problems or questions on this regard, please refer to their official building guides...

    Parsing the File

    As we are getting into the code part, I must put a license on it.
    To the extent possible under law, Dejaime Antonio de Oliveira Neto has waived all copyright and related or neighboring rights to YAML-CPP C++ Example Code. This work is published from: Brazil.
    Now that we have our file, and we understand how it was created, we can go ahead and create our parser. First, we need to define our structure to hold that information inside our code. Starting with the Sprite itself, this is what I'll use: class Sprite { std::string m_sName; std::string m_sSpritesheetPath; Animation::p_vector m_pvAnimations; bool m_isLoaded; public: typedef std::vector p_vector; bool IsLoaded () const { return m_isLoaded; } bool Load (std::string file, std::string name); static bool LoadAll (std::string file, p_vector *target); }; It has variables to hold the name of the sprite, the filepath to the spritesheet and a vector for the animations. The Sprite::Load(string, string) function takes a name for a sprite and a filepath to the .yaml file (not to be confused with the spritesheet). It can be used directly or by calling Sprite::LoadAll (string, vector[Sprite*]), that will create, load and push all sprites in the file into the passed sprite vector. Our Sprites also depend on different Animations. struct Animation { typedef std::vector p_vector; v2 m_Offset; v2 m_Size; std::string m_sName; std::vector m_fvDurations; Animation () {} Animation (std::string p_name, v2 p_offset, v2 p_size) { m_sName = p_name; m_Offset = p_offset; //Overloaded = operator. m_Size = p_size; } }; This is nothing more than a fancy struct. As you can see, all the information necessary for any sprite can be stored in these, if it follows our initial assumptions. If you have special needs, you can just alter it to suit your needs. Maybe moving the spritesheet into the animation to allow the animations to be in independent textures, or even add more information on every frame such as size, to allow an animation to change in size on each frame. Another useful piece of info could be the origin of each frame (like the head or where to render the "poison cloud"). You get it, this example is indeed using a minimalist approach.

    The Load function

    //Returns false on error bool Sprite::Load (std::string file, std::string name) { if (m_isLoaded) return false; //Already loaded. Since we have our basic structure to hold the information, we can now retreive it from the file and store on our objects, in order to use it. The first thing we should do is open our .yaml file. To do this, we need a yaml node, so it can assume the root node of the file. If you don't know what a node is, you can go back to our YAML introduction or look at their specs. The yaml-cpp library works under the YAML namespace, and has a variable type for an yaml node, the YAML::Node, I guess I didn't really need to say that... Anyway, we need to declare an YAML::Node and assign the root node of our file to it. YAML::Node baseNode = YAML::Load(file); if (baseNode.IsNull()) return false; //File Not Found? Now that we opened the file and have our root node at baseNode, we need to find the node for our sprite. The name of our sprite was passed to us as an argument, and that's what we are going to use. Here is where the library author used C++ operator overload to give us a really nice API, as you'll see: YAML::Node spriteNode = baseNode[name]; if (spriteNode.IsNull()) return false; //Sprite Not Found? We simply use our string as the index, and it will find the correct node. If it is not found, it will be left as a null node. So, Sprite found, we can now start to load up our information, starting by the name. Of course, there's no need to look the name up, as we just received it as an argument, but we still need to look up the SpriteSheet path. //Set the name, that we know exists in the file. m_sName = name; //Set the SSheet path by casting the value of the SpriteSheet field m_sSpritesheetPath = spriteNode["SpriteSheet"].as(); m_isLoaded = true; //point of no return With the sprite specifics on their place, the next step is to put all animations to the animation vector. Our code will need to know how many animations there are, but that'll be no problem: //Now, we need to parse the info on the animations short int totalAnimations = spriteNode["Anim_Names"].size(); for (unsigned short i = 0; i < totalAnimations; ++i) { Every animation is a node, so we should now retrieve it. //We get the animation by looking up the string value of the // i-th entry on the Anim_Names sequence. std::string tmpName = spriteNode["Anim_Names"].as(); YAML::Node animNode = spriteNode[ tmpName ]; With the animation node retrieved, we can now create the animation and populate it with the information from the file. Animation *tmpAnim = new Animation(); tmpAnim->m_sName = tmpName; tmpAnim->m_Offset.x = animNode["Offset"]["x"].as(); tmpAnim->m_Offset.y = animNode["Offset"]["y"].as(); tmpAnim->m_Size.x = animNode["Offset"]["w"].as(); tmpAnim->m_Size.y = animNode["Offset"]["h"].as(); The last thing we need to do now is to get the time duration for our animation's frames. unsigned short totalFrames = animNode["Frame_Durations"].size(); for (unsigned short f = 0; f < totalFrames; ++f){ tmpAnim->m_fvDurations.push_back( animNode["Frame_Durations"][f].as() ); }//Finished! } return true; } And Voila! All our animations are now inside our Sprite object! Here is the complete code: https://gist.github.com/dejaime/9129611 The functions we used here were: YAML::Load(filepath); //Load a yaml file YAML::Node.as(); //Retrieve value casted to an specific type. YAML::Node.IsNull(); //Find out whether the node is of null type. YAML::Node.size(); //Gets the size of a sequence Node.

    Want to try?

    Start by creating a valid YAML file. You can test your syntax at yamllint.com. It will probably take some tries, but after you actually learn it, you will mostly get it right in one try. A single entry like your personal info will do just fine. After that, create a simple structure to hold that information and load it manually, inside your main.cpp directly if you prefer. Then, move the loading procedure into a class that can load the information itself and, lastly, make several entries and load them independently. Move on to more complex documents and you'll master it before you know it. Thanks to jbeder for the yaml-cpp library! It is so convenient I'm getting lazy. Also, thanks for Gaiden for the helpful review! Misc info: yaml-cpp version used: 0.5.1 compiled with boost 1.55.0

      Report Article
    Sign in to follow this  

    User Feedback

    Create an account or sign in to leave a review

    You need to be a member in order to leave a review

    Create an account

    Sign up for a new account in our community. It's easy!

    Register a new account

    Sign in

    Already have an account? Sign in here.

    Sign In Now


    Report ·


    Share this review

    Link to review

    Report ·


    Share this review

    Link to review

    Report ·


    Share this review

    Link to review

    Report ·


    Share this review

    Link to review