Back to For Beginners

Parsing Commands

For Beginners

Started by Chad Smith March 20, 2013 05:49 AM

8 comments, last by DrSuperSocks 11 years ago

Chad Smith

1,344

Author

March 20, 2013 05:49 AM

I wanted to get some "basic" design ideas on parsing and processing commands that the user has typed in. I already have a basic system implemented that can get the command and arguments from the user. It can already split up which one was the command and argument and store it. Though the proper way to deal with commands is what I really am asking for.

I have done this before and they were just really simple ones. I would just "hard code" a if else if statement against some simple commands. While they may work for a simple application if used more and more, a more sophisticated designed basically mandatory.

I would like a way to "easily" add a command if needed, change how many arguments are required for that command. I am trying to get a "sophisticated" design though this isn't by any means a complicated parser, scripting language or anything like that.

Example of how simple:

>add 10 10

"add" would be the command and the arguments are everything afterwards. I have a basic system that gives the command and arguments the user typed in already so I guess parsing the commands and arguments isn't the problem. I also can "split" the arguments up and do something with them. From the previous example I can split 10 ad 10 up and give them to add so it can do it's job. Though that is the part that feels very "hackish." Giving the arguments to the command and processing the command. It feels very hardcoded.

Quick Example of what my code may look like right now:


// ..more stuff here

std::string input = GetUserInput();

// the command the user typed in
std::string command = ExtractCommand(input);
// arguments for the command
std::string arguments = ExtractArguments(input);

// "Mode" or Command we need to use
// CommandMode would just be an enum that takes in the command and returns what "mode" we need to be in
CommandMode userMode = GetMode(command);

switch(userMode)
{
     // ... all the different cases/commands, would run the function needed for that mode/command
}

// ... more stuff here

Sure it works but seems hackish and requires work if anything changes like more commands need to be added also checking that the correct arguments are entered and even if that command needed any arguments.

At this time also I do just assume that the arguments passed in are the correct "type" though later on I am looking to expand it to check that the argument list fits the command by what it expected and the number of arguments it needs or wants.

Thanks for any thoughts and ideas!

L. Spiro

25,818

March 20, 2013 06:16 AM

Frankly I would consider Flex/Bison essential tools for this job.

There is an initial learning curve but it is well worth it. They create parsers for you so all you have to do is specify the syntax (for Bison) and the tokens (for Flex) for your language and you are then basically free to create any language you want, complex or simple, and the parsing of that language will be stable and fast.

Your feeling that you are going about this the wrong way is correct.

You really want to get a real parser with look-ahead capabilities. And with Flex/Bison around, there is no reason to make your own parser from scratch except for for learning purposes.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Ryan_001

3,477

March 20, 2013 06:25 AM

Personally I was never a big fan of flex/bison. If they don't suit your needs perhaps try Boost::Spirit. For smaller/simple languages its clean and succinct.

cr88192

1,572

March 20, 2013 08:12 AM

Personally I was never a big fan of flex/bison. If they don't suit your needs perhaps try Boost::Spirit. For smaller/simple languages its clean and succinct.

yeah, agreed.

I have generally held the opinion that flex/bison are often a bit overkill for what one is getting from them.

for very simple "languages", my general preference is basically to have a "split" function which takes a string and splits it up into an array of strings (tokens), and then the first token is used to lookup the command (typically with some way to register new commands, if relevant).

or, if the syntax is not line orientated, to basically just start reading-off tokens and matching against various patterns.

for more advanced syntax designs, my general preference has been for hand-written recursive descent parsers.
these scale in a fairly straightforward manner over a fairly wide range of syntactic complexities.

for example, both my scripting language and C parser use recursive descent, ...
also, many common compilers (GCC, Clang, ...) also use recursive descent, ...

granted, there is a little funkiness when it comes to languages with C-like syntax in a few areas (like regarding parsing things like declarations and similar), and different languages have addressed these in different ways.

ADD: explanation: a recursive descent parser is basically just code for matching against tokens, but typically involves the use of multiple functions which may call themselves recursively to parse things (typically, each type of expression will have its own function).

at a more basic level though, there will be a tokenization function, whose main job is mostly to read off and classify the various kinds of tokens. the tokenizer (or lexer) is typically what deals with the actual input characters, and will generally work by matching various patterns in the characters, and will generally return either individual tokens, or an array of tokens.

(my parsers have most often read tokens one-at-a-time, but many parsers work by splitting the whole input buffer into an array of tokens in advance).

Current Status / Downloads: http://cr88192.mooo.com:8080/wiki/index.php/BGB_Current_Status

YouTube Channel: http://www.youtube.com/user/BGBTech

Main Page: http://cr88192.mooo.com:8080/wiki/index.php/Main_Page

Buckshag

899

March 20, 2013 09:16 AM

I have built some command system for this and use it extensively in our tools.

My design has the following classes:

- CommandManager: hold the history for undo, the registered commands, and is called by the user to execute commands.

- CommandLine: a string such as "-posX 10 -posY 14.53 -activated true -name "test"", which is split into a list of parameters and values.

- CommandSyntax: a list of optional and required parameters per command, and their types, and descriptions of each parameter (a parameter is like posX or name).

- Command: the command base class

- CommandGroup: a group of commands, for example if one given action contains several commands, but you want to Undo to undo all these commands together

The CommandSyntax class can validate the syntax easily. To see if they provided all required parameters/arguments, and if they are of the right type (check if floats or ints are really valid etc). Also I create in the tool some command lookup system which automatically creates the documentation for the commands based on the CommandSyntax.

Each Command has an Execute function and Undo function it needs to implement.

Some syntax init is implemented like this:


// init the syntax of the command
void CommandSaveActor::InitSyntax()
{
	GetSyntax().AddRequiredParameter(L"filename",	L"The filename of the actor file.", CommandSyntax::PARAMTYPE_STRING);
	GetSyntax().AddRequiredParameter(L"actorID",	L"The id of the actor to save.", CommandSyntax::PARAMTYPE_INT);
	GetSyntax().AddParameter(L"littleEndian",       L"True in case the actor setup shall be saved in little endian, false in case of big endian. The default value of the parameter will be little endian.", CommandSyntax::PARAMTYPE_BOOLEAN, L"true");
}

Inside the Execute of the command you can then easily extract the parameters/arguments using the commandLine class:


uint32	actorID		= parameters.GetValueAsInt(L"actorID", this);
bool	littleEndian    = parameters.GetValueAsBool(L"littleEndian", this);

String filename;
parameters.GetValue(L"filename", this, &filename);

I also can link callbacks to commands. This is for example when you load some model through a command and you want to then perform some other actions when inside the tool some model is loaded. Maybe to update some interfaces or so. You can then hook callbacks to these commands and implement those.

To execute a command you can just do like:


if (commandManager.Execute(L"SaveActor -filename c:/actors/test.actor -littleEndian false") == false)
    LogError(L"Failed to execute the command");

Hope that gives you some ideas. It is quite a big system, but it works very well and it doesn't depend on any external parsers or so, which I think is overkill as it is really simple to parse some simple command lines.

Hope this gives you some ideas :)

EMotion FX - Character Animation System

L. Spiro

25,818

March 20, 2013 12:03 PM

Spirit is a total piece of crap that does not compile for anyone anywhere.

You can’t find tutorials or examples that work with the latest version.

I downloaded it. Tried it. The classes are recognized which means my linker and include settings are fine, but some (not all) methods are unrecognized.

Okay, search online, find a different version (who knows what version I had, even though it was from their official site and obviously the most recent), same problem just different methods.

Firstly, Spirit doesn’t work.

Simple.

But even if it did ever work on any compiler, that compiler is not a console compiler.

Spirit absolutely 100% does not work on PlayStation Vita, for example. It doesn’t even work on normal GCC compilers because it is too heavily template-based.

When porting your work to GCC you often have to refactor template code. But in the case of Spirit that means entirely re-engineering the whole thing.

It was a nice study in metaprogramming, but it is not practical or useful.

Avoid Spirit by all means.

As someone who has tried them both, I can’t fathom why you would prefer Spirit/Wave over Flex/Bison.

They both have problems. But one of them is actually workable in the end.

I can literally point to Spirit as a source of trauma in my growth as a programmer, and that is not something I can say about many things.

L. Spiro

I restore Nintendo 64 video-game OST’s into HD! https://www.youtube.com/channel/UCCtX_wedtZ5BoyQBXEhnVZw/playlists?view=1&sort=lad&flow=grid

Ryan_001

3,477

March 20, 2013 12:28 PM

Spirit is a total piece of crap that does not compile for anyone anywhere.

You can’t find tutorials or examples that work with the latest version.

I downloaded it. Tried it. The classes are recognized which means my linker and include settings are fine, but some (not all) methods are unrecognized.

Okay, search online, find a different version (who knows what version I had, even though it was from their official site and obviously the most recent), same problem just different methods.

Firstly, Spirit doesn’t work.

Simple.

But even if it did ever work on any compiler, that compiler is not a console compiler.

Spirit absolutely 100% does not work on PlayStation Vita, for example. It doesn’t even work on normal GCC compilers because it is too heavily template-based.

When porting your work to GCC you often have to refactor template code. But in the case of Spirit that means entirely re-engineering the whole thing.

It was a nice study in metaprogramming, but it is not practical or useful.

Avoid Spirit by all means.

As someone who has tried them both, I can’t fathom why you would prefer Spirit/Wave over Flex/Bison.

They both have problems. But one of them is actually workable in the end.

I can literally point to Spirit as a source of trauma in my growth as a programmer, and that is not something I can say about many things.

L. Spiro

LOL fair enough. I only made small parsers with it and haven't used it for a while. In the end I wrote my own parser.

SiCrane

11,840

March 20, 2013 12:43 PM

To be honest what you're looking at seems sufficiently complicated that I'd consider embedding an existing scripting language if that's an option. I often use Python for embedding a command interpreter in C++, but there are certainly other possibilities.

Chad Smith

1,344

Author

March 20, 2013 02:09 PM

Thanks everyone for the replies, they weren't quite what I was working for, though after my original post I can see it hard to know exactly what I meant. I apologize. Though the replies are very interesting and sparks some interesting things I will look up pretty soon.

Basically though let's give an example of a text based adventure game. In those games you usually have commands like move, look, pickup, drop, and more commands like that. While its easy enough for a small game to just hardcode those in it could become complicated to process those commands and it really could not be easily moved to another game/application without a lot of changes. Though I feel rolling my own scripting language or embedding an existing scripting language might be a bit too much for a system like this. Though does provide some interesting ideas.

DrSuperSocks

258

March 20, 2013 10:49 PM

Well, for your small amount of commands that looks good to me. If you wanted to get semi-fancy, you could have an array of pointers to functions, where the functions executed the appropriate command. Then, your CommandMode would be the index of the function in your array. The functions would take an int for the number of args submitted and an array of strings for the actual args. Just an idea