Sign in to follow this  

Compackter code?

This topic is 3951 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Dear thread reader, Im working on my own scripting engine and at the moment I'm constructing my lexer. For one token I got this structure:
struct TScriptToken
{
  int            Type; //Example: constant
  int            SubType; //Example: if Type is a constant, subtype maby string 
  int            LineNum; //Line number this token came from (for neat error output and debug data)
  char*          StrData;
  int            IntData;
  unsigned int   UIntData;
  short          Int16Data;
  unsigned short UInt16Data;
  char           Int8Data;
  unsigned char  UInt8Data;
};

Is there a way to make this code cleaner and maby easier to use? These structures are put into a linked list (also homemade).

Share this post


Link to post
Share on other sites
You might wanna consider using an union. If something is an integer, why remember string data? And vice versa:

Linky to cplusplus.com

-Stenny

EDIT: Glad I could help a fellow Dutchie[smile]

Share this post


Link to post
Share on other sites
Bedankt, dat ziet er goed uit :D

Trans: Thanks that seems very usefull :D

Share this post


Link to post
Share on other sites
I am assuming C++.

Consider using enumerated types for the Type and SubType.

Use a std::string for the StrData. This isn't even really a "consider"; you're in for a world of pain otherwise. Similarly, don't use your own linked list; use a standard library container. std::list is a linked list. That might not necessarily be the kind of container you actually really want (There are several to choose from).

Consider whether you really need to distinguish all those different integer types for your scripting engine. The reason those types are in C++ is so that the programmer can control memory usage at a byte level. If you're doing "scripting", you shouldn't have to worry about such things.

Instead of a union, consider just having a std::string that holds the textual representation of the token, and providing helper functions to extract the value as whatever other type. Again, this is easy with the standard library:


struct TScriptToken {
TokenType type;
TokenSubtype subtype;
int linenum;
std::string text;

const std::string& asText() {
assert (subtype == STRING);
return text;
}

int asNumber() {
assert (subtype == INT);
int result;
assert (std::stringstream(text) >> result);
return result;
}
};

Share this post


Link to post
Share on other sites
boost::any

http://www.boost.org/doc/html/any.html

cdiggins::any

http://www.codeproject.com/cpp/dynamic_typing.asp

And if you still need to know about proper type:

struct TScriptToken
{
enum type;
any value;
}

Share this post


Link to post
Share on other sites
actually id say boost.variant is better suited to this problem then boost.any.


enum Type
{
string_t,
int_t,
char_t
};

// can hold a string, an int, or a char
typedef
boost::variant
<
std::string,
int,
char
> DataType;

struct Token
{
Type type;
DataType data;
};

// emits a push instruction
struct PushData : boost::static_visitor<>
{
// this will be called if the data is a string
void operator()(std::string)
{
throw std::logic_error("strings cannot be pushed on the stack");
}

// this will be called if it is an int
void operator()(int i)
{
current_script.emit_push((char*)&i, 4);
}

// this will be called if it is a char
void operator()(char c)
{
current_script.emit_push(&c, 1);
}
};

int main()
{
Token t;
t.type = int_t;
t.data = 5;
// calls the version on operator() that takes an int
boost::apply_visitor(PushData(), t);
// we have a bug and accidently store a string instead of a char
t.data = "hello";
// but this still calls the string version which throws the exception
// and lets us know before bad things happen
boost::apply_visitor(PushData()
}

Share this post


Link to post
Share on other sites
In my opinion, union types is one of the many things O'Caml got right, and they help a lot in this kind of situation. Their machine-level implementation is quite simple: they have an initial segment which describes their type, followed by one or more arguments which are pointers to the associated values.

However, C++ lacks both pattern matching and reference semantics, so I'll use value semantics, and handle matching through the vptr of each class. The idea is to have a Token base class, from which we inherit all other tokens. Then, using a visitor pattern, we apply operations to the token to extract its content. This allows us to keep type-safety, handle pattern matching elegantly without a switch, and to add any properties we wish to the derived types.


namespace Tokens
{
class Visitor
{
public:
virtual void visit(const Token &) = 0;
virtual void visit(const StringLiteral &) = 0;
virtual void visit(const IntLiteral &) = 0;
};

class Token
{
public:
const int lineNum;
Token(int lineNum) : lineNum(lineNum) {}
virtual void visit(Visitor &) const = 0;
};

class StringLiteral : public Token
{
public:
const std::string value;
StringLiteral(int lineNum,const std::string value)
: Token(lineNum), value(value) {}
void visit(Visitor & v) const { v.visit(*this); }
};

const IntLiteral : public Token
{
public:
const int value;
IntLiteral(int lineNum,const int value)
: Token(lineNum), value(value) {}
void visit(Visitor & v) const { v.visit(*this); }
};
}



Then, simply code your parsing automaton transitions as visitors, and everything should work out correctly.

Share this post


Link to post
Share on other sites

This topic is 3951 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this