Jump to content

  • Log In with Google      Sign In   
  • Create Account

We're offering banner ads on our site from just $5!

1. Details HERE. 2. GDNet+ Subscriptions HERE. 3. Ad upload HERE.


- - - - -

Preprocessor


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
12 replies to this topic

#1 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 08 September 2004 - 03:26 PM

I've just uploaded a new version of the preprocessor I preprocessor I posted in 'Binding somethingorother'. It now supports function-macros. It's passed all my tests, but I would appreciate it if anyone could run some macro-code through and tell me if it works as expected! You'll get a complete listing of the script and a list of all the macros defined outputted to the console right before the script runs. Check the test script (script.txt) for a minor quirk in the syntax. Because my lexer strips whitespace, I can't detect wether the () in a function macro immediatly follows the macros name ( #define macro(a) is valid, #define macro (a) is not! ) I instead had to include the # character before the (). It MUST go #define NAME #(args) ... Note the placement of spaces. Keep in mind that this is a work-in-progress. The code is very messy. It supports everything I need for my own project now, so I probably won't be adding new features unless I see a demand for them. I will, however, be cleaning up the code. Specifically, I'm going to be adding a file-loader functor argument, and I still need to figure out how to support relative paths for #include. Clicky! [edit]They replaced UBB with HTML, didn't they? :/[/edit]

Sponsor:

#2 Andreas Jonsson   Moderators   -  Reputation: 3416

Like
0Likes
Like

Posted 09 September 2004 - 01:30 AM

Thank you very much for this great contribution to the AngelScript community. I'm sure many people will find it very useful.

I will upload this as soon as possible. I'm a bit swamped at work right now so I don't have much time, but I'm sure it will clear up in a couple of days, and I'll get back to work on AngelScript as normal.
AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

#3 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 09 September 2004 - 10:42 AM

No rush. You'll just have to replace it anyway. :D!

#4 Rizwan Khalid   Members   -  Reputation: 122

Like
0Likes
Like

Posted 09 September 2004 - 07:15 PM

Hello Deyja
I have also made a preprocessor as part of my project that uses angelscript.
As far as the problem of relative path and default path for the #include is concerned, i have solved that. So if you want i can help you in this regard. (you have to deal with GetCurrentDirectory() and SetCurrentDirectory())

The area where i am stuck is also macro and i can not use my own convention as you have used(# before ()) because i have to parse C++ header file. The simple Definitions such as
#define INT int
are not difficult.(and I have coped)

I would really appreciate any solution (even partial) and the identified problems in this regard.

Regards
Rizwan Khlid


#5 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 10 September 2004 - 12:29 PM

The simplest solution (and the one I will probably employ) is to NOT strip whitespace during the lex phase, but instead generate a lexem of type 'whitespace'. I'm already doing this with newlines. You have to deal with all the extra tokens later on, though. I did this at one point, but it was much easier to just use my nasty syntax than to deal with the excessive whitespace!
Another way would be to lex on the fly, not all at once. You then end up with something very close to an actual compiler, and it can change the lexer state before lexing the define name so that it can check for that space in there. I won't be doing this, because it makes the splicing operations I do near impossible.

I am VERY interested in your relative-path code. I'm already tinkering with a simple way of making the algorithm recursive, so that includes in included files are relative the included file, not the 'root' file. That probably didn't make sense.
File A includes file B/C. File C includes file D. File D is actually at B/D relative A, but is just D relative C. With my current system, it will look in the same dir as A for it, NOT in B.
The only hitch is 'adding' the paths. Given path A/B/C.txt and ../D.txt, the result should be A/D.txt I'm sure I could do it, if I just sat down and worked it out.

#6 Andreas Jonsson   Moderators   -  Reputation: 3416

Like
0Likes
Like

Posted 10 September 2004 - 03:16 PM

It's very interesting to hear about the progress in your preprocessors.

I think I'll try to expose some way for a preprocessor to adjust the line and cursor position in the code it outputs. That way any error that AngelScript reports would still be given the correct line and column even if the preprocessor has added extra code.

I'll probably do this as a special token that will be treated as whitespace by the compiler but that can be detected when the stream position is converted into line and column number.

I'm not sure when this will be done though. I'll have to analyze it some more first.
AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

#7 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 10 September 2004 - 04:07 PM

I'm going to have to look into the AngelScript source myself. If I can by-pass AngelScript's lexer, or work directly with the lexem stream it produces, I imagine I will gain substantial speed benefits. Atleast I can avoid dumping it all back into a raw buffer, just for AngelScript to lex it again.

Right now, my preprocessor doesn't do much of any error reporting. It generally just fails silently, and bludgeons on through the rest of the script. I haven't yet found an error state that won't eventually lead AngelScript to complain, though. If you have a bad define, but don't use it, everything is find. If you do, AngelScript will complain - chances are, the define won't be expanded, and will result in an unknown identifier.
As for error reporting inside AngelScript, Defines can't have newlines in them, so they won't change the number of lines when they are removed. The only problem is included files. I can easily preserve the number of lines, and position of lines in a single script. Because of the way AngelScript works, I do not actually have to splice included files into a single chunk of source code. I could load them all into a module, using the filename as the section. I'd merely have to pre-process all the files with a single define table!
If I decide to implement the / preprocessor operator, I might have some trouble. In the meantime, don't worry about it. You've already supplied all the tools we need!

#8 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 10 September 2004 - 04:25 PM

Wow. I don't think I've ever seen a recursive descent Tokenizer before. Any particular reason you didn't use a state machine?

#9 Andreas Jonsson   Moderators   -  Reputation: 3416

Like
0Likes
Like

Posted 11 September 2004 - 03:07 AM

No special reason, it was just the first working solution I came up with, and since it worked very well I didn't feel the need to try something else.

Would a state machine provide much improvement?

My tokenizer doesn't produce a lexem stream like you're looking for, it simply identifies the first token in a string. The parser manually moves the position in the character stream to identify each token. I thought that was a better solution than producing an intermediate lexem stream.


AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

#10 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 11 September 2004 - 10:46 AM

For most applications, it is probably much faster. I produce a lexem stream so that I can manipulate it in the form of a linked list. It makes inserting and removing lexems much easier, and faster. I'm going to make changes, whenever I work on it again, to preserve line numbers. I'm not so sure about column numbers, though. I can preserve whitespace exactly (Though it's going to make parsing things a real bitch) but, of course, expanding a define immediatly screws the column numbers to hell. Anything you can add to allow us to correct column numbers would be great. The # character isn't used any where in the script, so you can use this to signify some sort of column changing command. It fits well with the pre-processor, too, and the preprocessor will strip out any script-writers put in. It has to be relative, though. '#>5' could add five to the column number. '#<5' could subtract five.

It seems like a lot of hassle for error messages.

#11 Andreas Jonsson   Moderators   -  Reputation: 3416

Like
0Likes
Like

Posted 11 September 2004 - 12:58 PM

Yes, a lexem stream is probably faster if you have to manipulate it. But AngelScript simply reads from it, it never changes it so allocating all those list nodes would be a waste of time I think. Anyway, I might change this in the future when/if I add improved support for pre-processors. Possibly a plug-in-able pre-processor (Just a thought that hit me now).

For now I'll leave everything as is. I'll see what I can do in the future about the position adjustment command.


AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

#12 Deyja   Members   -  Reputation: 920

Like
0Likes
Like

Posted 12 September 2004 - 02:51 AM

I got relative include paths working. It proved simpler than I thought, but I don't know if my solution is portable. It seems windows is fully capable of following a path like 'scripts/../includes/include.txt', so all I had to do was strip the filename off of one path and append the other to it.

I have also suddenly found a need for #ifdef, so I'm going to implement those before I release again.

#13 Andreas Jonsson   Moderators   -  Reputation: 3416

Like
0Likes
Like

Posted 12 September 2004 - 03:21 AM

I'll upload your preprocessor to my site today.

How about making a download page with the latest version of the preprocessor? That way I can link to it from my site.


AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS