Parse C code for symbols

Started by
4 comments, last by desertcube 18 years, 7 months ago
I'm currently creating an IDE for a C compiler for the 8051 microcontroller using C#. Everything's going smoothly - I've integrated the compiler, created a project explorer and I've even got syntax highlighting for the C source code. The main feature left to implement is a sort of intellisense. What I want to be able to do in the source code editor is to hover over a variable with the mouse and a little box will come up with the variables type. Also, I want to be able to right click on the variable to bring up a context menu with the option to jump to where the variable was declared. To do this, I think I need to parse the C code, generating a table of variable names and where they were declared. When the mouse is hovered over a word, I'll search the table for the variable and then display how the variable was declared. Simple! Now, as I'm parsing C code only, not C++, I'm not too bothered about auto complete for struct's or even listing their members. Now, I've tried googling, but I'm not entirely sure what I'm searching for! I've found this on lcc, but not sure how to use it. Another idea I've had is to create a separate program to parse the C code in the background, which I could create in C++ and use Spirit, in particular, there's a C grammar parser that might be helpful. Any ideas guys, or even what to search for, would be much appreciated.
Advertisement
Bison is nice and has many grammars available. I've tried both spirit and bison and found bison somewhat simpler to use and considerably faster to compile. Take a look at code examples for both and see which one you like more.
Well a thought on how to implement the hovering would be to find the offset of your main text box to the parent window. Get the mouse coordinates and see if they're within the x/y range of your input window.

If they're within the range and the cursor hasn't moved within N seconds do the following check:

Store the height of the current font in use for the development window and the width... this will only work if you have a unisized font ( every letter / symbol takes up the same amount of space ). Using this you can find out what line and what column the mouse is sitting on. From there you'll know what ( if any ) string exists there. Then I would look up in a declaration table what the variable is.

Probably not the best method but I think that would work.
-------------------------------Sometimes I ~self();
Sorry about the late reply, I've been without internet connection for the past couple of days.

private_ctor: Since I'm using C#, it already has the mouse hover event, the problem I need to figure out is how to parse the C files. Sorry about the confusion.

255: I've already used spirit and never really liked the idea of using bison or yacc, but that's just personal preference. Thanks for the suggestion.

What I really wanted to know is how to read through C source code and find out information about all the varaibles and function definitions. I'd ideally like to do this using C# and parse it in a low priority thread in the background, but I can use C++ and create a seperate program that I can execute from my C# code. I may be able to hack something which scans through C code for variable/function declarations, and their locations, but how do I go about a varaibles scope (i.e. which variables will be visiable at the currecnt level?)

I think I might try to change the spirit C grammar example to call a function whenever it encounters a declaration. I think I would also need to keep track of the current scope, perhaps in some form of tree structure. Some random example:
On scope begin:    Create a node with the current column/line position and add it to the nodes collection.    Set the current node to the newly added one.On scope end:    Set the current nodes end position to the current column/line position.    Set the current node to the current nodes parent.

Do you think something like this could work? Also, since I'll be using C++, what would be the best data structure? A custom one, or a couple of std::list's?
Sounds like a simple tree would work. If you want to use a C++ thread/subprocess, your biggest concern will probably be synchronization and sharing the datastructure with C#. I guess it'd be easiest to have the parser work as a thread (not a separate program) and use simple datastructures that both languages can understand, or, find a C# parser generator.

Have you looked at antlr? It seems mature and advertises C# output.
Thank you 255, Antlr seems to fit the bill, as it works with C# and has an example C grammar!

Cheers for all your help.

This topic is closed to new replies.

Advertisement