[C] C compiler macro trouble

Started by
8 comments, last by Ectara 12 years, 4 months ago
I'm halfway through writing my preprocessor, and I've hit a snag: macros. I can do simple dumb text replacement, however it gets more complicated when using complex macros and passing parameters, which may invoke other complex macros which may pass its own tokens along with a previous parameter as a new parameter to another complex macro, and quickly spiraling into the more complex.

For instance,

#define macro1(x) (x)
#define macro2(x) macro1(2 + x)
#define macro3(x) macro2(4 + x)
#define macro4(x) macro3(8 + x)

macro4(23 + 321);

I have an array of tokens, on to which I push new tokens as they are processed, or I paste the definitions of macros(recursively, if they aren't complex) if the next token is a macro name. However, I'm stuck when trying to keep track of all of the parameters with macros invoking other macros, and building upon each other's parameters. It seems like there should be a simpler way, but I haven't found it yet. Can anyone offer advice?
Advertisement
Without a specific question or at least code showing what you're doing the best I can offer is generic advice like look at the macro expansion algorithm used by other preprocessors such as boost::wave's macro expansion. boost::wave has the additional benefit of having source that you can look at.
I suppose my inquiry was too vague. I did read up on Boost's Wave and read a formal description of the macro expansion process. Unfortunately, it told me everything I already know.

I was asking how one would implement complex macro expansion, with the possibility of other complex macros being invoked in its defined body. The code I could provide would be irrelevant; it is about to be re-factored, and removed entirely.

I believe much of the problem arises form my urge to conserve resources at all costs. In doing this, I was trying to expand all of the macros in one pass, which required keeping a parameter state for all macros currently being evaluated. This would conserve memory and reduce allocations for the token buffer, but it would increase storage for the parameters, not to mention make it entirely too complex. Thus, it needed removing.

After reading the description of Wave, I think I may simply evaluate the complex macro and replace the parameter names in a temporary buffer. Then, I'll use a second buffer to expand the macros in the temporary buffer. Upon this level of expansion, I'll free or reuse the previous buffer, and continue to expand while alternating buffers, until the bottom level where the final buffer is appended to the token output. This would leave me the issue of ensuring that macros cannot form infinite loops by keeping track of which macros were already expanded, and should be treated as raw tokens the next time it is encountered. What comes to mind is:

#define orange(x) (x + 4)
#define green(x) (orange(x) + 3)
#define purple(x) (orange(x) + green(x))

purple(4);

In this example, it would expand to:

(orange(4) + green(4))

then

((4 + 4) + (orange(4) + 3))

However, orange has already been expanded previously, yet this seems perfectly valid. I'm not sure how to prevent circular invocation without breaking well-defined behavior.
Use functions instead then...

#define orange(x) (x + 4)
#define green(x) (orange(x) + 3)
#define purple(x) (orange(x) + green(x))

purple(4);

In this example, it would expand to:

(orange(4) + green(4))

then

((4 + 4) + (orange(4) + 3))

However, orange has already been expanded previously, yet this seems perfectly valid. I'm not sure how to prevent circular invocation without breaking well-defined behavior.


What you have there is not circular invocation, nor is it an infinite loop.
[size=2][ I was ninja'd 71 times before I stopped counting a long time ago ] [ f.k.a. MikeTacular ] [ My Blog ] [ SWFer: Gaplessly looped MP3s in your Flash games ]

Use functions instead then...


You missed the point where I'm writing a preprocessor. That answer doesn't make sense; I'm parsing macros and doing text replacement.

[quote name='Ectara' timestamp='1323900195' post='4893992']
#define orange(x) (x + 4)
#define green(x) (orange(x) + 3)
#define purple(x) (orange(x) + green(x))

purple(4);

In this example, it would expand to:

(orange(4) + green(4))

then

((4 + 4) + (orange(4) + 3))

However, orange has already been expanded previously, yet this seems perfectly valid. I'm not sure how to prevent circular invocation without breaking well-defined behavior.


What you have there is not circular invocation, nor is it an infinite loop.
[/quote]

I'm well aware. I said it is perfectly valid, and well-defined. If I simply keep track of which macros have been invoked, it would break the misquoted example. However, if I don't keep track of them, circular invocation would be allowed. So, there must be a way to keep track of which have been invoked, in a way that doesn't break the above.
A simple way is just a counter of the current number of expansions. If it gets unreasonable, bail out.

Another way is to remember the order in which macros are defined. Only expand macros that were defined before the current one. This will prevent single recursive macros, and mutually recursive macros.

You might be able to use an existing C preprocessor, depending on what you are doing. There are typically flags you can pass to stop the compiler after the processing stage. Likewise, I'm sure there must be liberally licensed implementations of the preprocessor you can examine.

A simple way is just a counter of the current number of expansions. If it gets unreasonable, bail out.

Another way is to remember the order in which macros are defined. Only expand macros that were defined before the current one. This will prevent single recursive macros, and mutually recursive macros.

You might be able to use an existing C preprocessor, depending on what you are doing. There are typically flags you can pass to stop the compiler after the processing stage. Likewise, I'm sure there must be liberally licensed implementations of the preprocessor you can examine.


Hm. What if I assign each macro an incrementing ID number, and only expand the macro if the parent macro has a higher ID number? I might be able to piece together a system to keep track of that. I'll see if I can find potential flaws before I implement it.
Changing how the tokens were added to the main token list helped my recursive macro processing greatly, and assigning them 64bit IDs solved the problem of circular invocation (unless someone defines (2**64) - 1 macros in one program.) I have yet to finish implementing complex macros. It should be simple once I get to it; the recursive macro expansion function passes the list of tokens from the macro's invocation, which would allow me to parse the parameters from it as I go. Each iteration does not worry about any others.

Finally getting around to implementing a vector container drastically simplified the arrays of strings I was constantly maintaining and resizing.

This topic is closed to new replies.

Advertisement