Sign in to follow this  

Comment Before Header Guards

This topic is 4196 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've noticed I and others put a comment first on a header file, even before the include guards. Would this slow down compilation, or is "the compiler" or "the preprocessor" smart enough to see that it is just comments before the include guards? [Edited by - Boder on June 16, 2006 2:49:49 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Boder
I've noticed I and others put a comment first on a header file, even before the include guards.

Would this slow down compilation, or is "the compiler" or "the preprocessor" smart enough to see that it is just comments before the include guards?


As far as I know comments are (with some exceptions? like on preprocessor lines maybe it's not seen as a comment) flat out ignored, probably removed before the compiler even sees the code. I wouldn't think it would slow down compilation by more than a millisecond, if it had to be removed more than once and it was included hundreds of times.

But...I'm guessing!

Share this post


Link to post
Share on other sites
Yeah I'm going to guess that the comments won't mess up the "Include Guard Detection Algorithm"

Thanks for the advice [grin]

[Edited by - Boder on June 16, 2006 2:15:53 AM]

Share this post


Link to post
Share on other sites
The compiler and preprocessor have to be smart enough to understand comments, otherwise they wouldn't work.

The file has to be opened and processed anyway to get to the include guards, and the (trivial) recognition of comments is nothing in the scheme of what a compiler has to do. Consider that even if you have your #ifndef INCLUDE_GUARD as the first line of the file, the whole file needs to be processed to get to the #endif...

Share this post


Link to post
Share on other sites
I'm okay with it getting opened and parsed once, but I want to make sure it gets the special flag saying "this header has include guards" so it isn't opened/parsed again until the next .cpp file.

Share this post


Link to post
Share on other sites
It seems that you are misunderstanding something. There is no such thing as an header guard detection algorithm. The preprocessor interprets the preprocessor directives in a way that prevent redefition of symbols:
#ifndef SOME_SYMBOL
#define SOME_SYMBOL
class A { ... };
#endif // SOME_SYMBOL

Simply means that is SOME_SYMBOL is defined , the part between #ifndef and #endif is not copied into the compilation unit. If SOME_SYMBOL is not defined, we make sure that it will be defined - to prevent another possible inclusion of the file. That's all. No special algorithm, just a flat out interpretation of what's written.
This is the same for comments. Everything that is between /* and */ is ignored, and everything after // is ignored until the end of the line. That's all.

There is nothing complex in a C/C++ preprocessor :)

Regards,

Share this post


Link to post
Share on other sites
Emmanuel Deloget, thank you for the explanation. You made me stop and think about what is actually going on.

There was a thread mentioning #pragma once and how it might allow faster compilation compared to traditional include guards. But it was also mentioned that recent compilers like GCC and Visual Studio could detect include guards just like #pragma once to prevent needless re-reading and re-parsing of the file. I want to make sure that if something like this happens

#include <header.h>
#include <header.h>

The preprocessor won't bother at all with the second #include, but of course it would have to if the header wasn't completely wrapped in header guards. Of course it also has to be careful of #undef but I'm sure it looks for that.

Share this post


Link to post
Share on other sites
Quote:

There was a thread mentioning #pragma once and how it might allow faster compilation compared to traditional include guards. But it was also mentioned that recent compilers like GCC and Visual Studio could detect include guards just like #pragma once to prevent needless re-reading and re-parsing of the file. I want to make sure that if something like this happens


I'd assume any header guard detection optimization would be smart enough to ignore comments. I also wouldn't worry to much about this - your compiler will probably spend an insignificant amount of time in the preprocessor compared to the rest of the compilation.

Share this post


Link to post
Share on other sites
Quote:

#include <header.h>
#include <header.h>

The preprocessor won't bother at all with the second #include, but of course it would have to if the header wasn't completely wrapped in header guards. Of course it also has to be careful of #undef but I'm sure it looks for that.


It will open and process both files. It has to. An explanation:

// header.h
#ifndef HEADER_H_
#define HEADER_H_

class Foo { /* details */ };

#endif // HEADER_H_

// somefile.cpp
#include "header.h" // "" versus <> irrelevant to discussion, btw
#include "header.h"

int main(void) { /* details */ }


After preprocessing of somefile.cpp, it will look like this (I have assumed the preprocessor will not remove comments, to provide some clarification):


// somefile.cpp
// header.h
#ifndef HEADER_H_

class Foo { /* details */ };

// HEADER_H_
// header.h
// HEADER_H_

int main(void) { /* details */ }


Note how the #ifndef / #define / #endif bits were not copied to the preprocessed translation unit (as Emmanuel Deloget said). header.h was opened and processed twice, but its content was only pasted once because of how we used the preprocessor.

This is the typical use of header files, and the way you'd typically expect them to behave. Unfortunately there is no mandate that headers / preprocessor includes must be used this way, and so the preprocessor should not be written to assume that once a file is preprocessed (#include'd) for a given translation unit (ex., somefile.cpp), it will be ignored (not opened or processed at all) if it was #include'd again from within that original translation unit.

For example, there are techniques for generating enumeration values and such by recursively or multiply-including a header file (with different sets of #define's) that would break if the preprocessor ignored repeated inclusions implicitly. While those sorts of techniques are amazingly ugly and hard to follow, they are not technically exploiting a bug, flaw, or standardization loophole in the preprocessor's behavior.

That's why many development suites offer some kind of #pragma once option that can be used to mark the file as ignorable if multiply-included.

Share this post


Link to post
Share on other sites
I'm saying, given that a header file is wrapped in traditional include guards, why can't the preprocessor safely ignore any subsequent attempts to #include the header?

It can, because as long as the symbol (HEADER_H_) remains defined, absolutely nothing is copied into the translation unit, effectively making it a noop.

If I was a compiler writer, I would realize people use include guards and check whether a particular header uses them. If it does, I would map the header's filename to the particular symbol it uses and whether it is currently defined. After the first #include, the header would be added to the map and the symbold would be defined.

If I find #undef HEADER_H_ then I ... I guess I would have to maintain a reverse map of symbols to header files. I would turn off the symbol and #include the header again wherever I see the directive.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by jpetrie
After preprocessing of somefile.cpp, it will look like this (I have assumed the preprocessor will not remove comments, to provide some clarification):


// somefile.cpp
// header.h
#ifndef HEADER_H_

class Foo { /* details */ };

// HEADER_H_
// header.h
// HEADER_H_

int main(void) { /* details */ }



ITYM

// somefile.cpp
// header.h

class Foo { /* details */ };

// HEADER_H_
// header.h
// HEADER_H_

int main(void) { /* details */ }


Quote:

Note how the #ifndef / #define / #endif bits were not copied to the preprocessed translation unit (as Emmanuel Deloget said). header.h was opened and processed twice, but its content was only pasted once because of how we used the preprocessor.

This is the typical use of header files, and the way you'd typically expect them to behave. Unfortunately there is no mandate that headers / preprocessor includes must be used this way, and so the preprocessor should not be written to assume that once a file is preprocessed (#include'd) for a given translation unit (ex., somefile.cpp), it will be ignored (not opened or processed at all) if it was #include'd again from within that original translation unit.

[ snip ]

That's why many development suites offer some kind of #pragma once option that can be used to mark the file as ignorable if multiply-included.


You're missing the point. The include guards are a very common idiom, and an idiom that is easily detected. Thus, it's easy to write a preprocessor that can watch for this idiom and have it behave exactly like a "#pragma once" (under C's "as if" rule).

The preprocessor isn't written to assume that once a file has been #include'd in a translation unit it should be ignored if #include'd again in the same translation unit.

The preprocessor is written to assume that once a file with include guards has been #include'd in a translation unit it should be ignored if #include'd again in the same translation unit.

Quote:

For example, there are techniques for generating enumeration values and such by recursively or multiply-including a header file (with different sets of #define's) that would break if the preprocessor ignored repeated inclusions implicitly. While those sorts of techniques are amazingly ugly and hard to follow, they are not technically exploiting a bug, flaw, or standardization loophole in the preprocessor's behavior.


assert.h/cassert is another example that relies on being able to #include a file more than once.

Share this post


Link to post
Share on other sites
Quote:

ITYM
(snipped AP's corrections

Yea, that is what I meant, thanks for catching that.

Quote:

The include guards are a very common idiom, and an idiom that is easily detected.


I disagree with the latter half of that statement:

//header.h

#ifndef FOO
#define FOO
// maybe other code
#endif

#ifndef BAR
#define BAR
// maybe other code
#endif


Which is the include guard? Maybe neither is a guard, maybe both are guards for some multiple-include trick?

Quote:

The preprocessor is written to assume that once a file with include guards has been #include'd in a translation unit it should be ignored if #include'd again in the same translation unit.

I can only see this happening in situations where "include guard detection" is possible, however -- which isn't always easy to do robustly or correctly. I suspect that the benefit of not actually opening or processing the file is neligible against the effort (also admittedly small) it would take somebody to implement this, especially since it wouldn't always been reliable so it'd need to be a very conservative operation. Whereas allowing #pragma once to explicitly say "don't include me any more, include guards or not" is, in contrast, both reliable and much easier to implement.

This hardly constitutes a broad test, however:

//header.h

#ifndef HEADER_H_
#define HEADER_H_

class foo
{
// ...
};

#endif

//main.cpp
#include "header.h"
#include "header.h"

int main(void)
{
// ...
}


Visual C++ 2003 produces the following output when this project is built with the /showIncludes option (I stripped the path names down for brevity):


Compiling...
main.cpp
Note: including file: header.h
Note: including file: header.h
Linking...


Adding #pragma once to header.h yeilds the expected result (only one "Note:" message). Visual C++ 2005 produces similar results. I don't have another IDE/compiler available to test where I am, unfortunately.

I just don't see why anybody implementing the preprocessor would bother implementing such an assumption when they have better ways to do it.

Share this post


Link to post
Share on other sites
If I were writing a preprocessor I would simply cache parse trees of recently included files which would allow me to perform multiple inclusion optimisations at least as efficiently as a #pragma once directive.

Σnigma

Share this post


Link to post
Share on other sites
The preprocessor is a consequence of C++'s single-pass compilation process, which mandates header files (along with the forward declaration requirement - itself a function of the single-pass compile), and is thus largely an anachronism.

If you want to do something productive, write a multi-pass C and C++ compiler.

That said, "inclusion guards" are not trivially identified, for the very reasons jpetrie mentions: they are not a unique functionality, but rather a mere idiom put to that popular use and others. Trying to optimize for something that has no impact on compilation time or complexity is inefficient.

Finally, assert is not dependent on being able to include more than once. I suggest you learn what a compilation unit is.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
Finally, assert is not dependent on being able to include more than once. I suggest you learn what a compilation unit is.


Yes it is. The C standard mandates that the assert.h header be able to be included multiple times to enable/disable the beahvior of the assert() macro depending on whether or not the NDEBUG preprocessor symbol is defined at the time of inclusion.

Share this post


Link to post
Share on other sites
Where is most of the time spent during compilation? I thought hard drive access was a large component, which is why I was worried about the header being included multiple times.

Do assert(), exit(), and abort() cause memory leaks if there are global objects that allocated memory?

Share this post


Link to post
Share on other sites
Quote:

Where is most of the time spent during compilation? I thought hard drive access was a large component, which is why I was worried about the header being included multiple times.

Hard drive access is a factor, but not due to multiple inclusion of headers. You should prevent unneccessary multiple-inclusion (by actually not using #include if you don't need to, not via #pragmas or guards) because that can cause dependancies to refresh, meaning more files will be compiled than possibly are neccessary.

During compilation and link, you'll probably hit the disk more due to reading/writing object files and other intermediate cruft than due to header inclusion. The majority of compilation time will be taken up actually compiling the files, depending on dependancy chains, the complexity and size of the code, and compiler options (especially optimizations).

The complexity of the link operation is (iirc) n-squared for Visual Studio, so reducing the number of actual .obj files generated can speed that up (and reduce disk access, maybe, but at the cost of making each .obj more complex to compute and maybe introducing more dependancy chains). Optimization options like "whole program optimization" can affect the speed of the link as well (WPO can basically perform some per-obj optimizations across all .obj files... which can be slow).

Quote:

Do assert(), exit(), and abort() cause memory leaks if there are global objects that allocated memory?


This is sort of off-topic for the thread, but:

  • assert(): When you assert, and the assertion fails, you typically hit a breakpoint of some sort. If a debugger is present, you break. If not, you crash. Technically, you leak.

  • exit(): When you call exit(), the calling process is terminated after cleanup and functions registered with atexit() are called. Static objects are destroyed in the opposite order they were created. This is basically equivalent to returning from main().

  • abort(): When you call abort(), you shoot the calling process in the head. Technically, you leak.



When your process goes down, the OS can clean up some resources (dynamic memory, et cetera), but others it might not be able to take care of.

Share this post


Link to post
Share on other sites
Quote:
Original post by jpetrie

  • exit(): When you call exit(), the calling process is terminated after cleanup and functions registered with atexit() are called. Static objects are destroyed in the opposite order they were created. This is basically equivalent to returning from main().



Objects with automatic storage duration (on the stack) do not get destroyed on calling exit(), so may leak resources.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by jpetrie
Quote:

The include guards are a very common idiom, and an idiom that is easily detected.


I disagree with the latter half of that statement:

//header.h

#ifndef FOO
#define FOO
// maybe other code
#endif

#ifndef BAR
#define BAR
// maybe other code
#endif


Which is the include guard? Maybe neither is a guard, maybe both are guards for some multiple-include trick?


Depending on how sophisticated it is, it may work. It would be easy to detect sections of the form #ifndef FOO_H/#define FOO_H/.../#endif and remember whether FOO_H is still defined (or, in your case, both FOO and BAR) when another #include attempt is found. Hey, look at that, it even works when it's not an include guard!

Quote:

Quote:

The preprocessor is written to assume that once a file with include guards has been #include'd in a translation unit it should be ignored if #include'd again in the same translation unit.

I can only see this happening in situations where "include guard detection" is possible, however -- which isn't always easy to do robustly or correctly. I suspect that the benefit of not actually opening or processing the file is neligible against the effort (also admittedly small) it would take somebody to implement this, especially since it wouldn't always been reliable so it'd need to be a very conservative operation. Whereas allowing #pragma once to explicitly say "don't include me any more, include guards or not" is, in contrast, both reliable and much easier to implement.


But it isn't that hard to detect robustly or correctly since the idiom is so common and varies so little.

Quote:

Visual C++ 2003 produces the following output when this project is built with the /showIncludes option (I stripped the path names down for brevity):


Compiling...
main.cpp
Note: including file: header.h
Note: including file: header.h
Linking...


Adding #pragma once to header.h yeilds the expected result (only one "Note:" message). Visual C++ 2005 produces similar results. I don't have another IDE/compiler available to test where I am, unfortunately.

I just don't see why anybody implementing the preprocessor would bother implementing such an assumption when they have better ways to do it.


Then I assume you don't see any reason to use #pragma once?

And, for what it's worth, gcc implements this optimization. Compiled with the -H option and the include guards it prints ". header.h" only once. If the include guards are removed it prints ". header.h" twice (and gives an error about redefining 'class foo').

Share this post


Link to post
Share on other sites
Quote:

Depending on how sophisticated it is, it may work. It would be easy to detect sections of the form #ifndef FOO_H/#define FOO_H/.../#endif and remember whether FOO_H is still defined (or, in your case, both FOO and BAR) when another #include attempt is found. Hey, look at that, it even works when it's not an include guard!


It seems you're missing the point now. FOO and BAR in that example might not be inclusion guards at all. If your optimization guessed they were and failed to include the file multiple times from the same translation unit, you may have just broken my build (and since not having include guards and multiply-including files is not in violation of any kind of specification w.r.t. the preprocessor, your implementation is thus buggy and broken).

Quote:

But it isn't that hard to detect robustly or correctly since the idiom is so common and varies so little.

"Robustly and correctly" here refers to being able to correct determine a set of preprocessor commands is an include guard in the general case so that you can apply your proposed optimization, or not, as required to maintain the build. You cannot do this, you can only guess, and must be very conservative about your guess. You can probably apply the optimization only when the top nesting level of #if* / #endif pairs contains only one pair.

(EDIT: And no other code or preprocess options at that level. Tests suggest that this is exactly what GCC does, see below).

Quote:

Then I assume you don't see any reason to use #pragma once?

On the contrary. #pragma once is useful for doing what #pragma once does (typically): tell the preprocess never to include this file again. #pragma once is an explicit declaration to the tools that you never want this file included again this translation unit. The preprocessor commands #ifndef, #define, and #endif, on the other hand, are mechanisms for controlling the preprocessing of source code that happen to have a common idiomatic usage pattern that helps prevent multiple-definition errors when a file gets included multiple times from the same translation unit. They are not the same.

Quote:

And, for what it's worth, gcc implements this optimization. Compiled with the -H option and the include guards it prints ". header.h" only once. If the include guards are removed it prints ". header.h" twice (and gives an error about redefining 'class foo').

So it does. It's also -- as I suspected it would have to be -- very, very conservative about applying it. Any non-comment characters outside of the first #ifndef block cause it to abort the optimization (interestingly enough, those characters don't even have to be legal code). You can also manually break it by doing #undef HEADER_H_ or whatever after the include, which does suggest that it associates the filename with the symbol used to guard it, as your theorized, and checks to make sure its still defined.

[Edited by - jpetrie on June 16, 2006 5:33:37 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
Quote:
Original post by jpetrie

  • exit(): When you call exit(), the calling process is terminated after cleanup and functions registered with atexit() are called. Static objects are destroyed in the opposite order they were created. This is basically equivalent to returning from main().



Objects with automatic storage duration (on the stack) do not get destroyed on calling exit(), so may leak resources.


Here's a demonstration of that point (notice how foo's destructor is never called):
#include <iostream>

class foo
{
public:
~foo() {std::cout << "foo's destructor called\n"; }
};


class bar
{
public:
~bar() {std::cout << "bar's destructor called\n"; }
};


int main()
{
foo a;
static bar b;

exit(0);
}


Outputs:
bar's destructor called

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by jpetrie
Quote:

Depending on how sophisticated it is, it may work. It would be easy to detect sections of the form #ifndef FOO_H/#define FOO_H/.../#endif and remember whether FOO_H is still defined (or, in your case, both FOO and BAR) when another #include attempt is found. Hey, look at that, it even works when it's not an include guard!


It seems you're missing the point now. FOO and BAR in that example might not be inclusion guards at all. If your optimization guessed they were and failed to include the file multiple times from the same translation unit, you may have just broken my build (and since not having include guards and multiply-including files is not in violation of any kind of specification w.r.t. the preprocessor, your implementation is thus buggy and broken).


Here's the thing, it doesn't matter whether it's an include guard or not so long as the optimization only catches cases that behave as if they were include guards. From the code you showed, your FOO and BAR look like include guards. I don't know if gcc would catch this because have two guards in one file is much less common than having a single guard.

Quote:

Quote:

And, for what it's worth, gcc implements this optimization. Compiled with the -H option and the include guards it prints ". header.h" only once. If the include guards are removed it prints ". header.h" twice (and gives an error about redefining 'class foo').

So it does. It's also -- as I suspected it would have to be -- very, very conservative about applying it. Any non-comment characters outside of the first #ifndef block cause it to abort the optimization (interestingly enough, those characters don't even have to be legal code). You can also manually break it by doing #undef HEADER_H_ or whatever after the include, which does suggest that it associates the filename with the symbol used to guard it, as your theorized, and checks to make sure its still defined.


Well, calling this "very, very conservative" and saying a #undef "breaks" it seems a bit extreme to me. As it is, the vast majority of include guards will be robustly and correctly optimized without violation of the standard. For further optimization, you'd have to beef up the preprocessor to actually understand C/C++ (instead of just being able to tokenize it).

Quote:

Quote:

But it isn't that hard to detect robustly or correctly since the idiom is so common and varies so little.

"Robustly and correctly" here refers to being able to correct determine a set of preprocessor commands is an include guard in the general case so that you can apply your proposed optimization, or not, as required to maintain the build. You cannot do this, you can only guess, and must be very conservative about your guess. You can probably apply the optimization only when the top nesting level of #if* / #endif pairs contains only one pair.

(EDIT: And no other code or preprocess options at that level. Tests suggest that this is exactly what GCC does, see below).


This seems no great limitation to me. I don't know that I've ever seen code outside include guards except in cases where it was intended to be included multiple times. Thus, in my experience, any time #pragma once could've been applied, include guard optimization could also be applied.

Quote:

Quote:

Then I assume you don't see any reason to use #pragma once?

On the contrary. #pragma once is useful for doing what #pragma once does (typically): tell the preprocess never to include this file again. #pragma once is an explicit declaration to the tools that you never want this file included again this translation unit. The preprocessor commands #ifndef, #define, and #endif, on the other hand, are mechanisms for controlling the preprocessing of source code that happen to have a common idiomatic usage pattern that helps prevent multiple-definition errors when a file gets included multiple times from the same translation unit. They are not the same.


I don't claim that they're the same, I do claim that the #ifndef/#define/#endif is so idiomatic that any programmer with a reasonable amount of experience in C/C++ will recognize it as an include guard (just like they wouldn't have to parse "for (int i = 0; i <

Share this post


Link to post
Share on other sites

This topic is 4196 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this