Public Group

[C] Skipping comments in a text file

This topic is 3624 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

Recommended Posts

I need to read values from a text file while skipping comments. A comment starts with a '#' and continues until the end of the line, but the catch is that it can start anywhere, not just at the beginning of a line. For example, this is a valid file:
# comment
5
6   #comment
#comment
7
I want to write a function that skips all comments up to a new value, but I can't think of a clean way to do it. Here's what I came up with (no error checking):
void skipLine(FILE *file) {
char ch;

while ((ch = (char)fgetc(file)) != '\n')
;
}

char ch;

while (1) {
fscanf(file, "%c", &ch);
if (ch == '#')
skipLine(file);
else {
ungetc(ch, file);
return;
}
}
}


What bothers me the most is the use of ungetc() - it just doesn't feel very elegant. Is there another way to do this? Note that this has to be done in straight C. Thanks in advance.

Share on other sites
Mu. Don't make a "skipComments" function, make a "readLine" function which strips out any comments it happens to come across. Or "readInt" or whatever.

Share on other sites
Or, for non-whitespace sensitive parsing, use recursion:

int skip(FILE *fp){    int c=fgetc(fp); while(isspace(c)) c=fgetc(fp);    if(c=='#')        {        while(c!=EOF && c!='\n') c=fgetc(fp);        return skip(fp); // teh recursive bitz        }    return c;}

Not tested, but the general idea. Deals with several comments in a row. Rest of the parser then uses skip like:

type next(FILE *fp){    int c=skip();    if(c==EOF) return EndOfFile; // etc}

Approach I've been using with my scripting languages for years and my computer hasn't exploded yet.

Share on other sites
Quote:
 Original post by EasilyConfused...

I think your function reads an extra character, which is a problem because the file can contain numbers with several digits. This is why I used ungetc().

Quote:
 Original post by SneftelDon't make a "skipComments" function, make a "readLine" function which strips out any comments it happens to come across.

But wouldn't such a function also need to use skipComments()?

Share on other sites
Quote:
 Original post by Gage64But wouldn't such a function also need to use skipComments()?

No. There's no such thing as skipping. Skipping is simply what happens when you fail to take an action as a result of reading something. If I read three numbers, then add the second and third ones and print the result, have I not skipped the first number?

Share on other sites
Quote:
Original post by Sneftel
Quote:
 Original post by Gage64But wouldn't such a function also need to use skipComments()?

No. There's no such thing as skipping. Skipping is simply what happens when you fail to take an action as a result of reading something. If I read three numbers, then add the second and third ones and print the result, have I not skipped the first number?

Maybe I'm missing the point, but if you want to read three numbers from a text file, you have to separate the characters that belong to the numbers from the characters that don't. This is what I refer to as skipping and is the functionality that I'm trying to implement.

So, I don't understand what you're trying to say here.

Share on other sites
/* pretend that this is actually safe I'm insufficiently bored to write    the C code necessary for the memory allocation to be correct */char * read_line(FIlE * file) {  char * line = malloc(LINE_SIZE);  fgets(line, LINE_SIZE, file);  char * comment_start = strstr(line, "#");  if (comment_start) *comment_start = 0;  return line;}

Share on other sites
Quote:
Original post by Gage64
Quote:
Original post by Sneftel
Quote:
 Original post by Gage64But wouldn't such a function also need to use skipComments()?

No. There's no such thing as skipping. Skipping is simply what happens when you fail to take an action as a result of reading something. If I read three numbers, then add the second and third ones and print the result, have I not skipped the first number?

Maybe I'm missing the point, but if you want to read three numbers from a text file, you have to separate the characters that belong to the numbers from the characters that don't. This is what I refer to as skipping and is the functionality that I'm trying to implement.

So, I don't understand what you're trying to say here.

No, no. Listen carefully.

Suppose I have a file which consists of the following text, in its entirety: "a3d4ba6gh". I am responsible for adding up all the numbers (which are assumed to each be one digit), while ignoring all letters.

Here's the first way I can do it: I write a function SkipLetters, which tries to discard any letters from the stream, without killing any digits. I write a function GetChar, which gets the next char and returns it. I call these two functions alternately, each time assuming the result of GetChar is a digit (because I've discarded any letters). The problem is implementing SkipLetters, assuming I don't want to use ungetc.

Here's the second way I can do it: I write a function GetDigitOnly, which, in a loop, tries to read a digit from the stream. Each time it fails to get a digit (because it has instead gotten a letter) it just tries again. When it successfully reads a digit, it returns it. I simply call this function until I run out of file. And I didn't need to use ungetc.

Note that this is exactly what EC's code is doing, except with recursion converted to iteration.

Share on other sites
I have something similar,my comments start with // and they can be on the beginning of the line or anywhere on the line.The code is in c++ put it can be converted to C easily.In fact it was using C some time ago but I changed it to use streams.
Here is the function
bool STUtil::skipComments(std::istream& stream)	{		bool bComment = false;				int iCharRead = 0;		char c = '0';				do{			stream>>c;//fgetc or something			iCharRead++;						if(c == '/'){				stream>>c;//fgetc or something				iCharRead++;				if(c == '/'){					//yay its a comment					return false;				}			}			else if(c == TAB || c  == SPACE || c == LINEFEED)				bComment = true;			else if(stream.eof())				return false;			else				bComment = false;		}while(bComment);		stream.seekg(-iCharRead,std::ios_base::cur);//fseek?		return true;	}

Share on other sites
Quote:
 Original post by SiCrane...

I think that's exactly what I wanted (but I'm too tired to be sure). Thank you.

Quote:
 Original post by SneftelHere's the second way I can do it: I write a function GetDigitOnly, which, in a loop, tries to read a digit from the stream. Each time it fails to get a digit (because it has instead gotten a letter) it just tries again. When it successfully reads a digit, it returns it. I simply call this function until I run out of file. And I didn't need to use ungetc.

What if the numbers can be more then one digit and I want to use fscanf() to read them and not do the parsing myself?

1. 1
Rutin
25
2. 2
3. 3
4. 4
JoeJ
18
5. 5

• 14
• 14
• 11
• 11
• 9
• Forum Statistics

• Total Topics
631757
• Total Posts
3002131
×