# Parsing a file line by line

This topic is 4733 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

This has always confused me.... Does anybody know how to parse a file line by line for instance i want to read this file print(hello) sayhi do_nothing just_read exit how would i parse that file line by line in c i just dont get it

##### Share on other sites
Checkout fscanf

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/html/_crt_fscanf.2c_.fwscanf.asp

##### Share on other sites
It depends on how long ang how efficient you want the parsing algorithm to be.
If it's only going to be a few lines long, you could just use some sort of loop to parse each individual character of a line until you find the newline character...

[EDIT]
You would have to load the file in a buffer first...

##### Share on other sites
If you simply want to read a file line-by-line until there are no more lines to read (until you are at the end of the file), then check out the following code. I have put comments in to explain what is going on. Make sure you include <stdio.h>...

// Step 1: Open the file and store a handle to the file in pFile//FILE* pFile = fopen("somefile.txt", "r");	// <<< The "r" specifies that you want read only accessif ( !pFile ){	// Failed to open	return;}	// Step 2: Read the file until we are at the end of the file//while ( !feof(pFile) ){	char szStringBuf[256];	fgets( szStringBuf, 256, pFile );	// szStringBuf now stores the current line that you are parsing}// Step 3: Make sure to close the file once you have finished//fclose(pFile);

##### Share on other sites
A line isnt always 256 characters etc...

Usualy the line terminates with /n (for newline) and /0 for eof. You can use the iostream library to grab just one line and then use scanf() to pull things from that line.

cin.getline(buffer, '\n', maxlen); //I believe these are the parameters, however they may be in a different order.. www.msdn.microsoft.com :)

##### Share on other sites
If you don't have unix man pages, but would like the benefit, you can google "man fprintf" and usually the first link will be a manpage that someone put online. Very useful. There are also sites where you can search through manpages specifically, but I lost the link.

:Found one.

##### Share on other sites
If i do this

// Step 1: Open the file and store a handle to the file in pFile
//
FILE* pFile = fopen("somefile.txt", "r"); // <<< The "r" specifies that you want read only access
if ( !pFile )
{
// Failed to open
return;
}

// Step 2: Read the file until we are at the end of the file
//
while ( !feof(pFile) )
{
char szStringBuf[256];
fgets( szStringBuf, 256, pFile );

if (szStringBuf == CMD_PRINT)
{
printf("CMD_PRINT successfully called!");
}
else if( szStringBuf == CMD_WAIT )
{
printf("CMD_WAIT successfully called!");
}
else
{
printf("COMMAND NOT VALID!");
}

// szStringBuf now stores the current line that you are parsing
}

// Step 3: Make sure to close the file once you have finished
//
fclose(pFile);

would that work?
if so how would i get params?

im trying to implement a command based language trying to keep it simple using stdio.h

##### Share on other sites
That's not likely to work...
Firstly, I'd have to see what the definitions are for CMD_PRINT and CMD_WAIT.
The code you worte will probably fail due to the way you are handling strings of text that are read from the file you opened...
Depending on the definition of CMD_PRINT and CMD_WAIT, and the data that is stored in szStringBuf you may or you may not be able to compare using the == operator...

According to your first post, you are trying to read a text file that has several commands in each line...
Example:

Wait
Print

I would use fgets to get a line from the file handle you opened...then I would use strcmp to compare the string that was read with a list of commands inside your app....according to the result, and using a loop, I'd do it recursively until the EOF and all commands have been processed...
Look for information on MSDN...if you still have trobule, I could write an easy example to illustrate the method...

If you are looking for performance, there are algorithms way more effective...

##### Share on other sites
as of now im not really looking for performance i looked on msdn and no help :S
if you could, would you write a example just to open a file and parse it line by line and compare the strings like print (forget about wait) cuz after that i gotta implement params please if u could :)

##### Share on other sites
Well i got it to read the string
however the last thing is params

this is my code (thanks to every1)

#include <stdio.h>

#define CMD_PRINT "print"
#define CMD_WAIT "wait"

int main()
{

// Step 1: Open the file and store a handle to the file in pFile
//
FILE* pFile = fopen("s", "rb"); // <<< The "r" specifies that you want read only access
if ( !pFile )
{
printf("ERROR");
return;
}

// Step 2: Read the file until we are at the end of the file
//
while ( !feof(pFile) )
{
char szStringBuf[256];
fgets( szStringBuf, 256, pFile );

if ( stricmp (szStringBuf, CMD_PRINT) == 0 )
{
printf("CMD_PRINT successfully called!");
}
else
{
printf("COMMAND NOT VALID!");
}

// szStringBuf now stores the current line that you are parsing
}

// Step 3: Make sure to close the file once you have finished
//
fclose(pFile);

}

so how would i get print to be able to use params if you could include that in the example :)

##### Share on other sites
Why don't you just read in a number of bytes and then sequentially store the lines? Something like this:
#include <string.h>#include <stdio.h>#include <stdlib.h>#include <iostream>using namespace std;#define BUFFER_SIZE 1000#define MAX_LINES   1000int main() {   char buffer[BUFFER_SIZE];   char * line[MAX_LINES];   FILE * f = fopen ("text.txt","rb");   int count = 0;   int size = fread (buffer, 1, 1000, f);   while (size != 0) {      //Find the line      int linesize = strcspn (buffer, "\r\n");      if (strcspn (buffer, "\n") < linesize) {linesize = strcspn (buffer, "\n");}      line[count] = (char *) malloc(linesize + 1);      memcpy (line[count],buffer,linesize);      line[count][linesize] = '\0';      cout << line[count] << endl;      count++;      if (buffer[linesize] == '\r') linesize++;      fseek (f, ftell(f) - size + linesize + 1, SEEK_SET);      size = fread (buffer, 1, 1000, f);   }   return 0;}

Bas

##### Share on other sites
Quote:
 Original post by MachlanaWell i got it to read the stringhowever the last thing is paramsso how would i get print to be able to use params if you could include that in the example :)

http://www.cplusplus.com/ref/cstdio/sscanf.html

example: you want to read in an int at line[2][4] (line[2] is "x = 85")
int n;sscanf(line[2],"%d",&n);cout << n << endl; //shows 85

By the way, it seems to me that you are writing some sort of interpreter. I would strongly advise you to read the whole file into seperate lines first, because they are easier to handle (using string.h)

Bas

##### Share on other sites
Thats what gets me i CANNOT write a parser nothing that deals with strings and files i need a example (or tutorial) to get me started :S

btw: Pse said he'd write a simple one :)
so i dont know im just confused on that
but the code i had posted i understand

##### Share on other sites
I don't think there are a lot of tutorials on that subject to be honest. I've just written a complete gunzip decompressor and a bittorrent tracker file reader and you just need to use the string.h libc stuff bigtime! http://www.cplusplus.com/ref/cstdio/index.html is all you will need, and a healthy knowledge of pointers, arrays and file handling. Actually, the file handling is not so much of an issue as you only have to do it at the start.

Why don't you just use my code as a starting point? It gives you all the lines of a file in an array of pointers. You can get individual characters with line[line_number][char_number]. If you have a problem I'll help :)

Bas

##### Share on other sites
I get this error

c:\program files\microsoft visual studio\vc98\include\eh.h(32) : fatal error C1189: #error : "eh.h is only for C++!"

EDIT: NVM im dumb lol i shouldve known it was a .C not .CPP

##### Share on other sites
it doesnt work though its a access violation
i made a error handler for the file open i think its where u allocate the size

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
using namespace std;

#define BUFFER_SIZE 1000
#define MAX_LINES 1000

int main()
{
char buffer[BUFFER_SIZE];
char * line[MAX_LINES];

if (!(FILE * f = fopen ("s","rb")))
{
printf("ERROR");
return;
}

int count = 0;
int size = fread (buffer, 1, 1000, f); // HERE
while (size != 0) // CONFLICTS HERE
{
//Find the line
int linesize = strcspn (buffer, "\r\n");
if (strcspn (buffer, "\n") < linesize)
{
linesize = strcspn (buffer, "\n");
}

line[count] = (char *) malloc(linesize + 1);
memcpy (line[count],buffer,linesize);
line[count][linesize] = '\0';
cout << line[count] << endl;
count++;
if (buffer[linesize] == '\r') linesize++;
fseek (f, ftell(f) - size + linesize + 1, SEEK_SET);
size = fread (buffer, 1, 1000, f);
}

return 0;
}

##### Share on other sites
Can I see the file 's' which you use to test it?

Bas

it says

Hi

##### Share on other sites
I don't understand it, it works perfectly here (GCC) and I don't see what would be wrong with the code... I have no experience with Visual C++ though so hopefully someone else will understand what's going on.

Try this in that file:
HiThisisatest

Check the value of size if anything is read at all.

##### Share on other sites
Bas, I'd still use fgets...it's much simpler than parsing the text by hand...as he's just looking for newline characters...
On the other hand...if you want to pass arguments to your commands...you could do it in several ways...you could use Bas' algorithm and replace the "\n\r" search string for something like a "(" (which would indicate the start of the parameters "zone")...however this approach, is not much used..as it incurs in overhead for parsing the syntax...
In case you really need arguments, you could define structures...separated by lines, which is much efficient.
So, for example, each command would have a predefined structure like this:

struct
{
char Cmd[6];
char Argu1[10];
char Argu2[10];
char Argu3[10];
char EndLine[] = "\r\n"; //So we don't have to append it later =)
}HAHA;

So...if you create the command file using this syntax, you could just load each line directly into this structure and have it automatically parsed :D (sort of)...
Of course...a single byte error in the file would create a mess, so you have to be careful programming it...
A command inside the command file would look something like:

PRINT Argument1 Argument2 Argument3
(...)

Notice how each parameter has the size specified by each array:
"PRINT " (including the space) has 6 bytes.
"Argument1 " (including the space) has 10 bytes.
And so on!

Obviously it depends on how you want to implement each function, and how much flexibility you need, etc...
Now that you can read lines by yourself my example would be useless... I'd advice you to stick tos Bas' method (which more or less does the parsing I told you earlier, but manually searching for the newline characters), or, if you like, I can write an example on this method as well...

##### Share on other sites

and Bas i really have no clue im working on it

##### Share on other sites

This topic is 4733 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.