Parsing a file line by line

Started by
19 comments, last by Machlana 19 years, 3 months ago
Why don't you just read in a number of bytes and then sequentially store the lines? Something like this:
#include <string.h>#include <stdio.h>#include <stdlib.h>#include <iostream>using namespace std;#define BUFFER_SIZE 1000#define MAX_LINES   1000int main() {   char buffer[BUFFER_SIZE];   char * line[MAX_LINES];   FILE * f = fopen ("text.txt","rb");   int count = 0;   int size = fread (buffer, 1, 1000, f);   while (size != 0) {      //Find the line      int linesize = strcspn (buffer, "\r\n");      if (strcspn (buffer, "\n") < linesize) {linesize = strcspn (buffer, "\n");}      line[count] = (char *) malloc(linesize + 1);      memcpy (line[count],buffer,linesize);      line[count][linesize] = '\0';      cout << line[count] << endl;      count++;      if (buffer[linesize] == '\r') linesize++;      fseek (f, ftell(f) - size + linesize + 1, SEEK_SET);      size = fread (buffer, 1, 1000, f);   }   return 0;}


Bas
Advertisement
Quote:Original post by Machlana
Well i got it to read the string
however the last thing is params

so how would i get print to be able to use params if you could include that in the example :)


http://www.cplusplus.com/ref/cstdio/sscanf.html

example: you want to read in an int at line[2][4] (line[2] is "x = 85")
int n;sscanf(line[2],"%d",&n);cout << n << endl; //shows 85


By the way, it seems to me that you are writing some sort of interpreter. I would strongly advise you to read the whole file into seperate lines first, because they are easier to handle (using string.h)

Bas
Thats what gets me i CANNOT write a parser nothing that deals with strings and files i need a example (or tutorial) to get me started :S

btw: Pse said he'd write a simple one :)
so i dont know im just confused on that
but the code i had posted i understand
I don't think there are a lot of tutorials on that subject to be honest. I've just written a complete gunzip decompressor and a bittorrent tracker file reader and you just need to use the string.h libc stuff bigtime! http://www.cplusplus.com/ref/cstdio/index.html is all you will need, and a healthy knowledge of pointers, arrays and file handling. Actually, the file handling is not so much of an issue as you only have to do it at the start.

Why don't you just use my code as a starting point? It gives you all the lines of a file in an array of pointers. You can get individual characters with line[line_number][char_number]. If you have a problem I'll help :)

Bas
I get this error

c:\program files\microsoft visual studio\vc98\include\eh.h(32) : fatal error C1189: #error : "eh.h is only for C++!"


EDIT: NVM im dumb lol i shouldve known it was a .C not .CPP
it doesnt work though its a access violation
i made a error handler for the file open i think its where u allocate the size

#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
using namespace std;

#define BUFFER_SIZE 1000
#define MAX_LINES 1000

int main()
{
char buffer[BUFFER_SIZE];
char * line[MAX_LINES];

if (!(FILE * f = fopen ("s","rb")))
{
printf("ERROR");
return;
}

int count = 0;
int size = fread (buffer, 1, 1000, f); // HERE
while (size != 0) // CONFLICTS HERE
{
//Find the line
int linesize = strcspn (buffer, "\r\n");
if (strcspn (buffer, "\n") < linesize)
{
linesize = strcspn (buffer, "\n");
}

line[count] = (char *) malloc(linesize + 1);
memcpy (line[count],buffer,linesize);
line[count][linesize] = '\0';
cout << line[count] << endl;
count++;
if (buffer[linesize] == '\r') linesize++;
fseek (f, ftell(f) - size + linesize + 1, SEEK_SET);
size = fread (buffer, 1, 1000, f);
}

return 0;
}
Can I see the file 's' which you use to test it?

Bas
it says


Hi
I don't understand it, it works perfectly here (GCC) and I don't see what would be wrong with the code... I have no experience with Visual C++ though so hopefully someone else will understand what's going on.

Try this in that file:
HiThisisatest


Check the value of size if anything is read at all.
Bas, I'd still use fgets...it's much simpler than parsing the text by hand...as he's just looking for newline characters...
On the other hand...if you want to pass arguments to your commands...you could do it in several ways...you could use Bas' algorithm and replace the "\n\r" search string for something like a "(" (which would indicate the start of the parameters "zone")...however this approach, is not much used..as it incurs in overhead for parsing the syntax...
In case you really need arguments, you could define structures...separated by lines, which is much efficient.
So, for example, each command would have a predefined structure like this:

struct
{
char Cmd[6];
char Argu1[10];
char Argu2[10];
char Argu3[10];
char EndLine[] = "\r\n"; //So we don't have to append it later =)
}HAHA;

So...if you create the command file using this syntax, you could just load each line directly into this structure and have it automatically parsed :D (sort of)...
Of course...a single byte error in the file would create a mess, so you have to be careful programming it...
A command inside the command file would look something like:

PRINT Argument1 Argument2 Argument3
(...)

Notice how each parameter has the size specified by each array:
"PRINT " (including the space) has 6 bytes.
"Argument1 " (including the space) has 10 bytes.
And so on!

Obviously it depends on how you want to implement each function, and how much flexibility you need, etc...
Now that you can read lines by yourself my example would be useless... I'd advice you to stick tos Bas' method (which more or less does the parsing I told you earlier, but manually searching for the newline characters), or, if you like, I can write an example on this method as well...

This topic is closed to new replies.

Advertisement