reading large files
Im just wondering about the best way to
read large text files.
Some of my files are 10 megs+.
I use C++ 6.0 MFC on a brand new
dell 1.8ghtz and Im expecting it
to happen quite fast.
However, the algo Im using now
takes a few seconds.
essentially the file is made up of 50,000 + lines with
13 or more numbers to be read in for each line.
Thanks in advance.
Well, I don''t know if this is the best way but you could do something like this. It''s general and assumes that you do something with the data (copy it to another buffer, transmit it over the internet, process it...) before the next read.
#define BLOCK_SIZE 2048
.
.
.
char sBuffer[BLOCK_SIZE];
int nRet;
FILE *fp = fopen("yourfile.txt", "rb");
while(!feof(fp))
{
nRet = fread(sBuffer, BLOCK_SIZE, sizeof(char), fp);
//Do something with the data in the buffer, you have nRet
//number of characters in there.
}
fclose(fp);
...
I don''t think this is anything new, but this is how I wrote a file transfer program. It was able to transfer a 10Meg file over a network in a couple of seconds(well, maybe a couple more than a couple ). You can play with the block size to see if you can get different results, Leave it a multiple of 2 though.
Also you can check to see what the file size is, allocate a buffer of that size and do one read operation, but something just tells me that I shouldn''t trust reading 10Megs in one call.
But that might just be me.
Jason Mickela
ICQ : 873518
E-Mail: jmickela@pacbell.net
------------------------------
"Evil attacks from all sides
but the greatest evil attacks
from within." Me
------------------------------
#define BLOCK_SIZE 2048
.
.
.
char sBuffer[BLOCK_SIZE];
int nRet;
FILE *fp = fopen("yourfile.txt", "rb");
while(!feof(fp))
{
nRet = fread(sBuffer, BLOCK_SIZE, sizeof(char), fp);
//Do something with the data in the buffer, you have nRet
//number of characters in there.
}
fclose(fp);
...
I don''t think this is anything new, but this is how I wrote a file transfer program. It was able to transfer a 10Meg file over a network in a couple of seconds(well, maybe a couple more than a couple ). You can play with the block size to see if you can get different results, Leave it a multiple of 2 though.
Also you can check to see what the file size is, allocate a buffer of that size and do one read operation, but something just tells me that I shouldn''t trust reading 10Megs in one call.
But that might just be me.
Jason Mickela
ICQ : 873518
E-Mail: jmickela@pacbell.net
------------------------------
"Evil attacks from all sides
but the greatest evil attacks
from within." Me
------------------------------
the way I was doing it was by reading
each # recursively into my prog.
this was slow
then I started reading line by line
and using sscanf functions to get the #''s which
sped it up a lot.
I was just wondering if there was an even faster way like
reading the entire file into memory and the reading the #''s.
not to sure how reading is taken care of.
each # recursively into my prog.
this was slow
then I started reading line by line
and using sscanf functions to get the #''s which
sped it up a lot.
I was just wondering if there was an even faster way like
reading the entire file into memory and the reading the #''s.
not to sure how reading is taken care of.
This is not directly related but for a college project i have to write an algorithm that searches through large (200+ mb) dna text files, to be executed on a supercomputer. The problem is we do not know the exact size of each file. You mentioned something about checking the file size first, then allocating enough memory for the text. How exactly do i check the file size and how do you allocate a non-constant amount of energy.
Any help appreciated.
Any help appreciated.
you can read entire file into a buffer with fopen, fread(buffer, something or anther). look in the documentation for those commands.
This is not directly related but for a college project i have to write an algorithm that searches through large (200+ mb) dna text files, to be executed on a supercomputer. The problem is we do not know the exact size of each file. You mentioned something about checking the file size first, then allocating enough memory for the text. How exactly do i check the file size and how do you allocate a non-constant amount of memory.
Any help appreciated.
Any help appreciated.
you can check file size by doing fseek(file, 0, SEEK_END), then position = ftell(file), position will hold the number of bytes in file.
If you include you can use std::istream for input and std::ostream for output. Open it with ios_base::in for input or ios_base::out for output. Read n characters with infile->read(myBuffer, n);
For the DNA file, you can get file info by calling the function
stat(fname, &file_stat_struct).
HSZuyd
For the DNA file, you can get file info by calling the function
stat(fname, &file_stat_struct).
HSZuyd
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement