• Advertisement
Sign in to follow this  

File reading/writing in C

This topic is 4252 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I don't know how I manage to do it, but file reading and writing is the one thing I've never quite understood in C. I've got cref, so I've read about fopen and fread, I've seen it applied in tutorials and on man pages, but I just can't do it myself. For example, what's wrong with this: #include <stdlib.h> #include <stdio.h> int main() { FILE *myFile = NULL; char *txtArray; myFile = fopen("loadme", "r"); fread(txtArray, 1, 2, myFile); printf("%s\n", txtArray); return(0); } I would think nothing, but it just gives me a Bus Error. I just want to read some text! But here are a couple of more specific questions, but first, the fread definition: The fread() function reads, into the array pointed to by ptr, up to nitems members whose size is specified by size in bytes, from the stream pointed to by stream. Ok, so for example is two words two items? And wouldn't there size in bytes differ accross different lengths of words? Example "supercalifragalisticexpyalidocious" vs. "hello". And what if I don't know the lengths of the word or amount of "items" in the file? Any help would be much appreciated. Thanks!

Share this post


Link to post
Share on other sites
Advertisement
The buffer that you read into must already be allocated.


txtArray = (char*)malloc(BUFFER_SIZE);

or

char txtArray[BUFFER_SIZE];


Also, fread/fwrite define size in bytes, so to read/write a string you will have to read/write every byte in the string.

Your code there will read two bytes from the file.

Share this post


Link to post
Share on other sites
If you don't know the lengths of the words in the file, there are two binary options and one main text option.

Binary options: Write the string's length before the string so that you know how many characters to read (this is my typical approach), or write the string out and include the null terminator byte (the problem here is that you don't know how large to make your buffer before you start reading).

If you want to use the above methods, you need to use a "binary file". This means that the file will store information in much the same way that a program stores values in memory. To fopen a binary file, you use "rb" or "wb" for the second parameter, where the 'b' of course means binary.


Text options: If you want to use a plain text file so that you can hand edit it, I recommend using 'fgets' and 'fputs' instead of 'fread' and 'fwrite'. fgets will read a line of text from the file by looking for newline characters ('\n') or if it reaches the end of the file. If no characters could be read, it returns NULL, which lets you know that you should stop reading in lines.

Share this post


Link to post
Share on other sites
An 'item' is just a sequential chunk of bytes in a file that you define the size of.

For example "supercalifragalisticexpyalidocious" could be one item of size 34(fread(buf,1,34,fp)) and "hello" another item of size 5(fread(buf,1,5,fp)).

Usually if you are reading a file you know the structure/format of that file before hand. For example say you have file that contains a list of words seperated by a newline. You can read each word using this code:

char buf[80]={0};
int total =0;
FILE* fp = fopen("wordlist.txt","rt");
if(NULL == fp)return;

while(fgets(buf,79,fp)){
int len = strlen(buf);
printf("Read string:%s of length:%i\n",buf,len);
total++;
}
printf("Total words in the file:%i\n",total);
fclose(fp);




For text files you can use fgets() to read in a file line by line.

In conclusion you must consider/determine/known the format of the file before trying to write code to read in the data.

[Edited by - Jack Sotac on May 31, 2006 1:54:16 PM]

Share this post


Link to post
Share on other sites
fgets eh?

Well thanks, I'll check that out! I was just wondering if I'd have to go find myself a text-to-number-of-bytes convertor or something. :p


Thanks a bunch!

Share this post


Link to post
Share on other sites
Ah, just a funky issue:

I have a text file with this text in it:

Hello World
Earl

I can load the first two words seperatly like this:

FILE *myFile = NULL;
char txtArray[0];
char itmChar[0];
myFile = fopen("loadme.txt", "r");
fgets(txtArray, 7, myFile);
printf("%s\n", txtArray);
fgets(itmChar, 6, myFile);
printf("%s\n", itmChar);

And it will display them fine. But for some reason, to get "Hello " (including the space, on one line, so that "world" is not indented) I need to set the bytes to 7. So far I've figured that 1 byte is one letter. That means 6, 5 for "hello" and 1 for the space, but instead it's 7. And then 6 for the 5 letter word "world". That seems strange for this reason:

Before, if I loaded them all at once it only took 12 bytes to grab it all, now I need 13. Of course, 12 did not conform to my "1 byte per letter" rule either, I was guessing that the extra one was the beginning of line character or something, but now there's even ONE MORE byte.


What's going on?

Thanks!

Share this post


Link to post
Share on other sites
Your problem is because strings in C are just special character arrays and the program needs an extra byte to figure out when to stop printing. This special byte is the Null character specified by '/0' or just the numberic constant 0.

So this:

char text[] = "hello ";


is the same as this:

char text[7] = {'h', 'e', 'l', 'l', 'o', ' ', '\0' };

Share this post


Link to post
Share on other sites
0) What you have doesn't really work. You are writing into memory that is pointed to by txtArray and itmChar (since the array names are interpreted as pointers to the beginnings of the arrays), but the arrays, being of zero size, are not big enough to hold the indicated data. In C (and also C++) arrays are not first-class objects; you simply can't "pass an array" to a function, but just pass the pointer. The fgets() function has no idea how much space is available, which is why you have to tell it with the provided parameter. By writing the code as shown, you *lie* to fgets(), which is forced to trust you, and thus write data into the bytes of memory next to the array - *which do not necessarily belong to you*. According to the language specification, *anything is allowed to happen at this point*. Including appearing to work correctly.

This is one of many, MANY reasons I would STRONGLY urge you to do this in C++, using proper tools from its standard library, instead. There are very, very few valid reasons for using C any more.

1) The fgets() (read "file-get-string") appends a null terminator to the read-in data; i.e. it reads in what poor C programmers call a "string". This extra \0 character is written in place after the data so that *other* functions can see where the end of the data is (because the length count doesn't get passed around with it). There is no "beginning of line" character, BTW; just end-of-line characters (carriage return and/or line feed).

In C++ you have access to a "file stream" object with a nicer interface than FILE* provides. But more importantly, it provides a real string object which does two very important things for you:

a) It handles all kinds of memory management automatically, so you never have to think about how much space is needed (you can't really read from a file directly into a string, but the provided code sample will show how to work around that, using another library widget with similar nice properties). The string automatically resizes itself if more data comes in than the current allocation can hold; holds on to its own allocation of memory which is guaranteed writable and won't be "aliased"; and cleans up after itself properly in all situations.

b) It remembers the length of the string data, as well as the allocated space, so that you never are stuck passing any "extra" parameters.

2) Another good reason to do this in C++ is that you are allowed to declare variables at their first use, and thus can initialize to a meaningful value right off the bat, rather than a dummy value. You can also do this implicitly by specifying constructor parameters on the declaration line.

3) Like you were told, there's no way to tell where one "item" ends in the file and the next begins except by having the file data indicate that in some way. The usual recommendation for string data is to prepend the string lengths.

4) I will hold your hand, even though I really should insist that you take it back to For Beginners first, and provide a full (not tested, but should be close) example for reading and writing:


#include <iostream>
#include <string>
#include <vector>
using namespace std;

// Binary I/O
template <typename T>
T read_primitive(istream& is) {
T result;
is.read(reinterpret_cast<char*>(&result), sizeof(T));
return result;
}

template <typename T>
void write_primitive(ostream& os, const T& t) {
os.write(reinterpret_cast<char*>(&t), sizeof(T));
}

string read_string_binary(istream& is) {
// The std::vector handles memory allocation of a buffer for us.
int len = read_primitive<int>(is);
vector<char> buffer(len);
is.read(&buffer[0], len);
return string(buffer.begin(), buffer.end());
}

string write_string_binary(ostream& os, const std::string& s) {
write_primitive(os, s.length());
os.write(s.c_str(), s.length());
}

// Text I/O
string read_string_text(istream& is) {
int len;
is >> len;
// We represented the length count in human-readable format, but we still
// want to use .read() for the actual string data. Using the shift operator
// just reads one token ("word").
vector<char> buffer(len);
is.read(&buffer[0], len);
return string(buffer.begin(), buffer.end());
}

string write_string_text(ostream& os, const std::string& s) {
os << s.length() << s;
}

void test_binary() {
ostream test("foo.bin", ios::binary);
// The char* literal will be implicitly converted to a std::string object.
write_string_binary(test, "hello ");
write_string_binary(test, "world");
test.close();
istream test2("foo.bin", ios::binary);
string x = read_string_binary();
string y = read_string_binary();
cout << x + y << endl; // yes, this works with std::strings!
// Of course, you could also just output them sequentially...
// No need to close test2 at this point. I will explain that later if requested...
}

void test_text() {
// All the same...
ostream test("foo.txt"); // text mode is default.
write_string_text(test, "hello ");
write_string_text(test, "world");
test.close();
istream test2("foo.txt");
string x = read_string_text();
string y = read_string_text();
cout << x + y << endl;
}

int main() {
test_binary();
test_text();
}

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by OneMoreToGo
I don't know how I manage to do it, but file reading and writing is the one thing I've never quite understood in C. I've got cref, so I've read about fopen and fread, I've seen it applied in tutorials and on man pages, but I just can't do it myself.

For example, what's wrong with this:

#include <stdlib.h>
#include <stdio.h>

int main() {

FILE *myFile = NULL;
char *txtArray;

myFile = fopen("loadme", "r");
fread(txtArray, 1, 2, myFile);

printf("%s\n", txtArray);

return(0);
}

I would think nothing, but it just gives me a Bus Error. I just want to read some text! But here are a couple of more specific questions, but first, the fread definition:

The fread() function reads, into the array pointed to by ptr, up to nitems members whose size is specified by size in bytes, from the stream pointed to by stream.


Ok, so for example is two words two items? And wouldn't there size in bytes differ accross different lengths of words? Example "supercalifragalisticexpyalidocious" vs. "hello". And what if I don't know the lengths of the word or amount of "items" in the file?

Any help would be much appreciated.

Thanks!





You might also think about doing error checking in your code

fopen() returns NULL if it cant open the file and the global errno variable will contain a error code.


if ((myFile = fopen("loadme", "r")) == NULL)
{
// print an error msgs "couldnt open file 'loadme' errno=" here...
return ...
}




Dont forget to fclose() your file also.



Share this post


Link to post
Share on other sites
I choose not to use C++, because C is far more clean, and results in easier to understand code. If that requires that I may need to do a little more work, then perhaps that is what I shall do. I see C++ as somewhat overcomplicated, although perhaps I just don't like it because of it's silly >> & << operands. Who the heck put those in anyway? ><

Besides, using C++ would mean having to re-learn certain aspects of the language, and also having to adapt the code on a project for which these file reading fuctions are destined, seeing as it would be stupid to only change my file extensions. If I were to use C++ for this work, I would have to go through my code making it C++'ish, otherwise there's no point at all.

Share this post


Link to post
Share on other sites
You don't need to change your file extensions. C++ compilers will compile .c files fine.

And if you put a call to ios::synch_with_stdio() at the start of your program, you can freely mix C-style and C++-style file and console reading and writing functions.

I'm afraid I can't agree that:

printf("%d\n",i);

is cleaner than

cout << i << endl;

And the reason for << is that it creates an extensible input/output system that can be extended to support user-defined types, unlike printf, fgets and all the other motley crew of functions which are by their nature error-prone and susceptible to array overflows.

Having said that, I quite like C programming as well so please feel free to ignore the above. :)

Share this post


Link to post
Share on other sites
Ah yes, but what about those :: things. Why must they insist we have those dotted around our program like mines? What do the denote, other than as far as I know, a sub function.

And that private and public stuff, what good is it? If you don't what to access a variable... DON'T ACCESS IT. There is no need to denote it as "private" to stop yourself from doing so.

Argh. Maybe I'll read a book on the language. :/

Share this post


Link to post
Share on other sites
Ah, yes I suppose that its true.

Anyway, new problem! :)

Standard cpp never compiles on my computer (I'm using standard gcc on a mac.) I've had this issue before, and so decided to use C.

However, if I change the C++, so that it is not like that found on the internet or in books (which I'm assuming is standard, simple c++) so that all my cout's and cin's have std:: before them, it works. Now, it's doing the same thing but with my ifstream's and ofstreams. Say this is my code:

#include <iostream.h>
#include <fstream.h>


ifstream in_stream;
ofstream out_stream;

char some_variable;

int main() {

in_stream.open("test.txt");

in_stream >> some_variable;

std::cout << some_variable << endl;

return 0;
}

I would get these errors:



In file included from /usr/include/gcc/darwin/4.0/c++/backward/iostream.h:31,
from test.cpp:1:
/usr/include/gcc/darwin/4.0/c++/backward/backward_warning.h:32:2: warning: #warning This file includes at least one deprecated or antiquated header. Please consider using one of the 32 headers found in section 17.4.1.2 of the C++ standard. Examples include substituting the <X> header for the <X.h> header for C++ includes, or <iostream> instead of the deprecated header <iostream.h>. To disable this warning use -Wno-deprecated.
/usr/bin/ld: Undefined symbols:
std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
std::basic_ifstream<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode)
std::basic_ifstream<char, std::char_traits<char> >::basic_ifstream()
std::basic_ifstream<char, std::char_traits<char> >::~basic_ifstream()
std::basic_ofstream<char, std::char_traits<char> >::basic_ofstream()
std::basic_ofstream<char, std::char_traits<char> >::~basic_ofstream()
std::ios_base::Init::Init()
std::ios_base::Init::~Init()
std::cout
std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char)
std::basic_istream<char, std::char_traits<char> >& std::operator>><char, std::char_traits<char> >(std::basic_istream<char, std::char_traits<char> >&, char&)
___gxx_personality_v0
collect2: ld returned 1 exit status

[/QUOTE]

And I'm not sure what they mean.

Thanks for any help you guys/gals can give me!

Share this post


Link to post
Share on other sites
Quote:
Original post by OneMoreToGo
I choose not to use C++, because C is far more clean, and results in easier to understand code.


This is some kind of joke?


// C++
#include <iostream>
#include <string>
#include <algorithm>

int main() {
std::string hw("hello");
hw += ", world!";
std::sort(hw.begin(), hw.end());
hw += "%d"; // not for formatting, but to force workarounds in the C version ;)
std::cout << hw << 123 << std::endl;
}


vs.


// C
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

int chrcmp(const void* x, const void* y) {
return *x - *y;
}

void main() {
char* h = "hello";
char* hw = malloc(strlen(h) + strlen(", world!") + strlen("%d") + 1);
strcat(strcpy(hw, h), ", world!");
qsort(hw, strlen(hw), 1, chrcmp);
strcat(hw, "%d"); /* I was clever enough to allocate ahead of time for this */
printf("%s%d\n", hw, 123);
free(hw);
}


In the C code you have to:

- Do all your own memory management - and either be unusually clever about it, or end up with repeated allocations. This is "cleaner"?
- Slow things down by repeatedly checking string lengths, or else clutter the code and complicate the logic by caching calculated string length values. The C++ string object carries the length with it.
- Invoke functions with names like "strcpy". This is "easier to understand"?
- Deal with format specifiers to output multiple things. The "silly" >> and << operands have the nice properties that they (a) can be made to work with any data type, while printf-style output can only ever work with primitives; (b) are typesafe - you never have a format string that needs to be kept in sync with the data, and you never have mysterious crashes when you mess that up.
- Write your own comparison function to compare two pointed-at characters! (If you think you can just use strcmp, well it should work, but it will also be inefficient, and do meaningless random comparisons of characters that follow the ones that the sort algorithm is currently interested in. The C++ library sort has a templated equal-comparison available and also uses it by default - but you can supply something else if you need to.
- Tell the sort algorithm the size of your objects. And it's schizophrenic, too: when you write the comparison function, it's up to you to make sure that it understands the same "block size" that qsort does.

Quote:
Besides, using C++ would mean having to re-learn certain aspects of the language


No, it would require you to learn a new language that coincidentally happens to have a lot in common with C. Treating it as "an improved C" is a wrong mindset for learning the language. But it *is* definitely "improved", compared to C.

Quote:
and also having to adapt the code on a project for which these file reading fuctions are destined, seeing as it would be stupid to only change my file extensions. If I were to use C++ for this work, I would have to go through my code making it C++'ish, otherwise there's no point at all.


In the long run your investment will pay off. This is the year 2006; what are you still using C for? C++ is designed to give you all the convenience I illustrated above at the absolute minimum cost for what is provided, as a result of the careful efforts of lots of very smart people. Plus, a hell of a lot more research is being done on optimizing C++ compilers these days. C has almost no advantages in terms of expressive power to the l33t h4xor optimizer any more (the 'restrict' keyword is one of them, but its application is limited and it shifts a serious burden of static proof-of-correctness onto the programmer). When do you think was the last time anyone touched your vendor's qsort() implementation?

Share this post


Link to post
Share on other sites
Quote:
Original post by OneMoreToGo
Standard cpp never compiles on my computer (I'm using standard gcc on a mac.) I've had this issue before, and so decided to use C.

However, if I change the C++, so that it is not like that found on the internet or in books (which I'm assuming is standard, simple c++) so that all my cout's and cin's have std:: before them, it works. Now, it's doing the same thing but with my ifstream's and ofstreams. Say this is my code:

<snip />

I would get these errors:

<snip />

And I'm not sure what they mean.

Thanks for any help you guys/gals can give me!


Did you actually read the error messages?

Quote:
Your compiler
Examples include substituting the <X> header for the <X.h> header for C++ includes, or <iostream> instead of the deprecated header <iostream.h>

It's telling you what to do! The headers <iostream.h> and <fstream.h> are both deprecated. Use <iostream> and <fstream> instead. Then put std:: in front of your ifstream and ofstream definitions:
#include <iostream>
#include <fstream>

std::ifstream in_stream;
std::ofstream out_stream;
char some_variable;

int main() {
in_stream.open("test.txt");
in_stream >> some_variable;
std::cout << some_variable << std::endl;
return 0;
}


And just to prove how hard it is to write error-free C code Zahlman (who has way more experience than you do) made an error in a 13 line example program (his chrcmp function tries to dereference a void *), and I nearly made one earlier on a similar length example (in fact there may well still be errors in the C code I wrote earlier).

And I really don't understand what you mean by:
Quote:
However, if I change the C++, so that it is not like that found on the internet or in books (which I'm assuming is standard, simple c++) so that all my cout's and cin's have std:: before them, it works.

What internet and/or books are you reading? Standard C++ uses namespaces and has done for the last eight years.

Σnigma

Share this post


Link to post
Share on other sites
No luck, adding the std::'s and changing the header names has no effect, if I try std::basic_ifStream, like the errors say, I get told I'm missing arguments.

This is why I don't use C++. I'd rather be actually working and having to find more workarounds, then be stuck with problems with the simplest code.

Share this post


Link to post
Share on other sites
Enigma, I tried your exact code, and it returns these errors:

/usr/bin/ld: Undefined symbols:
std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
std::basic_ifstream<char, std::char_traits<char> >::open(char const*, std::_Ios_Openmode)
std::basic_ifstream<char, std::char_traits<char> >::basic_ifstream()
std::basic_ifstream<char, std::char_traits<char> >::~basic_ifstream()
std::basic_ofstream<char, std::char_traits<char> >::basic_ofstream()
std::basic_ofstream<char, std::char_traits<char> >::~basic_ofstream()
std::ios_base::Init::Init()
std::ios_base::Init::~Init()
std::cout
std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char)
std::basic_istream<char, std::char_traits<char> >& std::operator>><char, std::char_traits<char> >(std::basic_istream<char, std::char_traits<char> >&, char&)
___gxx_personality_v0
collect2: ld returned 1 exit status

*shrug*

Also, in response to your question:

The C++ book I have is "Problem Solving with C++", by Walter Savitch. I like the way it teaches, but it leaves out all these strange std:: tags which appear neccessary.

EDIT: Look: http://www.cplusplus.com/doc/tutorial/files.html

They don't use std::, but they're on the new headers? Could someone PLEASE explain this?

[Edited by - OneMoreToGo on June 1, 2006 5:18:07 PM]

Share this post


Link to post
Share on other sites
Your problem is your compiler, not the language. How are you invoking the compiler? It looks like you're not linking the standard library.

If your C++ textbook doesn't mention namespaces then it doesn't matter how well written it is, because it's not teaching you proper C++. It would be a bit like a medical journal written by a Booker prize winner. I'm sure it would be a great read, but the information it contained would almost certainly be worthless. Get a new textbook.

<edit>
Quote:
Original post by OneMoreToGo
EDIT: Look: http://www.cplusplus.com/doc/tutorial/files.html

They don't use std::, but they're on the new headers? Could someone PLEASE explain this?
Quote:
http://www.cplusplus.com/doc/tutorial/files.html
using namespace std;

They have a using namespace declaration. A decent textbook would have taught you about this, as would an earlier tutorial at www.cplusplus.com.

</edit>

Σnigma

Share this post


Link to post
Share on other sites
I added the namespace declaration, but no luck still. But no worries, I'll try resolving this on my own.

As for how I'm compiling it, I'm using GCC command line, with the simple invocation:

gcc test.cpp -o test

This has always worked before with C, perhaps cpp requires that I actually write in the framework inclusions?


I'll try an xCode C++ project next.


Thanks for all your help guys/gals!

You're great people!

EDIT: I also find it funny that these std:: things are now necessary, what purpose do they serve? I never needed them when I took two C++ coding courses 2 years back.

EDIT 2: Stuck it in an xCode project and I'll be darned, it worked. But the project does not appear to have any extra linking commands...

Share this post


Link to post
Share on other sites
Although I believe you can compile c++ code by invoking gcc directly usually one invokes g++. Try g++ test.cpp -o test

Σnigma

Share this post


Link to post
Share on other sites
Quote:
Original post by Enigma
Although I believe you can compile c++ code by invoking gcc directly usually one invokes g++. Try g++ test.cpp -o test

Σnigma

Using gcc to compile C++ doesn't link the standard c++ library, which would explain the OP's problem.

Share this post


Link to post
Share on other sites
Quote:
Original post by bytecoder
Quote:
Original post by Enigma
Although I believe you can compile c++ code by invoking gcc directly usually one invokes g++. Try g++ test.cpp -o test

?nigma

Using gcc to compile C++ doesn't link the standard c++ library, which would explain the OP's problem.


Ah, that I did not know.

Share this post


Link to post
Share on other sites
Quote:
Original post by Enigma
And just to prove how hard it is to write error-free C code Zahlman (who has way more experience than you do) made an error in a 13 line example program (his chrcmp function tries to dereference a void *)


Hee hee, good point. I need to cast those eh? :) Naturally I have void*'s coming in because that's the function prototype that qsort() expects. Typesafety? What's that? :)

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement