Sign in to follow this  

The C++ std::vector and file I/O

This topic is 4358 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I want to use std::vector for a bunch of allocated memory without having to remember to free it after using. That memory area will be filled with the data from a file, but the file IO routines I've been using require a pointer to an allocated memory, and I haven't known how to use them with std::vector (ReadFile (), _read (), fread ()... for examples). I tried this (silly) workaround (_open () and _read () first for simplicity) :
int const FHandle = _open (/*...*/) ;
unsigned int const DataSize = /*...*/ ;

//...

std::vector<unsigned char *> pData ;
pData.resise (DataSize) ;

//...

if (_read (FHandle, &pData.at (0), DataSize) == -1) /*...Error...*/ ;
Please don't nitpick the _open () and _read () functions, these are as old as the Earth I know[smile]. I took the address of the first element in the vector then write directly to it, this works but of course unsafe. Can you please point out the formal way to do this ?. Feel free to throw out a bunch of other STL types that I should learn to work it out[grin].

Share this post


Link to post
Share on other sites
A std::vector is perfectly suitable for this, but for a more generic approach you might want to look at smart pointers (standard C++ provides auto_ptr, and the Boost libraries provide a number of additional smart pointer types) and templated classes that make resource management via the RAII idiom very easy.

You're on the right track with your example code:

const size_t DataSize = /* ... */;

// specify the initial size immediately
std::vector<unsigned char> vec(DataSize);

// or you can size it later (just don't forget!)
vec.resize(DataSize);

if (_read(FHandle, &*vec.begin(), DataSize) == -1) /* ... handle error ... */


&*vec.begin() is a bit weird looking until you think about what it's doing. You can also use the &vec[0] syntax if you prefer it, but I avoid it because it has slightly less chance of catching a particular class of error.

Share this post


Link to post
Share on other sites
Thanks, in the mean time I have implemented a template container class and use std::auto_ptr to cover it. What I feel a little afraid here is I'm modifying the memory area directly myself without the std::vector object knowing about that modification. This could cause weird behaviors on the vector class, one thing I guess people don't encourage (I haven't used STL long enough).

So, if I want to modify the data owned by std::vector, am I free to mutate it as long as I don't exceed the available allocated memory ?.

Or is there another way around to modify it while making sure the vector class won't be corrupted if there's something wrong in my code ?.

And a minor detail: I have to use reinterpret_cast</*...*/> when I take the pointer to the vector element, which I feel unsafe also.

Anyway thanks for your help.

Share this post


Link to post
Share on other sites
The standard requires that a vector's storage must be contiguous, so yes, as long as you've called size() in advance (or used the vector constructor which takes a size_t) to ensure the appropriate amount of elements have been allocated up front, you're free to modify the contents.

This specific technique is discussed in item 78 of Sutter and Alexandrescu's C++ Coding Standards, and item 16 of Meyer's Effective STL (and probably other places too).

You shouldn't need a reinterpret_cast (or any cast at all) if your syntax is correct. The example code below compiles for me without any warnings or errors (-W -Wall with GCC 4):


typedef unsigned char byte;
static const size_t MAX_BUFFER_SIZE = 512;

/* ... */

std::vector<byte> buffer(MAX_BUFFER_SIZE);

/* ... */

if (read(fd, &*buffer.begin(), buffer.size() < 0)
/* handle error */


The result of evaluting &*buffer.begin() is a byte* (aka unsigned char*), which will automagically decay into a void* without any casting required.

Share this post


Link to post
Share on other sites
If you want to read in the entire file, you can use std::copy and file iterators:


vector<myType> v;

copy(istreambuf_iterator<char>(myFile), (istreambuf_iterator<char>()),
back_inserter(v));

// use istreambuf_iterators to read in binary; use istream_iterators to
// read text files (i.e. formatted I/O).
// The extra parentheses are so this doesn't "look like a function declaration".


If you only want to read in a specific amount, you can try std::copy_n, but it is not (yet?) standard. Probably best to loop manually reading individual items. For binary files, I like to have a templated helper function to read a single item (if you have serialized non-POD structs/classes, you can then specialize it so that it reads your serialized form correctly) and then loop with it:


// Both these interfaces are useful depending on the situation ;)

template <typename T>
void readBinary(ifstream& source, T& dest) {
source.read(reinterpret_cast<char*>(&dest), sizeof(T));
}

template <typename T>
T readBinary(ifstream& source) {
T result;
readBinary(source, result);
return result;
}

//...
vector<myType> v;
for (int i = 0; i < numItems; ++i) {
v.push_back(readBinary<myType>(myFile)); // that version can't deduce the type
// of course, so you have to specify it explicitly.
}

Share this post


Link to post
Share on other sites
Thanks Zahlman. I will consider that way.
After taking a quick replace my code with some of the STL code (std::vector, string and some other algorithms), I couldn't believe that large amount of code has been reduced to some simple lines. How silly me refused not to use STL earlier [grin] with no decent reasons.

Sorry I can't rate you again [smile].

Share this post


Link to post
Share on other sites
No problem. :)

It is important (in order to sound like you know what you are talking about ;) ) to distinguish between the standard library "containers" (data structures, like vector, string etc.) and "algorithms") (like copy, fill etc.). But anyway, you can get plenty of information on both from SGI's reference. (Warning: This documents the STL, not the C++ standard library. So it leaves out things like iostream, and includes some non-standard things like copy_n.)

Share this post


Link to post
Share on other sites

This topic is 4358 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this