Sign in to follow this  
Nairou

vectorchar to int

Recommended Posts

Alright, again this feels like a stupid question but I can't think of the right way to do it. I've got a vector<char> buffer, and I'm reading binary data into it from a file. This data is structured, some of it consisting of bytes, but other parts consisting of words and dwords. Whats the best way to read an int out of a particular point in this vector? For example:
vector<char> buffer;
unsigned int stringsize;

// file is read into the buffer here...

stringsize = buffer[28];
I know the last line is incorrect, but thats the sort of idea I'm trying to get at, I'm sure there's a better way to do it than something like this:
stringsize = (buffer[29] << 8) | buffer[28];

Share this post


Link to post
Share on other sites
This should work for basic types on an x86, but don't assume it to be portable:


template<typename Type>
const std::vector<char>& operator>>(const std::vector<char>& buf, Type& out)
{
out = (Type*)&vector.at(offset);
return buf;
}

void test()
{
std::vector<char> buffer;
// fill the buffer
int value;
buffer >> value;
}

Share this post


Link to post
Share on other sites
Hmm. That seems almost as bad as doing it manually. And portability is a pretty high priority. Pointers are a definite option, since vectors are contiguous I could use it as a char array and interpret it as anything, but I was hoping to avoid that.

Also, pulling an int out of the vector was just one case, there are several other data types I need to pull out, such as strings (preferably std::string).

One option was to read it as a struct vector instead of char vector, but the data contains variable-length text strings, which makes that just as complex.

I suppose I could always do something like:


vector<char> buffer;

int *data = &buffer[28];
int value = *data;


But it seems very round-about. Is there no standard way of doing this?

Share this post


Link to post
Share on other sites
I would suggest that you fill the appropriate variables when reading in your file, instead of reading it all to a vector and then trying to parse the vector.

You might also want to look at the section on settings in the Enginuity series here, if you don't like the previous suggestion.
[edit] What is it you are trying to do, exactly, before I send you in the wrong direction?

Note: joanusdmentia's code seems like it confuses istream and vector; it needs offset defined. Also, vector.at should read buf.at

Regards,
jflanglois

[Edited by - jflanglois on December 4, 2004 9:50:32 PM]

Share this post


Link to post
Share on other sites
Reading individual values might be the best idea, I was just trying to minimize disk access.

What I'm doing right now is parsing zip files, reading the tables to locate the compressed file headers and read out the file info and names.

Share this post


Link to post
Share on other sites
Quote:
Original post by Nairou
Reading individual values might be the best idea, I was just trying to minimize disk access.


Don't try (not in this way, anyway); the stream object does this for you (by reading a "page" of the file at a time and maintaining an internal buffer).

Share this post


Link to post
Share on other sites
What are you using to read you file? ifstream? CreateFile? fopen?
With ifstream, it would look like this:

#include <iostream>
#include <fstream>

using namespace std;

int main() {

ifstream fin( "test.zip", ios::binary );

// Find the number of files in this record
fin.seekg( -22, ios_base::end );

// Find the begining of the central directory of this record
unsigned sig;
fin.read( reinterpret_cast < char * > ( &sig ), 4 );
while ( sig != 0x06054b50 ) {
fin.seekg( -5, ios_base::cur );
fin.read( reinterpret_cast < char * > ( &sig ), 4 );
}

// go to number of files
fin.seekg( 6, ios_base::cur );

// get number of files
short numfiles;
fin.read( reinterpret_cast < char * > ( &numfiles ), sizeof( numfiles ) );


// get length of name of first element
fin.seekg( 26 );
short strlength;
fin.read( reinterpret_cast < char * > ( &strlength ), sizeof( strlength ) );

// get name of first element
char *filename = new char[strlength + 1];
fin.seekg( 2, ios_base::cur );
fin.read( filename, strlength );
filename[strlength] = '\0';

cout << "First file: \"" << filename << "\"" << endl;

delete [] filename;

cin.get();
return 0;
}






Output:
Quote:
First file: "sorcerers.txt"


What spec are you using?

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Quote:
Original post by Nairou
Reading individual values might be the best idea, I was just trying to minimize disk access.


Don't try (not in this way, anyway); the stream object does this for you (by reading a "page" of the file at a time and maintaining an internal buffer).


I don't understand what you mean by this. The ifstream will keep a cursor on the file. How does that mean that it is not a good idea to read individual values from the file instead of the whole file to a buffer, and then trying to parse that?

Share this post


Link to post
Share on other sites
Thanks jflanglois, that clears it up a lot. I'm using ifstream (via boost::filesystem), I just didn't want to assume the read calls were being buffered in advance. :)

I'm using the PKZip specs off the pkware site, and its all working well now. I went ahead and have it reading the bulk of the header directly into a struct now, then directly reading the variable parts (like filenames) as I need them afterwards.

Share this post


Link to post
Share on other sites
You're welcome. That's the spec I used as well. Make sure you don't read the central directory for file headers because according to the site the CD can be compressed (unless of course you have code to decompress it).

Share this post


Link to post
Share on other sites
Damn struct padding... looks like I may have to read the values individually afterall...

Common zip files can have CD compression? I haven't run into one that does so far. I just assumed it was one of those unused features, like Zip64, that I wouldn't have to worry about in a game environment. I was actually reading the individual file headers to begin with, then read that they might not always be accurate or complete, and to use the CD instead... [rolleyes]

Share this post


Link to post
Share on other sites
About struct padding, you might want to look at this (at the end).

I read that the CD can be compressed on the pkzip site spec. Those headers contain a bit more data, but nothing that seems to be essential. It would sort of defeat the purpose if the local headers could be inaccurate (corruption aside).
Quote:
With the introduction of the Central Directory Encryption feature in version 6.2 of this specification, the Central Directory Structure may be stored both compressed and encrypted. Although not required, it is assumed when encrypting the Central Directory Structure, that it will be compressed for greater storage efficiency. Information on the Central Directory Encryption feature can be found in the section describing the Strong Encryption Specification. The Digital Signature record will be neither compressed nor encrypted.

Share this post


Link to post
Share on other sites
Quote:
Original post by jflanglois
About struct padding, you might want to look at this (at the end).


Thanks for the link, though my main problem is that I need it to be portable. From what I've read here in the forums, there doesn't appear to be one portable method of packing structs.

Quote:
Original post by jflanglois
I read that the CD can be compressed on the pkzip site spec. Those headers contain a bit more data, but nothing that seems to be essential. It would sort of defeat the purpose if the local headers could be inaccurate (corruption aside).


I think it was the "compressed size" and "uncompressed size" fields I was thinking of, which may or may not be present in the local headers, but which are always present in the CD. I need those values to stream files out of the zip as needed. Hmm... I guess I'll look into it more. I still don't know if CD compression is something I need to worry much about, I can always just require all my zips to have uncompressed CDs (which seems to be default, I haven't seen one compressed so far).

Share this post


Link to post
Share on other sites
Quote:
Original post by jflanglois
Quote:
Original post by Zahlman
Quote:
Original post by Nairou
Reading individual values might be the best idea, I was just trying to minimize disk access.


Don't try (not in this way, anyway); the stream object does this for you (by reading a "page" of the file at a time and maintaining an internal buffer).


I don't understand what you mean by this. The ifstream will keep a cursor on the file. How does that mean that it is not a good idea to read individual values from the file instead of the whole file to a buffer, and then trying to parse that?


LOL! No, I didn't mean "Don't try to read individual values", I meant "Don't try to minimize disk access like this". The ifstream does the buffering already, so any buffering you implement yourself is redundant - just read a variable at a time.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this