• Advertisement
Sign in to follow this  

File format reading

This topic is 4145 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

OK... I just need a hand getting started with this, I have a file type definition from Wotsits format and I want to know how to go about reading it. Here is the first part I want to read... it is the first 4 bytes in the file. local file header signature 4 bytes (0x04034b50) How would I open the file and read out the 4 bytes and check if they match the definition above to see if I am reading the right type of file. Thanks David

Share this post


Link to post
Share on other sites
Advertisement
First of all, you don't even mention what language you're using.

Second of all, why exactly you need someone telling you how to do this? Those things you ask are so simple and basic that any answer would probably be 4-5 lines of code you'd copy&paste. You need to look up the commands of whatever language you're using that:

1)Open a file for reading
2)Read from a file
3)Compare a value with another value

After you do all that and you still have problems post them here, but you have to try yourself first.

Share this post


Link to post
Share on other sites
I am using C++ and I am having problems with this because I am not sure what mode I am supposed to open the file in and how to check the values because they are in hex. And I have tryed myself, I just wanted some advice and forgot to mention that I was using c++... here is the code:


ifstream zipFile(argv[1], ios::in | ios::binary);
char fileHeaderSig[4];
zipFile.read(fileHeaderSig, 4);
if(fileHeaderSig != (char*)0x504B0304)
{
cout << "The signature found in the file was: " << fileHeaderSig << endl;
cout << "This is not a zip file signature" << endl;
system("PAUSE");
}


and when I output the value I get PK♥♦\RC which is right apart from the \RC on the end which I believe is some sort of escape sequence. But what is it doing and how do I get rid of it so I can check the value properly

Share this post


Link to post
Share on other sites
You may want to use zlib to handle zip files, unless you're doing it for learning purposes.

Oh and this would probably work:


ifstream zipFile(argv[1], ios::in | ios::binary);
unsigned int fileHeaderSig;
zipFile.read(&fileHeaderSig, 4);
if(fileHeaderSig != 0x504B0304)
{
cout << "The signature found in the file was: " << fileHeaderSig << endl;
cout << "This is not a zip file signature" << endl;
system("PAUSE");
}

Share this post


Link to post
Share on other sites
fileHeaderSig is an array. C++ can't compare whole arrays like you try to do. When you write just 'fileHeaderSig' it behaves like a pointer. "fileHeaderSig != (char*)0x504B0304" means "compare the address of fileHeaderSig array with 0x504B0304", which is not what you want to do. It does not compare the 4 bytes of fileHeaderSig with 0x504B0304, which is just an integer. So, if you want to compare with an integer value, read an integer, which is 4 bytes:

unsigned int fileHeaderSig;
zipFile.read((char*)fileHeaderSig, 4);
if(fileHeaderSig != 0x504B0304)
{
...
}

Share this post


Link to post
Share on other sites
Neither of these 2 methods work... any other suggestions, they won't compile because of bad conversions but I have messed around with both of them and I get crashes or the wrong text in the output.

Share this post


Link to post
Share on other sites
0) Like mikeman said (typo-fix and all). Except, don't read into an int for this; the size isn't guaranteed to match and the endianness might differ as well.

Read into the character buffer as before, but compare each individual character.

I recommend doing this with the standard library algorithm std::equal, from the <algorithm> header:


const char expectedSig[] = { 0x50, 0x4B, 0x03, 0x04 };
const int sig_size = sizeof(expectedSig) / sizeof(expectedSig[0]);
char fileHeaderSig[sig_size];

ifstream zipFile(argv[1], ios::in | ios::binary);
zipFile.read(fileHeaderSig, sig_size);
if (!std::equal(fileHeaderSig, fileHeaderSig + sig_size, expectedSig)) {
// report error
}


Quote:
Original post by win_crook
and when I output the value I get PK♥♦\RC


1) That's because the buffer contents are not null-terminated. That's fine; you don't want to interpret this thing as a string, but as a 4-byte "magic number".

Quote:
which I believe is some sort of escape sequence.


No. That's just the contents of memory that happened to follow the buffer, until a null terminator was found. When you attempt to output a char[] buffer, the buffer name is interpreted as a pointer, and the ostream operator<< overload for char* assumes you have a null-terminated string.

You will never see "an escape sequence" in output; you will see the actual character that is represented by the escape sequence. These sequences exist so that you can represent those character values *in code*, in places where you otherwise wouldn't (because it would mess up the parser; e.g. a double-quote would be interpreted as ending a string literal).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement