Binary file input (ifstream)

Started by
6 comments, last by Rhaal 17 years, 8 months ago
I'm trying to learn a little more about file formats and different file structures. I figured i'd start with the WAVE file format and see if I could extract information from it. My first attempt is to extract the number of channels from a wav file, which according to http://www.ringthis.com/dev/wave_format.htm, is 23 bytes into the file. So I said, "Hey I know how to use filestreams! I've done it in my text games, so it can't be that hard for binary :)". Well now I stand here scratching my head at the results :(
#include <fstream>
#include <iostream>
#include <windows.h>
using namespace std;

int main(int argc, const char** argv)
{
  char memblock[20];
  ZeroMemory(memblock, 20);

  ifstream InFile("C:\\wavtest\\music.wav", ios::binary | ios::in);
  InFile.seekg(23, ios::beg);
  InFile.read(memblock, sizeof(WORD));
  InFile.close();

  cout << (int)memblock << endl;
  return 0;
}
I consistantly get "1244856" , but it doesn't seem like what I want, which is the number of channels - 23 bytes in, 2 bytes in size.
- A momentary maniac with casual delusions.
Advertisement
I think the problem here is that '(int)memblock' doesn't do what you're expecting it to do.

What you're wanting is to interpret the first two bytes of memblock as an unsigned short. Here's one way to do this:
unsigned short result = (unsigned short)memblock[0] | ((unsigned short)memblock[1] << 8);
(I think I got that right...)

You could also use reinterpret_cast<>.
Updated to: cout << reinterpret_cast<unsigned short>(memblock) << endl;

However, now I think something's going on with Infile.seekg, because no matter what I put for the first argument, I get the same results. It should be moving the read position 23 bytes from the beginning of the file if I'm understanding http://msdn2.microsoft.com/en-us/library/y2d6fx99.aspx correctly.
- A momentary maniac with casual delusions.
Also note that you want to seek to position 22. I would do it like this (although you might want to replace unsigned short with a platform independent type, such as uint16_t from boost/cstdint.hpp):

#include <fstream>int main() {  short unsigned channels = 0;  std::ifstream fin( "C:\\wavtest\\music.wav", std::ios::binary );  fin.seekg( 22 );  fin.read( reinterpret_cast< char * >( &channels ), sizeof( channels ) );}
Quote:Original post by Rhaal
Updated to: cout << reinterpret_cast<unsigned short>(memblock) << endl;

However, now I think something's going on with Infile.seekg, because no matter what I put for the first argument, I get the same results. It should be moving the read position 23 bytes from the beginning of the file if I'm understanding http://msdn2.microsoft.com/en-us/library/y2d6fx99.aspx correctly.


The problem is that when you cast a stack-based array to a numerical value C++ will do so in the only way it knows how: it will implicitly convert the array to a pointer to the first element and then convert that pointer value to a numerical value. So the value you are printing is the address of the array (minus any overflow because of the size of the type). Instead you need to reinterpret the address of the array as being the address of a numerical value and dereference that, i.e. *reinterpret_cast< unsigned short * >(memblock). jflanglois's solution using an actual unsigned short is better though, both in expressiveness and avoiding problems of alignment.

Σnigma
jflanglois,

That was extremely helpful. I'm kinda ashamed to admit it but I kept saying to myself "How the hell will I know the sizeof the variable before I read the data into it?" Needless to say it's been a while since I programmed anything serious :P I can be such a noob! Here's the final result (I'll look at platform independance next):

#include <fstream>#include <iostream>using namespace std;int main(int argc, const char** argv){  unsigned short wavChannels = 0;  unsigned long  wavSampleRate = 0;  unsigned short wavSampleSize = 0;  // Open a file stream for reading from the wav file.  ifstream InFile("C:\\wavtest\\music.wav", ios::binary);  // Read in the number of channels (stereo/mono).  InFile.seekg(22);  InFile.read(reinterpret_cast<char*>(&wavChannels), sizeof(wavChannels));  // Read in the sample rate.  InFile.seekg(24);  InFile.read(reinterpret_cast<char*>(&wavSampleRate), sizeof(wavSampleRate));  // Read in the sample size.  InFile.seekg(34);  InFile.read(reinterpret_cast<char*>(&wavSampleSize), sizeof(wavSampleSize));    InFile.close();  cout << "Channels: " << wavChannels << endl;  cout << "Sample Rate: " << wavSampleRate << endl;  cout << "Sample Size: " << wavSampleSize << endl;  cout << endl;  return 0;}



Enigma,

That was extremely helpful as well, because it was a nice little refresher on casting in general! I actually went back to the first method I was using to read the channels and fixed it :D. I won't be using this, but it felt good to read your reply, and say "AHA!" and go fix it:

#include <string>#include <stdio.h>#include <stdlib.h>using namespace std;int main (int argc, const char** argv){  std::string filename = "c:\\wavtest\\music.wav";    FILE* pFile = NULL;  char buffer[255];  unsigned short block;  // Open the file for reading, and in binary (rb).  if ((pFile = fopen(filename.c_str(), "rb")) == NULL)  {    printf( "Can't find %s\n", filename.c_str() );    exit(0);  }  // Go 23 bytes into the file from the start.  fseek(pFile, 22, SEEK_SET);  block = fgetc( pFile );  // Read the 2 bytes for mono/stereo  for( int i=0; (i <= 2) && ( feof( pFile ) == 0 ); i++ )  {    buffer = (unsigned short)block;    block = fgetc( pFile );      }  printf( "%d\n", *buffer );  fclose(pFile);  return 0;}


Again, thanks both of you!
- A momentary maniac with casual delusions.
iostreams work perfectly well, and if you're smart enough to be using real strings (std::string), you should be smart enough to use modern C++ file I/O (iostream). You should also be smart enough to use modern algorithms (from the <algorithm>) header for utility tasks like zeroing out a buffer: instead of ZeroMemory(), use std::fill() (specifying 0 as the fill value).

And yes, the problem, as Enigma pointed out, is that converting "the buffer" to an int converts the pointer value to an int, not any buffer contents.

But you don't need a "buffer" anyway.

Instead of reading into a character buffer and trying to interpret some subsequence of those bytes as an integer value (which you might not be able to do, in general, because of hardware issues with "unaligned reads), set up the destination variable directly, and interpret *its* location in memory as a location where you can start writing characters.

#include <fstream>#include <iostream>#include <windows.h> // for WORD typedef :/using namespace std;int main() {  WORD channelCount;  ifstream InFile("C:\\wavtest\\music.wav", ios::binary);  // it's already an *i*fstream, so you don't need to specify ios::in.  InFile.seekg(22, ios::beg);  InFile.read(reinterpret_cast<char*>(&channelCount), sizeof(WORD));  // You don't need to .close() explicitly; the file destructor does that.  // .close() is provided for special circumstances where you need more control.  cout << channelCount << endl;  // You don't need to return 0 explicitly; the standard provides that, as a  // special case, main() will return 0 if it reaches the end. (Don't do this  // for other functions. It's designed this way so that you can treat main()  // *as if* it were a void function, for situations where you don't care.)}
Good advice, but I was mostly experimenting. I'm not even using that second implementation that I posted, and I'm satisfied with the first one.
- A momentary maniac with casual delusions.

This topic is closed to new replies.

Advertisement