• Advertisement
Sign in to follow this  

C++ Binary Level I/O

This topic is 4041 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to read a file, break it down to bit level and then apply an encryption algorithm I've written but I'm not quite sure how to go about that. The encryption algorithm is already written, it accepts 64-bits of input to proceed with encryption so I should break down the file into segments, each 64bit in size, which shouldn't be a problem if I have the file in binary format. However I have no clue how to get a random file into binary format( individual 1's and 0's). c++ file streaming seems more concerned with the content of the file rather than the file itself. I want to break down any file to 1's & 0's regardless of file content or type. Any ideas how to go about that ?

Share this post


Link to post
Share on other sites
Advertisement
I was dabbling in some encryption stuff myself and had some problems reading in files properly (though my problem wasn't with getting the binary data). Anyway, take a look at these snippets. This does a simple reversible XOR encryption, but you can see how it uses std::transform() and file streams (in binary mode) to read/write binary data to the files I'm encrypting.

...
typedef char Byte;
typedef long KeyType;
...

#pragma once

#include "Encryptor.h"
#include <iostream>
#include <algorithm>

class XOREncryptor : public Encryptor
{
public:
XOREncryptor(KeyType Key): mKey(Key), mPass(0) {}
virtual ~XOREncryptor() {}
virtual Byte operator()(const Byte& b)
{
mPass %= sizeof(KeyType);
return b ^ reinterpret_cast<Byte*>(&mKey)[mPass++];
}
virtual bool encrypt(std::istream& In, std::ostream& Out)
{
if (!In)
return false;
if (!Out)
return false;
In.seekg(In.beg);
Out.seekp(Out.beg);
std::transform(std::istreambuf_iterator<Byte>(In), std::istreambuf_iterator<Byte>(), std::ostreambuf_iterator<Byte>(Out), *this);
return true;
}
virtual KeyType getKey() const
{
return mKey;
}
private:
KeyType mKey; // Key.
unsigned short mPass; // Pass.
};



#include <iostream>
#include <fstream>
#include <string>
#include "XOREncryptor.h"

using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::ifstream;
using std::ofstream;
using std::fstream;
using std::ios;

int main(int NumArgs, char** Args)
{
string inFileName, outFileName;
KeyType key = 1;
cout << "Input: ";
cin >> inFileName;
cout << "Output: ";
cin >> outFileName;
cout << "Key: ";
cin >> key;
if (cin.fail())
key = 1;
Encryptor* e = new XOREncryptor(key);
fstream fin(inFileName.c_str(), ios::in | ios::out | ios::binary);
fstream fout(outFileName.c_str(), ios::in | ios::out | ios::binary);
e->encrypt(fin, fout);
delete e;
return 0;
}



The main() function just runs a little driver program, but you can see how it opens the streams. You can see how XOREncryptor uses std::transform() to handle the data. Hope this helps. Cheers.

Edit: Sorry for any confusing variable names. I went back and changed 'em. Also, I open fin and fout for both reading and writing for testing purposed (so I could easily swap the outputs and inputs). An ifstream and ofstream would do fine as well, but I'm sure you knew that. :-)

Share this post


Link to post
Share on other sites
I imagine you should just read the file as any normal file, but make sure it's in binary mode (ios::binary if you're using ifstream). Then each character you read in is just a byte of data that can be converted to an equivalent 8 bit binary value.

There might already be a function to convert a char into an 8bit binary string, but if not it shouldn't be too hard to write yourself using some bitwise operators. Then you could either save the string of '1's and '0's to another file, or plug the data straight into your encryption algorithm.

Share this post


Link to post
Share on other sites
I still prefer this:

FILE *pFile = fopen("file.ext", "rb");

.
.
(use fread/fgetc/fseek/etc.)
.
.

fclose(pFile);



Very readable and easy to understand.

Share this post


Link to post
Share on other sites
You should be using ifstream's read function, rather than the streaming operators, so as to bypass formatting. Also be sure to have opened the file using ios::binary.

C++ provides no mechanism to directly access bits of data. The usual procedure is to maintain an array or buffer of chars, using the bit-shifting operators to do your jiggerypokery.

Admiral

Share this post


Link to post
Share on other sites
It sounds like you want a long string of the *characters* '1' and '0'. You don't need that.

Your data *is* in binary format when you have a chunk of chars. There's no need to rewrite it to consist only of the ascii values for '0' and '1'.
A char is a byte is 8 bit.
If you need 64 bit of input data, you take 8 bytes. (or two ints)
If you gave it a text string containing 64 1's and 0's, that would be 512 byte of data.

Share this post


Link to post
Share on other sites
Quote:
Original post by Spoonbender
It sounds like you want a long string of the *characters* '1' and '0'. You don't need that.

Actually I kinda do, the algorithm was part of a college assignment so it was supposed to be written to print out its operation in ascii characters.

I think I got a clear picture of how to take the input now, but I'm still lost as to how to translate that into ascii "binary" characters now and how to translate it back for file output.

So the data is stored in binary format, but how can I access that and get 1's and 0's.

Share this post


Link to post
Share on other sites
You could AND each byte of data with different powers of two to test for the bits, but this could be slow (though I'm sure you don't need things to be all too efficient for an assignment).

...
typedef char Byte;
Byte inputByte = getNextInputByte();
if (Byte & 1)
{
// Byte has a '1' in the first place.
}
else
{
// Byte has a '0' in the first place.
}
...


Something like this, perhaps? (I'm not sure just using 1, 2, 4, 8, etc. will work; you may need to enter the hex codes, i.e., 0x01.)

Hope this helps. Someone correct me if I did something wrong. :-)

Share this post


Link to post
Share on other sites
You have to do some bit twiddling, here's some quick and dirty code:

unsigned char data = 20;
string binaryString = "";

// loop through all eight bits
for(int i = 0; i < 8; ++i)
{
// shift the bit to the left
unsigned char bit = (1 << (7 - i));

// add correct character to the string
if((bit & data) == bit)
binaryString += '1';

else
binaryString += '0';
}


This uses the bitshift operator to change the value of the 'bit' variable each loop iteration. It then AND's this value with the 'data' variable, and depending on the result, adds a '1' or '0' to the output string. This converts your byte of character information ('data') into an 8 byte string of 1's and 0's. Also check this wikipedia page for some useful info.

Hope that was what you were looking for.

Share this post


Link to post
Share on other sites
Quote:
Original post by knowyourrole
You have to do some bit twiddling, here's some quick and dirty code:

*** Removed ***
You're not guaranteed that one byte is eight bits - you should probably use CHAR_BIT from the standard climits header, instead.

Other then that, C++ does offer a quick (length-wise) solution:

std::string Dummy(unsigned char ucData)
{
return std::bitset <CHAR_BIT> (ucData).to_string();
}

Share this post


Link to post
Share on other sites
Quote:
Original post by raz0r
You're not guaranteed that one byte is eight bits

Are you sure about that?
I provide no evidence, but I'm sure I read somewhere that a C++ char is guaranteed to always be 8 bits long. If I were less tired (and drunk [looksaround]) I'd check the C++ standard myself.

Admiral

Share this post


Link to post
Share on other sites
Quote:
Original post by TheAdmiral
Are you sure about that?
Aye. I believe that the standard only guarantees the fact that sizeof(char) will always be one.

Share this post


Link to post
Share on other sites
sizeof(char) == 1, by definition.

CHAR_BITS >= 8. The fact that it isn't necessarily exactly 8 is largely why the constant exists at all.

sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long).

Further, int is *suggested* to represent the "native" size of integers on the machine; i.e. the type with which operations are fastest, or the width of the internal bus, or some other such logical measure.

Regardless, it makes sense to use int when you don't particularly care about the size of the data type (which is most of the time) - that's why the name is shorter than the others, to encourage its use ;)

Quote:

So the data is stored in binary format, but how can I access that and get 1's and 0's.


A byte is logically composed of bits. Each bit *is* a 1 or a 0. However, these are different things from the *symbols* '1' and '0' which we use to print numbers in binary. You need, conceptually, to extract individual bit values from a byte, and display the corresponding symbol according to the value.

It might help to read this (and subsequent pages).

Share this post


Link to post
Share on other sites
Thanks for the help guys, I've managed to apply bit twiddling to my data buffer and now I've got my data nicely split up into a number of 64 bit blocks ready for encryption. I've stopped coding this far for today as I'm pretty tired right now, anyways I thought I might have some trouble with getting my ascii binary back to the original format so any thoughts about that ?

Share this post


Link to post
Share on other sites
Well, each digit represents a bit, right? We created them by examining the bits of the number, and generating the corresponding symbols. Therefore, to go back, we examine the symbols, and set the bits of a generated number, in the corresponding positions, according to whether we see a '1' symbol or a '0' symbol for each bit.

Share this post


Link to post
Share on other sites
yes, but my blocks are no longer byte sized, they're 64 bit sized. Having to return them back to byte sized blocks (of unsigned char) wouldn't be too efficient. Furthermore, those byte sized blocks aren't (const char*)s that would accept to be written to ios::ofstream. I've done all the encryption and groundwork for decryption but I haven't yet tested it due to stopping at file output. There is probably an easy way around it but I'm too exhausted to think after several hours of coding & bug hunting. So any thoughts are welcome :)

Share this post


Link to post
Share on other sites
Doesn't matter. The size of the block used to represent the '1' and '0' symbols has nothing to do with the size of the block used to represent the value itself.

We output a 64-bit number by examining each bit, and outputting either a '1' or '0' symbol in turn, according to whether the bit is set or clear. Thus we output 64 symbols to show the number.

We input a 64-bit number by examining each symbol, and either setting or clearing a corresponding bit in turn, according to whether the symbol is '1' or '0'. Thus we look at 64 symbols to determine the number.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement