# How to implement a Symmetric Key Cipher

## Recommended Posts

sooner123    269
Recently, I've been looking into ways to encrypt and protect my files. I've always kept hashes of my files to verify their integrity, but now I'd like to encrypt them as well.

The first method that came to mind (since I will be the only one encrypting or decrypting) was to XOR the files bit-by-bit (or byte-by-byte depending on how the i/o stream works) with some key. Basically a Symmetric Key Cipher.

I'm not sure how to generate the key though. Could I just use a password-length key and do the following?

for (i=0; i<length(plaindata); i++) encrypted[i] = key[i%length(key)]^plaindata[i];

It seems simple enough, secure enough, reversible, and at first glance I don't see anything horribly inefficient about it.

However, I've never done anything like this before so I wouldn't be surprised if I was doing something silly or pointless.

##### Share on other sites
ibebrett    205
its ridiculously insecure. if you encrypt a well known file type, i can attack using the certain magic strings contained within the file. if its source for cpp, good guess that the word "include" is used, etc.

##### Share on other sites
sooner123    269
How so?

I'm not using a single substitution cipher.

Where 'include' might be encrypted to one string, elsewhere it'd be likely with a probability of (key_length - 1) / (key_length) to be encrypted to something different.

I think you might have confused this method for a single substitution cipher which would have the flaw you're describing.

##### Share on other sites
ibebrett    205
ive done it for project euler

http://projecteuler.net/index.php?section=problems&id=59

quite easily. unless im misunderstanding your scheme, and if so i apologize for the confusion

##### Share on other sites
ibebrett    205
i will also note that you can break xor keys into pieces, so that if the work 'the' shows up in my attack. i can keep the key the same in that section that happened to overlap those letters, and change the rest. Normally you cant do that with most crypto systems. Basically breaking a small part of the key actually makes it easier to break the rest.

##### Share on other sites
If you don't want to use a readily available cipher, you should read up on Feistel networks, which are the underlying principle of the majority of ciphers.

You should really consider using a well-known implementation, though. Implementing a fast cipher is trivial, and implementing a strong cipher is trivial. However, implementing a cipher that is both fast and strong is a hard problem (which is the reason why you would probably want to use a well-known cipher rather than making your own).

##### Share on other sites
ibebrett    205
whats wrong with using RSA? its fast.

##### Share on other sites
sooner123    269
Quote:
 Original post by ibebrettive done it for project euler http://projecteuler.net/index.php?section=problems&id=59quite easily. unless im misunderstanding your scheme, and if so i apologize for the confusion

Yep. That problem is exactly the way I was thinking of doing it. My pseudocode in my original post did exactly that: repeatedly cycling the key due to it being shorter than the plaintext.

As for security, I don't really see any feasible way to crack this without knowing the key.

I considered RSA and I've studied it in the past, but this seemed much easier and quicker.

Can anyone point out an actual security flaw with this approach?

Most of the files would be in the 10 kilobytes range and the keys used would be roughly 64 to 128 bits.

##### Share on other sites
ibebrett    205
what made it easy for me is, that you use a dictionary, and try permutations of the keys. now if you find a match the dictinoary anywwhere, leave those bits alone but continue permutation the rest. This is because the bits in the key and the bits in the encoded text match up. you can't do that with rsa. so as soon as a fine one 'the' in your text, 3 bits are taken off your key. i can breack it in small bits.

##### Share on other sites
kman12    174
This is basically just a polyalphabetic substitution cipher. See here for more information and ways to crack it. It's a little different since you are working with whole bytes instead of just letters but it's the same general idea.

##### Share on other sites
sooner123    269
Reading that page, I see the potential vulnerability described, but it doesn't seem likely.

You'd have to use a fairly short key and have a LOT of instances of words being repeated very often (at least as many times as the length of the key) to have an even chance of creating a repeated string that can lead to decryption.

I guess you could say that as the ratio of (message length) to (key length) grows, the chance of this vulnerability occurring increases.

And I imagine I could solve this problem completely by using a Merssene twister as the key. Or something like:

md5(first 32 bits of md5(small memorable key)).md5(second 32 bits of md5(small memorable key)).md5(third 32 bits of md5(small memorable key)).md5(last 32 bits of md5(small memorable key))

Which just turns a small memorable password into a 512 bit monstrosity.

##### Share on other sites
ibebrett    205
yes, but you don't need repeats as i mentioned to posts above. you can break the key into pieces because of they way the xor is applied.

##### Share on other sites
alvaro    21247
Imagine I manage to make you encrypt a file that only contains a long list of zero bytes. The encoding would be the key repeating for the length of the file.

Now imagine I simply happen to know a file that you are encrypting with this method. I can now XOR the original file with the encrypted file and again I will get the key repeating for the length of the file.

If you are encoding a C++ file, I can make a guess that the first few characters will be "#include <" or "#include<". By XORing this string with the beginning of the encrypted file, I will find the first 80 bits of the key. I can now try to guess the length of the key by XORing with the part I know at different parts of the file and seeing which get converted to apparently reasonable C++ code. Then guessing the characters before and after some of the windows that I have already decoded wouldn't be too hard, and that would give me more bits of the key.

As you see, it's not too hard to make an attack of this type.

It might be fun for you to encode some longish C++ code with this method and let us have a go at attacking it.

##### Share on other sites
If you really want to stick with XOR encryption instead of a "real" cipher, you should make it something like this:

for(...)
out[i] = in[i] ^ f(key, out[i-1])

where f(key, x) returns some value that depends on key and x in some non-obvious, non-linear way. Shifts paired with adds, multiplications modulo 16 bits, etc.

And you would absolutely need to add a random initialisation vector to every message.

Though if you need a symmetric cipher, I don't see why you can't just use something like for example TEA. Even the old, non-revised version is better than what 90% of all coders will come up with if they try to make something on their own.
If you want it safer, add 10 or 20 more rounds. If you don't care so much, but want it as fast as possible, reduce it to 4 or 8 rounds, it will still be better than XOR.

Or, just use AES, which is hardware-accelerated on the most recent CPUs and not too slow in software either.

##### Share on other sites
Mantear    251
I second the AES recommendation. The specification can be foud here:
csrc.nist.gov/publications/fips/fips197/fips-197.pdf

It explains how to implement it in great detail and even provides test vectors with expected output to verify your implementation. It even has intermediate step outputs to help debug. Works with 128-bit, 192-bit, or 256-bit keys.

The toughest part will be coming up with a key that you don't write down/store somewhere in plain text. Perhaps make a second application that takes an input text phrase and generates a key from that.