Battery of Tests For IO Library

Started by
10 comments, last by Ectara 11 years, 4 months ago

It's implemented in such a way that the encryption algorithm and mode don't make a difference. The stream makes use of my encryption module, and makes sure that block ciphers and stream ciphers both can work through the same code path. For instance, AES ECB and RC4 can both be used through the same stream implementation (not at the same time); all it requires is that you set up the encryption information correctly, and use it to attach an encrypted stream to the data. The encryption stream requires seeking backward, so if it cannot be seeked, it must be cached to a seekable stream type.

Wouldn't that be relatively inefficient, though? Consider a stateful mode like CBC, to encrypt at position N you'd need to encrypt everything before position N - how does your stream deal with this, do you forbid seeking ahead?

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Advertisement

Refactor to allow reading from and writing to abstract sources and sinks. A simple implementation of the source interface can then be trivially implemented to allow reading from character buffers for testing purposes. The same can be done for sinks/writing.

I'm not quite sure what you mean, as my streams are already as abstract as it gets. I have stream type that reads from and writes to a block of memory. Only problem is, I need to test that, too

The stream are not always simple file descriptors. If I'm testing an AES or compression stream, it's incredibly time consuming to test every output for correctness by hand if I simply check the written bytes in memory. For the most part the streams are working, but I need to test edge cases.


You can also have sources/sinks that fail at specific points for testing failures cases deterministically, or at random points for fuzz testing should that be required.


However, having tests fail purposely is useful. Perhaps having a target stream that purposely returns values that don't make sense, to see how the higher-level parts of the code deal with it, or something.


Wouldn't that be relatively inefficient, though?

Inefficient, yes. However, the goal of the stream interface is to abstract away the implementation to provide one uniform way of accessing and storing data. Doing it this way ensures that one code path works for all sources or destinations (like being able to load an image from anywhere through one interface), but if they need the utmost efficiency, they can use the encryption module directly, having the code now assume that input will be encrypted in the way they expect, and they can handle the data and the encryption any way they choose. The goal of the encryption stream is just to get the job done in a way that would allow it to conform to a stream interface, after all.


Consider a stateful mode like CBC, to encrypt at position N you'd need to encrypt everything before position N - how does your stream deal with this, do you forbid seeking ahead?

Well, it's a little simple. The streams function similar to C's standard IO, as that was my inspiration when I started. In order to write at position N in a C FILE, you'd need to have N - 1 bytes already written to the file; files can't be grown without writing data to extend the file. Thus, to write at position N, you must have written or seeked past N - 1 existing bytes. The data in my encrypted streams are already encrypted; if you attach an encryption stream to another stream, you are making the assumption that the existing data is encrypted in the way that you expect; if the data is not yet encrypted, then you need to read the data and write it through the encryption stream again in order to encrypt it.

So, first, you'd find the block where byte N resides. Then, you'd take the previous ciphertext (or the IV, if it is the first block), decrypt the current block, and XOR the two to get the plaintext. You'd then modify the plaintext, then XOR it with the previous ciphertext or the IV, and encrypt it again. It will, however, require re-encryption of all following blocks, due to the nature of CBC.

Since CBC uses the ciphertext of the previous block, to read or write at an arbitrary point only needs examining the ciphertext of the previous block. However, modifying the plaintext requires re-encrypting all following blocks until the end of the stream (or until you find one that encrypts to the same block, which is extraordinarily unlikely). When I do implement different block cipher modes, it wouldn't be too much trouble to make it agnostic to the actual cipher, and add in more optional bookkeeping depending on the mode.

As another example, when seeking backwards, it has optional bookkeeping to reset stream ciphers to their initial states, then quickly burn through iterations until it reaches the new position, so that the higher-level code can seek an RC4 stream backwards without knowing the difference, for instance.

There will be inefficiencies, but people that need to take care to avoid them would never need to hit them, because these inefficiencies are only caused to emulate functionality that other stream types take for granted. Of course, these inefficiencies don't really occur when you read or write straight through from beginning to end without changing direction or modifying the existing data, which is the most likely action to serialize data through a stream. Again, the stream's goal is to provide an interface to simply get the job done the way the caller needs it, which is immensely useful.

This topic is closed to new replies.

Advertisement