PDF secret sharing
Finally exams are done and I have (barely) survived!
Even though classes are over for me, I am continuing to work on a project for one of my professors, who has an interest in data security/privacy. What the work is, is to attempt to create a program that can encrypt and decrypt pdf files using Shamir's Secret Sharing effectivly and efficiently.
If you don't know what Secret Sharing is regarding to encryption it is the following: We take a 'secret' and generate an arbitrary number of shares from it, which can be then distributed safely accross what ever medium. Individually these shares are useless, but when a specific number of shares, k, are combined, they reconstruct the original secret. An easier way to think of it is say you have a switch to deploy a bomb, however it is kept protected by locks. 5 people have keys for the locks, therefore you need 3 of the people to use their keys to open the 3 locks in order to flip the switch. Read the Wiki for more details on the technique. In my project, each letter in the PDF document is treated as a secret, and split into 5 shares. 5 pdf documents are created to hold the corresponding shares for the entire document and 3 share files are required in order to reconstruct the original document. Reconstruction is done by calculating Lagrange basis polynomials. Again read the wiki for more info.
One problem that arises is that this technique is weak to frequency analysis attacks. This means that characters are directly mapped to a different character. So so say we have the letter 'a' as a secret to be shared then as example the following secrets are generated for it:
Secret 1: b
Secret 2: D
Secret 3: j
Secret 4: 8
Secret 5: y
This seems fine as there is no way someone looking at those 5 shares could figure out our secret is 'a'. However the problem lies in the fact that EVERYTIME the letter 'a' is shared in our document it will appear as 'b' in secret 1, 'D' in secret 2, etc. This is fixed by modifying how we create the shares which now uses the sumation of the degrees of our polynomials as the functions for computing secrets.
Secret sharing is not restricted to text, since to calculate text values, we use the numerical equivilant and share that value, what is stopping us from sharing images and sharing the RGB values? nothing at all! If we take a pixel from an image, and create shares of the R, G and B values of that pixel, our image is virtually destroyed in the shares, and can be reconstructed in the same manner as text is. This is however more computationally expensive as instead of performing one 'sharing' operation per character, we are performing 3 operations for every pixel.
As for the progress of the project, I have implemented the text features fully, and currently am working on image sharing. Unfortunately taking a 2~ week break for exams has made me feel rusty on the project and I REALLY wish I had commented my code better at the time
Anyways, I hope I enlightened you slightly on a strong encryption technique which isn't too hard to implement in code yourself as the computer will deal with all the actual math for you! If you have any questions feel free to ask, time for me to re-aquaintent myself with my code!