Steganography questions

Started by
4 comments, last by frob 6 years, 10 months ago

I want to hide data inside the bit planes of a bitmap. I have a few questions:

1. I obviously only want to hide the data in noisy parts, but what's the best way to determine which parts are noisy enough? Should I use some statistical calculation of the pixels, or what?

2. I suppose I should divide the image into small segments (like 8x8 or 16x16), so I can separate, and then flag each noisy segment. What's a good size to use? It seems like the larger the size is, the more likely it will contain shapes that are not noisy enough to hide data (like if I used a quarter of the image as a segment), but if they're too small, they may not tell me anything useful.

3. Here's a tough one. How do I retain the information about which segments are actually being used? If I just store data wherever I want, and then when I try to read it out, I measure the noise of the segments the same way, I've already altered the data in the image by storing hidden data into it, so I might not get the same results, and it wouldn't correctly know where my data is stored. Alternatively, it seems like I could take the flags I've made to tell me that information, and store them into the file, so I can read them out first, and that would tell me where the actual hidden data is stored. But if I do this, where should I store the flags? If I just store them in some arbitrary part of the image, wouldn't I have the same problem of risking storing them somewhere that isn't noisy enough? Also, might the flags use a significant amount of the space that I have for storing the data?

(I posted this in the "For Beginners" section, but it got locked, because supposedly it's not a beginners topic, and not directly about game programming. But whenever I posted before, I've always been told that everything should be in the beginners section, no matter what it was - I'm not sure why. But hopefully this one won't get locked. Please don't lock it.)

Advertisement

I've always been told that everything should be in the beginners section, no matter what it was

Citation needed. Everything should definitely not be in the beginners section. Why else would we have more forums?

1. I obviously only want to hide the data in noisy parts, but what's the best way to determine which parts are noisy enough? Should I use some statistical calculation of the pixels, or what?

You don't "obviously" want to hide the data in only the "noisy" parts. You can hide it anywhere in the image, typically across the whole image, by spreading it into low order bit values that result in variations in color that cannot be picked up by the human eye. The Wikipedia entry has a reasonable overview. If you do want to go for an approach that only uses certain regions of the image, one way to look for noisiness might be to look at the image in the frequency domain via a Fourier transform and compare local minima and maxima, maybe.

Note that this is different from storing the data that is noisy overall. That's a reasonable idea because it makes the low-frequency modifications you might be making even harder to detect.

2. I suppose I should divide the image into small segments (like 8x8 or 16x16), so I can separate, and then flag each noisy segment. What's a good size to use? It seems like the larger the size is, the more likely it will contain shapes that are not noisy enough to hide data (like if I used a quarter of the image as a segment), but if they're too small, they may not tell me anything useful.

Segmenting the image like this is potentially one way to proceed, but not the only way, especially given that it is predicated on your "noisy" assumption, which as above is not the only way to go (and probably not the best way to go, as I'll get to below).

Here's a tough one. How do I retain the information about which segments are actually being used? If I just store data wherever I want, and then when I try to read it out, I measure the noise of the segments the same way, I've already altered the data in the image by storing hidden data into it, so I might not get the same results, and it wouldn't correctly know where my data is stored.

Yes. This is probably the key flaw in any kind of approach that doesn't spread the data uniformly (or uniformly after some predefined offset) across the image. You can find "noisy" areas as I noted above, but finding the same noisy areas again after you've altered the input data by embedding your secret isn't guaranteed in the general case. That's why uniformly-distributed storage mechanisms like the one used by the tree/cat picture are, I would say, more common.

Also this is an amazing field of research with all kinds of crazy little options. The less to encode and more surface to encode, the more options you've got.

You get both concerns of encryption (even if you know the algorithm it cannot be unlocked) and obscurity (if you knew where to look you could find the data).

As one example out of many, assuming you've got a small amount of already encrypted data, you might encode a single byte within 128 arbitrary elements by flipping only a single bit: look at the natural parity of the 128 pixels, have your private knowledge that you are looking for either even or odd parity, then flip a single low-order bit of the pixel that encodes the parity you want. That's a classic logic puzzle: you've got a checkerboard with pieces your guards have set up across the entire board. The guard will either determine a square to be identified or the prisoners must choose the same piece, depending on the variation of the logic puzzle. One prisoner can flip a single piece on the board then leaves. The other prisoner enters the room, reviews the arbitrary board that had a single piece flipped, and finds the information that completes the puzzle and saves their lives.

It is a big field of research with many fun twists and turns.

Josh:

When I said that everything belongs in the beginners form, I meant that everything I post, I'm always told to put it in the beginners form, so now I tend to post everything there.

If I don't hide it only in the noisy sections, then won't it leave a clue that something is hidden there?

Thanks for verifying what I said about it changing the data and making it impossible to then find the noisy areas again. But then what should I do instead?

frob:

I hadn't thought of using parity, but thanks for the suggestion, although I'm not sure that applies to this situation, nor your ideas about spreading data thinly among the file I'm hiding it within, because the data that I want to hide is so big that it takes a significant portion of the file, like a whole bit plane or about half of one (depending on the size of the image of course, but I'm using a pretty high-res image). That's why I thought if I spread it over multiple bit planes but only in noisy areas, it would still fit, but be less obvious. And I can't split it between multiple images to hide inside of, because in this case, I need to hide the whole thing in a single image.

But then what should I do instead?

Visit Amazon.com or a good library and get some books on the techniques. Study them.
Research graphics formats and graphics compression algorithms and vision techniques and how data is interpreted and how similar encoding patterns produce more or less pleasing results. For example, Google's research on how to slightly modify jpeg compression routines to give the same legal jpeg files but selecting slightly different encoding patterns that are simultaneously better looking and tighter compressed.
Study how countermeasures are found, such as how encodings are discovered by minor details; maybe they use one compression algorithm that is true for most of the area yet portions or a bit channel have different encodings. You mention encoding within a color plane's low order bits, but that is such a basic approach it is trivially identified by anyone with knowledge of the field.

Just like your other topics and posts asking about information hiding, it is a subject that requires years of study and often requires highly advanced math. Not as insane as encryption, where current algorithms are only deeply understood by a few hundred people on the planet, but stegonography still is an extremely advanced mathematical field.

This topic is closed to new replies.

Advertisement