Virtual memory and compression

Started by
11 comments, last by TheAdmiral 16 years, 6 months ago
I need to be able to work with images which are usually stored in 48-bit color and can be very very large (these images can easily exceed 16384x16384 etc). Image data will be accessed in both solid chunks (any MxN block of data in the image) as well as by small or large deltas (display the whole image on the screen, say 1024x768). What is a convenient method of paging to disk? Are there resources out there for people with the same problem or would I be better off coming up with my own system which supports my specific needs? I don't want to be limited to Windows' paging file - the size needs to be limited to nothing less than the user's hard drive. Also, would it be better to organize the data in little chunks (1024x1024 or whatever) or in a straight stream? What kind of lossless compression techniques exist that could aid in this? Let me know if you need more information. Thanks. Cheers -Scott
Advertisement
Wow, image data larger than 1.6 gigabytes? I think storing it in chunks would be better because you would only have to load the chunks that are needed. Also, definitely don't explicitly load the whole file into memory. Look into using memory mapped files for whatever OS you are using; a memory mapped file allows the memory system and the file system to work together to only load the pages that are accessed.
What about using wavelets for a multiresolution analysis and data compression.
Paging in Windows usually works with 4kb per page. So having a 64x64 pixel image allows to store one byte per page. With 48 bit per color and three color components per pixel there are 18 pages needed for a 64x64 pixel image. That format can be used in a caching mechanism in memory.
Data storage on disk should be compressed to reduce bandwidth when accessing data. Using wavelets a low-resolution image and difference information to produce higher resolutions is stored. When a block in the image is needed, data is decompressed on the fly and the result is cached, for instance in an LRU scheme.
Hope that gives you some ideas.
Yeah I was looking into some of the wavelet compression schemes that are used for things such as satellite imagery. Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely :/ Also it is just as likely that the whole image would need to be viewed at once as it is likely that a chunk of it will be viewed at some zoom level, so the data needs to be accessable easily on both ends.

Now, this is Windows-specific, so I was looking at the Virtual* functions, and they seem like a good first-thought, but given my options (or lack thereof), it seems just as ideal if not better to just have the whole image raw on disk (let the user choose their fastest hard drive, or whatever) and access it like it is memory. I don't particularly like this idea but I'm not sure what other options would be available. I guess the best version of this would have x number of mipmap levels (zoom/detail level caches, whatever you want to call it) and have data sorted by chunks of the image (MxN blocks).

Thanks for the suggestions - that's where my thoughts and research *were* going too. Any other ideas?

Cheers
-Scott
Quote:Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely


No, unless the image contains a particular distribution.

The problem with loss-less wavelet compression comes from increasing accuracy.

Using simple averaging function, each increasing level requires at least one more bit to represent the deltas. Worst-case would require representation of maximum possible deviation if you cannot sacrifice accuracy.


But, the reason loss-less compression works, is due to distribution of deltas. Those are very suitable for entropy compression (huffman, arithmetic, RLE, for example). So the only problem becomes finding efficient variable-bit math operations and storage.

For real images (photographs) without any unusual noise patterns will result in 20-50% compression. If you can deal with one-time accuracy loss, converting into YUV model may gain you some 50-60% "loss-less" compression.

Just look at image histogram - not all values are equally represented, so a simple huffman encoding will produce at least some results.
Quote:Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely

What I meant was that unless there is a method of lossless compression that the scientific community as a whole has officially (who decides this stuff anyway?) accepted as useable for any type of image analysis, it may not be used :/ (reason #1 being lawsuits, and then going into a whole list of arguments supporting the no-lossless-compression "policy").

Otherwise that would be my prime focus right now because I think it fits the problem area perfectly. Also, I doubt the variation of noise already present in a nano-picture will be horribly disrupted by the minute artifacts of certain wavelet compression schemes, but it really doesn't matter as I have no say-so in the matter! Oh well. If anyone has any other thoughts on the matter let me know. Thanks again for the input.

Cheers
-Scott
Use windows' file mapping functions for accessing the huge file. This will let the OS handle the job of paging parts of the file into memory as needed. For handling different zoom levels you'll probably need to use some sort of mipmapping type technique to avoid needing to touch every page of the file when viewing the whole image at once.
Quote:What kind of lossless compression techniques exist that could aid in this?

Quote:Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely

Quote:What I meant was that unless there is a method of lossless compression that the scientific community as a whole has officially (who decides this stuff anyway?) accepted as useable for any type of image analysis, it may not be used :/ (reason #1 being lawsuits, and then going into a whole list of arguments supporting the no-lossless-compression "policy").
I'm guessing that you mean that you cannot afford for a portion of the image to be loaded and then monipulated in such a way that the same part of that image would now compress much worse, i.e. to a larger block size? Cause that's certainly always going to be a possibility with compression. However if you break the image up into multiple files (may be a good idea anyway) then you can quite happily deal with this.
Definitely store it in little blocks anyway as opposed to hugely long scanlines, as many types of image manipulations work on nearby pixels from other rows, e.g. Convolution filters.
Since you're storing the image data at a higher number of bits-per-pixel than the display can draw it at, have you given any thought to dithering down to 32, or 16 bit for displaying?
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Quote:What I meant was that unless there is a method of lossless compression that the scientific community as a whole has officially


Standards and scientific community don't mix well. NIHilitis and all that.

Your original problem description asked for image manipulation, not storage. What you do while manipulating is irrelevant - in the same way there are no official decisions on which quick sort partitioning scheme to use - find what is most suitable.

One way to manipulate such images is through off-line processing. You create a supersampled image of reasonable dimensions, and UI works with that. Whenever zoom is used, higher resolution is loaded.

But, to produce end result, image is processed off-line, and all transformations are applied to entire image.

There was a graphics package long time ago (when 32Mb was a lot of memory) that allowed editing of arbitrarily-sized images (50000 x 50000 or more) using this technique, but I no longer know the name.
Quote:Original post by Vorpy
Use windows' file mapping functions for accessing the huge file. This will let the OS handle the job of paging parts of the file into memory as needed. For handling different zoom levels you'll probably need to use some sort of mipmapping type technique to avoid needing to touch every page of the file when viewing the whole image at once.
That will work great right up until you run out of virtual address space a little short of 2 GB, and your code falls apart in a mess of boiling tar with toothpicks and paper clips stuck in the smoking remains. At which point everyone will laugh at you.
SlimDX | Ventspace Blog | Twitter | Diverse teams make better games. I am currently hiring capable C++ engine developers in Baltimore, MD.

This topic is closed to new replies.

Advertisement