Sign in to follow this  

Virtual memory and compression

This topic is 3722 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I need to be able to work with images which are usually stored in 48-bit color and can be very very large (these images can easily exceed 16384x16384 etc). Image data will be accessed in both solid chunks (any MxN block of data in the image) as well as by small or large deltas (display the whole image on the screen, say 1024x768). What is a convenient method of paging to disk? Are there resources out there for people with the same problem or would I be better off coming up with my own system which supports my specific needs? I don't want to be limited to Windows' paging file - the size needs to be limited to nothing less than the user's hard drive. Also, would it be better to organize the data in little chunks (1024x1024 or whatever) or in a straight stream? What kind of lossless compression techniques exist that could aid in this? Let me know if you need more information. Thanks. Cheers -Scott

Share this post


Link to post
Share on other sites
Wow, image data larger than 1.6 gigabytes? I think storing it in chunks would be better because you would only have to load the chunks that are needed. Also, definitely don't explicitly load the whole file into memory. Look into using memory mapped files for whatever OS you are using; a memory mapped file allows the memory system and the file system to work together to only load the pages that are accessed.

Share this post


Link to post
Share on other sites
What about using wavelets for a multiresolution analysis and data compression.
Paging in Windows usually works with 4kb per page. So having a 64x64 pixel image allows to store one byte per page. With 48 bit per color and three color components per pixel there are 18 pages needed for a 64x64 pixel image. That format can be used in a caching mechanism in memory.
Data storage on disk should be compressed to reduce bandwidth when accessing data. Using wavelets a low-resolution image and difference information to produce higher resolutions is stored. When a block in the image is needed, data is decompressed on the fly and the result is cached, for instance in an LRU scheme.
Hope that gives you some ideas.

Share this post


Link to post
Share on other sites
Yeah I was looking into some of the wavelet compression schemes that are used for things such as satellite imagery. Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely :/ Also it is just as likely that the whole image would need to be viewed at once as it is likely that a chunk of it will be viewed at some zoom level, so the data needs to be accessable easily on both ends.

Now, this is Windows-specific, so I was looking at the Virtual* functions, and they seem like a good first-thought, but given my options (or lack thereof), it seems just as ideal if not better to just have the whole image raw on disk (let the user choose their fastest hard drive, or whatever) and access it like it is memory. I don't particularly like this idea but I'm not sure what other options would be available. I guess the best version of this would have x number of mipmap levels (zoom/detail level caches, whatever you want to call it) and have data sorted by chunks of the image (MxN blocks).

Thanks for the suggestions - that's where my thoughts and research *were* going too. Any other ideas?

Cheers
-Scott

Share this post


Link to post
Share on other sites
Quote:
Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely


No, unless the image contains a particular distribution.

The problem with loss-less wavelet compression comes from increasing accuracy.

Using simple averaging function, each increasing level requires at least one more bit to represent the deltas. Worst-case would require representation of maximum possible deviation if you cannot sacrifice accuracy.


But, the reason loss-less compression works, is due to distribution of deltas. Those are very suitable for entropy compression (huffman, arithmetic, RLE, for example). So the only problem becomes finding efficient variable-bit math operations and storage.

For real images (photographs) without any unusual noise patterns will result in 20-50% compression. If you can deal with one-time accuracy loss, converting into YUV model may gain you some 50-60% "loss-less" compression.

Just look at image histogram - not all values are equally represented, so a simple huffman encoding will produce at least some results.

Share this post


Link to post
Share on other sites
Quote:
Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely

What I meant was that unless there is a method of lossless compression that the scientific community as a whole has officially (who decides this stuff anyway?) accepted as useable for any type of image analysis, it may not be used :/ (reason #1 being lawsuits, and then going into a whole list of arguments supporting the no-lossless-compression "policy").

Otherwise that would be my prime focus right now because I think it fits the problem area perfectly. Also, I doubt the variation of noise already present in a nano-picture will be horribly disrupted by the minute artifacts of certain wavelet compression schemes, but it really doesn't matter as I have no say-so in the matter! Oh well. If anyone has any other thoughts on the matter let me know. Thanks again for the input.

Cheers
-Scott

Share this post


Link to post
Share on other sites
Use windows' file mapping functions for accessing the huge file. This will let the OS handle the job of paging parts of the file into memory as needed. For handling different zoom levels you'll probably need to use some sort of mipmapping type technique to avoid needing to touch every page of the file when viewing the whole image at once.

Share this post


Link to post
Share on other sites
Quote:
What kind of lossless compression techniques exist that could aid in this?

Quote:
Only problem is, I was just told very firmly that lossless (or any) compression is out of the picture, completely

Quote:
What I meant was that unless there is a method of lossless compression that the scientific community as a whole has officially (who decides this stuff anyway?) accepted as useable for any type of image analysis, it may not be used :/ (reason #1 being lawsuits, and then going into a whole list of arguments supporting the no-lossless-compression "policy").
I'm guessing that you mean that you cannot afford for a portion of the image to be loaded and then monipulated in such a way that the same part of that image would now compress much worse, i.e. to a larger block size? Cause that's certainly always going to be a possibility with compression. However if you break the image up into multiple files (may be a good idea anyway) then you can quite happily deal with this.
Definitely store it in little blocks anyway as opposed to hugely long scanlines, as many types of image manipulations work on nearby pixels from other rows, e.g. Convolution filters.
Since you're storing the image data at a higher number of bits-per-pixel than the display can draw it at, have you given any thought to dithering down to 32, or 16 bit for displaying?

Share this post


Link to post
Share on other sites
Quote:
What I meant was that unless there is a method of lossless compression that the scientific community as a whole has officially


Standards and scientific community don't mix well. NIHilitis and all that.

Your original problem description asked for image manipulation, not storage. What you do while manipulating is irrelevant - in the same way there are no official decisions on which quick sort partitioning scheme to use - find what is most suitable.

One way to manipulate such images is through off-line processing. You create a supersampled image of reasonable dimensions, and UI works with that. Whenever zoom is used, higher resolution is loaded.

But, to produce end result, image is processed off-line, and all transformations are applied to entire image.

There was a graphics package long time ago (when 32Mb was a lot of memory) that allowed editing of arbitrarily-sized images (50000 x 50000 or more) using this technique, but I no longer know the name.

Share this post


Link to post
Share on other sites
Quote:
Original post by Vorpy
Use windows' file mapping functions for accessing the huge file. This will let the OS handle the job of paging parts of the file into memory as needed. For handling different zoom levels you'll probably need to use some sort of mipmapping type technique to avoid needing to touch every page of the file when viewing the whole image at once.
That will work great right up until you run out of virtual address space a little short of 2 GB, and your code falls apart in a mess of boiling tar with toothpicks and paper clips stuck in the smoking remains. At which point everyone will laugh at you.

Share this post


Link to post
Share on other sites
Quote:
Original post by Promit
Quote:
Original post by Vorpy
Use windows' file mapping functions for accessing the huge file. This will let the OS handle the job of paging parts of the file into memory as needed. For handling different zoom levels you'll probably need to use some sort of mipmapping type technique to avoid needing to touch every page of the file when viewing the whole image at once.
That will work great right up until you run out of virtual address space a little short of 2 GB, and your code falls apart in a mess of boiling tar with toothpicks and paper clips stuck in the smoking remains. At which point everyone will laugh at you.


Oops. I guess the program still needs to do some address space management, or run on 64 bit windows where the address space wouldn't be a problem.

Share this post


Link to post
Share on other sites
I think the best way to store this data is to definately tile the data:

0123ghijwxyz
4567klmnABCD
89abopqrEDGH
cdefstuvIJKL
MNOPetc...
QRST
UVWX
YZ01

This will make loading a tile quick - one sequential read. Storing like this:

0123456789ab
cdefghijklmn
opqrstuvwxyz
ABCDEDGHIJKL
MNOPQRSTUVWX
YZ01etc...

would require a sequence of reads and skips to load a tile which is much slower, even on a fast hard disk.
Additionally, precompute the mimmaps for each tile and store them with the tile:

tile0
tile0mip0
tile0mip1
tile0mip2
tile0mip3
etc..
tile1
tile1mip0
etc..

The calculations to find a specific tile and mipmap are then really straightforward, especially with a tile size of, say, 256x256. The file storage would then be rounded up to the next multiple of 256.

You would then have a mapping between tiles visible and the tile data on disk, editing the image data updates the tile and its mipmaps on disk.

Skizz

Share this post


Link to post
Share on other sites
Quote:
Original post by Skizz
I think the best way to store this data is to definately tile the data:

0123ghijwxyz
4567klmnABCD
89abopqrEDGH
cdefstuvIJKL
MNOPetc...
QRST
UVWX
YZ01
Why not take this a step further and define the image recursively as a quad-tree (down to 64x64 blocks)? With mip-maps stored at the node-level, this will allow for fairly uniform loading times at any zoom-resolution.

Admiral

Share this post


Link to post
Share on other sites

This topic is 3722 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this