Jump to content
  • Advertisement
Sign in to follow this  
way2lazy2care

noob memory use question related to number storage

This topic is 2629 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hola guys,
I am going to be storing potentially a lot of number values in an array. These numbers will always come in increasing in value (possibly staying the same) and will always be greater than zero. What I am wondering is if it is possible to store these values in a data structure in such a way that for values less than a certain threshold it will use the smallest size of memory needed to store it. For ex. 0-255 would be stored as bytes, 256-65,535 would be stored as shorts, etc. It seems like this could save a lot of data for very large data sets.

I'm probably just going to implement it with longs for now as this is very much a premature optimization, but I was just wondering if there was a way to do this or if it would cause more trouble than it would help.

More than likely the simpler way to do it would come down to compressing my huge number of points into fewer points, but I thought it would be illuminating to know the answer to this question :)

Share this post


Link to post
Share on other sites
Advertisement
Given those constraints, you could certainly do that. Different languages would make that easier/worse.

You could also store only the delta between each number if you can guarantee that to be small and don't need random access. For example:


0 2 0 4 3


instead of


0 2 2 6 9

Share this post


Link to post
Share on other sites

Given those constraints, you could certainly do that. Different languages would make that easier/worse.

C# is the language

You could also store only the delta between each number if you can guarantee that to be small and don't need random access. For example:


0 2 0 4 3


instead of


0 2 2 6 9

[/quote]
I like this. At the very least I should be able to cut the data in half because the differences should be fairly small.

Share this post


Link to post
Share on other sites
Millions of longs isn't a big deal on modern desktops. Hell, it's probably not that big of a deal on most mobile devices either once it's in memory.

Share this post


Link to post
Share on other sites

Millions of longs isn't a big deal on modern desktops. Hell, it's probably not that big of a deal on most mobile devices either once it's in memory.


Yea. I'm aware it shouldn't be a huge deal, but it got me curious how one might go about doing it efficiently.

Share this post


Link to post
Share on other sites
Telastyn's suggestion is probably the best. If you can constrain the difference between any two successive numbers to a given range, you can compact it to store, say, a byte or short per delta; otherwise, you could probably do some kind of variable-length encoding magic to get that down even further. I don't know if you really want to mess with bit-stream variable encoding in C#, though.

What really matters, though, is the usage pattern of this data:
  • Are you adding values on the fly? If so, how often?
  • How do you retrieve values and what do you do with them?
  • What access patterns to the value list do you need?

    For instance, if you know you'll only ever access the last 25 numbers, just use a circular buffer of size 25 and voila, no memory problems at all!

Share this post


Link to post
Share on other sites

What really matters, though, is the usage pattern of this data:
  • Are you adding values on the fly? If so, how often?
  • How do you retrieve values and what do you do with them?
  • What access patterns to the value list do you need?


It's the histogram thingy that I made another thread about. Values can be added on the fly to a database, and then I grab the new stuff from the db periodically. Access patterns is something I'm still dealing with as some of the histogram algorithms I've seen require access to all the data at some point in different patterns (usually either binary search or in order). I think what I'm going to end up doing is reading in all my data over whatever range and storing it in a kind of hi res simple histogram that's then used to generate the more dynamic scalable histogram.

Kind of like this:
1. 100,000-1,000,000s of data points.
2. Downscale to 5,000-10,000 weighted points (some time at startup, but afterwards it would just be incrementing the last containers weight with new data and if it runs out of space it will merge neighbors.)
3. Use 5,000 weighted points to generate a ~200 point histogram that can zoom/scale/pan/update fairly quickly.

I was looking at the visual usefulness of my graph, and anything over 200 points seems to be a waste for the size of the graph I'm using.

The problem is mostly due to the panning and zooming that makes the 200 point final histogram have to access more detailed data than it contains itself if I want to keep it balanced. I suppose I could also do something like having a fixed depth tree that does something similar to the above except the 5,000 weighted points would just be the lowest level of the tree. I will think about this some more actually, because now I think I've just sold myself on doing it that way.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!