Calculating distance between very large values.

Started by
2 comments, last by irreversible 7 years, 8 months ago

Here's what my position descriptor looks like:


            struct bounds_t
            {
                math::i64vec3		interstellar;
                math::i64vec3		global;
                math::vec3		local;
            };

i64vec3 is a signed vector of 64-bit integers and vec3 is a 3-vector of single precision floating point numbers. The three tiers correspond to:

local: 1 unit = 1 meter, clamped at 1000 meters

global: 1 unit = 1 km, calmped at +/- 943932 light years or 298 kpc (roughly the size of our galaxy)

interstellar: 1 unit = 1 kpc, clamped at a bloody large universe (presently chosen to be 64 bits as 32 bits does not cover the 93 bn light year/30 kpc size of the observable universe)

Let's, for the time being, ignore the purposefulness (or a lack thereof) of such a layout and, in particular, the efficacy of using 64 bit datatypes, and instead consider the need for a fastish way of calculating the distance between two such points.

The problem here is that while local coordinates can hit a ceiling of (1000x1000 + 1000x1000 + 1000x1000) = 3000000 units, global coordinates can overflow much more easily.

1 kpc = 3.086e+16 km, which can expand to a maximum of 2.8570188e+33 km during distance calculation. Which, according to my smallish brain, takes ~111 bits of precision to store.

Without resorting to more complex approximations what would your suggestion be to sidestep (or rather alleviate) this problem? With this in mind I'd like to stress that this is my first foray into numbers of such magnitude and I'm not even sure how to go about performing a square root on a value that large.

Would you conditionally use a 128-bit proxy if the Manhattan distance is larger than some threshold? Sample down to 32 bits during distance calculations? Something else (hopefully much more) sinister?

Advertisement
Unless for some reason you need ridiculously precise distance computations, you could subtract your two bounds_t objects (resulting in something like a bounds_t object that perhaps violates the clamping requirements, but you don't care), convert that to a traditional vector of 3 doubles (say, using Km as the unit) and use the length of that. I can't see where that might go wrong.

SIMSpace 8.0 uses floats for meters. a sector is 1M meters, and a quadrant is 1M sectors. sectors and quadrants are stored in ints. the game world is 14E17 meters across (a cube of 1000 stars at approximately the average star density of the milky way galaxy).

i use a range routine that returns BBOX distance, and units of measure (meters, sectors, or quadrants). until you get down to meters for combat and close maneuvers, BBOX is adequate, and you avoid the squared term in true Euclidean 3D distance calculations, which seems to be the first thing to overflow in 3D games. once the range gets low you can use floats, doubles, or fixed point for true 3d distance calculations if required.

Norm Barrows

Rockland Software Productions

"Building PC games since 1989"

rocklandsoftware.net

PLAY CAVEMAN NOW!

http://rocklandsoftware.net/beta.php

Frankly a loss of precision on these scales is a non-issue. I knew that much before I started this thread :). That being said, on the one hand that's really not the point of the discussion and on the other hand I hadn't considered directly casting stuff to doubles, which on closer inspection results in far less error than I initially figured, considerably alleviating the problem without need for much further thought.

However... I did do some further calculations to see what the limitations would be if I wanted to encapsulate the entire observable universe and calculate distances at a precision that would be absolutely minuscule with respect to the numbers involved. And I think I've figured out a somewhat reasonable, mathematically sound scheme to do so, which is in no shape or form overly fascinating. Or useful. In fact, without changing the above data structure AFAICT distances can be calculated to roughly within one micrometer across the entire observable universe, while only accepting some - though arguably considerable - waste of storage space. All of which is something every science nerd needs more of. Right?

tl;dr: please stop reading now. I'm way overthinking this.

In case the tl;dr didn't discourage you, here's some facts and presumptions:

1) the size (diameter) of the observable universe is 93 billion light years.

1.1) I was initially thinking of using 1 AU (astronomical unit) as the cutoff distance, but I wanted to be able to store a normal-sized solar system within a single precision range, which an AU cannot do

2) speed and storage are not really issues

3) however, I don't want to use arbitrary precision math. Eg everything should stay within the pipeline a regular 64-bit CPU can handle

4) the biggest presumption here is that my math is actually correct. Which it may very well not be.

And here's the rundown:

a) assume overflows are unacceptable and it is undesirable to make use of the upper parts of either double precision floating point or some spatially non-symmetrical fixed-point distance metric. The maximum squared distance between two 64-bit coordinates located within a cube hence becomes:

x2+x2+x2 = 3*x2 = 263 = 9223372036854775808 km
Which resolves to:
x2 = 3074457345618258602.6(6) km =>
x = 1753413056.1902003251168019428139 km, or =>
x = 5.6824247180601 pc

Which is ~200 times less than the kiloparsec range I was initially trying to stretch the global coordinates to. No worries. Let's just reduce the global scale to the range [1 km; 1 pc].

b) with this there's enough precision to calculate distances directly without fear of overflow. The precision cutoff point of 1 full unit in double precision floating point format (eg where the precision dips below one full unit) is around 1e+20, which is notably ~10x larger than than the 9223372036854775808 km upper limit for the squared distance. Which is actually a pretty nice fit.

c) the bigger problem here is wasted storage space. Using 64 bits to store distances from 1 km to 1 kiloparsec nets a whopping log2(974932) - log2(1000) = 9.9291577864 bits of wasted space per axis. Reducing the upper limit to one parsec bumps this up to 19.8949420711 bits.

Which is a total of ~60 bits of unused space when storing a single coordinate. However, that's not all.

d) the same logic can be applied to the intergalactic scope, which is also a 64-bit integer 3-vector, boosting the amount of wasted space to around 15 bytes per coordinate. Which is a LOT.

e) that being said, using 44 bits of precision per axis on the intergalactic on top of the 1 parsec local scale amounts to a maximum universe size of (2^44 * 3.26) ly / 93000000000 ly = ~616x the size of the observable universe.

Success! Right? Well, yeah, when not considering each coordinate wastes more space than is used by a full blown 3-vector used for rendering.

There are a couple of upshots to this, however:

a) the extra bits can optionally be used to store rotational or other intrinsic data, such as brightness and/or color.

b) assuming most of the universe is procedurally generated and can hence be quickly discarded on demand, the number of objects that need to be stored with extended precision at any one time is actually relatively small. Likely in the upper hundreds or lower thousands. Which doesn't really amount to too much wasted space in the end.

So - voila. Here's to 2 hours well spent! Because SCIENCE!

Incidentally, if anyone's bored, please feel free to check my math.

This topic is closed to new replies.

Advertisement