Sign in to follow this  
gohkgohk

Is it possible to make a custom type of integer?

Recommended Posts

if i know that i will only store a list of integers between -10 & 10, so there will be a lot of unwanted memory space wasted if i create an integer type for them. Is it possible to make a custom type of integer that only let you to store integers between -10 & 10, so no space will be wasted?

Share this post


Link to post
Share on other sites

If it's C/C++ you could use bitfields in a structure like this:

struct SmallIntEntry
{
int value1 : 5;
int value2 : 5;
int value3 : 5;
};

With correct alignment, sizeof( SmallIntEntry) will be 2 bytes for 3 values.
Creating a list of this will be kind of nasty though.

Share this post


Link to post
Share on other sites
Then the answer is no. Data types in C++ must always have a size of at least 1, which is minimally 8 bits. It's impossible to create a data type smaller than this.

Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
Then the answer is no. Data types in C++ must always have a size of at least 1, which is minimally 8 bits. It's impossible to create a data type smaller than this.

Oh...
:(
But wt is the meaning of PsyQo's reply?
i want to create a very big array(100000) which only store integers between -10 to 10.

Share this post


Link to post
Share on other sites
something to consider your 100000 entity array still takes up less then 1/5th of a megabyte you still be able to run it on an old 486 with 4mb ram with out having to worry about memory. even on the gba you would be fine with memory so you could create the array using char instead of an int so it only takes up 1/10th of a ~ megabyte

Share this post


Link to post
Share on other sites
PsyQo's post concerns bitfields, which are variables that are part of a data type. You can't use a bitfield as a type, but you can create a type that contains bitfields. However, such a data type will always use at least one byte.

Share this post


Link to post
Share on other sites
Quote:
Original post by gohkgohk
Is it possible to make a custom type of integer that only let you to store integers between -10 & 10, so no space will be wasted?

Well, what type are you using now? With signed char, you'd only waste 3 bits.

Share this post


Link to post
Share on other sites
It's actually quite possible to create a new data type that will be constrained to an integral range and takes up a minimum memory footprint within certain constraints. You would do this through template magicery and operator overloading.

If you templateize you class on the range (min and max), yu can select one of the native C++ types for representation to minimize memory usage. Operator overloading is straightforward, but conversions between oter integral types could cause problems so that's where you would have to be very careful.

The real question is, is it worth all the time and trouble?

--smw

Share this post


Link to post
Share on other sites
Quote:
Original post by Bregma
If you templateize you class on the range (min and max), yu can select one of the native C++ types for representation to minimize memory usage.


See SiCrane's post:

Quote:
Original post by SiCrane
Then the answer is no. Data types in C++ must always have a size of at least 1, which is minimally 8 bits. It's impossible to create a data type smaller than this.


Even with template trickery, you'd still only be able to go as small as a char to represent your type, so it doesn't really add anything.

Share this post


Link to post
Share on other sites
Of course, it's often possible to pack *multiple* values into a single integer — see, for instance, std::vector<bool> which truly uses just a single bit for each bool stored, plus a bit of padding at the end of the vector.

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
Well, what type are you using now? With signed char, you'd only waste 3 bits.


The information carried by the number is about 4.4 bits. Since storage quantum is 1 bit, the wasted space will be 0.6 bit, or about 12%.

The only way to avoid wasting space is through some form of compression.

Quote:
if i know that i will only store a list of integers between -10 & 10


Do you need to access this list? If not, run it through zlib, and it should come out close to 4.4 bits/value, probably even less.

Share this post


Link to post
Share on other sites
I think this is what Bregma had in mind. While you can't make a datatype smaller than a char, we can create an array to simulate it. If we extend the range to between -16 and 15 and assuming chars are 8-bits we can create an index system (the following is written to use a range of 0 to 31, it can be modified to use signed characters if desired):

struct smallIntArray
{
unsigned char * data;
unsigned int size;
smallIntArray(unsigned int size) : size(size);
{
data = new unsigned char[size*5 / 8 + 1];
}
~smallIntArray()
{
delete[] data;
}
// should overload operator[] for this;
unsigned char getInt(unsigned char x)
{
assert(x < size * 5 / 8+1);
unsigned int offset = x * 5 / 8;
if(0 == x % 8)
{
return (data[0+offset] & 0x1F);
}
else if(1 == x % 8)
{
return (data[0+offset] & 0xE0 >> 5) | (data[1+offset] & 0x03 << 3);
}
else if(2 == x % 8)
{
return (data[1+offset] & 0x7C >> 2);
}
else if(3 == x % 8)
{
return (data[1+offset] & 0x80 >> 7) | (data[2+offset] & 0x0F << 1);
}
else if(4 == x % 8)
{
return (data[2+offset] & 0xF0 >> 4) | (data[3+offset] & 0x01 << 4);
}
else if(5 == x % 8)
{
return (data[3+offset] & 0x3E >> 1);
}
else if(6 == x % 8)
{
return (data[3+offset] & 0xC0 >> 6) | (data[4+offset] & 0x07 << 2);
}
else if(7 == x % 8)
{
return (data[4+offset] & 0xF8 >> 3);
}
}
// should overload operator [] for this as well
void setInt(unsigned char value, unsigned int index)
{
// exercise for the reader
}
};


This is untested, but should give an idea how to implement it 'for real'. Template magic could allow for adjusting the math needed for different ranges, but I'm not knowledgable enough to do so. At least, I'm not going to do so whether or not I can figure it out.

Hope this helps.

Share this post


Link to post
Share on other sites
Quote:
Original post by PsyQo

If it's C/C++ you could use bitfields in a structure like this:

struct SmallIntEntry
{
int value1 : 5;
int value2 : 5;
int value3 : 5;
};

With correct alignment, sizeof( SmallIntEntry) will be 2 bytes for 3 values.
Creating a list of this will be kind of nasty though.
Thus will most likely use 4 bytes. changing int to short should do the trick though. But it's still up to the compiler, and it may align it to integer boundaries anyway.

Share this post


Link to post
Share on other sites
Sorry, I don't understand the problem.

100,000 [tt]int[/tt]s take 400,000 bytes, this will not only fit into every reasonably old computer's main memory easily, but will also fit entirely into every reasonably modern processors' L2 cache. Let's not even think about how much 400k affect today's multi-hundred-gigabyte hard disks if the data needs to be written to disk.
Using [tt]int[/tt] requires no extra code and is the most efficient way for the processor to access the data. Any technique to pack the data into a few less bits will considerably complicate everything and make accessing the data on the order of 5-20 times slower. Using non-native data sizes also pretty much rules out any later optimisations using SSE (at least, without additional troubles).

If anything, you might think about using [tt]short[/tt] instead of [tt]int[/tt], then the data will even fit into the L1 cache of most modern processors, which might possibly make it even faster.

Share this post


Link to post
Share on other sites
You can also try this:

An unsigned int can hold values from 0 to 4294967295
if you divide the number in parts, you can fit in there 5 numbers with 2 digits:

42|94|96|72|95

So, considering that -10 will be represented as 0 and 10 will be represented as 20 (and so on for other values), every int can store up to 5 of your variables.

If you had, for example, an array of 5 "-10 10 ints" instead of having:
{-10,1,5,-1,10} (20 bytes)
you would have:
0011150920 (4 bytes)

So, instead of having 100,000 ints you could have the same data in 20,000 ints.
It could be quite troublesome to write the data, but reading should be quite easy... (I'm to lazy to write code right now, but for reading some simple divisions should do the trick...).

Share this post


Link to post
Share on other sites
Quote:
Original post by samoth
Sorry, I don't understand the problem.

100,000 [tt]int[/tt]s take 400,000 bytes, this will not only fit into every reasonably old computer's main memory easily, but will also fit entirely into every reasonably modern processors' L2 cache.


Not every program written today runs on a computer. ;)

(Also, the 'tt' HTML tag is not reflected in the board code here; use raw HTML for it.)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this