C++ size_t for everything?

Started by
30 comments, last by frob 4 years, 5 months ago

Hi everybody!
Should we use size_t for everything?

That means:
- size/length/capacity
- function parameters for index
- all variables related to array index
- return of function when it's an index

Thanks for your feedback on this topic!

Advertisement

A semantic question, isn't it ? ?

OpenGL sometimes uses signed integers where i'd intuitively would choose an unsigned, so to avoid warnings or casting we might want to submit to the circumstances and use an int, too ?

When looping around to fill arrays or buffers and the like we sometimes count down to >=0, an unsigned would be suboptimal then ...

What about choosing the datatype for a situation that avoids demands the least type casting ?

Dunno ... i'd say no, not for everything. But for sizes, yes.

Imagine:

10 "guidance is internal" .. 8 .. (and so on) .. 1 .. 0 .. 65535 .. 65534 ...

 
 
 
 
2 hours ago, Green_Baron said:

When looping around to fill arrays or buffers and the like we sometimes count down to >=0, an unsigned would be suboptimal then ...

Nope, just use:


for (size_t i = x + 1; i-- > 0; )
...

as an alternative for


for (int64_t i = x; i >= 0; --i)
...

 

Regarding the topic:

I don't like about size_t that it has a varying memory footprint depending on your build setup. As a programming control freak, I only use size_t when it is necessary. Mostly when messing around with STL container sizes and raw memory. For everything else, I think about the expected value range and pick one of the following typedefs:

 


#include <cstdint>

using I8 = std::int8_t;
using I16 = std::int16_t;
using I32 = std::int32_t;
using I64 = std::int64_t;

using U8 = std::uint8_t;
using U16 = std::uint16_t;
using U32 = std::uint32_t;
using U64 = std::uint64_t;

 

If there is no special requirement I mostly stick to the 32bit integers (loops, indexing, etc.). The value range is large enough in most cases and it is twice as fast as 64bit integers (or size_t in a 64bit environment) if your compiler is able to vectorize your code. However, for the things you mentioned, I think there is no real drawback in using size_t.

Only if memory footprint or (vectorized) performance get important you should think about alternatives.

Greetings

As a very picky programmer, I don't use size_t but define my own one. I think it is important to have different sized types because of the difference between x86 and x64 systems. While you can address a wide larger range of memory in an x64 system as you can cover even with an unsigned integer 32, you will be limitted to exactly that in an x86 system. Same is true for memory management APIs the OS provide and that aren't hidden behind any library.

So what I do in my numerics.h header file of my SDK is to define my own type to something that can store at least the size of a pointer and make conditional an alias for size_t to the same size


typedef SE::TypeTraits::TypeSelector<sizeof(void*), Unsigned>::Result varying;
typedef SE::TypeTraits::TypeSelector<sizeof(void*), Unsigned>::Result uptrint;
typedef SE::TypeTraits::TypeSelector<sizeof(void*), Signed>::Result ptrint;

#if !defined(_SIZE_T_DEFINED)
#define _SIZE_T_DEFINED 1
typedef varying size_t;
#endif

my classes like Array.h then work with the varying type to handle platform dependant sized integer values. However, the only cast I perform is from varying to size_t if targeting an OS API that requires it

EDIT: The TypeTraits::TypeSelector declaration is template meta programming to select a platform dependant integer type from the provided list, that fits the size of the type desired to fit or fails with a compiler error. So it is guaranteed for an int32 to be signed and cover (at least) 32 bit for example

14 hours ago, Green_Baron said:

When looping around to fill arrays or buffers and the like we sometimes count down to >=0, an unsigned would be suboptimal then ...

This works no matter if signed or unsigned, and also it looks cool :) :


for (unsigned int i=x; i-->0; )

 

Adding to the question, i use int for iterating std::vectors, because mostly i need the index so iterators feel cumbersome, and i do indexing math with int.

To get rid of the warnings i use #pragma warning ( disable : 4267 ) // 'argument': conversion from 'size_t' to 'const int32_t', possible loss of data

But this looks bad bad practice. If anyone knows something better let me know.

I use std::size_t where the standard library uses std::size_t, for consistency.  For anything involving actual arithmetic, I prefer to use signed types.  It's just so easy to accidentally underflow a unsigned variable.

Ha ?, since everyone is riding that wave: sure, i am aware of a bunch of solutions to the "problem". But i have just recently commited such an error, as a result of a refactoring i overlooked a comparison and the respective compiler warning ?

So, apparently i am in no position to offer a general statement on do or don't use this or that datatype. Personally, i prefer one of the intx_t/uintx_t types and size_t when that type is expected/returned. I don't use it generally.

Yes. std::size_t is always the appropriate type to express sizes.

An index is not a size, it's a difference.  It should be expressed as a signed integer.

If you need a specifically-sized integer value, use a specifically-sized integer value. You rarely need such a thing, except for marshalling/unmarshalling data (which can include file I/O, network I/O, and communications with a coprocessor through an API).

Stephen M. Webb
Professional Free Software Developer

5 hours ago, Bregma said:

An index is not a size, it's a difference.  It should be expressed as a signed integer.

That's contrary to the choices that have been made by the designers of the standard library: vector::at and the similar operator[] both take size_type, which is an unsigned integer type.

 

12 hours ago, Bregma said:

If you need a specifically-sized integer value, use a specifically-sized integer value. You rarely need such a thing, except for marshalling/unmarshalling data (which can include file I/O, network I/O, and communications with a coprocessor through an API). 

Specifically sized values are also helpful if you're targeting more than one platform, because when a variable isn't the size you expect it to be, it can occasionally cause subtle bugs.

In addition, some style guides (e.g. Google's) suggest avoiding unsigned types where possible, as they can cause bugs if you're not careful (e.g. subtracting two unsigned integers is more likely to overflow than for signed integers).

This topic is closed to new replies.

Advertisement