C++ size_t for everything?

Started by
30 comments, last by frob 4 years, 4 months ago
7 minutes ago, JoeJ said:

This is correct: for (whatEverType i=stuff.size(); i-->0;)

This is wrong: for (whatEverType i=stuff.size()+1; i-->0;)

You are right. I got a little bit confused by the fact, that the unsigned version always needs to start with an index increased by 1 when compared to the signed version. However, the signed version has a -1 after the size:


for (int i = a.size() - 1; i >= 0; --i)
    std::cout << a.at(i) << std::endl;

So the unsigned version is as you said:


for (size_t i = a.size(); i-- > 0;)
    std::cout << a.at(i) << std::endl;

I think I am getting old ... -.-

 

28 minutes ago, Green_Baron said:

Haha, now, if that discussion is not an argument pro signed integers in loops, just to avoid confusions, then i don't know what ...

Just trying to be funny?

 

p.s.: have you tried it, with a std::array and the range checking at(), counting down ? 

 

The thing is, that always talking about this stuff confuses me more than actually working with it xD. I will continue using unsigned integers for values that are inherently non-negative. Usually, the compiler warns me about those signed unsigned issues.

 

Greetings

Advertisement

Sorry if i sounded disrepectful, wasn't meant so. What i meant was that the discussion just shows the possible confusion. You have my respect.

------

After leaving the loop the unsigned counter would still be wrapped around. No problem if it's defined local in the loop. But if it lives outside for later use or so one should keep that in mind ... i would arguably regard -1 as being more intuitive than the maximum of the size type ....

1 hour ago, DerTroll said:

I think I am getting old ... -.-

The unsigned version works for the signed case too, so memorizing just that makes it a bit easier becasue the code is simpler.

To be honest, it took me a long time to memorize this single line, haqving no more needing to looking up older code where i knew i used this before. Seems our brains are not made for counting backwards, haha :)

54 minutes ago, Green_Baron said:

i would arguably regard -1 as being more intuitive than the maximum of the size type ...

Most people agree to this, and to avoid confusion and bugs, using signed types for indices seems mostly preferred.

Personally i only got used to unsigned numbers because some GPU API has forced me.

Similarly, std::size_t has to be unsigned, becasue one may use a huge array requiring the extra bit.

Still, if i can i use signed types. And i conclude disabling the warning is the lesser evil than messing with conversions or adding casts everyhere. I've never had an issue because of muting the warning, but i had some for the other cases. :|

Edit: Seems a matter of personal preference. But we can choose, Java people can not - i really hated it's lack of unsigned types. :D

 

 

2 hours ago, Green_Baron said:

Sorry if i sounded disrepectful, wasn't meant so.

Don't worry, you didn't ;)

1 hour ago, JoeJ said:

Seems a matter of personal preference.

Think so too. Sure, working with unsigned types can lead to additional errors, but in the end, if you avoid those mentioned errors, it doesn't matter in most cases if you use unsigned or signed ints.

Greetings

16-bit legacy from DOS and Windows 3.1
I see the use of size_t as poor hardware abstraction and one of the early non-deterministic mistakes when C++ was originally designed. Unspecified behavior like this has caused countless of software bugs in C++ but is hard to get away from because of outdated standard libraries and people who might call your functions with larger types. When computers were very slow, unspecified size made programs faster by not having to mask out the extra bits after a perfectly valid serial calculation. Remember how common overflow was on 16-bit systems when you didn't use a double-word long?

Modern hardware
With access to more bits than we actually use, unspecified size is no longer a performance gain because we now have built-in SIMD ALUs for 8, 16 and 32 bit vector operations than can finish 16 elements while the scalar processor is still on the first of many cycles for a single element. In isolation, you can increase the robustness of your code significantly by only using fixed size integers. This also makes it easy to optimize by knowing the upper bounds for tight bit packing.

Dawoodoz, those are implementation defined, not undefined or unspecified. Those words have meaning You are partly right that it is for performance, but the performance is so that you can choose something that fits the hardware rather than something that fits the language. 

Early versions of Java suffered badly from that, with Java implementations being mixed between doing what the Java language specified for math operations (especially floating point), or doing whatever the hardware did even if it didn't quite match what the language specified. The Java language mandated that floating point needed exact reproducibility, which didn't exist on most FPUs; Java language handling of infinity and NaN results differed from IEEE hardware standards, and differences with IEEE floating point traps (invalid floating point operation, overflow, underflow, division by zero, and inexact results). Because Java over-specified the implementation of floating point,  organizations in the late 1990s and early 2000s were forced to either use hardware-supported operations and fail Sun's Java certifications, or use slow software-based floating point computation and pass Sun's certifications. People doing advanced numerical processing were mixed on it, games developers hated the slow versions and wanted what the hardware already did.

In the case of size_t and similar types the underlying hardware and the implementation details define the number of bits, the sizes of data types, and more.  C and C++ both specify minimum sizes, but leave the exact sizes up to the hardware for implementation details.

When I first learned to program, we had an old PDP machine that operated on 18-bit devices that used 9 bits, just like 16-bit devices often used 8 bits.  C worked just fine on them, because C doesn't mandate specific sizes.  If a language mandated that a byte held 256 values then a device that held 512 values couldn't be used.  Similar, an int must be at least 65536 values, but holding 262114 values is also legal. The implementation defines the range.

Over the years I've worked with a bunch of hardware, including DSPs and FPGAs that did away with the old 8-bit concepts entirely. All the datatypes for char, short int, int, and long int were all 32 bit values. They are also covered by the language definitions, and 32-bit char is perfectly acceptable in the language standard.  This means that sizeof(char), sizeof(bool), sizeof(char16_t), sizeof(char32_t) were all equal to 1. 

Just like other sizes, size_t is an implementation defined type. It is defined everywhere, but the exact value is up to the compiler.

In the latest C it is defined as: "the unsigned integer type of the result of the sizeof operator".

In the latest C++ it is defined as: "an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object", with a recommendation that it be no greater conversion than a long int unless required to hold all values.

Any other use is outside the standard.  It may be valid in one compiler implementation but not valid in another. That's part of the joy of using implementation defined behavior. It is up to the individual compiler implementation what it means, and it may or may not be the same on a different compiler.

I have to say, the idea of a computer that doesn't support addressing at 8-bit byte boundaries horrifies me.

There are good reasons that it has stabilized where it is today, but that doesn't mean it will be that way forever.

Who knows what changes will be made with future computer advancements.  Designing and building parallel algorithms is a major shift for people who focused on single-threaded processing; even concurrency is problematic for those game programmers who are light on theory.

When we get quantum computing, who knows what changes we'll see.  Best to learn theory --- which is universal and applies to regardless of architecture and regardless of language --- then to focus on what happens to be commonplace today only to be left in the dustbin of history tomorrow.

Not using size_t for index result to one more line in the assembly output of the compiler:
https://godbolt.org/z/ndwMQf

This topic is closed to new replies.

Advertisement