Microsoft Checked C

Started by
7 comments, last by Alberth 7 years, 10 months ago
Just thought I would share this project with you guys in case you had not come across it yet.

http://www.theregister.co.uk/2016/06/16/microsoft_releases_open_source_bugbomb_in_the_rambling_house_of_c/
https://github.com/Microsoft/checkedc/releases/download/v0.5-final/checkedc-v0.5.pdf
http://research.microsoft.com/en-us/projects/checkedc/default.aspx
https://github.com/Microsoft/checkedc

It looks really interesting. It basically extends the C language to include something similar to std::weak_ptr<T>. Even if a developer does not use C, it is still very relevant to them because 99% of languages are written in C and often when using native libraries (such as OpenGL, SDL etc...) the developer will need a binding (again, written in C). Checked C has a reference implementation using clang and llvm. It basically solves the issue of C being so unsafe and archaic but at the time a critical and necessary underpinning technology.

I developed a similar project called libstent (https://github.com/osen/stent) that tries to achieve the same thing but it uses MACRO hacks instead (however it does also provide exception safety using finalizers). However, having safe functionality built into the language itself is certainly nice to have.

What do you guys reckon? Would you throw away a bit of portability between C compilers to use Checked C?
http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers :)

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.
Advertisement
No. This doesn't solve anywhere near enough real problems compared to just using modern C++.

It also most definitely does not include anything even close to weak_ptr. weak_ptr, shared_ptr, and unique_ptr solve ownership and lifetimes, which Checked C does not address at all. Modern C++ also has all the rest of Checked C's additions built in or easily buildable yourself (e.g. GSL's span, observer_ptr, non_null, etc.).

Sean Middleditch – Game Systems Engineer – Join my team!

It also most definitely does not include anything even close to weak_ptr.


Are you sure? When reading it through, it looked like ptr<T> was designed to NULL out when the original data was invalid. That is the functionality from std::weak_ptr<T> that would be very nice.

No. This doesn't solve anywhere near enough real problems compared to just using modern C++.


C++ can't solve this issue because it itself is based upon potentially dangerous C libraries. As a library consumer, C++ (even Java and C#) seem better than C but remembering that underneath all their hoods, the same dangers can (and almost certainly do) still lurk, it removes all the fun ;). Plus these languages end up running more C than a C program due to their additional layers written in C so they are still not ideal.

Checked C aims to remove these issues at a much lower level than possible by just bolting on another random language.
http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers :)

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.

It also most definitely does not include anything even close to weak_ptr.


Are you sure? When reading it through, it looked like ptr<T> was designed to NULL out when the original data was invalid. That is the functionality from std::weak_ptr<T> that would be very nice.

No. This doesn't solve anywhere near enough real problems compared to just using modern C++.


C++ can't solve this issue because it itself is based upon potentially dangerous C libraries. As a library consumer, C++ (even Java and C#) seem better than C but remembering that underneath all their hoods, the same dangers can (and almost certainly do) still lurk, it removes all the fun ;). Plus these languages end up running more C than a C program due to their additional layers written in C so they are still not ideal.

Checked C aims to remove these issues at a much lower level than possible by just bolting on another random language.

The paper says it is about bounds-checking. a weak_ptr has nothing to do with bounds checking. The introduction is entirely about bounds-checking. It mentions problems such as using already-deleted memory are beyond the scope of Checked C.

C++ can at least partially solve the issue, because you can use templates to create the checked pointer classes. That's what the team who made this did as a prototype.

Just building with Checked C instead of C gets you nothing. You have to use the new types. Converting one of those potentially dangerous libraries to Checked C means putting in the correct changes to use bounds-checking everywhere they would belong, and making no mistakes or oversights while doing so. As long as you're rewriting chunks of the library, why not rewrite them in C++? What's the difference? How does the library go from potentially dangerous to knowing it's safe? How do you know you did the conversion correctly? How do you know the library isn't dangerous because of something that wasn't covered, like memory allocation?

If, for some reason, you have a code base that's in C, and you're willing to rewrite every part of it that uses pointers, and compiling it in C++ scares you and/or you're not willing to create the helper templates yourself, but compiling it in a new version of C from Microsoft is something you're willing to do, and you're having trouble with buffer overflows, and compile-time checks will help or run-time crashing is ok, than sure, Checked C might solve a problem for you.

What do you guys reckon? Would you throw away a bit of portability between C compilers to use Checked C?

Honestly, I'd just use C++. I would welcome this work being looked at for inclusion in the next C standard, but I suspect that would be unlikely. On most platforms today C++ is just as available as C -- even on something as tiny as certain varieties of 8bit microcontrollers you can do C++ with free and open toolchains. Only relatively few (and in general, esoteric, legacy, or both) platforms that support C don't support C++; usually those that have proprietary toolchains.

Between the family of C++ smart pointers and the work going on around C++ Core Guidelines with Microsoft providing a proof-of-concept implementation (and working with Bjarne Stroustrup) its meant to solve many of these things.

Or, there's the Rust language, which has commercial support from Mozilla and a great community, and which has a great C-linkage story + works even on bare-metal/embeded platforms.

Its a decent enough idea, but don't see checked-C gaining any level of real support.

throw table_exception("(? ???)? ? ???");

Just thought I would explain my interest in Checked C a little bit more. I am not here to recommend writing a game in it, I am suggesting that this could be a great technology to use for the parts of software that are currently needing to be written in ANSI C. For example things like SDL, Mesa, glew, glut, compilers etc...

No matter how much we (effectively the consumers in this context) prefer our own different languages, sooner or later we all depend on a C library that could potentially be made safer by using Checked C.

So things like Rust a C++ would still potentially benefit from Checked C. And that I think is pretty cool.

Rust is written in ~1.6% C (which is a massive amount of code, larger than many games in fact)
The Clang compiler is written in ~21.6% C (again massive and that's forgetting the standard library wrapping a lot of C)

If Checked C can alert us to bugs in these projects, then everyone's a winner :)
http://tinyurl.com/shewonyay - Thanks so much for those who voted on my GF's Competition Cosplay Entry for Cosplayzine. She won! I owe you all beers :)

Mutiny - Open-source C++ Unity re-implementation.
Defile of Eden 2 - FreeBSD and OpenBSD binaries of our latest game.
Very little of any software "has to be" written in C anymore.

Entire huge games have been shipped with no C in them, for example. Of the things you listed, the C implementation is usually for compatibility with non-C languages, not to espouse C directly!

I have an entire language and compiler toolchain written with 0 dependencies on raw C. It has support for speaking the C ABI, but that has nothing to do with the C language being a mandatory component of anything.

The pieces of software that are in C are usually some of the most hardened and well-tested portions, for several reasons - not least of which being the fact that they usually exist to interface with non-C languages.

The effort of rewriting C code into Checked C is nontrivial. The reward is marginal at best for already-bulletproofed code. I just don't see a compelling reason for adoption.


It's a nifty project from a languages standpoint, but I don't see it being revolutionary.

Wielder of the Sacred Wands
[Work - ArenaNet] [Epoch Language] [Scribblings]

Very little of any software "has to be" written in C anymore.

Entire huge games have been shipped with no C in them, for example. Of the things you listed, the C implementation is usually for compatibility with non-C languages, not to espouse C directly!

I have an entire language and compiler toolchain written with 0 dependencies on raw C. It has support for speaking the C ABI, but that has nothing to do with the C language being a mandatory component of anything.

The pieces of software that are in C are usually some of the most hardened and well-tested portions, for several reasons - not least of which being the fact that they usually exist to interface with non-C languages.

The effort of rewriting C code into Checked C is nontrivial. The reward is marginal at best for already-bulletproofed code. I just don't see a compelling reason for adoption.


It's a nifty project from a languages standpoint, but I don't see it being revolutionary.

And of course, if it "has to be" written in C, you cannot write it in Checked C.

The whole thing has bothered me from the start of this thread. I have had this little voice in the back of my head "it's the embrace and extend trick! watch out!", but while it's definitely a possibility, let's assume they really want a better C.

Philosophically, the C language trusts the programmer, and in my view, this is why C is so popular. I get total control over the code. In low level code or really high performant code, I want that control, since I am in a hurry. The compiler shouldn't add extra stuff that I didn't program. C in itself is not a nice language for building really large and complicated programs[1]. Like others have said already, unless you need that much control, other languages are often better for the task.

Now Checked C breaks that idea. The compiler adds bounds-checking code at every point where I use a special type (as far as I understood). So even if I have super-long thought about some point in the code, and it's really fine, and I wrote a proof, published it, and got a Nobel prize for it, even then the compiler can decide to add a check, since it didn't read my proof.

This goes so much against the idea behind C, that in my view, calling it <anything>-C, is just plain wrong.

To me, this makes the name "Checked C" just a cheap marketing trick to position the language.

Then the implementation. You make a new language ("embrace and extend C"!). As others have said, that defeats the purpose. The current eco-system is C. If you want to improve C code, making a new incompatible language doesn't work. People are not going to re-implement the zillions of C code lines, unless MS starts requiring that from their suppliers. "All new drivers must be in Checked C". Well, at least that would work for embrace and extend :P

I think a much better approach is to give the programmers more information and to educate them. They don't intentionally add exploits, they just fail to see the possibility. Build a really good analysis tool to point out the problems. Give warnings when you don't find counter-proof. Write articles about "better safe than sorry", "reliability is more important than

performance" (most times), and how to minimize impact on performance. Show that with good branch prediction, it's close to 0, so there isn't much reason not to do it. It would just fit in the current eco-system, and educate programmers about todays CPUs, and customer needs (which unlike what we believe is mostly not about performance, I think).

This would however be a much larger effort, and possibly have less impact, than bluntly just adding checks everywhere, and destroying some performance without actual need.

Maybe we should give up on that total control, and move on beyond C. I don't expect that to happen for another 20-30 years, tbh.

In that case however, "Checked-C" is a wrong name too :)

[1] Yes exceptions exist, both good ones and bad ones.

This topic is closed to new replies.

Advertisement