C's target isn't to be portable
``C has been characterized (both admiringly and invidiously) as a portable assembly language,'' -- Dennis M. Ritchie
At the time, porting code meant translating it to a different assembly language. C was invented so that all the common assembly instructions could be represented by high-level code, that was still ambiguous enough for it to be "portable" to every different CPU.
"Portable" has come to mean many different things over the years, from this idea of not having to re-write your assembly code, to Java's "WORA" (or Write Once, Debug Everywhere)... For discussions on portability, you've really got to define your terms.
If someone says that C is more portable than C#, they're correct because for almost any embedded system that you want to hack, you'll likely be able to find a C compiler for it.
If someone says that C# is more portable than C, they're correct because you can run the same binary code on a Windows and Linux system without re-compiling.
Apples and oranges...
it's much easier to learn to deal with .NET's GC than learning proper memory management in C++, simple or through the 6-7 "smart" pointers available.
Yes, C# is more optimized for general programmer productivity than C++, not many would dispute that, but the topic isn't which one is easier.
Handling memory in C++ is complicated, and there's no 'right' way to do it.
IMHO, manual use of new/delete are bad, smart-pointers are bad, and GC is also bad. The only method left that I like are scopes
In the last game I shipped, the GC (Lua, not C#) was consuming too much time per frame (or more to the point, it was consuming too much time every n frames). So, you introduce a limit on the amount of time it can use per frame, and move on... but then you realise that now garbage is building up every frame, and you run out of RAM and crash... so you have to go and fix your code to stop generating garbage, which means not creating objects.
So now the easy memory solution (GC) is out of bounds, and you're left struggling with all the "complex" methods that you use in unmanaged C++ anyway.
Performance on modern hardware is less about optimizing instructions and more about memory access patterns and data layout.
I think most game engines being written in C++ is simply because C and C++ are extremely portable languages.
Optimizing for memory access patterns is the primary optimisation these days; both C++ and C# give you tools to do this, but IMHO, it's easier to perform these optimisations in C or C++, due to the fact that you can treat RAM as a big array of bytes if you want to.
Every game console has a C/C++ compiler, whereas writing a console game in C# is a lot more of a hassle. This gives C++ a lot of inertia.
For an example of memory optimisation, I've got a "shader-collection" class, that has an operation to find the index of a particular "technique" object by name.
Inside this class's memory allocation (which is one contiguous block, for it's members, and all sub-objects, and their members, and so on) there is an array of "string-offsets", which are a pair of a 16-bit hash, and a 16-bit offset value. You hash the input "name" string, then binary-search this array for a matching hash (which is a linear, prefetchable operation). When you find a match, you've also found an offset ahead into the "shader collection"'s allocation, to a contiguous array of character data, where all the string data is actually stored; if a memcmp there matches, then you've found your index.
If you batch up these queries, so you're converting a collection of names into indices at once, then this entire data set can be made to be resident in the cache for the majority of the searches, speeding them up 100-fold. Moreover, the data is carefully arranged so that only data that is required by the current operation exists within a contiguous block, so there is minimal cache pollution.
C-style casting makes writing this kind of stuff really easy, while C#'s tools for programming on this level of abstraction are extremely verbose and complex.