Found this article and thought I should share this with you guys:
http://sebastiansylvan.com/2015/04/13/why-most-high-level-languages-are-slow/
Found this article and thought I should share this with you guys:
http://sebastiansylvan.com/2015/04/13/why-most-high-level-languages-are-slow/
That jives well with my experience working in Android. Dalvik's garbage collection is pretty bad, but it wouldn't be a problem if 'idiomatic' Java didn't allocate tiny objects pretty much everywhere.
As a concrete example, we were forced to remove every single instance of 'for each' loops from our code base, because a for each loop allocates a new iterator object at every iteration of the loop, which then triggers garbage collection...
First, the elephant in the room: C++ is a high-level language. (Shock! Horror! Let's stop using it!) As is C. As is Pascal. As is COBOL. And the list goes on...
If someone implies that C++ isn't high level, but C# is... then chances are they're using a different definition than you are... so instead of shouting "WRONG!" at them, it's better to assume good faith and adopt their terminology for the purpose of understanding their point.
In school we first learnt that you had a hierarchy consisting of: machine language, assembly languages, high level languages, and domain specific languages.
Later we learnt to categorize them, taking classes in systems programming in C, or application programming in Java. You learn that Java is more high leveler than C...
It's pretty obvious he's comparing C#/Java/Python-esque languages to C/FORTRAN/Pascal-esque languages.
He may have part of a point here simply because C# gives you less control then C++ - but that's kind of the point. You're letting the runtime determine more things for you because you think you can afford the potential cost as a tradeoff for development time and lack of bugs.
Just saying that productivity trumps performance is changing the subject, is it not? In a discussion about how language design affects performance, the impacts of your design choices in other areas are interesting, but not the topic at hand.
It's possible to come up with choices that let you have your productivity-cake and eat your manual-cache-performance too, like C#'s StructLayoutAttribute, but that then enters the argument about having to fight the language and use non-idiomatic means to regain said cake eating.
Can you write faster code in C++? Yes. If you have a few extra years.
You can also write faster code in assembler, but you don't see people screaming about how inefficient C++ is and telling everyone to use that, now do you?
It's not that simple.
C++ is a less productive (harder) language, so an inexperienced C# team might take longer to produce a product if forced to use C++, and their product may also be slower and buggier, than what they would've produced in C#.
Meanwhile, a team of trained C++ veterans might produce a product in C++ much faster than they would in C#, and also win on performance.
Just because C++ has lots of performance tools at your disposal, it doesn't mean that people use them correctly. Most people don't. But they're still useful tools for craftsman to have at their disposal.
As for assembler, when doing systems/embedded programming, usually you have a very good idea of what the resulting assembly will look like. You basically ARE writing the program in assembly, using C/C++ as your assembler If the compiled code doesn't line up with your expectations, then you tweak the your code so that the compiler does produce what you expected of it. It's low-level thinking in a high-level language.
Moreover, the compiler is usually better at asm than us, so it takes our intentions and cleverly optimizes them better than we ever could, using it's super-human knowledge of instruction scheduling, dependency chain tracking, register juggling abilities, etc...
Compilers are getting smarter all the time - and a higher level language is generally better at expressing programmer intent which allows the compiler to make optimizations it might otherwise have been unable to do.
My bolded statement above does not apply to the behavior of C#/Java/C++ code when it comes to smart data layout. C/C++ sucks at this too!
None of these languages have the power to automatically fix structural problems... which is kind of the point. There is no magic "fix my cache misses" algorithm.
It comes down to the language having tools to better allow you to describe the structure of the data.
Things like ispc's soa keyword, or the scope-stack allocation pattern, or Jai's "joint allocations", are, unfortunately, new language developments, but are ideas that have been developed manually forever (in languages that allow you to do such things manually, such as C)...
As for memory management, the dichotomy of a C-style manual free-for-all vs a C#-style catch-all GC is false. There's so, so many ways to manage resources, but not that many with language support.
It's also a two-way street. I've shipped lots of games that use GC's, and had to deal with all the show-stopping bugs caused by the GC that delayed shipping dates.
e.g. the GC is consuming too much CPU time per frame, blowing the performance budget. Ok, do a quick fix by capping how long it can run per frame, easy! The memory usage is now growing steadily until we run out and crash... Fix properly it by fighting the language, doing a pass over the ENTIRE code-base, and micro-optimizing everywhere to eliminate temporary objects.
A lot games I've worked on have also successfully used stack-based allocations entirely (no malloc/new/gc)... There's endless options here, 99% of which have no language support. The GC pattern isn't the one true solution to rule them all. The point is that high(er) level languages have already made more choices for you, deciding which tools you're allowed to use, and which ones of them are running at full voltage.
The simpler a language is, the easier it is to take a different evolutionary path through the tree of data management ideas. New languages will hopefully be able give us the productivity gains of C#/Java, but the efficiency that we currently can only achieve by doing manual data juggling in C... This advancement will be similar to the advancement from asm-programming to C-programming, but for structure this time, not logic
In fact, it's getting worse.
Also he calls out the .Net Native stuff and points out that while it'll help instructions it won't help with layout of memory and memory latency is a horrible problem which isn't getting better.
To use made up numbers - CPU performance doubles every 2 years, but memory performance doubles every 10. This means that after 10 years, your CPU is 32x faster, your RAM is 10x faster. If you normalize that for CPU power, what's happened is that your RAM has gotten ~70% slower!
Every year, instructions-per-memory-fetch just goes up and up. C#/Java were born out of the era of 90's programming when this issue wasn't as much of a big deal. This decade it's almost the defining problem of software performance... and none of our languages are built to deal with it.
Fun fact: C++ has garbage collection. Though it goes by a different name - "destructors".
Funnier fact: No, it does not, but one can be optionally plugged into it. By GC people (ie. Bjarne Stroustrup[1] or BoehmGC authors[2], on their respective websites) mean a mechanism that automatically frees memory that is no longer accessible by the program being ran. C++ does not have that, Java and C# do. You are the first person I've ever seen claiming that destructors == garbage collection.
1. http://www.stroustrup.com/bs_faq.html#garbage-collection
2. ( http://www.hboehm.info/gc/ links to) http://www.iecc.com/gclist/GC-faq.html#Common%20questions
I don't recall him saying he hates anything, he was just pointing out why languages like C# and Java with their design choices are slower than a language like C++ which pushes more control back to the developer are the way they are performance wise.
More to the point he points out that for many people they either don't care or it isn't a problem.
And there is no getting around the fact he is right.
C# with it's heap-for-all-the-things and various other things will cause you cache misses.
GCs can cause horrible problems with cache lines and unpredictable runtime issues.
End of the day languages have their pros and cons; I see nothing 'wrong' in what he said and nor did he conclude that 'high level languages are bad' just that you should be aware of things and why they are as they are.
If someone implies that C++ isn't high level, but C# is... then chances are they're using a different definition than you are... so instead of shouting "WRONG!" at them, it's better to assume good faith and adopt their terminology for the purpose of understanding their point.First, the elephant in the room: C++ is a high-level language. (Shock! Horror! Let's stop using it!) As is C. As is Pascal. As is COBOL. And the list goes on...
In school we first learnt that you had a hierarchy consisting of: machine language, assembly languages, high level languages, and domain specific languages.
Later we learnt to categorize them, taking classes in systems programming in C, or application programming in Java. You learn that Java is more high leveler than C...
It's pretty obvious he's comparing C#/Java/Python-esque languages to C/FORTRAN/Pascal-esque languages.
Just saying that productivity trumps performance is changing the subject, is it not? In a discussion about how language design affects performance, the impacts of your design choices in other areas are interesting, but not the topic at hand.He may have part of a point here simply because C# gives you less control then C++ - but that's kind of the point. You're letting the runtime determine more things for you because you think you can afford the potential cost as a tradeoff for development time and lack of bugs.
It's possible to come up with choices that let you have your productivity-cake and eat your manual-cache-performance too, like C#'s StructLayoutAttribute, but that then enters the argument about having to fight the language and use non-idiomatic means to regain said cake eating.
It's not that simple.Can you write faster code in C++? Yes. If you have a few extra years.
You can also write faster code in assembler, but you don't see people screaming about how inefficient C++ is and telling everyone to use that, now do you?
C++ is a less productive (harder) language, so an inexperienced C# team might take longer to produce a product if forced to use C++, and their product may also be slower and buggier, than what they would've produced in C#.
Meanwhile, a team of trained C++ veterans might produce a product in C++ much faster than they would in C#, and also win on performance.
Just because C++ has lots of performance tools at your disposal, it doesn't mean that people use them correctly. Most people don't. But they're still useful tools for craftsman to have at their disposal.
As for assembler, when doing systems/embedded programming, usually you have a very good idea of what the resulting assembly will look like. You basically ARE writing the program in assembly, using C/C++ as your assembler If the compiled code doesn't line up with your expectations, then you tweak the your code so that the compiler does produce what you expected of it. It's low-level thinking in a high-level language.
Moreover, the compiler is usually better at asm than us, so it takes our intentions and cleverly optimizes them better than we ever could, using it's super-human knowledge of instruction scheduling, dependency chain tracking, register juggling abilities, etc...
My bolded statement above does not apply to the behavior of C#/Java/C++ code when it comes to smart data layout. C/C++ sucks at this too!Compilers are getting smarter all the time - and a higher level language is generally better at expressing programmer intent which allows the compiler to make optimizations it might otherwise have been unable to do.
None of these languages have the power to automatically fix structural problems... which is kind of the point. There is no magic "fix my cache misses" algorithm.
It comes down to the language having tools to better allow you to describe the structure of the data.
Things like ispc's soa keyword, or the scope-stack allocation pattern, or Jai's "joint allocations", are, unfortunately, new language developments, but are ideas that have been developed manually forever (in languages that allow you to do such things manually, such as C)...
As for memory management, the dichotomy of a C-style manual free-for-all vs a C#-style catch-all GC is false. There's so, so many ways to manage resources, but not that many with language support.
It's also a two-way street. I've shipped lots of games that use GC's, and had to deal with all the show-stopping bugs caused by the GC that delayed shipping dates.
e.g. the GC is consuming too much CPU time per frame, blowing the performance budget. Ok, do a quick fix by capping how long it can run per frame, easy! The memory usage is now growing steadily until we run out and crash... Fix properly it by fighting the language, doing a pass over the ENTIRE code-base, and micro-optimizing everywhere to eliminate temporary objects.
A lot games I've worked on have also successfully used stack-based allocations entirely (no malloc/new/gc)... There's endless options here, 99% of which have no language support. The GC pattern isn't the one true solution to rule them all. The point is that high(er) level languages have already made more choices for you, deciding which tools you're allowed to use, and which ones of them are running at full voltage.
The simpler a language is, the easier it is to take a different evolutionary path through the tree of data management ideas. New languages will hopefully be able give us the productivity gains of C#/Java, but the efficiency that we currently can only achieve by doing manual data juggling in C... This advancement will be similar to the advancement from asm-programming to C-programming, but for structure this time, not logic
To use made up numbers - CPU performance doubles every 2 years, but memory performance doubles every 10. This means that after 10 years, your CPU is 32x faster, your RAM is 10x faster. If you normalize that for CPU power, what's happened is that your RAM has gotten ~70% slower!
Every year, instructions-per-memory-fetch just goes up and up. C#/Java were born out of the era of 90's programming when this issue wasn't as much of a big deal. This decade it's almost the defining problem of software performance... and none of our languages are built to deal with it.
Funnier fact: No, it does not, but one can be optionally plugged into it. By GC people (ie. Bjarne Stroustrup[1] or BoehmGC authors[2], on their respective websites) mean a mechanism that automatically frees memory that is no longer accessible by the program being ran. C++ does not have that, Java and C# do. You are the first person I've ever seen claiming that destructors == garbage collection.Fun fact: C++ has garbage collection. Though it goes by a different name - "destructors".
1. http://www.stroustrup.com/bs_faq.html#garbage-collection
2. ( http://www.hboehm.info/gc/ links to) http://www.iecc.com/gclist/GC-faq.html#Common%20questions
Destructors are deterministic garbage collectors in the sense that both automatically clean up resources when the program is done with them in a manner that the programmer doesn't have to directly manage (the compiler inserts calls, not the programmer).
By your definition of a GC ("clean up resources so the programmer doesn't have to"), automatic reference counting used in Swift is also GC because it "just works", "compiler does it, not the programmer", yet you just said yourself that Swift has no GC.
I still contend that if you're going to use a GC language, then you need to play nice with the GC
I'm curious if you have you tried doing this on a large scale?
I've spent an awful lot of time the last couple of years refactoring swathes of Java code (or porting it to C++), to reach the 'zero allocations' bar that one absolutely must hit if one wants a fluid and responsive android application. I'm not kidding about this - you will not be able to reliably hit 60fps if there are any allocations in your rendering or layout loops. And Java doesn't have struct types, so that means no short-lived compound types at all...
I'm trying to point out the argument cuts both ways. You can't say that "C# is slow by design" and ignore that the de-facto "high-performance" language (C++) also has slow features (i.e. virtual dispatch).
I would argue: "Everything on the heap" is at the core of C#. Virtual dispatch is not at the core of C++ in the same way.
As I already pointed out - C# does not require the heap for everything, and garbage collected (or at least loosely-pointed) memory has it's own advantages (memory compaction).
It doesn't, but then you're fighting the language. Trying to manage an array of value types comes with significant limitations in C#. True though, memory compaction is a potential advantage. And in practice, if you use an array of reference types and pre-allocate all the objects at the same time, they tend to end up sequential in heap memory anyway - so that does mitigate some of the cache performance issues.