Quote:Original post by Decrius
Proof me wrong then, I like to see facts, not opinions.
Some time back, I had to write a web page analysis system, and I used C. Needless to say, I spent an entire month getting all the string manipulation to work correctly while eliminating any overhead I could identify. Then, I ran the stuff, and although it seemed to run pretty fast in the beginning, after a day of running it simply ground to a halt. I didn't have time to optimize it, so I killed the process and used the partial results.
What I learned later was that someone else on the same team had to do some work that was quite similar to mine. But instead of using C, they used an in-house scripting language that compiled to java bytecode with bindings to the same C library I was using. They had finished coding all the algorithms in a single week (because they didn't have to care about memory management or buffer overflows, they had access to sane string manipulation operators, and the scripting language had functional programming support) instead of a full month like I did. The ability to write "a + b" instead of "measure, allocate, strcpy, strcat, deallocate, return" does give you that kind of productivity gain. And believe me, when you're dealing with web pages, you don't want to be using fixed-size buffers.
Then, they ran it. The program ran overall in a slow fashion, and ground to a halt after one day. They profiled it, found that there was a bottleneck in one of the hash tables, and solved that bottleneck by clearing the hash table every 10000 pages. The program still ran slowly, but it remained at a constant speed and delivered the full results after another day of computations. By looking at the profiling results some time after that, I noticed that I had the same bottleneck, which (they told me, once I had left the team) indeed solved the speed problem of my program.
The point of this comparison is that a hand-written C program took a month to complete and was too slow to provide full results, whereas a similar program written in a scripting language delivered the full results in two weeks. The entire time I spent writing C string manipulation code accounted for a performance increase in 1% of the execution time, while 99% of the execution time was (unpredictably) concentrated in the hash table processing. Moreover, a single statement was all it took to obtain a drastic increase in performance, where all that overhead removal code had failed.
In short: obtaining performance increase in C isn't free, you have to spend time and effort to make it happen. Had that time been spend working on algorithmic bottlenecks instead of memory allocation handling, your program would have been faster in many cases.
Quote:Quote:Original post by ToohrVyk
The quality of a programming language is defined by what this language prevents you from doing. A language which would let you do anything you want would be, by its very design, a bad language.
May I ask you why quality is connected to what a language is not?
There are two main schools of thought on what makes a programming language great. The first school of though, the older of the two, believes that programmers are close approximations of perfect beings with infinite knowledge, infinite brain power, and the ability to concentrate on an infinite amount of different concepts at the same time, and that great results will be achieved by giving them maximum freedom. This is generally summarized as "the programmer knows what he is doing". This school of thought thus proposes that a perfect language would be expressive enough to let the programmer do whatever he wishes to achieve the end result.
This leads to languages such as C, C++, PHP or LISP: broad enough to let you do whatever you wish, even if it involves shooting yourself in the foot as soon as you forget even the tiniest detail.
The amount of foot-shooting led to the appearance of a second school. This school of though believes that, although truly exceptional programmers exist, the vast majority of programmers are in fact frail and fragile creatures with limited knowledge and limited concentration skills. Thus, a system with infinite freedom would lead to two consequences: first, since there are several equally difficult ways of achieving any result, programmers tend to choose different ways and thus have trouble understanding each other's code, and second, since programmers are prone to making mistakes, freedom is more of a downside than an advantage. This school of thought thus believes that a language should shoehorn the programmer into a simple mindset by restricting his choice of actions to only a subset of what technology would allow him to, in a way that makes most foot-shooting activities unthinkable.
This leads to languages such as early Java, XSL, Haskell or OCaml: these force you into a specific way of developing, and prevent many of the classic mistakes that come from the languages with more freedom of action.
I used to belong to the first category of people. I saw myself as an infinitely competent programmer, able to notice all the bugs that appeared when I happened to make them. And then, I got a job. My job involved developing a PHP content management system, and I had to report my project's statut (with a prototype demo) every week. While I do consider myself a good developer, the number of errors that can happen with PHP is staggering—sure, the absence of types means your language is more expressive than about every single other language except Java and C# (and even then, object can only get you so far), but this usually means that you can express original and unusual errors in that language. Also, the vast freedom allowed had to be restrained: the contract specified that code had to be readable, so any clever manipulations involving PHP oddities were out of the question. Shortly after that, I came to appreciate the calm safety of restrictive languages.
OCaml is my personal favorite in this respect. See, most mistakes in code come from trying to do X, but achieving Y instead. This, in turn, derives from the fact that the code for doing Y is very similar to the code for doing X. In a language such as OCaml, the structure of code is such that the code for doing Y is, most of the time, extremely different from the code for doing X. The only exception here would be constants, where it's easy to use a constant instead of another or commit off-by-one errors, but the standard library of the language (and, generally, the philosophy of the language) tends to prevent such things.
The core reason for this is algebraic types. Consider for a short moment the C type system:
type = basic-types | structures | type [N] | type * | type (*)(type)
Out of these, only the basic types, structures and pointers are used frequently in average C code. Function pointers do appear sometimes, but they are usually kept extremely simple. In short, it's very difficult to have many different types in C. Now, consider the OCaml type system:
type = basic-types | structures | type * type | type -> type | type type
First, unlike C, types tend to be much deeper in OCaml. It's certainly not unusual to have a (string list, int option) Hashtbl.t list lying around (if you're wondering, that would be the data structure for supporting "undo" in a virtua lfile system). You have more combinators, especially those which involve more than one type. In the end, this means that an average program in OCaml will involve an order of magnitude more types than a C program. Thus, a subtle mistake in code will result in a type mismatch with a much higher probability than in C code.