Question
Why is C++ used at all now that a faster language (Go) has been created?
Just wanted to throw in two cents about binaries from GO supposedly being faster than C++. Recently C++ compiler made my jaw fall to the floor. At my job I have heard a riddle whether modern c++ compilers would be able to convert this recursive function into a loop, at sufficient optimization level. (AFAIK most programming languages would not do that)
int sum(int n) {
return n ? sum(n-1) + n: 0;
}
Generally the answer is yes, gcc or icc would make loop of it and, with extra optimization, unroll it a couple of times. But then one tried clang 5.0, and it produced an arithmetic series summation formula, making this function O(1)!
One could argue that this is just a cheap trick… the thing is there could be thousands of such “cheap tricks” in C++ compilers… And you need long years of well planned development to put such “cheap tricks” in production.
Answer
I “grew up” with optimizing compilers. At one point, Carnegie Mellon University, Computer Science, had one of the most advanced optimizing compilers of its day (1970s). But I’ve read the code from GNU and Microsoft C++, and today’s compilers make our efforts look like a good amateur attempt at optimization. Current compilers can compile 50,000 lines per minute (at least!) and produce code that has inline expansion of functions, cross-procedural optimizations, and mind-boggling optimization techniques. I once read code in which function A called function B, but instead of passing parameters to it, the Microsoft C++ compiler determined that some of the inputs were constants or pass-by-values that were not changed, and instead of pushing parameters onto the stack and calling function B, function A preloaded some registers with values and BRANCHED INTO THE MIDDLE of function B’s code. And the registers were precisely correct, and it avoided doing a hundred or more instructions that would have not been necessary at all if the inputs were known to be read-only. I would fire someone who did that in assembly code, but the compiler gets away with it because it accounts for every source change before determining the validity of the optimization. I’ve seen the Microsoft C++ compiler collapse multiply-instantiated templates into a single code body. I’ve seen output from GNU gcc that makes my head hurt trying to understand it, but which is perfectly correct.
I have no idea what a “faster” language is compared to C++, but I would want to see nontrivial metrics (forget Towers of Hanoi, Sieve of Eratosthenes, and similar toy programs. Compare code size and performance of systems 100K SLOC Source Lines of Code or better). I suspect that Microsoft C++ and gcc can run circles around GO. And take a look at Java HotSpot dynamic compilation and optimization before comparing Java to C++ or any other language. We used to see the same silly comparisons for language complexity, because you could implement “Hello, world” in BASIC in only one line, but every other language took lots of lines and correct punctuation. But you couldn’t maintain 100K SLOC written in BASIC (and I mean the original BASIC, not languages like Visual BASIC, which are modern languages with a few pieces of syntax borrowed from the original Darmouth BASIC).
A good optimizing compiler can take you through a dozen levels of abstraction and generate half an instruction. I don’t trust anyone who says “Language X is faster than Language Y” unless they give details of the metrics they used. Anyone can write a compiler that makes trivial code examples run quickly. Professional compilers can make 100K SLOC be smaller or faster,, independent of the source language.
Each time we added an optimization to the compiler, we knew that it might improve the code by 1%-3%. But put a couple hundred such optimizations in, and “compound interest” wins big. Some years ago, IBM was dismayed that under SPECMark array benchmarks, their compiler ran at 45 SPECMarks and Sun’s ran at 61 (bigger is better). So they wrote a program that took a configuration of the cache behavior of different models of their RISC6000, and wrote a FORTRAN program that took, as input, a FORTRAN program, and produced, as output, another FORTRAN program which did the exact same computation, but segmented the data to maximize cache hits. They ran the array example through this and got a score of…wait for it…900! Many modern FORTRAN compilers now incorporate these transformations internally (this experiment was done in the early 1990s).
Debates about “faster” languages are just plain silly. It’s the compiler that makes the difference.