A "quick" microbenchmark
The other day I decided to do some microbenchmarking across several different languages. I used an array of 1 M integers and applied the quicksort sorting algorithm as the test code and also included the native language sort routines where applicable.
I wrote the runner using Cake and Coffeescript which compiles to Javascript and runs on node.js. Cake is a very simple build system and is similar to make/rake, as you probably already guessed.
Each respective test will run the sort algorithm 5 times (some only 3 because they take so long already it doesn't matter) and the minimum execution time is printed. I ran each one a few times and took an eyeball average of those runs. I am not that interested in a +/- of a few milliseconds when sorting 1 M integers. I am way more interested in orders of magnitude.
Results
Language | Version | Time in milliseconds | |
---|---|---|---|
1 | C++ | MS 16.00.30319.01 for 80x86 | 75 |
2 | C# Array.Sort() | .NET 3.5 | 100 |
3 | Java | 1.6.0_23 | 105 |
4 | Groovy api (aka java) | 1.8.5 | 110 |
5 | Java Arrays.sort() | 1.6.0_23 | 121 |
6 | C# | .NET 3.5 | 126 |
7 | Scala | 2.9.1.final | 128 |
8 | Node.js (v8) | 0.6.7 | 175 |
9 | Chrome JavaScript | 16.0.912.77 | 182 |
10 | Ruby api array.sort! | 1.9.2p290 | 250 |
11 | Firefox JavaScript | 10.0 | 271 |
12 | IE JavaScript | 9.0.8112.16421 | 307 |
13 | IE JavaScript api sort() | 9.0.8112.16421 | 375 |
14 | Node.js (v8) api sort() | 0.6.7 | 480 |
15 | Chrome JavaScript api sort() | 16.0.912.77 | 520 |
16 | Python api sort() | 3.1.3 | 814 |
17 | PHP api sort() | 5.3.8 | 1441 |
18 | Firefox JavaScript api sort() | 10.0 | 3490 |
19 | Ruby | 1.9.2p290 | 3520 |
20 | Groovy | 1.8.5 | 4100 |
21 | Python | 3.1.3 | 8100 |
22 | PHP | 5.3.8 | 9700 |
Things I will take away from this exercise
Switching between too many languages during a day is tough and not very efficient. I haven't done much coding in some of these languages, or not at least for awhile. I didn't try to use the adopted paradigms in each language. I did try to keep the overall structure the same for each language. When you do a google search for quicksort some of the hits for ruby, python, javascript and scala show these "elegant" 1-3 liners of code that make me think I am looking at perl. Needless to say, I tried a few but they ran quite a bit slower than my bloated version. Thats not to say that some performance tuning could be done for each language's nuances, but as I mentioned, I am interested in the orders of magnitude comparison rather than a pure speed test.
It was interesting to see how easy some of the features were to find in some languages compared to others. Command line parameters, random numbers, benchmarking code etc. Lets take a look at command line parameters in each language. They are all similar (some variation of an arg[sv] array) but yet different (array offset, parsing, brackets, semi-colon). Its these subtle differences that make a lot of language context switching difficult.
C++ int len = atoi(argv[1]); C# var len = int.Parse(args[0]); java: int len = Integer.parseInt(args[0]); scala: val len = Integer.parseInt(args(0)) php: $len = $argv[1]; node.js (coffeescript): len = process.argv[2] groovy: len = args[0].toInteger() python: alen = int(sys.argv[1]) # len is a reserved word ruby: len = ARGV[0].to_i
A small but effective change included changing pivot = arr[(left+right) / 2]
to pivot = arr[(left+right) >> 1]
. In some languages it generated a noticeable increase in performance. As well it prevented a necessary cast/floor to an int preventing an array lookup of arr[3.5].
Pythons top google links are tied to v2 rather than v3. Using the old print statement rather than the print() function is pretty frustrating as you view the examples from google and the code seems correct. The error message was not at all helpful either. I have seen this in other languages/frameworks as well.
Then we have the browser wars, aka the js engine war. Chrome posted the best time for the quicksort. For some reason I was expecting some overhead compared to node.js but the in browser version was pretty much identical in both test cases. IE was the most consistent and by far had the best time using the Array API sort(). Firefox was really slow on the API call... no idea why. It was slow enough that I actually got the warning saying the script was running too long. During the algorithm Firefox was "Not Responding" as well. I didn't bother running the code in older versions as that wasn't the goal of this exercise.
Consective runs in IE and FireFox produced consistent results. However, this was not true for Chrome. Chrome's performance decreased by about double for the second run and a little more after that to what seemed to be a ceiling. It probably has to do with GC since on a page refresh the time returned back to the fast time.
Groovy gets to piggy back off of the JVM for API calls.
Most of the API sorts were faster. This is not surprising since the api would generally be running in the interpreters native language and are generally written in c++ so the sorts are running native code at that point. I was surprised that the quicksort algorithm was faster than the API calls for Java and JavaScript. In Java it is somewhat negligible and if you look at the source the Arrays sort() API is a "tuned" quicksort. Without spending too much time on it, it appears that the v8 implementation is doing a Quicksort written in js but makes extra calls to the required compare function among other things. The compare is necessary as by default the method sorts the elements alphabetically which is obviously not what we wanted. I assume this is where the overhead comes from, but that seems expensive as its almost 3x longer.
I was also surprised at how well all of the JavaScript engines did compared to the other main stream (server) languages. The node.js v8 implementation was 10x - 25x faster than most of them and was less than 2x Java/C#.
You can view the code for each implementation on github.
Although I did have the idea before seeing this, the following blog post provided some good information http://stronglytypedblog.blogspot.com/2009/07/java-vs-scala-vs-groovy-performance.html
You can see the results with PHP 5.4 ... A "quick" microbenchmark update PHP 5.4