Monday, May 24, 2010

New results for Perl vs Scala vs Go vs C

This post revisits the tests from my previous post, measuring how long various languages took to process a file.

After optimisations and suggestions were made from various people, I have some new results, and as before, the tests themselves are available at

I also have a user-submitted test for doing the test in C, which performed very well, but possibly doesn't perform quite such rigorous CSV parsing as the others.

The results are:

Small file:
Perl 0.744 seconds
Scala 0.842 seconds
Go: 1.55 seconds
C: 0.083 seconds

Medium file:
Perl: 7.12 seconds
Scala: 3.28 seconds
Go: 15.1 seconds
C: 0.780 seconds

Big file:
Perl: 71.2 seconds
Scala: 23.9 seconds
Go: 153 seconds
C: 7.83 seconds

Note the memory sizes:
Scala: 114 MB
Perl: 6 MB
Go: 2 MB
C: 0.5 MB

I find it interesting to note that the Scala code, originally, was taking 115 seconds for the largest file. While I am not a Scala expert, the code was still reasonably straight-forward and not actually *wrong*. However by changing the code around quite a bit, and using a different numeric formatting and output engine, the performance more than quadrupled.

I will welcome any patches for the Go version - I'm sure it must be possible to make it go much faster!

Tuesday, May 18, 2010

Perl vs Go vs Scala performance

I've been playing with Go and Scala a bit lately, and was curious to benchmark them doing something vaguely similar to my real-world Perl tasks. I know that there are already other benchmarks out there, but I wanted to see how things went on what is Perl's home turf - simple file processing and data manipulation.

Thus I created some tests, which you can see at

I expected Perl to do badly here, as it was up against the natively-compiled Go, and the JVM, which I've heard is highly optimised. (Although it always seems to take a while to load for me, and uses tonnes of RAM)

However Perl did surprisingly well!
I repeated the tests for three sizes of data set. It's curious to note that Perl and Go scaled linearly with the size of the input data, while Scala did better, taking the lead once the file sizes increased.

Edit: After some suggestions from the crowd, I updated and re-ran the tests.

The results follow..

Language100k rows1m rows10m rows
Perl1.089 s10.96 s111.3 s
Scala1.857 s9.835 s89.05 s
Go1.682 s16.77 s154.3 s

In doing this test, I was running Ubuntu 10.04 64bit, and:
Perl 5.10.1 w/Text::CSV and Text::CSV_XS
Go (may 2010 build) w/csv.go
Scala 2.8.0.RC1 on Java 1.6 w/

NOTE: This post has now been updated/superceded by: