Thursday, November 11, 2010

Google AI Contest

I'm competing in the Google-sponsored AI Contest, under the username wintrmute.

I'm entering a Perl 'bot, which isn't doing amazingly well, but it's fun.

You can see the Perl rankings here: Perl rankings. I see Perl doesn't have that many entries compared to the C++ and Java guys, nor are we doing that well overall. :(
Why not come along and join us?

I have a modified Perl starter package on Github, here:
(It's similar to the default one, but with a few minor improvements. The coding-style is still a bit ugh, IMO. Would be nice to rewrite it all from scratch, but I don't have the time right now.)

Monday, May 24, 2010

New results for Perl vs Scala vs Go vs C

This post revisits the tests from my previous post, measuring how long various languages took to process a file.

After optimisations and suggestions were made from various people, I have some new results, and as before, the tests themselves are available at

I also have a user-submitted test for doing the test in C, which performed very well, but possibly doesn't perform quite such rigorous CSV parsing as the others.

The results are:

Small file:
Perl 0.744 seconds
Scala 0.842 seconds
Go: 1.55 seconds
C: 0.083 seconds

Medium file:
Perl: 7.12 seconds
Scala: 3.28 seconds
Go: 15.1 seconds
C: 0.780 seconds

Big file:
Perl: 71.2 seconds
Scala: 23.9 seconds
Go: 153 seconds
C: 7.83 seconds

Note the memory sizes:
Scala: 114 MB
Perl: 6 MB
Go: 2 MB
C: 0.5 MB

I find it interesting to note that the Scala code, originally, was taking 115 seconds for the largest file. While I am not a Scala expert, the code was still reasonably straight-forward and not actually *wrong*. However by changing the code around quite a bit, and using a different numeric formatting and output engine, the performance more than quadrupled.

I will welcome any patches for the Go version - I'm sure it must be possible to make it go much faster!

Tuesday, May 18, 2010

Perl vs Go vs Scala performance

I've been playing with Go and Scala a bit lately, and was curious to benchmark them doing something vaguely similar to my real-world Perl tasks. I know that there are already other benchmarks out there, but I wanted to see how things went on what is Perl's home turf - simple file processing and data manipulation.

Thus I created some tests, which you can see at

I expected Perl to do badly here, as it was up against the natively-compiled Go, and the JVM, which I've heard is highly optimised. (Although it always seems to take a while to load for me, and uses tonnes of RAM)

However Perl did surprisingly well!
I repeated the tests for three sizes of data set. It's curious to note that Perl and Go scaled linearly with the size of the input data, while Scala did better, taking the lead once the file sizes increased.

Edit: After some suggestions from the crowd, I updated and re-ran the tests.

The results follow..

Language100k rows1m rows10m rows
Perl1.089 s10.96 s111.3 s
Scala1.857 s9.835 s89.05 s
Go1.682 s16.77 s154.3 s

In doing this test, I was running Ubuntu 10.04 64bit, and:
Perl 5.10.1 w/Text::CSV and Text::CSV_XS
Go (may 2010 build) w/csv.go
Scala 2.8.0.RC1 on Java 1.6 w/

NOTE: This post has now been updated/superceded by:

Monday, March 22, 2010

Hacked nVidia drivers for Ion LE

So I bought a new Samsung N510 netbook, which comes with the nVidia Ion LE GPU.

The "LE" is notable because that version is crippled - but purely in software. I've hacked up the current nVidia drivers to enable the full power (well, as much power as the non-LE version gets). Might be of interest to anyone else who has a laptop with a "Ion LE" chipset, anyway..

Here's a link:
It's based on the 195.62 drivers; I might release newer versions as time goes by.

(Info on how to do it was found on the site.)

Wednesday, January 13, 2010

Perl date validation performance

So I was benchmarking some code, and noticed that DateTime was causing a significant portion of the runtime.
Edit: Updated to include more modules.

I was curious to find out what the fastest way to valid dates was, so ran a test using DateTime, Date::Calc, Time::Piece and Time::Local.
The results are below, but they show that Date::Calc was by far and away the quickest method, followed by Time::Piece (even though that one doesn't natively do date checking. (ie. I had to write extra code around it.))

The tests below are for checking "bad" and "good" dates, in case there was a difference. (I expect 99% of dates to be valid, after all.)
As it happens, it only made any difference to Time::Local.

I also tried testing Date::Tiny, DateTime::Tiny, and DateTime::LazyInit, however the first two didn't actually do date validation in any useful way, and the latter failed to install. (And I suspect would just return the same performance as DateTime..)

So, here are the results!

tobyc@arya:~/git/cbt/performance$ ./
# Higher numbers are better.
gooddatetime 7328/s
baddatetime 7802/s
badtimelocal 11471/s
goodtimelocal 39665/s
goodtimepiece 63021/s
badtimepiece 63743/s
gooddatecalc 399124/s
baddatecalc 400509/s

My actual code used was:

# Benchmark script for Date validation methods.
use strict;
use warnings;
use Benchmark qw(:all :hireswallclock);
use DateTime;
use Time::Piece;
use Time::Local qw(timelocal);
use Date::Calc qw(check_date);

my $gooddate = '2009-02-01';
my $baddate = '2009-02-31';
our $dtre = qr/^(\d{4})-(\d\d)-(\d\d)$/;
our $format = '%Y-%m-%d'; # Also %F

# Check these all pass the good date:
die unless datetime($gooddate);
die unless tlocal($gooddate);
die unless timepiece($gooddate);
die unless datecalc($gooddate);

# Check they fail the invalid date:
die if datetime($baddate);
die if tlocal($baddate);
die if timepiece($baddate);
die if datecalc($baddate);

cmpthese(-5, { # ie. 5 seconds each
gooddatetime => sub { datetime($gooddate) },
goodtimepiece => sub { timepiece($gooddate) },
baddatetime => sub { datetime($baddate) },
badtimepiece => sub { timepiece($baddate) },
goodtimelocal => sub { tlocal($gooddate) },
badtimelocal => sub { tlocal($baddate) },
gooddatecalc => sub { datecalc($gooddate) },
baddatecalc => sub { datecalc($baddate) },

sub datetime {
my $date = shift;
eval {
$date =~ $dtre;
my $dt = DateTime->new(
year => $1,
month => $2,
day => $3
return(1) unless $@;

sub timepiece {
my $date = shift;
my $match;
eval {
my $tp = Time::Piece->strptime($date, $format);
my $reverse = $tp->strftime($format);
$match = ($date eq $reverse);
return $match;

sub tlocal {
my $date = shift;
eval {
$date =~ $dtre;
my $time = timelocal(0,0,0, $3, ($2-1), $1);
return(1) unless $@;
return 0;

sub datecalc {
my $date = shift;
$date =~ $dtre;
return check_date($1, $2, $3);