drio


Nine years ago I was doing a lot of bioinformatics work. I was also writing golang code to learn the language and see if I could use it for my daily programming tasks.

At the same time I came across a set of benchmarks that tested how fast different programming languages could read fastq files. The winner was a heavily optimized c program.

Naturally, I decided to write an implementation in golang. My first attempt did not do better than the c version (didn’t do that bad either – same order of magnitude if I remember correctly). But with the help of the golang community I ended up writing a version that was faster than the c implementation.

Fast forward 9 years. I came across this article that discusses how the different versions of golang have been consistently improving performance.

I decided to rerun the benchmarks with the latest versions of gcc, golang and lua. You can see the results below. Golang still performs significantly better than the c version.

Code here.

  > make
Input file has 25134480 reads
gcc --version 2>/dev/null | head -1
Apple clang version 13.0.0 (clang-1300.0.29.30)
time  cat *.fastq | ./out > /dev/null

real    0m14.930s
user    0m13.730s
sys     0m2.565s
go version
go version go1.17.6 darwin/arm64
time cat *.fastq | go run ./readfq.go >/dev/null

real    0m8.314s
user    0m6.272s
sys     0m3.541s
lua -v
Lua 5.4.4  Copyright (C) 1994-2022 Lua.org, PUC-Rio
time cat data.fastq| lua ./readfq.lua
25134480        3770172000      3770172000

real    1m1.813s
user    1m0.779s
sys     0m2.307s