Nine years ago I was doing a lot of bioinformatics work. I was also writing golang code to learn the language and see if I could use it for my daily programming tasks.
At the same time I came across a set of benchmarks that tested how fast different programming languages could read fastq files. The winner was a heavily optimized c program.
Naturally, I decided to write an implementation in golang. My first attempt did not do better than the c version (didn’t do that bad either – same order of magnitude if I remember correctly). But with the help of the golang community I ended up writing a version that was faster than the c implementation.
Fast forward 9 years. I came across this article that discusses how the different versions of golang have been consistently improving performance.
I decided to rerun the benchmarks with the latest versions of gcc, golang and lua. You can see the results below. Golang still performs significantly better than the c version.
Code here.
> make
Input file has 25134480 reads
gcc --version 2>/dev/null | head -1
Apple clang version 13.0.0 (clang-1300.0.29.30)
time cat *.fastq | ./out > /dev/null
real 0m14.930s
user 0m13.730s
sys 0m2.565s
go version
go version go1.17.6 darwin/arm64
time cat *.fastq | go run ./readfq.go >/dev/null
real 0m8.314s
user 0m6.272s
sys 0m3.541s
lua -v
Lua 5.4.4 Copyright (C) 1994-2022 Lua.org, PUC-Rio
time cat data.fastq| lua ./readfq.lua
25134480 3770172000 3770172000
real 1m1.813s
user 1m0.779s
sys 0m2.307s