Abstract
With an ever-increasing amount of available data, scalability problems are much 
more of a concern than they were 20 years ago, and climate change issues urges 
for greener computing. In bioinformatics, a few methods have become de-facto 
standards to approach specific issues such as gene-calling, multiple sequence 
alignment, or protein domain annotation. These tools are now the basis of numerous 
annotation pipelines that are executed thousands of time daily. Academia however 
favors scientific novelty over efficiency, and C or C++ code is more than 
enough to be categorized as high-performance software.
In this work, we use profiling to identify critical parts of established 
bioinformatics methods. We then employ several optimization techniques to 
make these tools more efficient. We show how inlining a few functions halve 
the runtime of the PRANK aligner; 
how caching alignment matrices used by trimAl 
reduce the runtime 10-fold; how SIMD speeds up the gene scoring step of 
Prodigal; and how parallel hashing in 
FastANI increases the efficiency on 
multi-core machines.
The goal of this software engineering experiment is to introduce some efficient 
programming habits to the community, and to change the perspective about 
ubiquitous software we use brainlessly in our pipelines. More practically, we 
provide several of these patches as Python packages to 
be used as drop-in replacements for the originals. As a perspective, we 
present some general figures about the energetic cost of computing, and 
how much CO2 can be saved with the aforementioned optimizations.