Alistair built the tools. He worked with colleagues in the geosciences and in open computing communities to develop new software tools such as Zarr, a new approach for storing large, complex data sets like those produced by sequencing many mosquitoes, each of which have 250 million base pairs of DNA.
“[To do mosquito genomics,] you really need to be able to do a lot of exploratory analysis, iterate quickly and visualise data to get ideas about new directions,” explains Dr. Miles. “And that pushes you more into that space where you need to figure out a better way of doing computing. It pushes you towards things like cloud computing, and new approaches to scientific computing.”
Data from the more than 10,000 sequenced Anopheles samples in Vector Observatory have been stored in the cloud, and Dr. Miles' team, together with collaborators at the Liverpool School of Tropical Medicine (LSTM) and Imperial College, have created a Python software library so that anyone with a laptop and a basic internet connection can start to analyse the data.