The genomes of 66,000 UK species are to be sequenced as part of a global effort to sequence all known eukaryotic species on Earth
The genetic code of 66,000 UK species will be sequenced by the Wellcome Sanger Institute in a major collaboration with EMBL’s European Bioinformatics Institute (EMBL-EBI) and other partner organisations as part of a global effort to sequence all 1.5 million known species of animal, plant, protozoa and fungi on Earth.
The UK project, known as the Darwin Tree of Life Project, launched 1 November alongside the global effort, the Earth BioGenome Project. The Earth BioGenome Project will ultimately create a new foundation for biology to drive solutions for preserving biodiversity and sustaining human societies.
Building an open database
Once sequenced by the Wellcome Sanger Institute, EMBL-EBI will assist in annotating the genomes and assessing how the data can be stored and accessed.
“How the information is presented is going to be very important,” explains Richard Durbin, a professor in the department of genetics at the University of Cambridge and an associate faculty member at the Wellcome Sanger Institute. “EMBL-EBI’s experience in dealing with very large numbers of reference genomes, and identifying the relationships between the genes within them, will provide the project with an essential tool. The aim is to create an open data infrastructure on which researchers can build, and EMBL-EBI has a very central role to play.”
The Darwin Tree of Life project is now possible due to recent and expected advances in sequencing technology which mean the costs involved have significantly reduced in recent years, while throughput capacity has increased. These advances also make it possible to produce genomes of a much higher quality. Researchers can now sequence continuous stretches of tens of thousands of base pairs of DNA, an increase of more than 100 times from the methods historically used in genomics.
The project is estimated to cost approximately £100 million over the first five years, and the sequencing of 66,000 species’ genomes will take around 10 years.
“The Darwin Tree of Life project is an exciting opportunity to understand life, evolution, ecosystems and biodiversity by leveraging genomics and our experience in creating biological data resources that are freely available to everyone in the world,” adds Paul Flicek, a senior scientist and group leader at EMBL-EBI.