Facilitating transcontinental human data exchange

CINECA logo on blue data background
Federated data sharing made possible on an unprecedented scale through the transcontinental CINECA project. PHOTO: Spencer Phillips/EMBL-EBI

print

CINECA project to unite more than a million human data sets from across Africa, Canada and Europe

Registered researchers will be able to analyse population-scale genomic and biomolecular data with the launch of the Common Infrastructure for National Cohorts in Europe, Canada and Africa (CINECA). The international project is led by EMBL’s European Bioinformatics Institute (EMBL-EBI). Data from 1.4 million individuals will be accessible to approved researchers around the world through CINECA’s federated cloud-based network.

Rapid access to clinical research data allows scientists to share their findings. It also reduces the need to duplicate costly studies and accelerates research.

Comprised of 18 partner organisations across three continents, CINECA is composed of data from eleven cohorts. The dataset provides a diverse representation of studies in rare disease, common disease and national cohorts over time (longitudinal).

Personalised medicine

Within the next five years it is predicted that the majority of human genomes will be generated through national-scale healthcare initiatives. Federated analysis tools, like those within the CINECA initiative, could help identify relevant treatments for patients on an individual basis. There are also hopes that in the near future, personalised medicine programmes worldwide will be using cloud technology.

“By enabling access to genetic data from diverse human populations, CINECA will support the development of treatments tailored to each individual patient’s genetic profile, the ultimate goal of personalised medicine,” says Thomas Keane, Team Leader at EMBL-EBI. “Clinicians need to be able to compare a patient’s genome to a large set of healthy people and sick people, in order to understand the underlying genetics of the patient. And by “large”, we mean hundreds of thousands or even millions of other people.”

Tools for discovery

A key aim of CINECA is developing tools that enable rapid data discovery, secure access and authorisation within the cloud. Such tools will enable researchers to quickly discover data which are relevant to ongoing research projects, without duplicating studies. This raises the potential for novel discoveries into causes of rare and common disease such as cancer and diabetes.

“The project provides an avenue for us to align with international best practices, and contribute to these from an African and resource-limited perspective,” says Nicola Mulder, Head of Computational Biology at University of Cape Town and Principle Investigator of H3ABioNet (a Pan African bioinformatics network for H3Africa). “At the same time as contributing our own expertise in working with diverse African genetic data, we hope to gain experience in new technologies for data sharing and clinical implementations.”

The challenges

Federated international human data sharing presents ethical and technical challenges, and is a task tightly embedded within the CINECA project. Delivering a solution to meet ethical and security requirements for international health data sharing is a key aim of CINECA.

To protect patient privacy, access to the federated data cohorts will follow the established structure used by the Global Alliance for Genomics and Health (GA4GH). This means researchers must formally apply for data access on an individual basis.

“This project is a large scale implementation of almost all of the GA4GH standards, in particular, data use and Researcher ID standards,” says Keane. “The goal of this implementation is to accelerate the process of accessing datasets in a safe and secure way. All the control of the datasets remains with the local cohorts, as we’re not trying to create a centralised resource, but a federated one.”

Read the full press release on EMBL-EBI News.