HYPERGENES

European Network for Genetic-Epidemiological Studies

Progress

HYPERGENES Final Report!

Final Report with all project results


(.pdf, 427Kb)

Project final summary

HYPERGENES project is focused on the definition of a comprehensive genetic epidemiological model of complex traits like Essential Hypertension (EH) and intermediate phenotypes of hypertension dependent/associated Target Organ Damages (TOD) as well as other endophenotypes as the pharmacogenomic pattern of two drugs widely used in EH.

The discovery of the genetic component in common complex diseases is extremely challenging since most of them are multifactorial and since the genetic component is likely to be described by the interactions of several genes involved in the disease pathway, each predisposing imperceptibly to the disease. HYPERGENES adopted the Genome Wide Association (GWA) approach to identify common variants contributing to the inherited component of common diseases.

The project's Technical and Scientific objectives were the following:

  • To identify the common genetic variants relevant for EH and TOD
  • To design and implement appropriate computational tools.
  • To develop a comprehensive Biomedical Information Infrastructure (BII).
  • To create a “Web-Based Portal” to allow access to the BII in order to allow dissemination of knowledge.
  • To develop new methods, protocols and standards for genomic association analysis, gene annotation and molecular pathways.
  • To develop a set of Decision Support Systems tools combining genetic, clinical and environmental information.
  • To develop a simple, inexpensive genetic diagnostic chip, that can be validated in our existing well-characterized cohorts.
  • To strengthen the existing clinician-basic scientist collaborative network on the genetic mechanisms of EH.
  • To generate educational tools to support professional training on all aspects of the project, favouring mobility of PhD students and post-docs.
  • To disseminate HYPERGENES achievements through scientific meetings, teaching in tutorial sessions, publication in high-impact scientific journals etc.
  • To exploit the results in a translational scenario.

HYPERGENES project is structured in three steps:

  • STEP 1: Discovery
  • STEP 2: Validation
  • STEP 3: Dissemination & Results Exploitation

HYPERGENES Discovery Phase was performed during the first two years of the project and was focused on building the methodological and technical framework to support the Genome Wide Association Analysis, performed on 4,000 Caucasian subjects recruited from historical well-characterized European cohorts.

The need of integrating the observations from different studies posed significant challenges which were faced through an integrated epidemiological and bioinformatics approach. The Biomedical Information Infrastructure (BII) was developed to support the entry, persistency and retrieval of data and knowledge relevant to EH, including clinical, environmental and genotypic data.

Genotyping have been performed on high throughput Illumina technologies, thanks to the coordinated efforts of the Laboratories of UNIMI and UNIL.

Both classical and machine learning techniques were used for genetic analysis, to produce an enriched list of SNPs, that resulted associated with EH or TOD, or however other endophenotypes relevant to hypertension. The conducted case-control association study lead to hundreds of significant associations, which were only partially overlapping with the results of previous studies.

The best SNPs resulting associated to EH and TODs in the Discovery Sample together with candidate SNPs well known as being associated to the phenotypic trait of interest were used to build a custom Illumina iSelect HD chip, including 15,000 SNPs. Such tool was used to validate the results obtained in the Discovery Phase in an additional independent sample of 8,000 subjects.

Regions showing most promising results were sequenced in order to obtain a more detailed comprehension of the nucleotide sequence of each region. Sequencing was performed on 92 subjects.

A risk prediction algorithm for EH and TOD was then developed.

A lab-on-chip (LoC) meant to test the most promising SNPs found associated with microalbuminuria was developed. The LoC was therefore cross-validated on 100 samples analyzed in blind by UNIMI and UNIL.

The developed BII manifests the commonalities among the dozens of HYPERGENES cohorts for multi-cohort analysis, while preserving the disparities and allowing researchers to access the original cohort data. Atop of the BII two Machine Learning tools were developed: the “Bioclinical Data Mining Tool”, which is a general purpose algorithms built into a generic framework for streaming data, and the “SNP weighting tool”, an algorithm specifically developed for utilizing the existing knowledgebase for SNP association analysis. These tools are available online for HYPERGENES partners.

These tools concurred to the development of a risk prediction disease model. Given a set of known EH and TODs risk factors for which progressive data are available, the model allows to predict the future clinical range of each parameter, exploring also the contribution of the genomic data to the prediction. To further integrate the model with genomic information, a previously developed gene network was used. Each gene in every pathway was represented by its strongest associated SNP. A good prediction performance was demonstrated for measures having sufficient samples (over thousand).

Moreover UNIMI performed a pathway analysis using the 29 variants identified in ICPB-GWAS, a recent genome-wide association study involving 200,000 individuals, and HYPERGENES results with the aim to verify a common pathway between the two studies. HYPERGENES added a further relevant player, NOS3, within the vasodilator pathway for blood pressure regulation.

HYPERGENES results apply to a variety of fields, as expected from the inherent project multidisciplinarity. They can be summarized per areas as follows:

  • IT tools to support multidisciplinary research
  • Development of a BII that allows integration, harmonization and standardization of clinical, environmental and genomic data.
  • Development of a SNP-Ranker software, that accounts for experimental data and knowledge information about genomic regions
  • Creation of a tool (IBM Bioclinical Data Mining Tool) that allows users to execute machine learning and data mining algorithms on large data sets.
  • Representation of HYPERGENES analysis results consistency with patient data representation, which makes it easier to utilize the results by clinical decisions support applications, in particular the comparison of the genomic data
  • Representation of external knowledge fetched from publicly-available sources is easily merged with HYPERGENES newly-created knowledge so that it facilitates the refinement of the final outcome
  • Predictive biomarkers for the design and the production of diagnostic chips
  • Identification of a ranked list of SNPs associated with EH and TODs in a sample of 4,000 well characterized subjects, through a high density array and validation of the results in an independent sample of 8,000 individuals.Identification of a new genetic player (ENOS), likely to be associated with EH pathogenesis
  • Development of a LoC for the detection of genetic polymorphisms associated to microalbinuria l
  • Capacity building on the use of LoC technology and validation of the LoC in non-testing conditions
  • Model allowing prediction about EH and associated TOD:
  • Creation of a model that predicts the future clinical range of each measure from the clinical data available integrating the the contribution of the genomic data to the prediction. Such model, based on Machine Learning approach, demonstrated a good prediction performance.
  • Identification of several functional pathways that concern vasodilator and natriuretic mechanisms, electrolyte homeostasis and hypertrophic processes
  • Development of two strategies for clinical decision support systems that rely on mining patient data
  • Facilitate the creation of mixed marts of subjects' data along with knowledge-based computed fields from the core warehouse in a way that is optimized to analysis users as well as to clinical decision supports that use machine learning techniques to refine their reasoning processes.

HYPERGENES gave great relevance to dissemination activities, through publication of many papers in peer reviewed journals, participation to international conferences and workshops and constant update of the project website.

HYPERGENES results will have a strong impact on the scientific community involved in research on EH and other complex diseases. A unique European cohort, gathering a great amount of genetic and well characterized phenotypic data, has been built and it represents a valuable resource for future projects. Thanks to HYPERGENES, strong collaborations and synergies between different centers worldwide have been strengthened and the project effort will contribute to build the European leadership in the domain of advanced genomics.

The dissemination of HYPERGENES results will support the definition of new strategies for prevention and treatment in EH. Furthermore they will play a central role in enhancing the appropriateness of the therapeutic protocols, toward a more personalized treatment and the reduction of side effects of drugs and pharmaceutical health expenses.

Concerning the commercial impact, the Consortium collected and analyzed all the foreground generated, identifying the best exploitation strategy for each of them. One patenting procedure is ongoing.