UCL DEPARTMENT OF COMPUTER SCIENCE
UCL Bioinformatics Group Logo
UCL logo

Research

Why are so many neurodegenerative diseases – such as Alzheimer’s, Parkinson’s and Huntington’s – associated with the formation of insoluble amyloid inclusions? Do these inclusions represent a protective mechanism by the cell to eliminate toxic unfolded proteins? Or are they directly responsible for the death and dysfunction of neurons? Can Bioinformatics and Systems Biology help understand complex diseases such as these by organizing and clarifying the various disease mechanisms that have been proposed?

Unlike many bioinformatics research labs that tend to focus on particular techniques, we take a more systems biology approach that requires detailed understanding of a focused biological system. In particular, the focus of our lab is on applying a variety of theoretical approaches to help understand polyglutamine expansion diseases, although we also have a broader interest in other amyloid diseases (Amyloidosis).

We believe this disease-centric approach is vital to ensure sufficient expertise to appropriately apply theoretical techniques. We currently have three main research threads: integration and analysis of high-throughput and biological network data; prediction of amyloid protein structures and the process of amyloidosis; and understanding tissue-specific and stem cell aspects.

Integrative Systems Biology of Polyglutamine Expansion Diseases

One of our key approaches involves the integration of high-throughput and biological network data that currently have unnatural barriers between them due to differences in methodology and data formats. For instance, we attempt to understand polyglutamine expansion diseases employing data and methods across a number of resources including:

Key Resources Description
ENSEMBL, GenBank Genomics resources used to characterize DNA sequences and genomic context of genes affected by polyglutamine expansion between different species.
GEO, ArrayExpress, HDBase Transcriptomics resources containing extensive experimental data on mRNA expression levels for polyglutamine expansion diseases. These include both human studies across different tissues (e.g.GSE3790 - brain, GDS1331 - peripheral blood, GSE8762 - lymphocytes, muscle), in addition to animal studies (GSE10263 - R6/2, Hdh4/Q80, CHL2 mice).
Proteomics Data

Proteomics databases are in their infancy compared to transcriptomics. Proteomics data for polyglutamine diseases are generally only available from original literature such as:

Mochel F, Charles P, Seguin F, Barritault J, Coussieu C, et al. (2007) Early Energy Deficit in Huntington Disease: Identification of a Plasma Biomarker Traceable during Disease Progression. PLoS ONE 2(7): e647.

Metabolomics Data

In a similar way to proteomics data, we generally have to go to original sources to obtain metabolomics data for polyglutamine diseases such as:

Tsang, T.M., Woodman, B., Mcloughlin, G.A., Griffin, J.L., Tabrizi, S.J., Bates, G.P., and Holmes, E. (2006) Metabolic Characterization of the R6/2 Transgenic Mouse Model of Huntington's Disease by High-Resolution MAS 1H NMR Spectroscopy. J. Proteome Res., 5, 3, 483 - 492.

Reactome, KEGG Biological pathway resources used to integrate within and between different levels of high-throughput data.
OMIM Human disease database that provides a wealth of information about polyglutamine and other amyloid diseases.

Our key aim within this work is to strengthen the detection of important differences between normal and diseased states by allowing the data from different sources to corroborate each other. In this way potential biomarkers can be detected for which there is insufficient support from any one source.

In doing this we employ a wide range of methodologies from Bioinformatics, Systems Biology and Computer Science in general. These include data modelling and transformation techniques (Java, Perl, XML processing, relational and native XML databases), high-throughput data analysis (R, Bioconductor, MATLAB), biological network analysis (XSTL, Cytoscape) and machine learning and statistical analysis (Weka, R, MATLAB).

 

Prediction and Modelling of Huntingtin and Amyloid Structures

Amyloidosis are diseases involving the formation of insoluble amyloid aggregates. These types of diseases gained very public recognition, at least in the UK, from Bovine Spongiform Encephalopathy (BSE aka 'Mad Cow Disease') that resulted from cows being fed the carcasses of other cows. The unique infectious agent believed to be responsible for this disease is an extremely stable misfolded form of the prion protein, essentially an amyloid protein. Furthermore, this amyloid protein could jump between species and resulted in an outbreak of new-variant Creutzfeldt-Jakob disease (nvCJD) in people that ate BSE infected meat. Both BSE and nvCJD are neurodegenerative diseases, in a similar manner to other diseases associated with formation of amyloid protein structures.

The Bioinformatics Group, headed by Prof. David Jones, is a world leading group in protein structure prediction, both using fold recognition methods and, more fundamentally, ab initio structure prediction (new fold) techniques.

Our lab is applying these world-leading methods to understand the structure and formation of amyloid fibrals. In particular, pre-amyloid structures have gained recent attention since these are believed to represent the cytotoxic elements.

Moreover, the actual native proteins involved in polyglutamine expansion diseases, such as Huntingtin, tend to be very large multidomain proteins with segments of intrinsic disorder. Structural studies of these proteins could provide valuable information about their native cellular functions, which still remain elusive in a lot of cases. However these structural studies are hampered by the sheer size of the proteins involved and their intrinsic disordered regions. We are developing sequence prediction methods based on Machine Learning techniques to predict both protein disorder (DISOPRED) and protein domain structures (DomPred). In this way, we hope to help structurally characterize these large proteins that are currently difficult to resolve using crystallography or NMR.

 

Tissue Specific Data Analysis and Stem Cells

Each of the 9 currently known polyglutamine repeat diseases show similarities that tentatively suggest a common underlying mechanism, for instance similarities in the threshold length of glutamine repeats (around 35 +/- 10 glutamines) and similar relationships between age of onset and polyglutamine expansion length.

However, there are also unique characteristics to each polyglutamine repeat disease. One striking aspect is that each disease involves degeneration of a unique subset of neurons - although more general atrophy is often present. For example, microarray studies indicate different gene expression changes are caused by Huntingtons Disease (HD) within different regions of the human brain (GSE3790, [1]). Also HD specific changes in gene expression result within different tissues such as blood and muscle (see Table above). Extensive gene expressions datasets are now also available for different human tissues (e.g. GSE7307 has over 600 samples covering more than 90 different types of tissue).

One of our key goals is to develop successful meta-analysis techniques to characterize tissue-specific gene expression differences and help understand tissue-specific differences that arise due to polyglutamine expansion diseases.

We are also carrying out meta-analysis studies on stem cells to improve both our understanding of cell-type specific differences and also to improve our understanding of stem cells per se due to their involvement in regenerative therapies that are in development for the treatment of various neurodegenerative diseases [2]. Along these lines, we have developed the Adam database within a large stem cell consortium effort supported by the Wellcome Trust.

 


[1] Hodges A, Strand AD, Aragaki AK, Kuhn A, Sengstag T, Hughes G, Elliston LA, Hartog C, Goldstein DR, Thu D, Hollingsworth ZR, Collin F, Synek B, Holmans PA, Young AB, Wexler NS, Delorenzi M, Kooperberg C, Augood SJ, Faull RL, Olson JM, Jones L, Luthi-Carter R. (2006) Regional and cellular gene expression changes in human Huntington's disease brain. Hum Mol Genet. 2006 Mar 15;15(6):965-77.

[2] Clelland CD, Barker RA, Watts C (2008) Cell therapy in Huntington disease. Neurosurg Focus. 2008;24(3-4):E9.

 

This page last modified 21 March, 2008 by Kevin Bryson

"We build too many walls and not enough bridges" (Isaac Newton 1643-1727)

 


Bioinformatics Group - University College London - Gower Street - London - WC1E 6BT - Telephone: +44 (0)20 7679 0409 - Copyright © 1999-2005 UCL


Search by Google