COMPM058 - Bioinformatics

This database contains the 2017-18 versions of syllabuses. Syllabuses from the 2016-17 session are available here.

Note: Whilst every effort is made to keep the syllabus and assessment records correct, the precise details must be checked with the lecturer(s).

Code COMPM058 (Also taught as COMPGI10)
Year 4 (Masters)
Prerequisites It is expected that students will already be familiar with the principles of techniques such as neural networks, Support Vector Machines, and ideally Hidden Markov Models from earlier parts of their degree course. Also, students will need to have taken the Supervised Learning option (COMPM055)
Term 2
Taught By David Jones (66%)
Kevin Bryson (33%)

The overall aim of this course is to introduce students to the new field of bioinformatics (computational biology) and how machine learning techniques can be employed in this area.

The course is aimed at students who have no previous knowledge of biology and so the aim of Part 1 of the course is to give a basic introduction to molecular biology as a background for bioinformatics.

Part 2 will concentrate on modern bioinformatics applications, particularly those which make good use of pattern recognition and machine learning methods.

Learning Outcomes
  • To have a basic knowledge of modern molecular biology and genomics.
  • To understand the advantages and disadvantages of different machine learning techniques in bioinformatics and how the relative merits of different approaches can be evaluated by correct benchmarking techniques.
  • To understand how theoretical approaches can be used to model and analyse complex biological systems.


Part 1: Basic molecular biology (6 lectures)

  • Introduction to Basic Cell Chemistry: Cell chemistry and macromolecules. Biochemical pathways e.g. Glycolysis. Protein structure and functions.
  • Cell Structure and Function: Cell components. Different types of cell. Chromosome structure and organisation. Cell division.
  • The Hereditary Material: DNA structure, replication and protein synthesis. Structure and roles of RNA. Genetic code. Mechanism of protein synthesis: transcription and translation. Mutation.
  • Recombinant DNA Technology: Restriction enzymes. Hybridisation techniques. Gene cloning. Polymerase chain reaction.
  • Genomics and Structural Genomics: Genes, genomes, mapping and DNA sequencing.

Part 2: Bioinformatics Applications (3 lectures per subject)

  • Biological Databases: Overview of the use and maintenance of different databases in common use in biology.
    • Case study: the CATH database of protein structure.
  • Gene Prediction: Methods for analysing genomic DNA to identify genes. Techniques: neural networks and HMMs.
  • Detecting Distant Homology: Methods for inferring remote relationships between genes and proteins. Techniques: dynamic programming, HMMs, hierachical clustering.
  • Protein Structure Prediction: Methods for predicting the secondary and tertiary structure of proteins. Techniques: neural networks, SVMs, genetic algorithms and stochastic global optimization.
  • Transcriptomics: Methods for analysing gene expression and microarray data. Techniques: clustering, SVMs.
  • Agent-based Genome Analysis: Automation of genome analysis using intelligent software agents.
  • Drug Discovery Informatics: Approaches to drug discovery using bioinformatics techniques.

Method of Instruction

Lecture presentations with associated class problems and group presentation/discussion of key research papers


The course has the following assessment components:

  • Written Examination (2.5 hours, 85%)
  • Coursework Section (1 individual mini-project, 15%)

To pass this module, students must:

  • Obtain an overall pass mark of 50% for all components combined;
  • Obtain a minimum mark of 40% in each component worth ? 30% of the module as a whole.

The examination rubric is:

Answer THREE questions out of FIVE. All questions cary equal marks.


Reading list available via the UCL Library catalogue.