Current students

COMPGA16 - Malware

This database contains 2017-18 versions of the syllabuses. For current versions please see here.

 CodeCOMPGA16 (Also taught as COMPM066)
PrerequisitesUndergraduate courses in logic and discrete mathematics, assembly, and imperative programming.
Taught By

David Clark (60%) [Module Leader]

Jens Krinke (20%)

Earl Barr (20%)

Hector Menendez (90%) [LAB support]


To provide students with:

  1. Specialist understanding of the issues and techniques in malware detection and classification.
  2. Broad understanding of the human, social, economic and historical context in which malware occurs.
Learning Outcomes

Successful completion of this course will provide students with a specialist understanding of the nature of malware, its capabilities, and how it is combatted through detection and classification. Students will understand what are the underlying scientific and logical limitations on society’s ability to combat malware. Furthermore, students should have an appreciation and broad understanding of the social, economic and historical context in which malware occurs.


Laboratory work (24% assessment) Nine 2 hour labs

Topics: Introduction (malware analysis, tools list). Lab 1: architecture; Labs 2 and 3: 8086 instructions; Lab 4: from C to assembly; Labs 5 and 6: Radare 2; Lab 7: static analysis; Lab 8: dynamic analysis (Wireshark, PIN); Lab 9: packing/unpacking (Yara, PEID)


  1. The taxonomy of malware and its capabilities: viruses, Trojan horses, rootkits, backdoors, worms, targeted malware
  2. History of malware

The social and economic context for malware

  1. crime, anti-malware companies, legal issues, the growing proliferation of malware

Basic Analysis

  1. Signature generation and detection
  2. clone detection methods

Static analysis theory

  1. program semantics
  2. abstract interpretation framework

Static Analysis

  1. System calls: dependency analysis issues in assembly languages; semantic invariance of system call sequences
  2. abstract interpretation as a formal framework for detection
  3. taint-based analyses
  4. semantic clones

Dynamic Analysis

  1. virtualization: semantic gap
  2. reverse engineering
  3. hybridisation with static analysis

Similarity metrics

  1. Kolmogorov Complexity
  2. association metrics
  3. other entropy based metrics
  4. NLP based approaches.

Problems in large scale classification

  1. scalability
  2. triage methods
  3. Required FP rate


  1. Polymorphism
    1. compression
    2. encryption
    3. virtualization
  2. Metamorphism
    1. high level code obfuscation engines
    2. on-board metamorphic engines
    3. semantics-preserving rewritings
  3. Frankenstein

The theory of malware

  1. Rice’s theorem and the undecidability of semantic equivalence
  2. Adleman’s proof of the undecidability of the presence of a virus
  3. Cohen’s experiments on detectability and self-obfuscation

Method of instruction

Lectures, class-room based exercises and labs


The module has the following assessments:

  • Examination (2.5 hours) (70%)
  • Coursework (30%)

To pass this module, students must:

  • Obtain an overall pass mark of 50% for all components combined.


Reading list available via the UCL Library catalogue.