COMP0047 Data Analytics
This database contains the 2018-19 versions of syllabuses.
Note: Whilst every effort is made to keep the syllabus and assessment records correct, the precise details must be checked with the lecturer(s).
The module is aimed at introducing to data analytics providing some fundamental data-science tools. Students will learn statistical tools to individuate regularities, discover patterns and laws in complex datasets together with instruments to analyse, characterize, validate, parameterize and model complex data. Practical issues on business data analysis and statistics will be covered with specific case studies also in collaboration with industry partners.
On successful completion of the module, a student will be able to:
- analyse main statistical features of complex datasets;
- understand how to analyse, characterize empirically complex data;
- understand how to compute relevant statistical quantities and quantify their confidence intervals;
- understand how to build sensible models and how to parameterize and validate these models;
- understand how to quantify inter-dependency/causality structure between different variables;
- understand how to use the outcome of data-analytics to develop better tools for forecasting.
Availability and prerequisites
This module delivery is available for selection on the below-listed programmes. The relevant programme structure will specify whether the module is core, optional, or elective.
In order to be eligible to select this module as optional or elective, where available, students must meet all prerequisite conditions to the satisfaction of the module leader. Places for students taking the module as optional or elective are limited and will be allocated according to the department’s module selection policy.
Programmes on which available:
In order to be eligible to select this module, students must have a good knowledge of basic mathematics and statistics.
Empirical investigation of complex data
Essential practical familiarization with complex and big data, and with the most commonly used software packages to analyse them. Typical challenges with real data. Basics on data acquisition, manipulation, cleaning, filtering, representation and plotting.
Univariate and multivariate statistics
Marginal probability, joint probability and conditional probability. Empirical estimation of probability distributions. Measures of dependency. Cause and effect, Granger causality. Information theoretic measures: mutual information, transfer entropy. Spurious correlations and regularization. Forecasting and regressions. Hypothesis testing and validation.
Modelling and filtering through networks
Basics on complex networks: definitions and properties. Construction of networks of interactions form correlation and causality measures. Information filtering though networks.
Constructing predictive probabilistic models form data. Test and validate model performances. Select between alternative models.
Applications and case-study
Application of the studied material and methods to practical cases and real data will be done within the course through case-studies developed in collaboration with industry partners. Some case studies will discussed in class and used as demonstrations of the methodologies covered during the lectures. Other case studies will instead be given as assignments, and will represent the core material for the coursework.
An indicative reading list is available via http://readinglists.ucl.ac.uk/departments/comps_eng.html.
The module is delivered through a combination of lectures, practical exercises, in-class demonstrations and case studies.
This module delivery is assessed as below:
Data Analytics reports
In order to pass this module delivery, students must achieve an overall weighted module mark of 50%.