COMPG011 - Data Analytics

This database contains the 2017-18 versions of syllabuses. Syllabuses from the 2016-17 session are available here.

Note: Whilst every effort is made to keep the syllabus and assessment records correct, the precise details must be checked with the lecturer(s).

CodeCOMPG011
YearMSc
Prerequisites

Good knowledge of basic mathematics and statistics.

Term2
Taught By

Tomaso Aste (50%)
Fabio Caccioli (50%)

Aims

The course is aimed at introducing to data analytics providing some basic data-science tools. Statistical tools to individuate regularities, discover patterns and laws in complex datasets will be introduced to students together with instruments to analyse, characterize, validate, parameterize and model complex data. Practical issues on business data analysis and statistics will be covered with specific case studies also in collaboration with industrial partners.

Learning Outcomes

Students will become able to analyse main statistical features of complex datasets. On successful completion of the course, a student should have a good understanding on: 1) how to analyse, characterize empirically complex data; 2) how to compute relevant statistical quantities and quantify their confidence intervals; 3) how to build sensible models and how to parameterize and validate these models; 4) how to quantify inter-dependency/causality structure between different variables; 5) how to use the outcome of data-analytics to develop better tools for forecasting.
Applications:
There is a great need to increase the data-analytics capability in the business community. Data scientists are in great increasing demand. Instruments and tools provided by this course are essential to understand, model and make practical use of the very large quantity of data that most businesses are currently collecting.
Further information and material available to students on the course moodle page.

Content

Empirical investigation of complex data

Essential practical familiarization with complex and big data. Typical challenges with real business data. Basics on data acquisition, manipulation, cleaning, filtering, representation and plotting.

Univariate and multivariate statistics

Marginal probability, joint probability and conditional probability. Empirical estimation of probability distributions. Measures of dependency. Cause and effect. Granger causality, mutual information, transfer entropy. Spurious correlations and regularization. Forecasting and regressions. Calibration, validation hypothesis testing.

Modelling and filtering through networks

Basics on complex networks: definitions and properties. Construction of networks of interactions form correlation and causality measures. Information filtering though networks.

Applications and case-study

Application of the studied material and methods to practical cases and real data will be done within the course through case-studies developed in collaboration with industrial partners.

Method of Instruction

3 hours of lectures per week, practical exercises, case studie

Assessment

The course has the following assessment components:

  • Coursework (100%)

To pass this course, students must:

  • Obtain an overall pass mark of 50% for all sections combined

Resources

Reading list available via the UCL Library catalogue.