High-Dimensional Data Analysis
A short course presented by
Olivier Thas (Ghent University, Belgium and University of Wollongong, Australia)
Monday and Tuesday, 5 - 6 February 2018
Building 6 Room 210
University of Wollongong Campus
High-dimensional data are generated by modern high-throughput technologies that have emerged in many disciplines such as genomics, brain imaging, and the environmental sciences. Dimension-reduction is often required to explore, visualise and analyse such data. This course offers practical intermediate-level coverage of the analysis of high-dimensional data with examples drawn from genomics, medicine, the environmental and social sciences. A modern approach is taken to lay a basis for data analysing, ranging from data exploration, visualisation and summarisation to large-scale hypothesis testing and building prediction models in the high-dimensional data setting.
This 2-day course provides the fundamental statistical background of the analysis of high dimensional data. The course focuses on choosing the appropriate analysis for answering research questions related to data exploration, data visualisation, prediction (regression and classification) and large-scale hypothesis testing (multiple testing).
- Basics of multivariate data exploration and traditional dimension reduction methods: Singular Value Decomposition (SVD) with Principal Components Analysis (PCA), and Multidimensional Scaling (MDS – PCoA) as special cases
- SVD, PCA and MDS for high-dimensional data (sparse methods)
- Visualisation of high-dimensional data, including bi-plots and t-SNE
- High-dimensional prediction/classification models, including model selection, feature selection, regularisation and model assessment and validation
- Basics of Deep Learning
- Large-scale hypothesis testing, including False Discovery Rate (FDR) control and Empirical Bayes methods
Participants will receive a printed copy of the notes and slides used in the presentations and of the example computer programs. Participants are required to bring a laptop with a recent version of R loaded. Loaner laptops may be available with sufficient prior notice. Additional material for R will be made available during the course.
This course is appropriate for any statistician or researcher with a solid statistics background and is useful for anyone who works with high dimensional datasets.
Olivier Thas is a Professor of Statistics in the Faculty of Bio-Science Engineering, BioStat Research Group, Department of Mathematical Modelling, Statistics and Bio-Informatics of Ghent University in Belgium. He is also an Honorary Professorial Fellow in the National Institute for Applied Statistics Research Australia (NIASRA) at the University of Wollongong.
Olivier’s research covers many aspects of nonparametric and semiparametric statistics, with a particular focus on hypothesis testing and statistical methods for high-throughput and high-dimensional genomics data.
Fees and Information
|High-Dimensional Data Analysis||5 -6 Feb||$1000||$900||$400|
|Location:||Smart Building (Building 6) Room 210|
|Duration:||Monday 5 Feb: Registration at 9:00 am; course from 9.15 am to 4.30 pm |
Tuesday 6 Feb: Course from 9:15 am to 4:30 pm
Morning and afternoon coffee/tea and a sandwich lunch on both days are included in the course fee.
Places are strictly limited and registrations will be processed as they are received.