SDA Statistics and Data Analysis
Course Description
This course aims at introducing fondamental concepts of statistics and data analysis, ranging from dimension reduction to analysis of variance. At the end of this course, students should be able to rigorously address a large span of standard statistical questions that naturally arise when trying to understand real data. Hence both the theoretical foundations and the practical implementation of methods need to be known. We encourage students to keep in mind both paradigms at any time. Typically, ...
- ... when working on theoretical aspects, ask yourself what it would imply on the practical side.
- ... before applying methods on data, make sure that theoretical assumptions are satisfied.
For students planning to follow in the coming years more advanced courses on statistics, the set of tools provided in this course already allow you to tackle active research questions and we can provide references and resources for those interested.
Solving Real Problems with Statistical Methods
Software
-
The learning materials are developed for R version 3.6.0 or later. We also recommend to also install the latest version of RStudio.
-
To install all requirements, please copy and paste this line of code in your R console.
source("https://quentin-duchemin.github.io/ENPC-SDA/install.R")
-
To remember basic functions in R and Rmarkdown, you may download the following file: https://cran.r-project.org/doc/contrib/Kauffmann_aide_memoire_R.pdf
-
All data can be downloaded locally following this link.
-
For your personal projects, we give a short list of websites where you may find interesting datasets.
Learning materials
- Introduction
- Chi 2 Test of Independence
- Two-sample Homogeneity Tests
- Fisher and Student Tests
- Two-sample Homogeneity Non-Parametric Tests
- Analysis of Variance
- Logistic Regression
- Additional Exercises
Teachers
- Julien Reygner
- Gabriel Stoltz
- Guillaume Perrin
- Quentin Duchemin
The source for this course webpage is in github.