The module will provide a foundation in data science principles and techniques. Topics covered will include:
1. The role of code in data exploration and analysis
2. Loading, selecting and visualizing data sets from different sources.
3. Basic programming for data analysis.
4. Data structures including arrays and data frames.
5. Principles of inference using simulation and resampling.
6. Straight line relationships with correlation and regression.
7. Numerical optimization
8. Regression for prediction and inference
9. Bootstrap methods.
10. Machine learning methods for prediction.
Finally, students will complete an independent group data analysis project in which they find, load, clean, explore and analyze a data set of their choice, using inference or prediction methods as necessary. This will be the primary basis of the course assessment.
8 coursework assignments = 25% (first four = 20%; last four = 5%)
1 X 30% structured data analysis assignment
1 X 45% independent group project