Curated Content
Below we provide links to supplementary online material. Hopefully, some of the items will inspire you to view the module material in a broader context and lead to further investigations.
Investigation 1
What is Statistics?
- Cambridge Ideas - Professor Risk
-
https://www.youtube.com/watch?v=a1PtQ67urG4
Prof David Spiegelhalter (Cambridge University) discusses public understanding of risk. You may also be interested in reading (Spiegelhalter 2020).
- The Joy of Statistics
-
https://www.youtube.com/watch?v=jbkSRLYSojo
Prof Hans Rosling (Karolinska Institute and Gapminder Foundation) analyses data from 200 Countries over 200 Years in 4 Minutes - The Joy of Stats - BBC Four.
- Teach statistics before calculus!
-
https://www.ted.com/talks/arthur_benjamin_teach_statistics_before_calculus
Prof Arthur Benjamin (Harvey Mudd College) argues that the pinnacle of math education is probability and statistics — not calculus.
- Kaggle
-
https://www.kaggle.com/
Towards data science. -
https://www.youtube.com/watch?v=TNzDMOg_zsw
What’s Kaggle?
Investigation 2
Defence against the dark arts.
- Three ways to spot bad statistics
-
https://www.ted.com/talks/mona_chalabi_3_ways_to_spot_a_bad_statistic
Mona Chalabi (Data Journalist) discusses three ways to spot bad statistics.
- Statistics Done Wrong
-
https://www.statisticsdonewrong.com/
A book by Dr Alex Reinhart (Carnegie Mellon University).
- How to defend yourself against misleading statistics in the news
-
https://www.youtube.com/watch?v=mJ63-bQc9Xg
Sanne Blauw (Journalist) discusses how the presentation of statistics can mislead.
Investigation 3
Data analysis and visualisation.
- The Grammar of Graphics
-
https://www.youtube.com/watch?v=h-62NwWUI5c
What Makes A Good Visualisation? Rhys Jackson from RocketMill, a UK Digital Marketing Agency, gives a perspective on visualising data from a marketing perspective. -
https://www.youtube.com/watch?v=kepKM7Z2O54
David Keyes (RStudio) discusses how the grammar of graphics underpins theggplot2
data visualization package inR
.
- Same Stats, Different Graphs
-
https://www.autodeskresearch.com/publications/samestats
Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing (ACM SIGCHI Conference on Human Factors in Computing Systems) by Justin Matejka, George Fitzmaurice.
- Why do we so often use 0.05 for hypothesis testing?
-
https://www.openintro.org/book/stat/why05/
In this online exercise, you will gain an improved understanding of what a significance level is, and why a value in the neighbourhood of 0.05 is reasonable as a default.
- Data visualisations
-
https://flowingdata.com/
FlowingData blog by Nathan Yau. -
https://fivethirtyeight.com/
FiveThirtyEight blog by Nate Silver.
- Storytelling with data
-
http://www.storytellingwithdata.com/blog
Blog with nice hints and tips for how to present data in tables, graphics, and visualisations. -
https://community.storytellingwithdata.com/challenges
Monthly challenge.
Investigation 4
Statistical paradoxes.
- How statistics can be misleading (TED-Ed)
-
https://www.ted.com/talks/mark_liddell_how_statistics_can_be_misleading
Mark Liddell (Educator) discusses Simpson’s Paradox in this TED-Ed animation.
- Low birth-weight paradox
- https://www.wikiwand.com/en/Low_birth-weight_paradox
- Gambler’s Fallacy
-
https://www.youtube.com/watch?v=4eVluL-idkM
Prof Kelly Shue (Chicago Booth) discusses the gambler’s fallacy.
Investigation 5
The law and interpreting statistics.
- How stats fool juries.
-
https://youtu.be/kLmzxmRcUTo
Prof Peter Donnelly (Oxford University) discusses common mistakes in interpreting statistics.
- Measurement Uncertainty Calculator (MUCalc)
-
https://discovery.dundee.ac.uk/en/publications/measurement-uncertainty-calculator-mucalc
The Leverhulme Research Centre for Forensic Science Measurement Uncertainty Calculator (MUCalc) is an application for calculating measurement uncertainty in accordance with the standards of International Organization for Standardization ISO/IEC 17025.
- Prosecutor’s fallacy
-
https://www.wikiwand.com/en/Prosecutor%27s_fallacy
A fallacy of statistical reasoning, typically used by a prosecutor to exaggerate the likelihood of guilt: because P(\text{hypothesis} \mid \text{evidence}) \neq P(\text{evidence} \mid \text{hypothesis})!
Investigation 6
Data-driven decision making in epidemiology.
- Project Tycho
-
https://www.tycho.pitt.edu/
Digitized archival epidemiological data for the United States and the world. -
https://www.youtube.com/watch?v=Kn9OJy1BPDo
An overview of the origins of project Tycho.
- Our World in Data
-
https://ourworldindata.org/
A project of the Oxford Martin School to make public health data, including progress in UN Sustainable Development Goals, available and accessible.
- Demographic Party Trick
-
https://www.youtube.com/watch?v=2nDh8MQuS-Y
Prof Hans Rosling (Karolinska Institute and Gapminder Foundation) and Bill Gates seek to shed light on the true statistics of childhood vaccinations.
Investigation 7
Spurious correlations!
- The danger of mixing up causality and correlation
-
https://www.youtube.com/watch?v=8B271L3NtAw
Prov Ionica Smeets (University of Leiden) discusses causality and correlation.
- Spurious correlations
-
https://tylervigen.com/spurious-correlations
Tyler Vigen’s site dedicated to spurious correlations.
- Cause & Effect
-
https://www.youtube.com/watch?v=lbODqslc4Tg
Correlation vs. causality from the Clip from the 2010 documentary “Freakonomics: The Movie”.
Investigation 8
Data and Society: can data-driven and predictive modelling lead to a better world? What are the ethics of mass data collection?
- Science behind the news: Predictive Policing
-
https://www.youtube.com/watch?v=74_jreara3w
The Los Angeles Police Department is using a new tactic in their fight against crime called “predictive policing.” It’s a computer program originally developed by a team at UCLA, including mathematician Andrea Bertozzi and anthropologist Jeff Brantingham. “Science Behind the News” is produced in partnership with NBC Learn. (Provided by the National Science Foundation & NBC Learn)
- You should get paid for your data
-
https://www.nytimes.com/video/opinion/100000006678020/data-privacy-jaron-lanier-2.html
Jaron Lanier (Computer Scientist and Author) discusses a compensation plan and data dignity. -
https://www.ted.com/talks/jennifer_zhu_scott_why_you_should_get_paid_for_your_data
Jennifer Zhu Scott (Computer Scientist) also thinks you should get paid for your data.
- How tech companies deceive you into giving up your data and privacy
-
https://www.ted.com/talks/finn_lutzow_holm_myrstad_how_tech_companies_deceive_you_into_giving_up_your_data_and_privacy
Finn Lützow-Holm Myrstad (Norwegian Consumer Council) discusses consumer protections and data collection.
- Your company’s data could help end world hunger
-
https://www.ted.com/talks/mallory_freeman_your_company_s_data_could_help_end_world_hunger
Mallory Freeman (Data Scientist) discusses how to do the most good with data.
Investigation 9
Machine learning / big data.
- What is Machine Learning?
-
https://www.youtube.com/watch?v=f_uwKZIAeM0
OxfordSparks discusses the topic of supervised learning algorithms and how machine learning is used all around us.
- Big Data (TED-Ed)
-
https://www.youtube.com/watch?v=j-0cUmUyb-Y
Tim Smith (educator) discusses the historical arc of big data in this TED-Ed animation.
- The human insights missing from big data
-
https://www.ted.com/talks/tricia_wang_the_human_insights_missing_from_big_data
Tricia Wang (Ethnographer) discusses the human insights missing from big data.
- How we can find ourselves in data
-
https://www.ted.com/talks/giorgia_lupi_how_we_can_find_ourselves_in_data
Giorgia Lupi (Designer) discusses a humanistic approach to data and data visualization.