Statistics
Courses
- https://github.com/thomas-haslwanter/statsintro_python/tree/master/ipynb
- Model based machine learning: http://www.mbmlbook.com/toc.html
- A free online companion course to the Second Edition of An Introduction to Statistical Learning is available through edX - https://www.statlearning.com/online-course
- Michael Betancourt series on probability, modeling, etc. - https://betanalpha.github.io/writing/
Books
- Wasserman’s All of Statistics
- Shalizi’s Advanced Data Analysis from an Elementary Point of View
- Computer Age Statistical Inference: Algorithms, Evidence and Data Science
- Model Based Machine Learning by John Winn and others (including Chris Bishop)
- Intro to Probability for Data Science
- The Effect: An Introduction to Research Design and Causality
- Bayesian models of perception and action
Probability theory
- https://betanalpha.github.io/assets/case_studies/probability_theory.html
- https://betanalpha.github.io/assets/case_studies/conditional_probability_theory.html
Probabilist programming
- Prob programming workflow using PyMC3 - https://github.com/springcoil/pydataprobprog
Bayesian Data Analysis
- Lecture notes and videos by Aki Vehtari - https://github.com/avehtari/BDA_course_Aalto
Links
- Connection between statistical tests and linear models - https://lindeloev.github.io/tests-as-linear/
- Understanding ANOVA - http://www.stat.columbia.edu/~gelman/research/unpublished/econanova2.pdf
- Intro to probabilistic programming - https://arxiv.org/abs/1809.10756
- Imitation in Animals: Evidence, Function, and Mechanisms - http://pigeon.psy.tufts.edu/avc/zentall/default.htm
- Simpson's paradox - https://roamanalytics.com/2017/09/08/simpsons-paradox-and-causal-inference-with-observational-data/
- A Survey on Data Collection for Machine Learning: a Big Data - AI Integration Perspective - https://arxiv.org/abs/1811.03402
- Data science is science's second chance to get causal inference right: A classification of data science tasks - https://arxiv.org/abs/1804.10846
- High-Confidence Predictions under Adversarial Uncertainty - https://arxiv.org/abs/1101.4446
- Cheat Sheet: Subgradient Descent, Mirror Descent, and Online Learning - http://www.pokutta.com/blog/research/2019/02/27/cheatsheet-nonsmooth.html
- Handy statistical lexicon - https://statmodeling.stat.columbia.edu/2009/05/24/handy_statistic/
- Toward a principled Bayesian workflow: A tutorial for cognitive science - https://osf.io/b2vx9/
- Description of different types of MCMC algorithms - https://m-clark.github.io/docs/ld_mcmc
- Lord's paradox, Simpson's paradox explanations - https://m-clark.github.io/docs/lord/index.html
- Probability cheatsheet - https://github.com/wzchen/probability_cheatsheet
- Importance Samplint - https://statweb.stanford.edu/~owen/mc/Ch-var-is.pdf
- Montel-Carlo - https://github.com/szcf-weiya/MonteCarlo/blob/master/References/Monte-Carlo-Strategies-in-Scientific-Computing.pdf
- The ABC of ABC (Approximate Bayesian Computation) - https://xianblog.wordpress.com/2019/07/28/introductory-overview-lecture-the-abc-of-abc/amp/
- Falling (In Love With Principled Modeling) - https://betanalpha.github.io/assets/case_studies/falling.html
- Handy statistical lexicon from Gelman - https://statmodeling.stat.columbia.edu/2009/05/24/handy_statistic/
Uncertainity
- Monte-Carlo-Strategies-in-Scientific-Computing - https://erikbern.com/2018/10/08/the-hackers-guide-to-uncertainty-estimates.html
Extreme value theory (EVT)
- Novelty detection via EVT: https://pdfs.semanticscholar.org/b81a/a3046e7f9213949fb37e0c59cacdca4572c4.pdf
- Fitting extreme value distribution (Weibull): https://stackoverflow.com/questions/17481672/fitting-a-weibull-distribution-using-scipy
- Extreme value analysis an introduction: https://hal-enac.archives-ouvertes.fr/hal-00917995/document
- EVT in context of discrete choice modeling: https://eml.berkeley.edu/books/choice2.html
Datasets
Economics
- The Economy - https://www.core-econ.org/the-economy/
- QuantEcon: Open source code for economic modeling - https://quantecon.org/
- Nashpy: Python library used for the computation of equilibria in 2 player strategic form games - https://nashpy.readthedocs.io/en/stable/index.html
Time Series
- Stumpy - https://github.com/TDAmeritrade/stumpy/
- Python Toolkit of Statistics for Pairwise Interactions (pyspi) - https://github.com/olivercliff/pyspi
- Time Series datasets for spatiotemporal data modeling from transportation data imputation project - https://github.com/xinychen/transdim
- Dynamic Mode Decomposition - http://www.pyrunner.com/weblog/2016/07/25/dmd-python/ Application to Traffic Patters