This repository contains solutions to all applied exercises from the book:
G. James, D. Witten, T. Hastie, R. Tibshirani, and J. Taylor – An Introduction to Statistical Learning with Applications in Python – Springer (2023)
The book and all associated resources can be downloaded for free from the website linked to above. A complete set of 108 video lectures by the authors, covering all of the chapters and Python labs, is also available on Youtube.
Each solution consists of a separate interactive Jupyter notebook. Solutions are organized by chapter into corresponding folders which contain all the datasets and auxiliary files necessary to run the code.
To download the entire repository, click on at the top of this page and select "Download ZIP". If you merely want to read/view a specific solution, use the links below for direct access.
In order to run the code in the notebooks, you will need to have the following packages installed (besides Python 3.x):
- Jupyter
- NumPy
- pandas
- scikit-learn
- PyTorch
- statsmodels
- Matplotlib
- Seaborn
- SciPy
- SymPy
- ISLP (a package by Prof. Jonathan Taylor, one of the co-authors)
The easiest way to satisfy all requirements is to follow the installation instructions that can be found on the book's website.
I have tried as much as possible not to rely on the ISLP package, since it is not as standard as the other ones. However, at a few points its use is unavoidable (e.g., sometimes an exercise explicitly asks you to use it or it is needed to load some dataset).
If you find any mistakes or have suggestions for improvements, please open an issue on this repository. Any feedback is appreciated!