Skip to content
/ ISLP Public

Introduction to Statistical Learning with Python

License

Notifications You must be signed in to change notification settings

pzuehlke/ISLP

Repository files navigation

ISLP – Solutions to Applied Exercises

Overview

This repository contains solutions to all applied exercises from the book:

G. James, D. Witten, T. Hastie, R. Tibshirani, and J. Taylor – An Introduction to Statistical Learning with Applications in Python – Springer (2023)

The book and all associated resources can be downloaded for free from the website linked to above. A complete set of 108 video lectures by the authors, covering all of the chapters and Python labs, is also available on Youtube.

Structure

Each solution consists of a separate interactive Jupyter notebook. Solutions are organized by chapter into corresponding folders which contain all the datasets and auxiliary files necessary to run the code.

To download the entire repository, click on the Code button at the top of this page and select "Download ZIP". If you merely want to read/view a specific solution, use the links below for direct access.

Solutions

  1. Statistical Learning

  2. Linear Regression

  3. Classification

  4. Resampling Methods

  5. Linear Model Selection and Regularization

  6. Moving Beyond Linearity

  7. Tree-Based Methods

  8. Support Vector Machines

  9. Deep Learning

  10. Survival Analysis and Censored Data

  11. Unsupervised Learning

  12. Multiple Testing

Dependencies

In order to run the code in the notebooks, you will need to have the following packages installed (besides Python 3.x):

  • Jupyter
  • NumPy
  • pandas
  • scikit-learn
  • PyTorch
  • statsmodels
  • Matplotlib
  • Seaborn
  • SciPy
  • SymPy
  • ISLP (a package by Prof. Jonathan Taylor, one of the co-authors)

The easiest way to satisfy all requirements is to follow the installation instructions that can be found on the book's website.

I have tried as much as possible not to rely on the ISLP package, since it is not as standard as the other ones. However, at a few points its use is unavoidable (e.g., sometimes an exercise explicitly asks you to use it or it is needed to load some dataset).

Contributing

If you find any mistakes or have suggestions for improvements, please open an issue on this repository. Any feedback is appreciated!