Skip to content

bips-hb/JSS_innsight

Repository files navigation

Interpreting Deep Neural Networks with the Package innsight

This repository can be used to reproduce the results and figures from the paper "Interpreting Deep Neural Networks with the Package innsight" submitted for the Journal of Statistical Software (JSS). All results can be reproduced with the R script. Nevertheless, the script reproduction_material.R is structured into the individual sections of the paper, which can also be executed individually after a successful execution of the script's preamble. It includes:

  • Section 4.1: The R code for the example with the penguin dataset (only numerical input variables) and reproduces the Figures 7 (a) and (b).

  • Section 4.2: The R code for the example with the melanoma dataset (images and tabular inputs) and reproduces the Figures 9 (a) and (b). In this code the weights of an already trained model are used, which are stored in the folder additional_files/. How exactly the model was trained is explained in the folder Melanoma_model_training/, but requires considerable computer power due to the size of the model.

  • Section 5.1: It includes the simulation study of the implemented feature attribution methods regarding the correctness compared with the reference implementations captum, zennit, innvestigate, deeplift, and shap. It reproduces the Figures 10 (a) - (c).

  • Section 5.2: It includes the simulation study of the implemented feature attribution methods regarding the runtime compared with the reference implementations captum, zennit, innvestigate, deeplift, and shap. It reproduces the Figures 11 (a) and (b), 14, 15, 16, 17, 18.

  • Appendix B: The code to reproduce the differences between innsight and innvestigate explained in Appendix B for the LRP α-β-rule when a bias vector occurs in the model.

After one of these (sub-)scripts has been executed, the respective Figures are saved in the folder figures/.

Since each reference implementation has different constraints on the provided deep learning library and the available packages, the computations occur in separated conda environments with the required packages and package versions. These conda environments are created in the preabmble of the R script using the R script utils/create_condaenvs.R and are essential for reproducing the results. However, the first time you run the code, you will be asked if you want to install them.

Reproduction of the results

Before executing any subsection from the R script reproduction_material.R, make sure that the requirement section has been executed previously and that the required packages and Conda environments are present.

4.1. Example 1: Penguin dataset

In the first example, the penguin dataset provided by the palmerpenguins package is used and a neural network consisting of a dense layer is trained using the neuralnet package. To reproduce the results from the paper, run the corresponding section in the R script reproduction_material.R.

The Figures 7 (a) and (b) used in the paper are then saved in the folder figures/.

Figure 7

4.2. Example 2: Melanoma dataset

The second example examines the melanoma dataset from the Kaggle challenge in 2020, issued by the society of imaging informatics in medicine (SIIM) and based on the international skin imaging collaboration (ISIC) archive, the most extensive publicly available collection of quality-controlled dermoscopic images of skin lesions. This dataset consists of $33.126$ labeled images with associated patient-level contextual information, such as the age, gender, and image location of the skin lesion or mole. Since this is an extensive neural network and complicated high-dimensional data, the definition and training of the model is explained in more detail in the folder Melanoma_model_training/. To reproduce the results, the weights of the model stored at additional_files/melanoma_model.h5 are loaded and used. To reproduce the results from the paper, run the corresponding section in the R script reproduction_material.R.

This creates the images for Figure 9 from the paper and places them in the folder figures/.

Figure 9

5. Validation and runtime

In the paper, our package innsight was evaluated with the reference implementations zennit, captum, innvestigate, shap and deeplift in terms of correctness of results and runtime on a simulation study with shallow untrained models. Each of these simulations takes quite a bit of time. The exact details of this simulation can be found in the paper and the simulation is run to reproduce the results with the corresponding section in the R script reproduction_material.R.

Note: Since this simulation takes a lot of time, it can be significantly reduced if fewer models than $50$ are created per architecture. This can be adjusted with the num_models value (e.g., num_models <- 1) in the R script called above. (By default, the script uses a minimal time-consuming setting)

This creates the images for Figure 10 from the paper and places them in the folder figures/.

Figure 10

To start the simulation for the time measurement the corresponding section in the R script reproduction_material.R must be executed.

Note: Since this simulation takes a lot of time, it can be significantly reduced if fewer models than $20$ are created per architecture. This can be adjusted with the num_models value in the R script called above. In addition, the step size of the varying parameter can also be lowered, e.g., for the number of hidden layers, set c(2, 25, 50) instead of c(2, seq(5, 50, by = 5)). (By default, the script uses a minimal time-consuming setting)

This creates the images for Figure 11 (a) and (b), 14, 15, 16, 17 and 18 from the paper and places them in the folder figures/.

Figure 11 (a)

Figure 11 (b)

Figures from Appendix A

Figure 14

Figure 15

Figure 16

Figure 17

Figure 18

Appendix B

The R script demonstrates the differences between innsight and innvestigate in the LRP α-β-rule, which is explained in more detail in the paper in Appendix B. Run the corresponding section in the R script reproduction_material.R for a reproduction.

That outputs

── Results ─────────────────────────────────────────────────────────────────────

── iNNvestigate 
  epsilon_0.001 alpha_beta_1 alpha_beta_2  out
1     0.9868421            1            2 0.75

── innsight 
  epsilon_0.001 alpha_beta_1 alpha_beta_2  out
1     0.9868421    0.7499993     1.499999 0.75

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published