This research investigates the locality characteristics of different XAI (eXplainable AI) methods on tabular data. We focus on comparing LIME and Integrated Gradients across various models and datasets.
- XAI Methods: LIME, Integrated Gradients
- Models: Deep learning and gradient boosting approaches
- Datasets: 3 standard + 3 synthetic datasets
- Distance Measures:
- lime: euclidean, manhattan, canberra, cosine,
- gradient: all of the above + infinity
- Kernel Widths: half, double, default (LIME only)
XAI Method | Completed Models | Status |
---|---|---|
LIME | ✅ All models (Gbt, DL) | ✅ All lime-compatible distances & kernel widths |
Integrated Gradients | ✅ All DL-models | ✅ All distances |
XAI Method | Model Type | Configuration | Total |
---|---|---|---|
LIME (GBT) | 3 models × 6 datasets × 4 distances × 3 kernel widths | 216 experiments | ✅ Done |
LIME (Deep) | 6 models × 6 datasets × 4 distances × 3 kernel widths | 432 experiments | ✅ Done |
IG (Deep) | 6 models × 6 datasets × 5 distances | 180 experiments | ✅ Done |
XAI Method | Pending Models |
---|---|
Anchor | All models (Gbt, DL) |
Smooth Grad | All DL-models |
..?.. | All models |
Dataset | Features | Samples | Description |
---|---|---|---|
Higgs | 28 | 940,160 | Binary classification of Higgs boson signals |
Jannis | 54 | 57,580 | Binary classification benchmark dataset |
MiniBooNE | 50 | 72,998 | Particle identification |
To be extended to all datasets integrated into pytorch frame, see description here:
Using sklearns method: sklearn.datasets.make_classification
Link to dataset
Complexity | Features | Informative Features | Clusters per Class | Description |
---|---|---|---|---|
Simple | 50 | 2 | 2 | Low complexity |
Medium | 50 | 10 | 3 | Medium complexity |
Complex | 100 | 50 | 3 | High complexity |
- ResNet (Gorishniy et al., 2021)
- ExcelFormer (Chen et al., 2023a)
- Trompt (Chen, et al., 22023)
- FTTransformer (Gorishniy et al., 2021)
- TabNet (Arik Sercan O., 2021)
- TabTransformer (Huang et al., 2020)
- Simple MLP
- XGBoost
- LightGBM
- Distance Measures: euclidean, manhattan, canberra, cosine, infinity
- Kernel Widths (LIME only): half of the default, double of the default, default
This repository contains code adapted from the python package PyTorch Frame (PyG-team).
- Original source: GitHub link to original script
- License: MIT (link)
Modifications include dataset adaptation for our specific use case.