🚧 Still Under Construction - please open issues if you find any bugs or have suggestions, thank you!
AERCA: Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery (ICLR 2025 Oral)
The AERCA algorithm performs robust root cause analysis in multivariate time series data by leveraging Granger causal discovery methods. This implementation in PyTorch facilitates experimentation on both synthetic and real-world datasets, allowing researchers to:
- Identify causality relationships among variables.
- Detect anomalies and trace them to their root causes.
- Compare performance across various types of datasets.
For further details, refer to the paper AERCA.
The code has been developed and tested using the following system setup:
- Operating System: Ubuntu 20.04
- GPU Driver: NVIDIA driver 535.216.03
- CUDA Version: 12.2
- Python Version: 3.11.11
- PyTorch Version: 2.6.0+cu118
- Python 3.11: Ensure that Python 3.11 is installed.
- Virtual Environment (Recommended): It is advisable to use a virtual environment to manage dependencies.
-
Install
virtualenv
(if not already installed):python3 -m pip install --user virtualenv
-
Create a virtual environment:
python3 -m venv venv
-
Activate the virtual environment:
source venv/bin/activate
-
Install the required packages:
pip install -r requirements.txt
-
Deactivate the virtual environment (when done):
deactivate
For more details on setting up a virtual environment, refer to the Python documentation.
Clone the repository from GitHub and navigate into the project directory. Replace my-project
with your preferred folder name:
git clone https://github.com/hanxiao0607/AERCA.git my-project
cd my-project
Execute the main script main.py to run the AERCA algorithm. You need to specify the dataset name using the --dataset_name
argument. The supported dataset names are:
linear
nonlinear
lorenz96
lotka_volterra
swat
msds
For example, to run the algorithm on the linear
dataset:
python main.py --dataset_name linear
Additional command-line options are available to customize the run-time behavior. To see all available options, execute:
python main.py --help
This will display details on parameters such as hyperparameter settings, logging options, and more.
The repository includes support for multiple datasets, each designed to evaluate the algorithm under different conditions:
- Linear Dataset: Ideal for evaluating causal relationships in linear systems.
- Nonlinear Dataset: Tests the model’s performance in systems with nonlinear interactions.
- Lorenz96: A synthetic dataset based on the Lorenz96 model, often used for studying chaotic dynamics.
- Lotka-Volterra: Derived from predator-prey dynamics, useful for ecological and biological time series analysis.
- MSDS: A real-world dataset from the MSDS collection.
- SWaT: A dataset for anomaly detection in water treatment systems.
Ensure that the dataset you choose is formatted as expected by the code. Additional preprocessing scripts or instructions may be provided within the repository as needed.
@inproceedings{
han2025root,
title={Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery},
author={Xiao Han and Saima Absar and Lu Zhang and Shuhan Yuan},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=k38Th3x4d9}
}
For any questions, issues, or contributions, please open an issue on GitHub or contact the repository maintainer.