This repository contains the Python implementation of Robust ATE identification from Multiple ENvironments (RAMEN), introduced in the paper "Doubly Robust Identification of Treatment Effects from Multiple Environments".
RAMEN addresses treatment effect identification by leveraging the heterogeneity of multiple data sources without requiring complete knowledge of the causal graph. In this context, identification refers to the discovery of a valid adjustment set—a set of covariates that can be used to correctly identify the treatment effect, enabling unbiased estimation in subsequent steps. In particular, RAMEN achieves doubly robust identification, meaning the treatment effect can be identified if either the causal parents of the treatment or those of the outcome are observed.
-
Doubly Robust Identification: We introduce a novel double robustness property targeting identification rather than estimation, providing guarantees even with post-treatment and unobserved variables.
-
Algorithms:
- RAMEN: Uses a combinatorial search over subsets of covariates to find valid adjustment sets.
- Insta-RAMEN: A scalable optimization procedure leveraging the Gumbel trick.
-
Empirical Validation: Our algorithms outperform existing methods on synthetic and semi-synthetic datasets and show strong results on real-world data, aligning with epidemiological findings.
The RAMEN
folder contains the core package:
RAMEN/
│── data/ # Data loading and preprocessing
│ ├── datasets/ # Dataset files (e.g., ihdp_obs.csv)
│ ├── data.py # Data processing functions
│
│── evaluations/ # Evaluation scripts and utilities
│ ├── evaluate_semi_synthetic.py # Evaluation on semi-synthetic datasets
│ ├── evaluate_synthetic.py # Evaluation on synthetic datasets
│ ├── evaluations_utils.py # Evaluation helper functions
│
│── models/ # Model implementations
│ ├── insta_ramen.py # Insta-RAMEN model
│ ├── IRM.py # Invariant Risk Minimization (IRM) model
│ ├── ramen.py # RAMEN model
│ ├── models_utils.py # Model utility functions
This package is set up with Python 3.13.2 and the following libraries. These versions represent a possible implementation:
numpy==2.2.2
pandas==2.2.3
torch==2.6.0
scikit-learn==1.6.1
scipy==1.15.1
tqdm==4.67.1
xgboost==2.1.4
conda create -n ramen_env python -y
conda activate ramen_env
Start by cloning the repository from GitHub. Then, upgrade pip
to its latest version and use the local setup files to install the package.
git clone https://github.com/jaabmar/RAMEN.git
cd RAMEN
pip install --upgrade pip
pip install -e .
To run synthetic experiments, navigate to the RAMEN/evaluations
folder and execute:
python evaluate_synthetic.py --n_env 5 --n 1000 --n_features 2 --invariance Y --post_treatment collider --n_post 2 --ate 3.0 --seed 1
--n_env
: Number of environments--n
: Number of samples for each environment--n_features
: Number of pre-treatment features--invariance
: Invariance setting (Y
,T
,TY
)--post_treatment
: Type of post-treatment features (collider
,descendant
,noise
)--n_post
: Number of post-treatment features--ate
: Average treatment effect--seed
: Random seed for reproducibility
We welcome contributions to improve this project. Here's how you can contribute:
- Fork the repository
- Create a new branch (
git checkout -b feature-branch
) - Make your changes and commit (
git commit -m "Description of change"
) - Push to your branch (
git push origin feature-branch
) - Open a Pull Request
For questions or collaborations, feel free to reach out:
- Javier Abad Martinez - [email protected]
- Piersilvio de Bartolomeis - [email protected]
- Julia Kostin - [email protected]
If you find this code useful, please consider citing our paper:
@article{debartolomeis2025doubly,
title={Doubly robust identification of treatment effects from multiple environments},
author={Piersilvio De Bartolomeis and Julia Kostin and Javier Abad and Yixin Wang and Fanny Yang},
journal={International Conference on Learning Representations},
year={2025},
}