Investigate DAG idempotency #4648
Labels
💻 aspect: code
Concerns the software code in the repository
🧰 goal: internal improvement
Improvement that benefits maintainers, not users
🟩 priority: low
Low priority and doesn't need to be rushed
🧱 stack: catalog
Related to the catalog and Airflow DAGs
🔧 tech: airflow
Involves Apache Airflow
Context
In a recent conversation about unplanned EC2 instance restarts and how they might affect our infrastructure, we began discussing how our current DAGs might handle a random dropout. Here's that conversation for posterity:
From @sarayourfriend:
From @AetherUnbound:
Description
We should audit our current set of DAGs to make sure their behavior is consistent with what's described above, and that all DAGs (critical or not) would be able to be restarted easily if the worker is terminated during the course of the run.
The text was updated successfully, but these errors were encountered: