Change Highlights
Renamed to zeus
!
Until now we used zeus-ml
because the name zeus
was taken on PyPI, but now we're finally able to move to zeus
:
pip install zeus
Prometheus Metrics
Zeus power and energy measurements can now be exported as Prometheus metrics! We currently support three metrics:
- Energy consumption of a fixed code range (Histogram)
- Power draw over time (Gauge)
- Cumulative energy consumption over time (Counter)
We wrote up a detailed metric monitoring guide and integration examples.
AMD GPU enhancements
We created ROCm AMDSMI Python bindings (GitHub, PyPI) and integrated it with Zeus. Before this, users had to cd
into their ROCm installation's AMDSMI distribution directory and run pip install
, which isn't very convenient.
Our bindings are unofficial & community-maintained. But AMDSMI maintainers did take a look (ROCm/amdsmi#8).
Carbon Emission Estimations
The new zeus.monitor.carbon.CarbonEmissionMonitor
takes in a carbon intensity provider (e.g., from ElectricityMaps) and provides an estimate for operational carbon emissions. The window-based API is essentially the same as ZeusMonitor
.
Full Changelog
- [Misc] Reorganize Zeus NSDI 23 paper artifacts by @jaywonchung in #126
- [Docs] Add
BUILD_SOCIAL_CARD
env, skip social card build by default by @jaywonchung in #130 - [Feat]
CarbonIntensityProvider
and ElectricityMaps implementation by @danielhou0515 in #129 - [Misc] Fix link in PLO example README by @jaywonchung in #136
- Fix typo in profiler script by @dkopczyk in #138
- [Feat]
amdsmi
bindings integration by @parthraut in #132 - Make sure to assign EmptyCPUs to cpus if there is a permission error by @wbjin in #139
- [Feat] Implement CPU and DRAM monitoring for
zeusd
by @wbjin in #137 - [Fix] Fix tests failing due to deprecated
app
argument in httpx client by @jaywonchung in #140 - Out of Bounds Power Limit in
GlobalPowerLimitOptimizer
by @parthraut in #143 - [CI] Upgrade
actions/cache
to V4 by @jaywonchung in #144 - [Misc] Update Perseus paper link by @jaywonchung in #145
- [feat]
CarbonEmissionMonitor
by @danielhou0515 in #148 - Update
zeusd
dependencies following dependabot suggestions by @jaywonchung in #149 - [Feat] Prometheus metric export by @sharonsyh in #134
- Pytorch Fully Sharded Data Parallel (FSDP) Integration by @parthraut in #147
- Rename package from
zeus-ml
tozeus
by @jaywonchung in #151 - Incorporate Zeusd for CPU and DRAM monitoring in ZeusMonitor by @michahn01 in #150
- Trace GPU ID in Zeusd GPU routes by @jaywonchung in #152
New Contributors
- @dkopczyk made their first contribution in #138
- @michahn01 made their first contribution in #150