Skip to content

Commit

Permalink
update case readme
Browse files Browse the repository at this point in the history
  • Loading branch information
zhouyu committed Jul 18, 2023
1 parent dd0a7af commit 1395ffd
Showing 1 changed file with 24 additions and 7 deletions.
31 changes: 24 additions & 7 deletions training/nvidia/mobilenetv2-pytorch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,28 @@


### 运行情况
| 训练资源 | 配置文件 | 运行时长(s) | 目标精度 | 收敛精度 | Steps数 | 性能(samples/s) |
| -------- | --------------- | ----------- | -------- | -------- | ------- | ---------------- |
| 单机1卡 | config_A100x1x1 | | | | | |
| 单机2卡 | config_A100x1x2 | | | | | |
| 单机4卡 | config_A100x1x4 | | | | | |
| 单机8卡 | config_A100x1x8 | 94208.62 | 70.634 | 70.634 | 1501500 | 4081.72 |
| 两机8卡 | config_A100x2x8 | | | | | |
* 通用指标

| 指标名称 | 指标值 | 特殊说明 |
| -------------- | ----------------------- | ------------------------------------- |
| 任务类别 | 图像分类 | |
| 模型 | MobilenetV2 | |
| 数据集 | ImageNet2012 | |
| 数据精度 | precision,见“性能指标” | 可选fp32/amp/fp16 |
| 超参修改 | fix_hp,见“性能指标” | 跑满硬件设备评测吞吐量所需特殊超参 |
| 硬件设备简称 | nvidia A100 | |
| 硬件存储使用 | mem,见“性能指标” | 通常称为“显存”,单位为GiB |
| 端到端时间 | e2e_time,见“性能指标” | 总时间+Perf初始化等时间 |
| 总吞吐量 | p_whole,见“性能指标” | 实际训练图片数除以总时间(performance_whole) |
| 训练吞吐量 | p_train,见“性能指标” | 不包含每个epoch末尾的评估部分耗时 |
| **计算吞吐量** | **p_core,见“性能指标”** | 不包含数据IO部分的耗时(p3>p2>p1) |
| 训练结果 | acc,见“性能指标” | 单位为top1分类准确率(acc1) |
| 额外修改项 || |

* 性能指标

| 配置 | precision | fix_hp | e2e_time | p_whole | p_train | p_core | acc | mem |
| ------------------ | --------- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| A100单机8卡(1x8) | fp32 | / | 68451 | 5617 | 5784 | 5902 | 68.58% | 4.9/40.0 |
| A100单机8卡(1x8) | fp32 | bs=256,lr=0.36 | 43888 | 8763 | 9059 | 9274 | 66.7% | 21.7/40.0 |

0 comments on commit 1395ffd

Please sign in to comment.