-
-
Notifications
You must be signed in to change notification settings - Fork 16.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Benchmarking CI with --hard-fail flag #7915
Conversation
fc9f15a
to
ff99d04
Compare
…INO and Tensorflowjs
@JWLee89 thanks for the PR. We already have most export verification functionality included in utils/benchmarks.py, i.e. see #6613. This file runs all possible exports and checks export success and mAP and speeds if exports succeeded. We want to use this existing code as the basis for any CI updates, by just making a few small changes to create hard rather than soft failures, i.e. python utils/benchmarks --hard-fail, and to also assert output mAPs above threshold. Colab Pro+ High-RAM CPU Results
|
About deleting exported models, I'm not clear on the reason you want to delete them, but if you want to you can simply point utils/benchmarks.py --weights to a new directory. The PyTorch model automatically downloads to this directory and all exports are located alongside it. After CI you can simply delete the directory, i.e.: python utils/benchmarks.py --weights weights/yolov5n.pt
rm -rf weights/ |
Thank you for the clarification. I will update this PR to match the specifications by modifying I just ran the test on my local device and it seems that this does take some time to run, so I am planning on creating a separate workflow I will ping you when the updated code is ready for review.
I wanted to delete exported models if in the case that we run the CI workflow on a self-hosted server, we don't leave dangling files on the server's local filesystem. Some questions
# Download script/URL (optional)
download: https://ultralytics.com/assets/coco128.zip
# Benchmark values
benchmarks:
# metrics E.g. mAP
mAP:
# Img size 640
yolov5n: 0.45
yolov5s: 0.45
yolov5m: 0.45
yolov5l: 0.45
yolov5x: 0.45
# Img size 1280
yolov5n6: 0.5
yolov5s6: 0.5
yolov5m6: 0.6
yolov5l6: 0.6
yolov5x6: 0.6 For now, If the benchmark property does not exist under the yaml file, we will skip the assertions.
|
@glenn-jocher I have added the assertions to check whether mAP lies above a certain threshold. The test is running on CI-workflow called python utils/benchmarks.py --weights ${{ matrix.model }}.pt --hard-fail Summary on implementation details
mAP threshold check can be seen below. Applied a threshold value of 0.8 to show that it raises the appropriate error when the model does not meet threshold requirements. |
53c6f08
to
ca3d4aa
Compare
d1145dd
to
ea77cc2
Compare
1c854b9
to
3fc7703
Compare
…s and other human error
@glenn-jocher When you have some time to spare, could you please take a look at the PR? I noticed that in #6613, the benchmarks are performed using yolov5s. Right now, the automation for model export failure and mAP threshold check is being done for both For additional information, please feel free to ask. Details are also recorded in my previous comments. Thank you for taking time out of your schedule to take a look at the PR. |
@JWLee89 yes I've got this as a TODO, will get to it soon! |
@glenn-jocher Thank you for letting me know! |
@JWLee89 I've added benchmarking by default to all CI runs now in #7996: yolov5/.github/workflows/ci-testing.yml Lines 40 to 43 in 09ba6f6
This runs YOLOv5n at 320 and should produce these results: |
@glenn-jocher Thank you for the follow up! I noticed that you removed the And right now, the CI benchmark testing is failing, because the mAP 0.5:0.95 for
Is there any specific threshold value that you have in mind for If the mAP threshold check feature is not needed, I can also remove / comment out the code from this PR. Let me know. Thanks! |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions YOLOv5 🚀 and Vision AI ⭐. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions YOLOv5 🚀 and Vision AI ⭐. |
👋 Hello there! We wanted to let you know that we've decided to close this pull request due to inactivity. We appreciate the effort you put into contributing to our project, but unfortunately, not all contributions are suitable or aligned with our product roadmap. We hope you understand our decision, and please don't let it discourage you from contributing to open source projects in the future. We value all of our community members and their contributions, and we encourage you to keep exploring new projects and ways to get involved. For additional resources and information, please see the links below:
Thank you for your contributions to YOLO 🚀 and Vision AI ⭐ |
@JWLee89 The Regarding the mAP threshold issue for Your efforts to improve the benchmarking process are valuable, and I appreciate your attention to detail. |
@glenn-jocher Thank you for the update and not a problem. If there is some other high-priority / useful items that needs to be worked on, I am more than happy to work on some of these items. Since I have not been actively working on Yolo for a while, I might need a day or two to read through the project / source code again to get myself re-acquainted. |
@JWLee89 Your willingness to contribute is greatly appreciated! Re-familiarizing yourself with the project sounds like a solid plan. Feel free to reach out if you have any questions or need assistance with anything. Your valuable insights and contributions are always welcome. Thank you for your support! |
This PR is motivated by a quote in the following issue (#7870).
For now, this PR does not incorporate separate actions and benchmarking to keep the PR compact, since adding all of these tests in a single PR will result in major changes, making it difficult to review.
Question regarding separate CI working for exports and benchmarking:
Should the CI workflows be run on all three OS (windows, linux, mac)? The reason why I ask is that as far as I know, TensorRT does not play well with Mac OS. Tests may be easier if we target a platform such as ubuntu 18.04 or 20.04 and run the CI workflow inside of a docker container.
Changes
ci-testing.yml
.python export.py --weights ${{ matrix.model }}.pt --img 64 --include torchscript onnx # export
with pytest.test_export.py
pytest
torequirements.txt
conftest.py
to accept pytest command line argument:--weights
(the path to PyTorch.pt
file)Future Works
Discovered issues
tensorflowjs
and OpenVINO do not play well together simultaneously due to OpenVINO supportingnumpy
ver < 1.20.tensorflowjs
on the other hand, does not work withnumpy
ver < 1.20.check_requirements(('tensorflowjs', ))
sometimes fails when installingtensorflowjs
. The root cause is currently unknown, but I believe that we can reproduce it by following the github workflow inside of a docker container.🛠️ PR Summary
Made with ❤️ by Ultralytics Actions
🌟 Summary
Enhanced benchmark integrity and added benchmark validation for model exports in YOLOv5.
📊 Key Changes
--hard-fail
option to the benchmarking script to enforce stricter validation.coco128.yaml
file for different models and image sizes.🎯 Purpose & Impact
The changes primarily impact developers conducting benchmarks for YOLOv5 models, but they also benefit users by ensuring high-quality model performance standards.