Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow inference of bottom-up models #138

Closed
Witek- opened this issue Sep 19, 2020 · 10 comments · Fixed by #142
Closed

Slow inference of bottom-up models #138

Witek- opened this issue Sep 19, 2020 · 10 comments · Fixed by #142
Assignees

Comments

@Witek-
Copy link

Witek- commented Sep 19, 2020

I am getting very slow performance on a single person video on RTX2080Ti with the following command

python demo/bottom_up_video_demo.py configs/bottom_up/resnet/coco/res152_coco_512x512.py res152_coco_512x512-364eb38d_20200822.pth --video-path demo/tabata1_640x360.mp4 --show --device cuda:0

FPS ~2.5. It was calculated as total processing time divided by the number of video frames, so it is not super precise...

Any idea what might be wrong?

@innerlee
Copy link
Contributor

Inference speedup is on our list #9 (comment)
To gather more info, could you tell what's the expected speed in this case?

@jin-s13
Copy link
Collaborator

jin-s13 commented Sep 21, 2020

For higher inference speed, you may try to set adjust=False, refine=False, flip_test=False in the config file.

@innerlee
Copy link
Contributor

maybe we should add a section in doc to explain how to set flags to enable fast demo, and how to set them to get accurate demo.

@Witek-
Copy link
Author

Witek- commented Sep 21, 2020

For higher inference speed, you may try to set adjust=False, refine=False, flip_test=False in the config file.

I divided the number of frames by the entire 'while' loop time, so fps is calculated more accurately.

standard config - 2.4 fps
adjust=False - 2.5 fps - marginal gain
refine=False - 10.3 fps
adjust=False and refine=False - 10.4 fps

However, when I use flip_test=False get the following error:

Traceback (most recent call last):
File "demo/bottom_up_video_demo.py", line 104, in
main()
File "demo/bottom_up_video_demo.py", line 71, in main
pose_results = inference_bottom_up_pose_model(pose_model, img)
File "/home/witek/Desktop/open-mmlab/mmpose/mmpose/apis/inference.py", line 322, in inference_bottom_up_pose_model
return_loss=False, img=data['img'], img_metas=data['img_metas'])
File "/home/witek/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/witek/Desktop/open-mmlab/mmpose/mmpose/models/detectors/bottom_up.py", line 112, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/witek/Desktop/open-mmlab/mmpose/mmpose/models/detectors/bottom_up.py", line 241, in forward_test
aggregated_heatmaps.size(2)])
File "/home/witek/Desktop/open-mmlab/mmpose/mmpose/core/evaluation/bottom_up_eval.py", line 190, in get_group_preds
joints = transform_preds(person, center, scale, heatmap_size)
File "/home/witek/Desktop/open-mmlab/mmpose/mmpose/core/post_processing/post_transforms.py", line 108, in transform_preds
assert coords.shape[1] == 2 or coords.shape[1] == 5
AssertionError

@jin-s13
Copy link
Collaborator

jin-s13 commented Sep 21, 2020

Aha. Sorry for that, this is indeed a bug.
Please just ignore the assertion and comment line108.

@jin-s13 jin-s13 mentioned this issue Sep 21, 2020
@Witek-
Copy link
Author

Witek- commented Sep 21, 2020

OK, now it works. Thank you. I was able to get 17.5 fps and even over 22 with visualization off and no imshow. It is a huge difference in speed. Any idea about precision change?

@innerlee innerlee linked a pull request Sep 21, 2020 that will close this issue
@jin-s13
Copy link
Collaborator

jin-s13 commented Sep 21, 2020

Yes, speed and accuracy are important trade-offs. Generally speaking, no flip-testing will lead to about 5 mAP drop on COCO dataset. You may follow the guidelines and evaluate the models on COCO.

@Witek-
Copy link
Author

Witek- commented Sep 23, 2020

Inference speedup is on our list #9 (comment)
To gather more info, could you tell what's the expected speed in this case?

Well, I just compiled OpenPose and it runs at around 46 fps (no hand and face detection). The thing is it uses close to 100% of my GPU, while mmpose just uses a fraction of it...(as reported by nvidia-smi)

@adriansteidle
Copy link

Well, I just compiled OpenPose and it runs at around 46 fps (no hand and face detection). The thing is it uses close to 100% of my GPU, while mmpose just uses a fraction of it...(as reported by nvidia-smi)

Hey @Witek- ,
I come across the same issue. Would be great if you could let me know if you found another solution with MMPose or something even faster than OpenPose.
Maybe one possible cause could be, that OpenPose uses cuDNN for GPU acceleration?

@jin-s13 @innerlee
Is there a way to increase the GPU utilization?
Thanks for your great work so far!

rollingman1 pushed a commit to rollingman1/mmpose that referenced this issue Nov 5, 2021
HAOCHENYE pushed a commit to HAOCHENYE/mmpose that referenced this issue Jun 27, 2023
…sTestCase (open-mmlab#138)

* [Enhancement] Provide MultiProcessTestCase to test distributed related modules

* remove debugging info

* add timeout property

* [Enhancement] Refactor the unit tests of dist module with MultiProcessTestCase

* minor refinement

* minor fix
@sm00110011
Copy link

For higher inference speed, you may try to set adjust=False, refine=False, flip_test=False in the config file.

Where is the config file ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants