Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Underutilization of GPU #37

Closed
ducalpha opened this issue Aug 2, 2018 · 6 comments
Closed

Underutilization of GPU #37

ducalpha opened this issue Aug 2, 2018 · 6 comments
Labels
enhancement Improving of an existing feature

Comments

@ducalpha
Copy link

ducalpha commented Aug 2, 2018

When running an experiment with NER (run_ner.py), I found that the model does not utilize my GPU fully. I have 2 GPUs Nvidia Titan Xp on my machine but only 1 of them is used with the utilization varying from 10-60%.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.45                 Driver Version: 396.45                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN Xp            Off  | 00000000:03:00.0 Off |                  N/A |
| 40%   61C    P2    65W / 250W |   5367MiB / 12196MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN Xp            Off  | 00000000:04:00.0 Off |                  N/A |
| 23%   35C    P8    10W / 250W |     10MiB / 12188MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     30384      C   python                                      5357MiB |
+-----------------------------------------------------------------------------+

Training my data NeuroNLP2 yields utilization of both GPUs at more than 90%.

@alanakbik
Copy link
Collaborator

Very interesting! Multi-GPU support is definitively on our list, possibly for release 0.3. With regards to GPU usage that is strange - we have to run some tests to see what happens here.

@alanakbik
Copy link
Collaborator

@kashif

alanakbik pushed a commit that referenced this issue Sep 24, 2018
alanakbik pushed a commit that referenced this issue Sep 27, 2018
@tabergma tabergma added the enhancement Improving of an existing feature label Oct 4, 2018
@alipetiwala
Copy link

alipetiwala commented Feb 1, 2019

@alanakbik

same issue here on NER model

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   37C    P0    38W / 300W |   2353MiB / 16130MiB |     16%      Default |
+-------------------------------+----------------------+----------------------+

Used the ner model.

@mattilyra
Copy link
Contributor

Just to chime in, I'm experiencing the same issue using the BiLSTM+CRF model - the GPU utilisation typically hovers somewhere around 30%, occasionally jumping up to 70 or 80.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                    0 |
| N/A   52C    P0   330W / 300W |  10909MiB / 16130MiB |     65%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2402      C   python                                     10899MiB |
+-----------------------------------------------------------------------------+

@stale
Copy link

stale bot commented Apr 30, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Apr 30, 2020
@alanakbik alanakbik removed the wontfix This will not be worked on label Apr 30, 2020
@alanakbik
Copy link
Collaborator

A lot of improvements were added over time so GPU usage should be much higher now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving of an existing feature
Projects
None yet
Development

No branches or pull requests

5 participants