Underutilization of GPU #37

ducalpha · 2018-08-02T14:18:23Z

When running an experiment with NER (run_ner.py), I found that the model does not utilize my GPU fully. I have 2 GPUs Nvidia Titan Xp on my machine but only 1 of them is used with the utilization varying from 10-60%.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.45                 Driver Version: 396.45                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN Xp            Off  | 00000000:03:00.0 Off |                  N/A |
| 40%   61C    P2    65W / 250W |   5367MiB / 12196MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN Xp            Off  | 00000000:04:00.0 Off |                  N/A |
| 23%   35C    P8    10W / 250W |     10MiB / 12188MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     30384      C   python                                      5357MiB |
+-----------------------------------------------------------------------------+

Training my data NeuroNLP2 yields utilization of both GPUs at more than 90%.

The text was updated successfully, but these errors were encountered:

alanakbik · 2018-08-02T14:35:47Z

Very interesting! Multi-GPU support is definitively on our list, possibly for release 0.3. With regards to GPU usage that is strange - we have to run some tests to see what happens here.

alanakbik · 2018-08-02T16:21:54Z

@kashif

…ngs for speed improvements

alipetiwala · 2019-02-01T09:30:36Z

@alanakbik

same issue here on NER model

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   37C    P0    38W / 300W |   2353MiB / 16130MiB |     16%      Default |
+-------------------------------+----------------------+----------------------+

Used the ner model.

mattilyra · 2019-04-17T15:36:05Z

Just to chime in, I'm experiencing the same issue using the BiLSTM+CRF model - the GPU utilisation typically hovers somewhere around 30%, occasionally jumping up to 70 or 80.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                    0 |
| N/A   52C    P0   330W / 300W |  10909MiB / 16130MiB |     65%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2402      C   python                                     10899MiB |
+-----------------------------------------------------------------------------+

stale · 2020-04-30T02:10:58Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

alanakbik · 2020-04-30T11:00:49Z

A lot of improvements were added over time so GPU usage should be much higher now.

blythed mentioned this issue Aug 6, 2018

Refactor CRF layer for speed #47

Closed

alanakbik mentioned this issue Aug 16, 2018

Gh 47 refactor crf #69

Merged

alanakbik pushed a commit that referenced this issue Sep 24, 2018

GH-38: add confidence for tagger predictions | GH-37: small refactori…

f52826f

…ngs for speed improvements

alanakbik pushed a commit that referenced this issue Sep 27, 2018

GH-38: add confidence for tagger predictions | GH-37: small refactori…

fa3fe2d

…ngs for speed improvements

tabergma added the enhancement Improving of an existing feature label Oct 4, 2018

tabergma mentioned this issue Jan 15, 2019

How can I use multiple GPUs? #389

Closed

kashif mentioned this issue Feb 5, 2019

[WIP] initial work to improve gpu occupancy #459

Merged

YangFei1990 mentioned this issue Jul 3, 2019

Extremely low GPU ultilization #857

Closed

stale bot added the wontfix This will not be worked on label Apr 30, 2020

alanakbik removed the wontfix This will not be worked on label Apr 30, 2020

alanakbik closed this as completed Apr 30, 2020

demongolem-biz mentioned this issue Oct 5, 2021

Is Multiple-GPU support available now? #2467

Closed

akash418 mentioned this issue Nov 28, 2022

Some notes concerning Flair for fine tuning LM's malteos/finetune-evaluation-harness#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Underutilization of GPU #37

Underutilization of GPU #37

ducalpha commented Aug 2, 2018

alanakbik commented Aug 2, 2018

alanakbik commented Aug 2, 2018

alipetiwala commented Feb 1, 2019 •

edited

Loading

mattilyra commented Apr 17, 2019

stale bot commented Apr 30, 2020

alanakbik commented Apr 30, 2020

Underutilization of GPU #37

Underutilization of GPU #37

Comments

ducalpha commented Aug 2, 2018

alanakbik commented Aug 2, 2018

alanakbik commented Aug 2, 2018

alipetiwala commented Feb 1, 2019 • edited Loading

mattilyra commented Apr 17, 2019

stale bot commented Apr 30, 2020

alanakbik commented Apr 30, 2020

alipetiwala commented Feb 1, 2019 •

edited

Loading