Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIOpen(HIP): Warning [SQLiteBase] Unable to read system database - Performance may degrade #1254

Closed
Bengt opened this issue Oct 31, 2021 · 7 comments

Comments

@Bengt
Copy link

Bengt commented Oct 31, 2021

Hello,

when running some PyTorch code, I am seeing a warning about MIOpen not being able to open a system database:

MIOpen(HIP): Warning [SQLiteBase] Unable to read system database file:/opt/rocm/miopen/share/miopen/db/gfx900_64.kdb Performance may degrade

I checked that the database file is in place and readable as a user:

$ ls -lh /opt/rocm/miopen/share/miopen/db/gfx900_64.kdb
-rw-r--r-- 1 root root 555M Okt 19 12:02 /opt/rocm/miopen/share/miopen/db/gfx900_64.kdb

This seems somewhat similar #306, where the database file cannot be deleted.

Just after this warning, I get errors about the device identifier being messed up, but these are already covered in ROCm/ROCm#1572. Anyhow, here is the full output of my application:

FAILED [100%][INFO] Found GPU. Using device: cuda
Processing 1/1
[INFO] Found GPU. Using device: cuda
[INFO] Found GPU. Using device: cuda
MIOpen(HIP): Warning [SQLiteBase] Unable to read system database file:/opt/rocm/miopen/share/miopen/db/gfx900_64.kdb Performance may degrade
MIOpen(HIP): Error [SetIsaName] 'amd_comgr_action_info_set_isa_name(handle, isa.c_str())' amdgcn-amd-amdhsa--gfx900:sramecc-:xnack-: INVALID_ARGUMENT (2)
MIOpen(HIP): Error [BuildOcl] comgr status = INVALID_ARGUMENT (2)
MIOpen(HIP): Warning [BuildOcl] amdgcn-amd-amdhsa--gfx900:sramecc-:xnack-
MIOpen Error: /MIOpen/src/hipoc/hipoc_program.cpp:286: Code object build failed. Source: MIOpenConvFwd_LxL_11.cl

test_train_alexnet_from_scratch.py:25 (test_train_alexnet_from_scratch)
def test_train_alexnet_from_scratch():
        # Mock
        num_epochs = 1
    
        # Test
>       train_alexnet_from_scratch(num_epochs=num_epochs)

test_train_alexnet_from_scratch.py:31: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../vr_backend/pytorch/alexnet/train_alexnet.py:83: in train_alexnet_from_scratch
    outputs = alexnet(images)
../.tox/py38-rebuild/lib/python3.8/site-packages/torch/nn/modules/module.py:1102: in _call_impl
    return forward_call(*input, **kwargs)
../vr_backend/pytorch/alexnet/alexnet_module.py:38: in forward
    x = self.features(x)
../.tox/py38-rebuild/lib/python3.8/site-packages/torch/nn/modules/module.py:1102: in _call_impl
    return forward_call(*input, **kwargs)
../.tox/py38-rebuild/lib/python3.8/site-packages/torch/nn/modules/container.py:141: in forward
    input = module(input)
../.tox/py38-rebuild/lib/python3.8/site-packages/torch/nn/modules/module.py:1102: in _call_impl
    return forward_call(*input, **kwargs)
../.tox/py38-rebuild/lib/python3.8/site-packages/torch/nn/modules/conv.py:446: in forward
    return self._conv_forward(input, self.weight, self.bias)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
input = tensor([[[[-2.1035, -2.1035, -2.1034,  ..., -2.1051, -2.1051, -2.1050],
          [-2.1036, -2.1036, -2.1036,  ..., -2..., -1.8043, -1.8044],
          [-1.8034, -1.8034, -1.8034,  ..., -1.8041, -1.8042, -1.8044]]]],
       device='cuda:0')
weight = Parameter containing:
tensor([[[[ 4.4890e-04,  3.3360e-02, -2.1240e-03,  ..., -5.4662e-03,
           -4.8673e-04, -4....3187e-03, -4.7364e-02,  ...,  3.9884e-02,
           -4.0196e-02,  3.5859e-02]]]], device='cuda:0', requires_grad=True)
bias = Parameter containing:
tensor([-0.0020, -0.0466, -0.0401,  0.0008, -0.0212, -0.0030, -0.0184,  0.0100,
         0.0160,...    0.0405, -0.0028,  0.0331,  0.0474, -0.0086,  0.0185,  0.0332,  0.0045],
       device='cuda:0', requires_grad=True)

    def _conv_forward(self, input: Tensor, weight: Tensor, bias: Optional[Tensor]):
        if self.padding_mode != 'zeros':
            return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
                            weight, bias, self.stride,
                            _pair(0), self.dilation, self.groups)
>       return F.conv2d(input, weight, bias, self.stride,
                        self.padding, self.dilation, self.groups)
E       RuntimeError: miopenStatusUnknownError

../.tox/py38-rebuild/lib/python3.8/site-packages/torch/nn/modules/conv.py:442: RuntimeError
@atamazov
Copy link
Contributor

Hi @Bengt. Most likely, the 1st problem (warning) can be safely ignored. Just in case, let's check the GPU (please paste output of /opt/rocm/opencl/bin/clinfo | grep -E "gfx|units" here).

For the 2nd problem we have workaround, see #1231

@Bengt
Copy link
Author

Bengt commented Oct 31, 2021

Hi @atamazov,

here is the requested output about my GPU (it's a Vega 64):

$ /opt/rocm/opencl/bin/clinfo | grep -E "gfx|units"
  Max compute units:				 64
  Name:						 gfx900:xnack-

@atamazov
Copy link
Contributor

atamazov commented Nov 1, 2021

@Bengt Yes, the 1st problem can be ignored.

@Bengt
Copy link
Author

Bengt commented Nov 1, 2021

Alright then. Since this issue is primarily about the first warning about not being able to read the database, I consider ignoring it a workaround. Hence, I will close this issue.

@Bengt Bengt closed this as completed Nov 1, 2021
@Bengt
Copy link
Author

Bengt commented Nov 1, 2021

I hope this issue still helps somebody wondering about that warning message in the future.

@atamazov
Copy link
Contributor

atamazov commented Nov 1, 2021

@Bengt The 1st issue (warning) and the 2nd (build error) are unrelated. The workaround is required to resolve the 2nd issue. The root reason of it is ROCm problem with gfx900 identification.

@Bengt
Copy link
Author

Bengt commented Nov 1, 2021

Yes, thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants