Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in prediction step #64

Open
sun647 opened this issue Jan 3, 2025 · 1 comment
Open

Error in prediction step #64

sun647 opened this issue Jan 3, 2025 · 1 comment

Comments

@sun647
Copy link

sun647 commented Jan 3, 2025

Hello,

I ran into the following error when running the final prediction command: cryoCARE_predict.py --conf predict_config.json

2025-01-03 11:19:44.871539: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2025-01-03 11:19:44.872627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 45266 MB memory) -> physical GPU (device: 3, name: NVIDIA RTX A6000, pci bus id: 0000:61:00.0, compute capability: 8.6)
Loading network weights from 'weights_best.h5'.
(208, 1264, 928, 1)
2025-01-03 11:19:53.903428: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2025-01-03 11:19:53.921961: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3593075000 Hz
2025-01-03 11:19:55.220327: W tensorflow/core/kernels/gpu_utils.cc:49] Failed to allocate memory for convolution redzone checking; skipping this check. This is benign and only means that we won't check cudnn for out-of-bounds reads and writes. This message will only be printed once.
2025-01-03 11:19:55.220463: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-01-03 11:19:56.573105: F tensorflow/stream_executor/cuda/cuda_dnn.cc:88] Check failed: narrow == wide (-391249920 vs. 3903717376)checked narrowing failed; values not equal post-conversion
Aborted (core dumped)

Below are my Tensorflow and nvcc versions:
pip show tensorflow
Name: tensorflow
Version: 2.4.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Location: /home/csun/miniconda3/envs/cryocare_11/lib/python3.8/site-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras-preprocessing, numpy, opt-einsum, protobuf, six, tensorboard, tensorflow-estimator, termcolor, typing-extensions, wheel, wrapt
Required-by:

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:34:21_PDT_2024
Cuda compilation tools, release 12.6, V12.6.20
Build cuda_12.6.r12.6/compiler.34431801_0

Has anyone seen the same error? How should I work around it?

Best,
Chen

@sun647
Copy link
Author

sun647 commented Jan 3, 2025

Nevermind. I got it work after reinstalling with cuda 12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant