-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Segmentation Fault in libMKLDNNPlugin.so when using blobs with pre-allocated buffers #4742
Comments
Hi @tanmayv25 Thanks for reporting, providing necessary files and steps to reproduce! I was able to reproduce the segmentation fault with niter set to 3 and above. Interesting, your application works fine when using a GPU device. We have opened a bug under the CPU plugin for the development team to investigate. Regards, Ref. 51087 |
Hi @tanmayv25 Thanks for the reproducer. But the attached model is a degenerate case that triggers an unusual call sequence that has little to do with real-world use cases, but it should work anyways and will be fixed. However, I suspect that this reproducer does not reveal the real problem. I suppose that your original model is somewhat more complex and the problem with it may have a different source. Am I right? If so, could you please provide the original model that the problem initially occurred with? Regards, |
Hi @maxnick, That being said, I have tried running valgrind on Triton+OV for resnet50_int8 model and saw the following reports:
This doesn't cause seg faults like the model I shared before and I don't find any inconsistencies in the results. The output is of 1001 FP32 values with probability for each class. I don't get why libMKLDNNPlugin.so will try to read 32 bytes at the margin. It looked like an artifact from avx instructions. Unfortunately, I don't have a simple reproducer for the latter case but I can share one if it is of interest. |
@maxnick Has there been any progress on the resolution of this issue? An identity model is the most simplistic and serve various utilities for the framework use-cases. |
@tanmayv25 Take a look at PR #5000 for updates. Regards, |
System information (version)
Detailed description
For my use case where I already had input data in memory, I created InferenceEngine::Blob objects using the pointer to the pre-allocated buffer and size. I had to free the buffer once the inference results were retrieved and create/set another blob with the next requests. First several runs are successful with expected results, however, I see a segmentation fault after some iterations.
When using the valgrind it looks like in the Infer() call the libMKLDNNPlugin.so attempts to access the the memory address of one of the previous request which has been already freed when the corresponding request completed. I expect once the SetBlob is called on the InferRequest it should override the previous SetBlob call completely and the Infer call should use the latest blob for inference.
See the valgrind log here:
I hacked the
hello_classification
sample code little bit to repeatedly issue the Infer requests to an IR model converted from ONNX Identity model. The main section looks like:Steps to reproduce
ov_segf_repro.zip
The attached zip file contains
repeated_wraps
.model.onnx
file.Follow the below steps to reproduce the segmentation fault:
openvino/ubuntu18_dev
container image.model.onnx
to IR using model optimizer in the container:repeated_wraps
folder to the/opt/intel/openvino_2021.2.185/inference_engine/samples/cpp
apt install build-essential
then change to the above directory../build_samples.sh
script.~/inference_engine_cpp_samples_build/intel64/Release
A segmentation fault will occur.
Some example runs:
Issue submission checklist
The text was updated successfully, but these errors were encountered: