-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update versions for sqlite and protobuf #2347
Conversation
b48134a
to
87c1ddd
Compare
requirements.txt
Outdated
@@ -21,12 +21,12 @@ | |||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | |||
# THE SOFTWARE. | |||
##################################################################################### | |||
google/protobuf@v3.11.0 -DCMAKE_POSITION_INDEPENDENT_CODE=On -X subdir -Dprotobuf_BUILD_TESTS=Off | |||
google/protobuf@v3.20.2 -DCMAKE_POSITION_INDEPENDENT_CODE=On -X subdir -Dprotobuf_BUILD_TESTS=Off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if .pb
files or .onnx
needs to be re-generated.
Did all tests pass ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they might need to be regenerated. There seems to be an issue on the develop branch preventing it from building:
/code/AMDMIGraphX/AMDMIGraphX/src/targets/gpu/include/migraphx/gpu/ck.hpp:34:10:
fatal error: 'ck/host/device_batched_gemm_softmax_gemm.hpp' file not found
#include "ck/host/device_batched_gemm_softmax_gemm.hpp"
So I checked out an earlier version of develop and made the version changes for protobuf and sqlite and was able to get build/tests to pass. I put that up as PR but realized there were conflicts with the requirements.txt
on develop. So I rebased it to resolve conflicts, but I was not sure about the commits ROCmSoftwarePlatform/composable_kernel@70eefcf4f263aa5c25f3c9ff0db8f6f199ef0fb9
that may need to be updated. I was planning to ask this in the afternoon meeting along with some other questions on this. In the meantime, I think its better to convert it to draft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that error is from CK.
You need to rebuild your dependencies using rbuild develop
or rbuild prepare
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried rbuild develop
but I still saw the same error. The I tried rbuild prepare
with the command rbuild prepare --deps-dir ../AMDMIGraphX/
. But I still see the same error. I am doing this in the build directory in a docker container. Do you know what I might be missing? :)
Edit: May be the deps-dir
is incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, I think I got it fixed. Doing rbuild prepare -d depend
and using -DCMAKE_PREFIX_PATH=depend
seems to have done the trick.
Edit: No it did not fix it :). Lakhinder mentioned spinning a new docker container is another way. I'll try that.
Second Edit: I was able to get this resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if
.pb
files or.onnx
needs to be re-generated. Did all tests pass ?
@umangyadav All tests except the test_verify ones I mentioned in the other message pass.
Codecov Report
@@ Coverage Diff @@
## develop #2347 +/- ##
========================================
Coverage 91.36% 91.36%
========================================
Files 440 440
Lines 16530 16530
========================================
Hits 15101 15101
Misses 1429 1429 |
Seeing this error now:
Not related to this PR. rocm versions: Machine running this on: |
This build is OK for merge ✅ |
🔴torchvision-inceptionv3_1: FAILED: MIGraphX is not within tolerance - check verbose output🔴slim-inceptionv4_1: FAILED: MIGraphX is not within tolerance - check verbose output🔴bert_base_cased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output🔴distilgpt2_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
87c1ddd
to
3e1ca8e
Compare
The get_available_gpu_memory error seem to be machine dependant. I was trying this on another machine and I do not see these errors.
|
They don't seem to be failing on CI with MLIR. Can you check if your MLIR dependency is up-to-date ? |
(Ping noted, no idea what I can do to help here) |
I suspect its dev environment issue ? as 1) I cant reproduce it and 2) CI cant reproduce it.. So might worth checking the MLIR commit hash being used in the dev env. |
Thanks for checking it out. I'll investigate it more at my end. I also suspect this to be an env issue at my end. |
The failures seem to be happening on Navi32, the tests run successfully on an MI100 system. I have opened an issue for this with the details: #2365 FYI @manupak @umangyadav |
3e1ca8e
to
0166274
Compare
@ahsan-ca once CI passes set this to ready-to-merge unless anyone has any other unresolved comments |
@TedThemistokleous There are a few failures, but I am unable to sign in to Jenkins to see the failures. How does one get access to Jenkins to see the failures? |
/opt/rocm/llvm/bin/clang++ -Werror -g -O2 -fno-omit-frame-pointer -fsanitize=undefined,address -fno-sanitize-recover=undefined,address CMakeFiles/test_tf.dir/tf/tf_test.cpp.o -o ../bin/test_tf -Wl,-rpath,/var/jenkins/workspace/AMDMIGraphX_PR-2347/build/lib ../lib/libmigraphx_tf.so.2008000.0 ../lib/libmigraphx.so.2008000.0 -lpthread
|
I suspect this to be caused by an issue in Protobuf : ABI may depend on NDEBUG. This issue has been fixed in Protobuf v21.3. I have updated the Protobuf version to v21.3 to see if it fixes the error and how this version interacts with other packages. Need to wait for Jenkins tests to see how this plays out. |
8bb716d
to
6c502ac
Compare
Ah, this did not work out. Looking into it. |
e544df2
to
59516c0
Compare
The |
59516c0
to
3313614
Compare
The original version of Protobuf |
All the CI tests pass so far, just waiting on Jenkins tests to pass before this is ready to be merged. |
|
Ah, the same error persists. |
3313614
to
92defb8
Compare
Seems like 3.18.0 worked, I am now trying 3.19.0 in a bid to get the most recent version of Protobuf that does not have the error we saw with later versions of Protobuf. |
For Posterity; 3.19.0 worked so updated Protobuf to it. |
Updated versions for sqlite and protobuf are needed to fix
security vulnerabilities.