-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix undefined behavior in LDLT #222
Conversation
Thank you very much for the fix, @mjacobse would you be able to review this one? |
Thanks, this should definitely be fixed and as far as I can see the proposed change would do it. I was just a bit bothered by needing a somewhat magic looking conditional just for this when it feels like there should be a more natural way to handle this special case. So I thought about it a bit and I believe this extra call for copying the blocks that contain both diagonal and rectangular parts can be written more succinctly and intuitively: copy_failed_rect(
get_nrow(nblk-1, m, block_size) - get_nrow(nblk-1, n, block_size),
get_ncol(jblk, n, block_size), 0, cdata[jblk],
failed_rect.data() + jfail*(m-n), m-n,
&a[jblk*block_size*lda+n], lda
); This should be exactly equivalent since |
When m==n the failed_rect.data() is nullptr, but then we still subtract some small integer from it. Doing arithmetic on a null pointer is undefined behavior. Clang's undefined Behavior sanitizer says ldlt_app.cxx:2420:38: runtime error: applying non-zero offset 18446744073709551536 to null pointer The copy_failed_rect ends up being a no-op because m==rfrom, but it's still UB to do arithmetic on nullptr, even if never de-referenced.
51eb2b0
to
2728b8d
Compare
Yes, confirmed. I ran that new version through our UBSan test case that caught this, as well as our full Drake test suite, and everything passed. I've pushed that version now. |
Great, thanks! Seems good to me but would be happy if you also take a look into the new version @jfowkes |
Thank you, all, mods have been incorporated in galahad's multiprecision cpu version, and tests succeed |
When
m==n
thefailed_rect.data()
isnullptr
, but then we still subtract some small integer from it. Doing arithmetic on a null pointer is undefined behavior. Clang's undefined Behavior sanitizer saysldlt_app.cxx:2420:38: runtime error: applying non-zero offset 18446744073709551536 to null pointer
.The
copy_failed_rect
ends up being a no-op becausem==rfrom
, but it's still UB to do arithmetic onnullptr
, even if never de-referenced.