-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String Literals Domain #1048
String Literals Domain #1048
Conversation
…zer into base-string-analyses
…zer into base-string-analyses
…zer into base-string-analyses
…zer into base-string-analyses
…zer into base-string-analyses
The CI summary page has a |
…zer into base-string-analyses
src/cdomains/lval.ml
Outdated
let to_string_length = function | ||
| StrPtr (Some x) -> Some (String.length x) | ||
| _ -> None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this might not be so easy because C strings and OCaml strings are different. In particular a string literal "hello\0world"
would have strlen
5, but the OCaml string has length 11.
The same probably applies to other length-related functionality as well.
Maybe CIL has some helper functions for manipulating strings with C semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's absolutely right, again thank you for thinking about this case! I couldn't find a CIL function that would help me in this case in a quick search, but I'll have another look. Otherwise, String.split_on_char '\x00' x
and taking the first element of the resulting list would do the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is now fixed for strlen
, but I think it still needs to be fixed for some of the other functions. I think it matters for strstr
and strcmp
as well. Right now they're just performing the operations on OCaml strings, but all of those operations need to be performed with C semantics, so only up to the NUL byte considered when checking for inclusion etc.
I'm wondering if a better solution is to maybe just do the stripping initially, so it doesn't have to be done for every operation individually. Or maybe CIL is already doing that as well?
I haven't checked, but it would be good to have tests for these things. If sv-benchmarks doesn't contain such cases, it would be nice to also contribute those because this detail is easy to get wrong and become unsound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first assumption was that CIL would cut the string at the first NULL byte, that's why I didn't do the stripping. After your comment, I tested this and found out that's not the case.
As you're right that this also concerns strstr
and str(n)cmp
, I introduced a new function to_c_string
performing the stripping which I now use accross all my other functions. I didn't change to_string
as some parts of the base analysis not related to my PR rely on it.
I added a new 02-string_literals_with_null.c test to verify this edge case is now correctly covered.
…zer into base-string-analyses
@sim642 do you want to do a final pass or are you good with merging? |
CHANGES: * Add `setjmp`/`longjmp` analysis (goblint/analyzer#887, goblint/analyzer#970, goblint/analyzer#1015, goblint/analyzer#1019). * Refactor race analysis to lazy distribution (goblint/analyzer#1084, goblint/analyzer#1089, goblint/analyzer#1136, goblint/analyzer#1016). * Add thread-unsafe library function call analysis (goblint/analyzer#723, goblint/analyzer#1082). * Add mutex type analysis and mutex API analysis (goblint/analyzer#800, goblint/analyzer#839, goblint/analyzer#1073). * Add interval set domain and string literals domain (goblint/analyzer#901, goblint/analyzer#966, goblint/analyzer#994, goblint/analyzer#1048). * Add affine equalities analysis (goblint/analyzer#592). * Add use-after-free analysis (goblint/analyzer#1050, goblint/analyzer#1114). * Add dead code elimination transformation (goblint/analyzer#850, goblint/analyzer#979). * Add taint analysis for partial contexts (goblint/analyzer#553, goblint/analyzer#952). * Add YAML witness validation via unassume (goblint/analyzer#796, goblint/analyzer#977, goblint/analyzer#1044, goblint/analyzer#1045, goblint/analyzer#1124). * Add incremental analysis rename detection (goblint/analyzer#774, goblint/analyzer#777). * Fix address sets unsoundness (goblint/analyzer#822, goblint/analyzer#967, goblint/analyzer#564, goblint/analyzer#1032, goblint/analyzer#998, goblint/analyzer#1031). * Fix thread escape analysis unsoundness (goblint/analyzer#939, goblint/analyzer#984, goblint/analyzer#1074, goblint/analyzer#1078). * Fix many incremental analysis issues (goblint/analyzer#627, goblint/analyzer#836, goblint/analyzer#835, goblint/analyzer#841, goblint/analyzer#932, goblint/analyzer#678, goblint/analyzer#942, goblint/analyzer#949, goblint/analyzer#950, goblint/analyzer#957, goblint/analyzer#955, goblint/analyzer#954, goblint/analyzer#960, goblint/analyzer#959, goblint/analyzer#1004, goblint/analyzer#558, goblint/analyzer#1010, goblint/analyzer#1091). * Fix server mode for abstract debugging (goblint/analyzer#983, goblint/analyzer#990, goblint/analyzer#997, goblint/analyzer#1000, goblint/analyzer#1001, goblint/analyzer#1013, goblint/analyzer#1018, goblint/analyzer#1017, goblint/analyzer#1026, goblint/analyzer#1027). * Add documentation for configuration JSON schema and OCaml API (goblint/analyzer#999, goblint/analyzer#1054, goblint/analyzer#1055, goblint/analyzer#1053). * Add many library function specifications (goblint/analyzer#962, goblint/analyzer#996, goblint/analyzer#1028, goblint/analyzer#1079, goblint/analyzer#1121, goblint/analyzer#1135, goblint/analyzer#1138). * Add OCaml 5.0 support (goblint/analyzer#1003, goblint/analyzer#945, goblint/analyzer#1162).
This PR extends the base analysis with cases for
strcpy
(adapted),strncpy
,strcat
,strncat
,strlen
,strstr
,strcmp
andstrncmp
. The following scenarios can now be analyzed:dest
ofstrcpy
,strncpy
,strcat
orstrncat
potentially is a string literal, Goblint will now warn about a possible undefined behavior. For this purpose, the previous implementation ofstrcpy
had to be adapted.strlen
,strstr
,strcmp
andstrncmp
are string literals, Goblint can now compute the exact return value of these functions.I also added two regression tests, both with assertions that can now be tracked and assertions for which the String Literals Domain can still not make any assertion.
Note: The regression tests fail on MacOS while they succeed on Linux. I couldn't yet figure out why that is the case.