Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Cliff and Excessive Memory Usage at --unwind 6 #8600

Open
zhoulaifu opened this issue Feb 23, 2025 · 4 comments
Open

Performance Cliff and Excessive Memory Usage at --unwind 6 #8600

zhoulaifu opened this issue Feb 23, 2025 · 4 comments

Comments

@zhoulaifu
Copy link

zhoulaifu commented Feb 23, 2025

CBMC Bug Report: Performance Cliff and Excessive Memory Usage at --unwind 6

Performance Degradation and Memory Explosion in CBMC 6.4.1 on macOS ARM64 with --unwind 6 When Verifying cJSON.c.

Please advise if this is a known bug, and suggest patches or workarounds. I’m happy to provide more logs, test cases, or assist in debugging.

Description

I am experiencing a severe performance issue with CBMC 6.4.1 when increasing the --unwind value from 5 to 6 while verifying a C program that uses the cJSON library (cJSON.c). For --unwind ≤ 5, CBMC completes execution within seconds with reasonable memory usage. However, at --unwind 6, CBMC runs for over 10 hours without producing a result, and memory usage exceeds 100GB, which is unexpected and appears to indicate a state space explosion or bug.

Steps to Reproduce

  1. Environment:

    • OS: macOS (Apple M3)
    • CBMC Version: 6.4.1
  2. Project Structure:

    • Source files:
      • main.c: Contains a harness function provided below.
      • lib/cJSON/cJSON.c: The cJSON JSON parsing library (version 1.7.99, linked from https://github.com/DaveGamble/cJSON/blob/master/cJSON.c). The file also contains manually inserted assertions.
      • lib/cJSON/cJSON_Utils.c: Utilities for cJSON.
      • lib/cJSON/cJSON.h and lib/cJSON/cJSON_Utils.h: Corresponding headers (assumed to be included).
  3. Harness and Command:

    • Harness Code (in main.c):

      #include <assert.h>
      #include <string.h>
      
      #define MAX_INPUT_SIZE 512
      
      // Forward declaration
      char *process_command(const char *command);
      
      void harness() {
          char buffer[3][MAX_INPUT_SIZE]; // Process up to 3 commands
          for (int i = 0; i < 3; i++) {
              __CPROVER_assume(buffer[i][MAX_INPUT_SIZE - 1] == '\0');
              __CPROVER_assume(strncmp(buffer[i], "ADD", 3) == 0 || strncmp(buffer[i], "MOD", 3) == 0 || 
                               strncmp(buffer[i], "GET id", 6) == 0 || strncmp(buffer[i], "EXIT", 4) == 0);
              char *response = process_command(buffer[i]);
              if (response) {
                  assert(response[strlen(response)] == '\0');
                  free(response);
              }
          }
      }
    • Command for Low Unwind (Works Fine):

      cbmc -Ilib/cJSON main.c lib/cJSON/cJSON.c lib/cJSON/cJSON_Utils.c --function harness --trace --no-standard-checks --no-built-in-assertions --unwind 5
      • Completes in seconds with normal memory usage.
    • Command for High Unwind (Fails):

      cbmc -Ilib/cJSON app/main.c lib/cJSON/cJSON.c lib/cJSON/cJSON_Utils.c --function harness --trace --no-standard-checks --no-built-in-assertions --unwind 6
      • Runs for >10 hours without completing and consumes >100GB of memory.
  4. Verbose Output (With --verbosity 9):

    • When running with --unwind 6 and --verbosity 9, the following lines seem to repeat indefinitely, suggesting a potentially infinite loop or excessive unrolling:
      Unwinding loop cJSON_Delete.0 iteration 5 file lib/cJSON/cJSON.c line 260 function cJSON_Delete thread 0
      Unwinding recursion cJSON_Delete iteration 5
      Unwinding recursion cJSON_Delete iteration 6
      Not unwinding recursion cJSON_Delete iteration 7
      Unwinding loop cJSON_Delete.0 iteration 1 file lib/cJSON/cJSON.c line 260 function cJSON_Delete thread 0
      Not unwinding recursion cJSON_Delete iteration 7
      Unwinding loop cJSON_Delete.0 iteration 2 file lib/cJSON/cJSON.c line 260 function cJSON_Delete thread 0
      Not unwinding recursion cJSON_Delete iteration 7
      [And so on...]
      
    • For a single-command harness (processing one process_command call), the output is concise and fast:
      Unwinding recursion cJSON_Delete iteration 1
      Unwinding loop cJSON_Delete.0 iteration 1 file lib/cJSON/cJSON.c line 260 function cJSON_Delete thread 0
      Unwinding loop cJSON_Delete.0 iteration 1 file lib/cJSON/cJSON.c line 260 function cJSON_Delete thread 0
      ....
      
      This completes quickly with low memory usage.

Expected Behavior

  • I expected runtime and memory usage to increase gradually with each increment of --unwind (e.g., from 5 to 6), not exponentially. For --unwind 6, CBMC should complete within minutes (or at most hours) with reasonable memory usage (e.g., <10GB), not >10 hours and >100GB.

Actual Behavior

  • At --unwind 6, CBMC hangs for >10 hours, consumes >100GB of memory, and appears stuck in excessive unrolling of cJSON_Delete’s loop and recursion in lib/cJSON/cJSON.c (line 260).

Additional Context

  • Assertions in cJSON.c:

    • I have manually injected three assertions in cJSON.c to test for vulnerabilities.
  • Workarounds Tried:

    • Constrained buffer[i] to specific commands (ADD, MOD, GET id, EXIT) using __CPROVER_assume, but the issue persists.
    • Used a single-command harness, which works fast but doesn’t test the multi-command sequence needed for potential bugs.

Impact

This performance cliff prevents effective verification of multi-command sequences.

Possible Cause

  • The cJSON_Delete function (line 260 in cJSON.c) contains a while loop and recursive calls (cJSON_Delete(item->child)), which CBMC unrolls excessively at --unwind 6. This likely causes a state space explosion.
@zhoulaifu zhoulaifu changed the title CBMC Unwind Issue Leading to Extremely Long Execution Time CBMC Bug Report: Performance Cliff and Excessive Memory Usage at --unwind 6 Feb 23, 2025
@zhoulaifu zhoulaifu changed the title CBMC Bug Report: Performance Cliff and Excessive Memory Usage at --unwind 6 Performance Cliff and Excessive Memory Usage at --unwind 6 Feb 23, 2025
@tautschnig
Copy link
Collaborator

Trying to Explain Why You Might See this Behaviour

It seems that increasing the unwinding by 1 triggers execution of a whole new set of program statements. Most likely you will have CBMC running on a very wide call tree, caused by excessive branching. This, in turn, is often caused by CBMC resolving function pointers to some over-approximate set of functions over the ones that can actually be called at any such point. A possible scenario is that you are iterating beyond the bounds of an array of function pointers such that the sixth element isn't initialised anymore and, therefore, CBMC's over-approximate function pointer resolution behaviour becomes apparent.

Remark: The above is some speculation as I couldn't find version 1.7.99 of cJSON, and neither was I able to find the function process_command in the cJSON repository.

A Simple Remediation

To tell whether you are likely facing this scenario or not you may wish to review the early stages of the log output with --verbosity 9. You will find a line saying "Removal of function pointers and virtual functions" followed by lines that give a source location and then "replacing function pointer by possible targets". (If no such line is present then my guess is wrong and you don't actually have any function pointers.) If you do have such lines, consider using goto-instrument's --restrict-function-pointer capability as documented here: https://diffblue.github.io/cbmc/man/goto-instrument.html.

The Better Remediation

Instead of trying to work around the limitations of bounded model checking you might instead opt for unbounded verification, using code contracts and/or loop invariants. See our documentation at https://diffblue.github.io/cbmc/contracts-user.html. For an practical example see coreJSON where we insert loop invariants described in this patch.

@zhoulaifu
Copy link
Author

Thank you! We'll look into these remediations.

@kroening
Copy link
Member

Also consider --depth -- this limits the number of instructions, as opposed to the number of loop iterations, and often has less of a performance cliff.

@zhoulaifu
Copy link
Author

Daniel, thank you! I’ll definitely give that a try. CBMC is truly an awesome tool, and I’m excited about the prospect of using it for real-world projects.

I’ve had some success by simplifying a loop-heavy function in CJSON, although I’m a bit worried that these simplified functions or using the --depth flag might affect CBMC’s soundness. Still, I’m exploring that direction.

Also, could you possibly take a look at another question I posted? Here’s the link: #8602. It’s about CBMC’s modeling of library functions like sscanf and strcmp, and the related challenge of generating a counterexample for a char*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants