Integrate BugsInPy #184

t-sorger · 2024-11-24T16:57:09Z

WIP: Issue #178

andre15silva · 2025-01-14T09:17:30Z

Any update here?

FYI, it would probably be a good idea to rebase with master, I made some updates recently.

t-sorger · 2025-01-14T11:36:44Z

I am still encountering issues with BugsInPy and how it works.
I will send you an email so we can arrange a meeting to discuss the problems I’m facing.

Thanks for the hint, I will rebase with master as soon as possible!

Update submodules when rebasing with master

…k into BugsInPy

Update submodules when rebasing with master

…k into BugsInPy

andre15silva · 2025-02-25T11:50:44Z

/gemini review

gemini-code-assist

Code Review

This pull request integrates the BugsInPy benchmark into the elleelleaime framework. The changes include adding the BugsInPy submodule, creating classes to represent the BugsInPy benchmark and bugs, and adding a new prompting strategy for Python bugs. Overall, the integration seems well-structured, but there are a few areas that could be improved for clarity and maintainability.

Summary of Findings

Error Handling in Bug ID Parsing: The code includes error handling for invalid bug IDs, which is good. However, consider logging the specific error message for better debugging.
Duplicated Code: The code to checkout both the buggy and fixed versions of the code is duplicated in TestBugsInPy.checkout_bug. Consider refactoring this into a single function.
Hardcoded Paths: There are several hardcoded paths in the code, such as in extract_single_function and TestBugsInPy. Consider using os.path.join or pathlib to make the code more portable.

Assessment

The pull request introduces the BugsInPy benchmark, which is a valuable addition to the project. The code is generally well-structured, but there are some areas that could be improved for clarity, maintainability, and robustness. I recommend addressing the review comments before merging. Please ensure that others review and approve this code before merging.

gemini-code-assist · 2025-02-25T11:51:04Z

elleelleaime/core/benchmarks/BugsInPy/BugsInPy.py

+                except ValueError:
+                    logging.warning(
+                        f"Skipping invalid bug ID: {bug_id.decode('utf-8')}"
+                    )


Consider logging the specific ValueError exception message for better debugging. This will help identify the cause of the invalid bug ID.

For example, you can log str(e) to capture the error message.

Suggested change

except ValueError:

logging.warning(

f"Skipping invalid bug ID: {bug_id.decode('utf-8')}"

)

except ValueError as e:

logging.warning(

f"Skipping invalid bug ID: {bug_id.decode('utf-8')}. Error: {str(e)}"

)

gemini-code-assist · 2025-02-25T11:51:05Z

elleelleaime/core/utils/python/python.py

+        Optional[Tuple[str, str]]: None if the bug is not single-function, otherwise a tuple of the form (buggy_code, fixed_code)
+    """
+    project_name, _ = bug.get_identifier().rsplit("-", 1)
+    path = f"./benchmarks/BugsInPy/projects/{project_name}"


This path is hardcoded. Consider using os.path.join or pathlib to construct the path, making the code more portable and robust to directory structure changes.

Suggested change

path = f"./benchmarks/BugsInPy/projects/{project_name}"

path = Path("./benchmarks/BugsInPy/projects", project_name)

gemini-code-assist · 2025-02-25T11:51:05Z

tests/core/benchmarks/BugInPy/test_BugsInPy.py

+        try:
+            # Checkout buggy version
+            bug.checkout(bug_identifier, fixed=False)
+
+            project_name, _ = bug_identifier.rsplit("-", 1)
+            path = f"./benchmarks/BugsInPy/framework/bin/temp/{project_name}"
+
+            # Assert that there are files in the directories
+            if len(list(Path(path).glob("**/*"))) == 0:
+                return False
+            # Assert that we can reach some Python files
+            buggy_python_files = list(Path(path).glob("**/*.py"))
+            if len(buggy_python_files) == 0:
+                return False
+
+            # Checkout fixed version
+            bug.checkout(bug_identifier, fixed=True)
+            # Assert that there are files in the directories
+            if len(list(Path(path).glob("**/*"))) == 0:
+                return False
+            # Assert that we can reach some Python files
+            buggy_python_files = list(Path(path).glob("**/*.py"))
+            if len(buggy_python_files) == 0:
+                return False
+
+            return True


The code to checkout both the buggy and fixed versions is duplicated. Consider refactoring this into a single function to reduce code duplication and improve maintainability.

You could create a helper function that takes the bug identifier and a boolean indicating whether to checkout the fixed version. This function would handle the checkout process and the assertions.

def checkout_bug_version(self, bug: Bug, fixed: bool) -> bool: bug_identifier = bug.get_identifier() project_name, _ = bug_identifier.rsplit("-", 1) path = f"./benchmarks/BugsInPy/framework/bin/temp/{project_name}" try: # Checkout specified version bug.checkout(bug_identifier, fixed=fixed) # Assert that there are files in the directories if len(list(Path(path).glob("**/*"))) == 0: return False # Assert that we can reach some Python files buggy_python_files = list(Path(path).glob("**/*.py")) if len(buggy_python_files) == 0: return False return True finally: shutil.rmtree(path, ignore_errors=True) def test_checkout_bugs(self): bugs_in_py = get_benchmark("BugsInPy") assert bugs_in_py is not None bugs_in_py.initialize() # Run only the first 3 bugs to not take too long bugs = list(bugs_in_py.get_bugs())[:3] assert bugs is not None for bug in bugs: assert self.checkout_bug_version(bug, fixed=False), f"Failed checkout for {bug.get_identifier()} (buggy version)" assert self.checkout_bug_version(bug, fixed=True), f"Failed checkout for {bug.get_identifier()} (fixed version)"

gemini-code-assist · 2025-02-25T11:51:05Z

tests/core/benchmarks/BugInPy/test_BugsInPy.py

+            bug.checkout(bug_identifier, fixed=False)
+
+            project_name, _ = bug_identifier.rsplit("-", 1)
+            path = f"./benchmarks/BugsInPy/framework/bin/temp/{project_name}"


This path is hardcoded. Consider using os.path.join or pathlib to construct the path, making the code more portable and robust to directory structure changes.

path = Path("./benchmarks/BugsInPy/framework/bin/temp", project_name)

gemini-code-assist · 2025-02-25T11:51:05Z

tests/sample/instruct/test_instruct.py

        )

        # Assert we are dealing with the correct bug and strategy
-        assert sample["identifier"] == "Closure-115"
+        assert sample["identifier"] == "youtube-dl-1"
        assert sample["prompt_strategy"] == "instruct"


The prompt strategy is hardcoded as "instruct". However, the class variable PROMPT_STRATEGY is set to instruct_python. Please ensure that these are consistent, and that the correct prompt strategy is being used.

Suggested change

assert sample["prompt_strategy"] == "instruct"

assert sample["prompt_strategy"] == "instruct_python"

andre15silva · 2025-02-25T11:52:24Z

@t-sorger just testing the gemini code assistant, don't mind the comments

gemini-code-assist · 2025-02-25T11:52:34Z

Thanks for testing the Gemini code assistant, @andre15silva! I've reviewed the changes and have provided some feedback in the review comments. Let me know if you have any questions.

t-sorger · 2025-02-25T11:56:24Z

@andre15silva still interesting to see what feedback Gemini gives!

gemini-code-assist · 2025-02-25T11:56:31Z

Hi @t-sorger,

It's great to see the BugsInPy benchmark being integrated! It's definitely interesting to see the feedback from the Gemini code assistant. I hope the review was helpful!

monperrus · 2025-03-21T06:58:04Z

hi all, how to complete this important task?

andre15silva · 2025-03-21T09:47:32Z

hi all, how to complete this important task?

The major blocker right now is the execution of samples since executing them locally requires installing a lot of dependencies.

@t-sorger is working on dockerizing BugsInPy so the samples execute in an isolated environment with their own set of dependencies

t-sorger added 5 commits November 24, 2024 17:54

add BugsInPy submodule

a09695d

add initial BugsInPybug.py

c9384d5

add initial BugsInPy.py to benchmark

ce48490

add BugsInPy to core utils

865975b

add initial tests for BugsInPy; fix typo

e8976c5

t-sorger added 23 commits January 14, 2025 13:37

add BugsInPy submodule

9a3325d

add initial BugsInPybug.py

96d79c5

add initial BugsInPy.py to benchmark

83b35cd

add BugsInPy to core utils

0cf0179

add initial tests for BugsInPy; fix typo

e09839c

add test implementation for BugsInPybug

f335bdf

fix bin path issues

2bc479a

lint code

bd08ec1

rework tests for BugsInPy

11600a3

update submodules

1cc7bc6

Update submodules when rebasing with master

Merge branch 'BugsInPy' of github.com:ASSERT-KTH/repairbench-framewor…

0d28f9d

…k into BugsInPy

add BugsInPy submodule

d3de871

add initial BugsInPybug.py

56f4502

add initial BugsInPy.py to benchmark

8274a8d

add BugsInPy to core utils

63f5834

add initial tests for BugsInPy; fix typo

8e761a6

add test implementation for BugsInPybug

41821d4

fix bin path issues

28e4c9a

lint code

21420fd

rework tests for BugsInPy

5962796

update submodules

ea287fa

Update submodules when rebasing with master

Merge branch 'BugsInPy' of github.com:ASSERT-KTH/repairbench-framewor…

17c438d

…k into BugsInPy

adds RichBug and fixes process calls

7177e86

t-sorger added 6 commits January 26, 2025 16:35

add checks and fix path issues

7a195e0

fix code and first tests

1c2f662

fix error in tests

1845b6d

lint code

f0cfa76

start adding instruct test and new python utils

1c1ea5e

update python.py

1e0ffd0

gemini-code-assist bot reviewed Feb 25, 2025

View reviewed changes

t-sorger added 9 commits February 25, 2025 15:45

update Python utils and comment other test cases

edd053f

add InfillingPromptingPython

c74c397

update utils for Python

b679250

add test infilling for BugsInPy codellama

994e21e

lint files

4d3561c

uncomment other infilling tests

c583a39

add initial files for language_utils

779340a

add get_language_utils method

76272cf

add usage of LanguageUtils for infilling

b1e684f

andre15silva mentioned this pull request Mar 11, 2025

add support for run-bug-run runbugrun #39 WIP #166

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate BugsInPy #184

Integrate BugsInPy #184

t-sorger commented Nov 24, 2024

andre15silva commented Jan 14, 2025

t-sorger commented Jan 14, 2025

andre15silva commented Feb 25, 2025

gemini-code-assist bot left a comment

gemini-code-assist bot Feb 25, 2025

gemini-code-assist bot Feb 25, 2025

gemini-code-assist bot Feb 25, 2025

gemini-code-assist bot Feb 25, 2025

gemini-code-assist bot Feb 25, 2025

andre15silva commented Feb 25, 2025

gemini-code-assist bot commented Feb 25, 2025

t-sorger commented Feb 25, 2025

gemini-code-assist bot commented Feb 25, 2025

monperrus commented Mar 21, 2025

andre15silva commented Mar 21, 2025

	path = f"./benchmarks/BugsInPy/projects/{project_name}"
	path = Path("./benchmarks/BugsInPy/projects", project_name)

	assert sample["prompt_strategy"] == "instruct"
	assert sample["prompt_strategy"] == "instruct_python"

Integrate BugsInPy #184

Are you sure you want to change the base?

Integrate BugsInPy #184

Conversation

t-sorger commented Nov 24, 2024

andre15silva commented Jan 14, 2025

t-sorger commented Jan 14, 2025

andre15silva commented Feb 25, 2025

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Assessment

gemini-code-assist bot Feb 25, 2025

Choose a reason for hiding this comment

gemini-code-assist bot Feb 25, 2025

Choose a reason for hiding this comment

gemini-code-assist bot Feb 25, 2025

Choose a reason for hiding this comment

gemini-code-assist bot Feb 25, 2025

Choose a reason for hiding this comment

gemini-code-assist bot Feb 25, 2025

Choose a reason for hiding this comment

andre15silva commented Feb 25, 2025

gemini-code-assist bot commented Feb 25, 2025

t-sorger commented Feb 25, 2025

gemini-code-assist bot commented Feb 25, 2025

monperrus commented Mar 21, 2025

andre15silva commented Mar 21, 2025