Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema refactoring #261

Merged
merged 22 commits into from
Feb 5, 2025
Merged

Schema refactoring #261

merged 22 commits into from
Feb 5, 2025

Conversation

alexthomas93
Copy link
Contributor

@alexthomas93 alexthomas93 commented Jan 27, 2025

Description

  • Refactors the schema creation code to reduce code duplication, improve maintainability, and ensure the code can be seamlessly imported into the LangChain Neo4j integration.
  • Introduces support for creating enhanced schemas.
  • Migrates and updates all relevant unit and integration tests to cover the enhanced schema functionality.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Complexity: Medium

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@alexthomas93 alexthomas93 changed the title Added format_schema function Schema refactoring Jan 28, 2025
@alexthomas93 alexthomas93 marked this pull request as ready for review January 28, 2025 16:59
@alexthomas93 alexthomas93 requested a review from a team as a code owner January 28, 2025 16:59
@alexthomas93
Copy link
Contributor Author

@CodiumAI-Agent /update_changelog

@CodiumAI-Agent
Copy link

Changelog updates: 🔄

2025-01-28

Added

  • Support for creating enhanced schemas with detailed property statistics.
  • New utility functions for schema formatting and value sanitization.
  • Updated unit and integration tests to cover enhanced schema functionality.

Changed

  • Refactored schema creation code to reduce duplication and improve maintainability.

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

src/neo4j_graphrag/schema.py Show resolved Hide resolved
src/neo4j_graphrag/schema.py Outdated Show resolved Hide resolved
src/neo4j_graphrag/schema.py Show resolved Hide resolved
src/neo4j_graphrag/schema.py Outdated Show resolved Hide resolved
src/neo4j_graphrag/schema.py Show resolved Hide resolved
src/neo4j_graphrag/schema.py Show resolved Hide resolved
): # Check if the sanitized value is not None
new_dict[key] = sanitized_value
elif isinstance(value, list):
if len(value) < LIST_LIMIT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is performed in value_sanitize in the isinstance(value, list) case, do we have to redo it here? Maybe it's the same in the end, because we will have to test for None return value if we skip this list length test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. This is value_sanitize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant I think the test on line 113 is redundant, and I have the impression it should work if we remove it because then we enter the _value_sanitize function with a list as input, and the test on the list length is performed on line 124.

But it's just my impression looking at the code, and it's a minor thing.

src/neo4j_graphrag/schema.py Outdated Show resolved Hide resolved
src/neo4j_graphrag/schema.py Outdated Show resolved Hide resolved
src/neo4j_graphrag/schema.py Show resolved Hide resolved
src/neo4j_graphrag/schema.py Outdated Show resolved Hide resolved
src/neo4j_graphrag/schema.py Outdated Show resolved Hide resolved
Copy link
Contributor

@stellasia stellasia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for answering all my questions! Nice work! ⭐

@alexthomas93 alexthomas93 merged commit 0b93788 into neo4j:main Feb 5, 2025
7 checks passed
@alexthomas93 alexthomas93 deleted the refactor/schema branch February 5, 2025 11:21
@CodiumAI-Agent
Copy link

Changelog updates: 🔄

2025-02-05 *

Added

  • Support for creating enhanced schemas with detailed property statistics.
  • New utility functions for schema formatting and value sanitization.
  • Updated unit and integration tests to cover enhanced schema functionality.

Changed

  • Refactored schema creation code to reduce duplication and improve maintainability.

to commit the new content to the CHANGELOG.md file, please type:
'/update_changelog --pr_update_changelog.push_changelog_changes=true'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants