Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(SqlLoader): transformations in SqlLoader #1569

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

scaliseraoul
Copy link
Contributor

@scaliseraoul scaliseraoul commented Jan 31, 2025


Important

Enhance SQLDatasetLoader to apply transformations using TransformationManager and add corresponding tests.

  • Behavior:
    • SQLDatasetLoader in sql_loader.py now applies transformations using _apply_transformations() after executing SQL queries.
    • Removes redundant _apply_transformations() from local_loader.py.
  • Transformations:
    • TransformationManager in transformation_manager.py now directly uses Transformation objects for applying transformations.
  • Tests:
    • Adds test_sql_loader.py to test SQL loading and transformation application.
    • Removes SQL-related tests from test_loader.py.
  • Misc:
    • Updates VirtualDataFrame in virtual_dataframe.py to use SQLDatasetLoader for data loading.

This description was created by Ellipsis for 314ba22. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 314ba22 in 50 seconds

More details
  • Looked at 460 lines of code in 7 files
  • Skipped 0 files when reviewing.
  • Skipped posting 7 drafted comments based on config settings.
1. pandasai/data_loader/transformation_manager.py:3
  • Draft comment:
    The import statement for numpy was removed, but it is not used anywhere in the file. This change is correct and does not affect functionality.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The import statement for numpy was removed, but it is not used anywhere in the file. This change is correct and does not affect functionality.
2. pandasai/data_loader/transformation_manager.py:886
  • Draft comment:
    The apply_transformations method now uses a Transformation object instead of a dictionary for transformations. This change improves type safety and clarity.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The TransformationManager class now uses a Transformation object instead of a dictionary for transformations. This change improves type safety and clarity.
3. pandasai/data_loader/transformation_manager.py:310
  • Draft comment:
    The format parameter was renamed to _format to avoid conflict with the built-in format function. This is a good practice to prevent potential issues.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The format parameter in the to_datetime method was renamed to _format to avoid conflict with the built-in format function. This is a good practice to prevent potential issues.
4. pandasai/data_loader/sql_loader.py:26
  • Draft comment:
    The load method no longer reinitializes self.query_builder. This is a good change as it avoids unnecessary reinitialization.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The load method in SQLDatasetLoader no longer reinitializes self.query_builder. This is a good change as it avoids unnecessary reinitialization.
5. tests/unit_tests/data_loader/test_sql_loader.py:16
  • Draft comment:
    The new functionality in SQLDatasetLoader is well-covered by the added unit tests in this file. Good job on ensuring the new code is tested!
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The code changes in the PR include modifications to the SQLDatasetLoader class, which is part of the library code. The PR also includes new unit tests for the SQLDatasetLoader in a separate test file, ensuring that the new functionality is tested. This aligns with the requirement that new code should have corresponding unit tests.
6. tests/unit_tests/data_loader/test_loader.py:110
  • Draft comment:
    The new functionality in SQLDatasetLoader is well-covered by the added unit tests in test_sql_loader.py. Good job on ensuring the new code is tested!
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The PR includes changes to the SQLDatasetLoader class, which is part of the library code. The PR also includes new unit tests for the SQLDatasetLoader in a separate test file, ensuring that the new functionality is tested. This aligns with the requirement that new code should have corresponding unit tests.
7. pandasai/data_loader/sql_loader.py:36
  • Draft comment:
    Ensure that the SQL queries are parameterized to prevent SQL injection vulnerabilities. Consider using parameterized queries or ORM features to handle this.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_LmNIF0YzCzP0RI9w


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@gventuri gventuri merged commit 847757a into sinaptik-ai:main Jan 31, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants