Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: encrypted database #5

Merged
merged 4 commits into from
Dec 24, 2024
Merged

feat: encrypted database #5

merged 4 commits into from
Dec 24, 2024

Conversation

JarbasAl
Copy link
Contributor

@JarbasAl JarbasAl commented Dec 23, 2024

Summary by CodeRabbit

  • New Features

    • Introduced secure storage options for JSON data with AES encryption.
    • Added classes for encrypted JSON storage compliant with XDG specifications.
    • Implemented functions for encryption and decryption of data, including JSON handling.
    • Automated execution of unit tests through a new GitHub Actions workflow.
  • Bug Fixes

    • Added robust error management for encryption and decryption processes.
  • Documentation

    • Updated exception handling with new classes for encryption-related errors.
  • Tests

    • Added a comprehensive test suite for the EncryptedJsonStorage class to ensure data integrity and security.

Copy link
Contributor

coderabbitai bot commented Dec 23, 2024

Warning

Rate limit exceeded

@JarbasAl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 21 minutes and 10 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 69e36cc and f3a2bc9.

📒 Files selected for processing (1)
  • .github/workflows/unit_tests.yml (1 hunks)

Walkthrough

The pull request introduces encryption capabilities to the JSON database library by adding two new classes: EncryptedJsonStorage and EncryptedJsonStorageXDG. These classes extend the existing storage mechanisms with AES encryption functionality. A new cryptography module crypto.py provides encryption and decryption methods using AES-GCM mode, supporting JSON data compression and secure storage. Additionally, two new exception classes are introduced to handle encryption-related errors, and a new GitHub Actions workflow is implemented to automate unit testing.

Changes

File Changes
json_database/__init__.py Added EncryptedJsonStorage and EncryptedJsonStorageXDG classes for encrypted JSON storage
json_database/crypto.py New module with encryption/decryption functions: encrypt, decrypt, encrypt_as_json, decrypt_from_json, compress_payload, decompress_payload
json_database/exceptions.py Added DecryptionKeyError and EncryptionKeyError exception classes
.github/workflows/unit_tests.yml New workflow for running unit tests on pull requests and pushes, with coverage reporting
test/requirements.txt Added dependencies for testing: coveralls, flake8, pytest, pytest-cov, cov-core, pycryptodomex
test/test_crypto.py New test suite for EncryptedJsonStorage class with multiple test cases for functionality validation

Poem

🔐 Encryption's Rabbit Hop! 🐰
In bytes of code, secrets now dance,
With AES, we take our stance,
Encrypted data, safe and tight,
A rabbit's magic, pure delight!
Secure and swift, our payload leaps! 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (6)
json_database/crypto.py (3)

18-28: Validate key length or usage for AES.
By default, AES keys are often 16, 24, or 32 bytes long, depending on the desired AES variant (AES-128, AES-192, AES-256). Consider ensuring the provided key meets these requirements to avoid ambiguity in encryption strength.


30-39: Ensure consistent error handling when decryption fails.
If decryption fails due to an invalid key, the user might benefit from a specialized message or logging context.


99-102: Use a ternary operator to simplify your compression logic.
This is a minor code readability improvement recommended by static analysis.

Example code:

-if isinstance(text, str):
-    decompressed = text.encode("utf-8")
-else:
-    decompressed = text
+decompressed = text.encode("utf-8") if isinstance(text, str) else text
🧰 Tools
🪛 Ruff (0.8.2)

99-102: Use ternary operator decompressed = text.encode("utf-8") if isinstance(text, str) else text instead of if-else-block

Replace if-else-block with decompressed = text.encode("utf-8") if isinstance(text, str) else text

(SIM108)

json_database/__init__.py (3)

108-109: Ensure docstring matches usage.
The docstring indicates a persistent dict stored with AES, which is accurate. Keep it updated if new features (like compression) are added over time.


111-113: Consider storing the encryption key in a secure manner.
Storing a plaintext encryption key in memory might be acceptable for many apps, but more sensitive data may require a secure or ephemeral approach.


332-346: Avoid calling xdg_data_home() in default arguments.
See the static analysis hint B008. It can cause unexpected behavior if xdg_data_home() changes at runtime.

Proposed fix:

-class EncryptedJsonStorageXDG(EncryptedJsonStorage):
-    def __init__(self,
-                 encrypt_key: str,
-                 name: str,
-                 xdg_folder=xdg_data_home(),
-                 ...
+class EncryptedJsonStorageXDG(EncryptedJsonStorage):
+    def __init__(self,
+                 encrypt_key: str,
+                 name: str,
+                 xdg_folder=None,
+                 ...
     if xdg_folder is None:
         xdg_folder = xdg_data_home()
🧰 Tools
🪛 Ruff (0.8.2)

338-338: Do not perform function call xdg_data_home in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c955eb9 and f9d7cdd.

📒 Files selected for processing (3)
  • json_database/__init__.py (3 hunks)
  • json_database/crypto.py (1 hunks)
  • json_database/exceptions.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
json_database/__init__.py

338-338: Do not perform function call xdg_data_home in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

json_database/crypto.py

14-14: Do not use bare except

(E722)


66-66: Do not use bare except

(E722)


67-67: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


76-76: Do not use bare except

(E722)


77-77: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


93-93: Do not use bare except

(E722)


94-94: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


99-102: Use ternary operator decompressed = text.encode("utf-8") if isinstance(text, str) else text instead of if-else-block

Replace if-else-block with decompressed = text.encode("utf-8") if isinstance(text, str) else text

(SIM108)

🔇 Additional comments (5)
json_database/crypto.py (1)

41-51: Avoid truncating keys without explanation.
Lines 44-45 truncate the key to 16 bytes. This silently enforces AES-128. If that is intentional, add documentation. Otherwise, consider supporting longer keys.

json_database/__init__.py (3)

16-16: Import usage looks good.
No issues here. The import from “crypto” is consistent with the usage below.


115-127: Decrypt after super().load_local for guaranteed dictionary.
This approach is consistent and avoids partial load states, but be mindful that if super() populates partial or malformed data, decryption might still fail.


128-139: Confirm intention to keep the unencrypted data in memory.
After storing, the code reverts to a decrypted state. This is convenient but might be a security concern if the memory is readable by other processes.

json_database/exceptions.py (1)

17-22: Exceptions look fine.
Defining custom exceptions clarifies the meaning of encryption or decryption failures, enhancing maintainability.

json_database/crypto.py Outdated Show resolved Hide resolved
json_database/crypto.py Outdated Show resolved Hide resolved
json_database/crypto.py Outdated Show resolved Hide resolved
json_database/crypto.py Show resolved Hide resolved
@github-actions github-actions bot added feature and removed feature labels Dec 24, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (2)
json_database/__init__.py (2)

338-344: Avoid function call in default argument.

Move the xdg_data_home() call inside the function to prevent potential issues with mutable default arguments.

     def __init__(self,
                  encrypt_key: str,
                  name: str,
-                 xdg_folder=xdg_data_home(),
+                 xdg_folder=None,
                  disable_lock=False,
                  subfolder="json_database",
                  extension="ejson"):
+        if xdg_folder is None:
+            xdg_folder = xdg_data_home()
🧰 Tools
🪛 Ruff (0.8.2)

341-341: Do not perform function call xdg_data_home in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


370-387: Move test code to proper test files.

The test block should be moved to a dedicated test file with proper test cases and assertions.

Consider creating tests/test_encrypted_storage.py with proper test cases:

import pytest
from json_database import EncryptedJsonStorage
from tempfile import NamedTemporaryFile

def test_encrypted_storage():
    with NamedTemporaryFile() as tf:
        # Test with different key sizes
        for key_size in (16, 24, 32):
            key = "S" * key_size
            db = EncryptedJsonStorage(key, tf.name)
            db["test"] = "value"
            db.store()
            
            # Verify data is encrypted on disk
            raw_db = JsonStorage(tf.name)
            assert "ciphertext" in raw_db
            
            # Verify data can be decrypted
            db2 = EncryptedJsonStorage(key, tf.name)
            assert db2["test"] == "value"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9d7cdd and 9a191e3.

📒 Files selected for processing (2)
  • json_database/__init__.py (4 hunks)
  • json_database/crypto.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
json_database/__init__.py

341-341: Do not perform function call xdg_data_home in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

json_database/crypto.py

13-13: Do not use bare except

(E722)


68-71: Use ternary operator decompressed = text.encode("utf-8") if isinstance(text, str) else text instead of if-else-block

Replace if-else-block with decompressed = text.encode("utf-8") if isinstance(text, str) else text

(SIM108)

🔇 Additional comments (1)
json_database/crypto.py (1)

6-14: Improve error handling in AES import.

The bare except can mask important errors. Use specific exception handling.

 try:
     # pycryptodomex
     from Cryptodome.Cipher import AES
 except ImportError:
     # pycrypto + pycryptodome
     try:
         from Crypto.Cipher import AES
-    except:
+    except ImportError:
         AES = None
🧰 Tools
🪛 Ruff (0.8.2)

13-13: Do not use bare except

(E722)

json_database/crypto.py Show resolved Hide resolved
json_database/crypto.py Show resolved Hide resolved
json_database/crypto.py Show resolved Hide resolved
json_database/crypto.py Show resolved Hide resolved
json_database/__init__.py Show resolved Hide resolved
json_database/__init__.py Show resolved Hide resolved
@github-actions github-actions bot added feature and removed feature labels Dec 24, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/test_crypto.py (1)

9-9: Consider using a randomized or environment-derived encryption key for production scenarios.
Currently, for testing purposes, the key is a fixed string. While acceptable for most tests, you might want to highlight in documentation or comments that actual usage should involve a secure key generation.

test/requirements.txt (1)

1-6: Well-structured inclusion of relevant testing dependencies.
All essential packages for coverage, linting, pytest, and encryption appear properly declared. If you need stricter control, you might consider pinning “pycryptodomex” to a specific version.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a191e3 and 69e36cc.

📒 Files selected for processing (3)
  • .github/workflows/unit_tests.yml (1 hunks)
  • test/requirements.txt (1 hunks)
  • test/test_crypto.py (1 hunks)
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/unit_tests.yml

41-41: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


43-43: the runner of "actions/setup-python@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


63-63: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🔇 Additional comments (5)
test/test_crypto.py (5)

10-10: Potential concurrency issue with a static file path.
Writing and deleting “/tmp/test.json” may lead to conflicts if tests are run in parallel. Consider using a unique file per test or using a temporary directory fixture.


20-25: Good coverage of data storage.
This test thoroughly checks that data is stored both in memory and on disk. It might be beneficial to add a quick assertion confirming that the encrypted content isn't plain text within the same test to reinforce coverage.


27-34: Thorough plaintext absence validation.
Ensuring that “42” is not present post-write is a direct way to test encryption. Good job!


35-41: Solid decryption check after reload.
Verifying successful decryption covers a critical usage scenario.


42-49: Smart check against reading encrypted data with JsonStorage.
Verifying the absence of the plaintext key is an effective approach.

runs-on: ubuntu-latest
timeout-minutes: 35
steps:
- uses: actions/checkout@v2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Upgrade to the latest action versions for better compatibility.
Static analysis indicates that these versions might be outdated on newer GitHub runners. For longevity and security, consider updating them as follows:

-      - uses: actions/checkout@v2
+      - uses: actions/checkout@v3

-        uses: actions/setup-python@v2
+        uses: actions/setup-python@v4

-        uses: codecov/codecov-action@v3
+        uses: codecov/codecov-action@v4

Also applies to: 43-43, 63-63

🧰 Tools
🪛 actionlint (1.7.4)

41-41: the runner of "actions/checkout@v2" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

@JarbasAl JarbasAl merged commit 0a7cd49 into dev Dec 24, 2024
2 checks passed
JarbasAl added a commit that referenced this pull request Dec 24, 2024
* feat:semver

* feat:semver

* add gitignore

* feat: encrypted database (#5)

* feat: encrypted database

* feat: encrypted database

* tests

* tests

* Increment Version to 0.8.0a1

* Update Changelog

---------

Co-authored-by: miro <[email protected]>
Co-authored-by: JarbasAI <[email protected]>
Co-authored-by: JarbasAl <[email protected]>
@JarbasAl JarbasAl deleted the feat/encrypted_db branch December 24, 2024 02:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant