Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat:shared_utils #324

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open

feat:shared_utils #324

wants to merge 2 commits into from

Conversation

JarbasAl
Copy link
Member

@JarbasAl JarbasAl commented Feb 1, 2025

gather some basic util methods from across our repos that were flagged with a TODO

reorganize functions dumped at the top level module into their own files by topic for easier maintenance

Summary by CodeRabbit

Release Notes

  • Refactor

    • Consolidated utility functions across multiple modules
    • Reorganized imports and function locations
  • New Features

    • Added new utility functions for list manipulation, including rotate_list, flatten_list, and deduplicate_list
    • Enhanced text processing capabilities with functions like camel_case_split, collapse_whitespaces, and remove_accents_and_punct
    • Improved threading capabilities with new functions for timeout handling and daemon thread creation
  • Deprecation

    • Removed several existing utility functions
    • Added deprecation warnings for certain methods
  • Performance

    • Introduced advanced caching and property management decorators, including classproperty and timed_lru_cache

gather some basic util methods from across our repos that were flagged with a TODO

reorganize functions dumped at the top level module into their own files by topic for easier maintenance
Copy link
Contributor

coderabbitai bot commented Feb 1, 2025

Walkthrough

This pull request introduces significant refactoring of the ovos_utils package by reorganizing utility functions across different modules. The changes involve removing several functions from the __init__.py file and redistributing them into more specialized modules like decorators.py, list_utils.py, text_utils.py, and thread_utils.py. The modifications include the addition of new decorators, list manipulation functions, text processing utilities, and threading helpers, along with deprecation warnings for certain functions.

Changes

File Changes
ovos_utils/__init__.py - Removed multiple utility functions
- Added deprecation warnings
- Updated imports
ovos_utils/decorators.py - Added classproperty decorator
- Added timed_lru_cache decorator
ovos_utils/list_utils.py - Added rotate_list function
- Added flatten_list function
- Added deduplicate_list function
ovos_utils/text_utils.py - Added camel_case_split function
- Added collapse_whitespaces function
- Added rm_parentheses function
- Added remove_accents_and_punct function
ovos_utils/thread_utils.py - Added threaded_timeout decorator
- Added create_killable_daemon function
- Added create_daemon function
- Added wait_for_exit_signal function

Possibly related PRs

Suggested reviewers

  • NeonDaniel

Poem

🐰 Hop, hop, refactor we go!
Functions dancing to and fro
Modules clean, code so bright
Utility helpers take their flight
A rabbit's dance of clean design! 🌟

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added feature and removed feature labels Feb 1, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
ovos_utils/list_utils.py (2)

4-4: Improve parameter naming.

The parameter name l is ambiguous and could be confused with the number 1. Consider using a more descriptive name.

-def rotate_list(l: List[Any], n: int = 1) -> List[Any]:
+def rotate_list(lst: List[Any], n: int = 1) -> List[Any]:
🧰 Tools
🪛 Ruff (0.8.2)

4-4: Ambiguous variable name: l

(E741)


35-35: Replace lambda with a regular function definition.

Using lambda assignments is against Python style guidelines. Convert it to a regular function for better readability.

-    _flatten = lambda l: [item for sublist in l for item in sublist]
+    def _flatten(lst):
+        return [item for sublist in lst for item in sublist]
🧰 Tools
🪛 Ruff (0.8.2)

35-35: Do not assign a lambda expression, use a def

Rewrite _flatten as a def

(E731)


35-35: Ambiguous variable name: l

(E741)

ovos_utils/text_utils.py (1)

31-42: Consider handling nested parentheses.

The current regex pattern doesn't handle nested parentheses correctly. Consider using a more robust approach.

-    return re.sub(r"\((.*?)\)", "", text).replace("  ", " ")
+    def remove_parentheses(text):
+        stack = []
+        start = -1
+        result = list(text)
+        for i, char in enumerate(text):
+            if char == '(':
+                if not stack:
+                    start = i
+                stack.append(i)
+            elif char == ')' and stack:
+                stack.pop()
+                if not stack:
+                    for j in range(start, i + 1):
+                        result[j] = ''
+        return collapse_whitespaces(''.join(result))
+    return remove_parentheses(text)
ovos_utils/__init__.py (1)

29-35: Consider adding migration guidance in deprecation warnings.

The deprecation warning could be more helpful by suggesting alternative approaches or explaining why the functionality is being removed.

     warnings.warn(
-        "deprecated without replacement and will be removed in a future release.",
+        "The create_loop function is deprecated and will be removed in version 2.0.0. "
+        "Consider using asyncio.create_task or threading.Timer for similar functionality.",
         DeprecationWarning,
         stacklevel=2,
     )
ovos_utils/thread_utils.py (4)

9-51: Improve thread cleanup and resource management.

While the implementation is functional, consider these improvements for better resource management:

  1. The thread continues running in the background even after timeout, which could lead to resource leaks.
  2. No cleanup is performed for thread resources.

Consider this enhanced implementation:

 def threaded_timeout(timeout: int = 5) -> Callable:
     def deco(func: Callable) -> Callable:
         @wraps(func)
         def wrapper(*args: Any, **kwargs: Any) -> Any:
             res = [Exception(f'function [{func.__name__}] timeout '
                              f'[{timeout}] exceeded!')]
+            stop_event = Event()
 
             def func_wrapped():
                 try:
                     res[0] = func(*args, **kwargs)
                 except Exception as e:
                     res[0] = e
+                finally:
+                    stop_event.set()
 
             t = Thread(target=func_wrapped)
             t.daemon = True
             try:
                 t.start()
-                t.join(timeout)
+                stop_event.wait(timeout)
+                if not stop_event.is_set():
+                    LOG.warning(f"Function {func.__name__} timed out after {timeout}s")
             except Exception as je:
                 raise je
             ret = res[0]
             if isinstance(ret, BaseException):
                 raise ret
             return ret
 
         return wrapper
 
     return deco

This implementation:

  1. Uses an Event to properly signal completion
  2. Logs a warning when timeout occurs
  3. Provides better cleanup of thread resources

54-74: Add input validation for better error handling.

The function would benefit from input validation to prevent runtime errors.

Consider this enhanced implementation:

 def create_killable_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None, autostart: bool = True) -> kthread.KThread:
+    if not callable(target):
+        raise TypeError("target must be a callable")
+    if not isinstance(args, tuple):
+        raise TypeError("args must be a tuple")
+    kwargs = kwargs or {}
     t = kthread.KThread(target=target, args=args, kwargs=kwargs)
     t.daemon = True
     if autostart:
         t.start()
     return t

This implementation:

  1. Validates that target is callable
  2. Ensures args is a tuple
  3. Safely handles None kwargs

77-97: Reduce code duplication with create_killable_daemon.

The function shares significant code with create_killable_daemon. Consider extracting common logic to reduce duplication.

Consider this refactored implementation:

+def _validate_thread_args(target: Callable, args: tuple, kwargs: Optional[dict]) -> dict:
+    if not callable(target):
+        raise TypeError("target must be a callable")
+    if not isinstance(args, tuple):
+        raise TypeError("args must be a tuple")
+    return kwargs or {}
+
 def create_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None, autostart: bool = True) -> Thread:
+    kwargs = _validate_thread_args(target, args, kwargs)
     t = Thread(target=target, args=args, kwargs=kwargs)
     t.daemon = True
     if autostart:
         t.start()
     return t

And update create_killable_daemon:

 def create_killable_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None, autostart: bool = True) -> kthread.KThread:
+    kwargs = _validate_thread_args(target, args, kwargs)
     t = kthread.KThread(target=target, args=args, kwargs=kwargs)
     t.daemon = True
     if autostart:
         t.start()
     return t

This refactoring:

  1. Extracts common validation logic
  2. Reduces code duplication
  3. Ensures consistent behavior across both functions

100-114: Enhance signal handling and fix f-string.

The function could benefit from handling additional termination signals, and there's an unnecessary f-string in the log message.

Consider this enhanced implementation:

+import signal
+
 def wait_for_exit_signal() -> None:
+    exit_event = Event()
+    
+    def signal_handler(signum, frame):
+        LOG.debug("Exiting on signal %s", signal.Signals(signum).name)
+        exit_event.set()
+    
+    signal.signal(signal.SIGTERM, signal_handler)
+    signal.signal(signal.SIGINT, signal_handler)
+    
     try:
-        Event().wait()
+        exit_event.wait()
     except KeyboardInterrupt:
-        LOG.debug(f"Exiting on KeyboardInterrupt")
+        LOG.debug("Exiting on KeyboardInterrupt")

This implementation:

  1. Handles both SIGTERM and SIGINT signals
  2. Uses a shared Event for coordinated exit
  3. Removes unnecessary f-string
  4. Uses proper string formatting in log messages
🧰 Tools
🪛 Ruff (0.8.2)

114-114: f-string without any placeholders

Remove extraneous f prefix

(F541)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5ba67f and 90db3f1.

📒 Files selected for processing (5)
  • ovos_utils/__init__.py (2 hunks)
  • ovos_utils/decorators.py (1 hunks)
  • ovos_utils/list_utils.py (1 hunks)
  • ovos_utils/text_utils.py (1 hunks)
  • ovos_utils/thread_utils.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
ovos_utils/list_utils.py

4-4: Ambiguous variable name: l

(E741)


35-35: Do not assign a lambda expression, use a def

Rewrite _flatten as a def

(E731)


35-35: Ambiguous variable name: l

(E741)

ovos_utils/thread_utils.py

114-114: f-string without any placeholders

Remove extraneous f prefix

(F541)

ovos_utils/__init__.py

17-17: ovos_utils.decorators.classproperty imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


17-17: ovos_utils.decorators.timed_lru_cache imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


18-18: ovos_utils.list_utils.flatten_list imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


18-18: ovos_utils.list_utils.rotate_list imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


19-19: ovos_utils.log.LOG imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


20-20: ovos_utils.text_utils.camel_case_split imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


21-21: ovos_utils.thread_utils.wait_for_exit_signal imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


21-21: ovos_utils.thread_utils.threaded_timeout imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


21-21: ovos_utils.thread_utils.create_killable_daemon imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: license_tests
🔇 Additional comments (5)
ovos_utils/list_utils.py (1)

45-62: LGTM! Well-implemented deduplication with order preservation option.

The implementation is clean and efficient, offering flexibility with the keep_order parameter.

ovos_utils/text_utils.py (1)

43-58: LGTM! Well-implemented Unicode normalization.

The function correctly handles Unicode normalization and preserves necessary punctuation marks.

ovos_utils/__init__.py (1)

17-21: LGTM! Re-exporting symbols from submodules.

The imports are used for re-exporting symbols, making them available at the package level.

🧰 Tools
🪛 Ruff (0.8.2)

17-17: ovos_utils.decorators.classproperty imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


17-17: ovos_utils.decorators.timed_lru_cache imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


18-18: ovos_utils.list_utils.flatten_list imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


18-18: ovos_utils.list_utils.rotate_list imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


19-19: ovos_utils.log.LOG imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


20-20: ovos_utils.text_utils.camel_case_split imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


21-21: ovos_utils.thread_utils.wait_for_exit_signal imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


21-21: ovos_utils.thread_utils.threaded_timeout imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)


21-21: ovos_utils.thread_utils.create_killable_daemon imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

ovos_utils/decorators.py (2)

6-20: LGTM! Well-implemented classproperty decorator.

The implementation correctly handles class-level properties with proper type hints and documentation.


23-71: LGTM! Robust implementation of timed LRU cache.

The implementation properly handles:

  • Optional decorator arguments
  • Cache expiration using monotonic time
  • Preservation of original cache management methods

@github-actions github-actions bot added feature and removed feature labels Feb 1, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
ovos_utils/thread_utils.py (4)

11-55: Review: threaded_timeout Decorator Implementation
The implementation uses a mutable list (res) to capture the result or exception from the function call, which is an effective pattern for inner-function communication. However, note that the docstring states that "the thread is terminated" if the timeout is exceeded. In reality, when using a standard Thread, there is no built-in mechanism to kill the thread—the thread will continue running in the background even after a timeout. Consider updating the docstring to clarify this behavior or switching to a killable thread (e.g., using kthread) if termination is required.

Example diff to update the docstring:

-    If the timeout is exceeded, an exception is raised and the thread is terminated.
+    If the timeout is exceeded, an exception is raised. Note: The underlying thread is not forcefully terminated and may continue running in the background.

57-78: Review: create_killable_daemon Function
This helper function correctly leverages kthread.KThread to create a daemon thread with an option to autostart. One improvement suggestion is to ensure that the kwargs parameter is always a dictionary. If kwargs is passed as None, it may be preferable to default it to an empty dictionary before passing it to the thread constructor.

Example diff:

-def create_killable_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None,
-                           autostart: bool = True) -> kthread.KThread:
-    t = kthread.KThread(target=target, args=args, kwargs=kwargs)
+def create_killable_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None,
+                           autostart: bool = True) -> kthread.KThread:
+    t = kthread.KThread(target=target, args=args, kwargs=kwargs or {})

81-101: Review: create_daemon Function
This function mirrors the logic of create_killable_daemon for standard daemon threads. Similar to the previous function, consider handling the kwargs parameter to default to an empty dictionary if None is provided. This can help avoid potential issues if the underlying Thread constructor does not handle a None value gracefully.

Example diff:

-def create_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None, autostart: bool = True) -> Thread:
-    t = Thread(target=target, args=args, kwargs=kwargs)
+def create_daemon(target: Callable, args: tuple = (), kwargs: Optional[dict] = None, autostart: bool = True) -> Thread:
+    t = Thread(target=target, args=args, kwargs=kwargs or {})

104-127: Review: wait_for_exit_signal Function
The wait_for_exit_signal function is well implemented for gracefully blocking execution until a termination signal is received. The use of an Event and a custom signal handler is appropriate. One minor suggestion is to consider restoring the original signal handlers after the event is set if other parts of the application rely on them, though this might be unnecessary depending on the overall program design.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90db3f1 and d02f592.

📒 Files selected for processing (1)
  • ovos_utils/thread_utils.py (1 hunks)
🔇 Additional comments (1)
ovos_utils/thread_utils.py (1)

1-10: Review: Imports and Module Dependencies
The import statements are clear and include all necessary modules for the threading utilities. One minor suggestion is to ensure that the external dependency kthread is available in the environment or to document its installation as it is vital for the killable daemon functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant