Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracks progress for package creation, upload and kickoff #2935

Merged
merged 17 commits into from
Feb 4, 2025

Conversation

kumare3
Copy link
Contributor

@kumare3 kumare3 commented Nov 16, 2024

Allows users to understand through visualization if large packages are being uploaded or incorrect files are being packaged.

Summary by Bito

Comprehensive Flytekit improvements including progress tracking with updated environment variables (LOGGING_RICH_FMT_ENV_VAR), configurable chunk sizes for S3/GCS uploads, and enhanced Ray task configuration with pod template support. Features include package creation/compression progress visualization, improved error handling, and secret management functionality. Status messages and variable names in fast registration module have been made more descriptive.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 5

Copy link

codecov bot commented Nov 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.26%. Comparing base (3f0ab84) to head (59b6742).
Report is 7 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2935       +/-   ##
===========================================
+ Coverage   76.33%   93.26%   +16.93%     
===========================================
  Files         199       48      -151     
  Lines       20840     1842    -18998     
  Branches     2681        0     -2681     
===========================================
- Hits        15908     1718    -14190     
+ Misses       4214      124     -4090     
+ Partials      718        0      -718     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Mecoli1219 Mecoli1219 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could consider implementing a global Progress to:

  1. Simplify the codebase
  2. Enable consistent progress reporting behavior across modules
  3. Make it easier to add progress tracking to new modules
  4. Allow for a global flag to enable/disable all progress logging (especially if we want something like --verbose or --silence)
image image

kumare3 and others added 7 commits November 18, 2024 17:45
@fiedlerNr9 fiedlerNr9 force-pushed the progress-tracking-pyflyte-run branch from a6cc2ed to ae55489 Compare January 23, 2025 04:26
@fiedlerNr9
Copy link
Contributor

fiedlerNr9 commented Jan 23, 2025

fast-register.mov

Changes I made:

  • introduced FLYTEKIT_DISPLAY_PROGRESS_ENV_VAR to control the progress display
  • updated rich progress task description and looks for creating tarball- & compressing tarball step

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 23, 2025

Code Review Agent Run #941fb7

Actionable Suggestions - 5
  • flytekit/remote/remote.py - 3
    • Consider encapsulating task identifier parameters · Line 446-452
    • Consider using dataclass for parameters · Line 613-617
    • Consider consolidating progress bar initialization logic · Line 1252-1258
  • flytekit/tools/fast_registration.py - 1
  • flytekit/loggers.py - 1
Additional Suggestions - 2
  • flytekit/remote/remote.py - 2
    • Consider combining optional string parameters · Line 613-617
    • Consider combining optional parameters into line · Line 479-483
Review Details
  • Files reviewed - 3 · Commit Range: 59b6742..5b05419
    • flytekit/loggers.py
    • flytekit/remote/remote.py
    • flytekit/tools/fast_registration.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Jan 23, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Progress Tracking Implementation

loggers.py - Added function to check if progress display is enabled

remote.py - Implemented progress tracking for package uploads

fast_registration.py - Added visual progress tracking for package creation and compression

Comment on lines 446 to 452
def fetch_task(
self,
project: str = None,
domain: str = None,
name: str = None,
version: str = None,
) -> FlyteTask:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider encapsulating task identifier parameters

Consider using a dataclass or named tuple for the parameters since they are used together in multiple places in the codebase (e.g., fetch_task_lazy, execute_local_task, sync_execution). This could improve code maintainability and reduce parameter duplication.

Code suggestion
Check the AI-generated fix before applying
 # Add TaskIdentifier class
 @dataclass
 class TaskIdentifier:
     project: Optional[str] = None
     domain: Optional[str] = None 
     name: Optional[str] = None
     version: Optional[str] = None

 # Update method signature
 def fetch_task(self, task_id: TaskIdentifier = None) -> FlyteTask:

Code Review Run #941fb7


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines 613 to 617
self,
project: str = None,
domain: str = None,
name: str = None,
version: str = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using dataclass for parameters

Consider using a dataclass or named tuple for the parameters since they are all optional string parameters that appear to be used together frequently.

Code suggestion
Check the AI-generated fix before applying
Suggested change
self,
project: str = None,
domain: str = None,
name: str = None,
version: str = None,
self, entity_id: EntityIdentifier = None,

Code Review Run #941fb7


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

flytekit/loggers.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@eapolinario eapolinario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, nothing major though.

flytekit/remote/remote.py Outdated Show resolved Hide resolved
flytekit/loggers.py Outdated Show resolved Hide resolved
@@ -186,5 +187,9 @@ def get_level_from_cli_verbosity(verbosity: int) -> int:
return logging.DEBUG


def is_display_progress_enabled() -> bool:
return os.getenv(FLYTEKIT_DISPLAY_PROGRESS_ENV_VAR, False)
Copy link
Collaborator

@eapolinario eapolinario Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use str2bool instead:

Suggested change
return os.getenv(FLYTEKIT_DISPLAY_PROGRESS_ENV_VAR, False)
return str2bool(os.getenv(FLYTEKIT_DISPLAY_PROGRESS_ENV_VAR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using str2bool here but we end up with a circular import here. Do you have a preference of fixing this?

flytekit/tools/fast_registration.py Outdated Show resolved Hide resolved
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 4, 2025

Code Review Agent Run #fd1704

Actionable Suggestions - 0
Additional Suggestions - 10
  • flytekit/models/security.py - 2
    • Consider adding env_var validation check · Line 45-45
    • Consider validating env_var before usage · Line 63-63
  • flytekit/core/resources.py - 1
    • Add parameter validation for container name · Line 106-106
  • tests/flytekit/integration/remote/workflows/basic/get_secret.py - 1
    • Consider explicit error for missing env · Line 19-20
  • flytekit/core/data_persistence.py - 2
    • Consider more descriptive environment variable name · Line 57-57
    • Consider configurable chunksize for protocols · Line 130-131
  • flytekit/remote/remote.py - 2
  • plugins/flytekit-ray/flytekitplugins/ray/task.py - 2
    • Consider improving error message clarity · Line 50-57
    • Consider consolidating pod template creation logic · Line 107-134
Review Details
  • Files reviewed - 29 · Commit Range: 5b05419..29b77f8
    • .pre-commit-config.yaml
    • flytekit/core/data_persistence.py
    • flytekit/core/resources.py
    • flytekit/core/type_engine.py
    • flytekit/image_spec/default_builder.py
    • flytekit/image_spec/image_spec.py
    • flytekit/interaction/parse_stdin.py
    • flytekit/models/security.py
    • flytekit/models/task.py
    • flytekit/remote/remote.py
    • flytekit/remote/remote_fs.py
    • flytekit/tools/fast_registration.py
    • flytekit/types/structured/structured_dataset.py
    • plugins/flytekit-ray/flytekitplugins/ray/task.py
    • plugins/flytekit-ray/setup.py
    • plugins/flytekit-ray/tests/test_ray.py
    • pydoclint-errors-baseline.txt
    • pyproject.toml
    • tests/flytekit/integration/remote/test_remote.py
    • tests/flytekit/integration/remote/workflows/basic/get_secret.py
    • tests/flytekit/integration/remote/workflows/basic/sd_attr.py
    • tests/flytekit/unit/core/image_spec/test_default_builder.py
    • tests/flytekit/unit/core/test_data_persistence.py
    • tests/flytekit/unit/core/test_dataclass.py
    • tests/flytekit/unit/core/test_list.py
    • tests/flytekit/unit/core/test_resources.py
    • tests/flytekit/unit/core/test_unions.py
    • tests/flytekit/unit/extras/pydantic_transformer/test_pydantic_basemodel_transformer.py
    • tests/flytekit/unit/types/structured_dataset/test_structured_dataset.py
  • Files skipped - 1
    • .github/workflows/pythonbuild.yml - Reason: Filter setting
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 4, 2025

Code Review Agent Run #57e89b

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 29b77f8..1d32481
    • flytekit/loggers.py
    • flytekit/tools/fast_registration.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@eapolinario
Copy link
Collaborator

This is pretty cool. Thank you!

@eapolinario eapolinario merged commit 5b1833d into master Feb 4, 2025
108 of 110 checks passed
UmerAhmad pushed a commit to UmerAhmad/flytekit that referenced this pull request Feb 8, 2025
* Tracks progress for package creation, upload and kickoff

Signed-off-by: Ketan Umare <[email protected]>

* updated

Signed-off-by: Ketan Umare <[email protected]>

* introduce FLYTEKIT_DISPLAY_PROGRESS_ENV_VAR to control progress

Signed-off-by: Jan Fiedler <[email protected]>

* update remote.py

Signed-off-by: Jan Fiedler <[email protected]>

* update fast_registration.py

Signed-off-by: Jan Fiedler <[email protected]>

* ruff format

Signed-off-by: Jan Fiedler <[email protected]>

* ruff check fix

Signed-off-by: Jan Fiedler <[email protected]>

* remove show_progress attribute from remote & fast_registration

Signed-off-by: Jan Fiedler <[email protected]>

* make lint

Signed-off-by: Jan Fiedler <[email protected]>

* Revert "make lint"

This reverts commit 5b05419.

Signed-off-by: Eduardo Apolinario <[email protected]>

* run make lint from repo this time

Signed-off-by: Eduardo Apolinario <[email protected]>

* reformat remote.py

Signed-off-by: Eduardo Apolinario <[email protected]>

* reformat fast_registration.py

Signed-off-by: Eduardo Apolinario <[email protected]>

* reuse LOGGING_RICH_FMT_ENV_VAR for is_display_progress_enabled()

Signed-off-by: Jan Fiedler <[email protected]>

* replace l & t variable names with total files & files_processed

Signed-off-by: Jan Fiedler <[email protected]>

---------

Signed-off-by: Ketan Umare <[email protected]>
Signed-off-by: Jan Fiedler <[email protected]>
Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Ketan Umare <[email protected]>
Co-authored-by: Jan Fiedler <[email protected]>
Co-authored-by: Jan Fiedler <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>
Signed-off-by: Umer Ahmad <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants