Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(models): add output_dir option to model_download #178

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

fix(pr): lint + extra equality checks

0997d40
Select commit
Loading
Failed to load commit list.
Draft

feat(models): add output_dir option to model_download #178

fix(pr): lint + extra equality checks
0997d40
Select commit
Loading
Failed to load commit list.
Google Cloud Build / kagglehub-branch-3-12-0 (kaggle-cicd) failed Nov 4, 2024 in 54s

Summary

Build Information

Trigger kagglehub-branch-3-12-0
Build 4884531b-aacc-4f33-a208-668db9db4b7a
Start 2024-11-04T13:46:09-08:00
Duration 52.335s
Status FAILURE

Steps

Step Status Duration
check_substitutions SUCCESS 26.859s
build-hatch-image SUCCESS 3.63s
tests CANCELLED 19.082s
lint FAILURE 18.386s

Details

starting build "4884531b-aacc-4f33-a208-668db9db4b7a"

FETCHSOURCE
hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: 	git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: 	git branch -m <name>
Initialized empty Git repository in /workspace/.git/
From https://github.com/Kaggle/kagglehub
 * branch            0997d406b84c90d0ebad80ac10d44c99358cfc6f -> FETCH_HEAD
HEAD is now at 0997d40 fix(pr): lint + extra equality checks
BUILD
Starting Step #0 - "check_substitutions"
Step #0 - "check_substitutions": Already have image (with digest): gcr.io/cloud-builders/docker
Step #0 - "check_substitutions": 3.12.0: Pulling from kaggle-cicd/tools/hatch
Step #0 - "check_substitutions": 90e5e7d8b87a: Pulling fs layer
Step #0 - "check_substitutions": 27e1a8ca91d3: Pulling fs layer
Step #0 - "check_substitutions": d3a767d1d12e: Pulling fs layer
Step #0 - "check_substitutions": 711be5dc5044: Pulling fs layer
Step #0 - "check_substitutions": 48b2d58a56e9: Pulling fs layer
Step #0 - "check_substitutions": b61fb8c5b702: Pulling fs layer
Step #0 - "check_substitutions": 67ddeb5b15df: Pulling fs layer
Step #0 - "check_substitutions": 7da1b82bcb72: Pulling fs layer
Step #0 - "check_substitutions": 09eea12e3c05: Pulling fs layer
Step #0 - "check_substitutions": 711be5dc5044: Waiting
Step #0 - "check_substitutions": 48b2d58a56e9: Waiting
Step #0 - "check_substitutions": b61fb8c5b702: Waiting
Step #0 - "check_substitutions": 67ddeb5b15df: Waiting
Step #0 - "check_substitutions": 7da1b82bcb72: Waiting
Step #0 - "check_substitutions": 09eea12e3c05: Waiting
Step #0 - "check_substitutions": 27e1a8ca91d3: Verifying Checksum
Step #0 - "check_substitutions": 27e1a8ca91d3: Download complete
Step #0 - "check_substitutions": 90e5e7d8b87a: Verifying Checksum
Step #0 - "check_substitutions": 90e5e7d8b87a: Download complete
Step #0 - "check_substitutions": d3a767d1d12e: Verifying Checksum
Step #0 - "check_substitutions": d3a767d1d12e: Download complete
Step #0 - "check_substitutions": 48b2d58a56e9: Verifying Checksum
Step #0 - "check_substitutions": 48b2d58a56e9: Download complete
Step #0 - "check_substitutions": 67ddeb5b15df: Verifying Checksum
Step #0 - "check_substitutions": 67ddeb5b15df: Download complete
Step #0 - "check_substitutions": b61fb8c5b702: Verifying Checksum
Step #0 - "check_substitutions": b61fb8c5b702: Download complete
Step #0 - "check_substitutions": 7da1b82bcb72: Verifying Checksum
Step #0 - "check_substitutions": 7da1b82bcb72: Download complete
Step #0 - "check_substitutions": 711be5dc5044: Download complete
Step #0 - "check_substitutions": 09eea12e3c05: Verifying Checksum
Step #0 - "check_substitutions": 09eea12e3c05: Download complete
Step #0 - "check_substitutions": 90e5e7d8b87a: Pull complete
Step #0 - "check_substitutions": 27e1a8ca91d3: Pull complete
Step #0 - "check_substitutions": d3a767d1d12e: Pull complete
Step #0 - "check_substitutions": 711be5dc5044: Pull complete
Step #0 - "check_substitutions": 48b2d58a56e9: Pull complete
Step #0 - "check_substitutions": b61fb8c5b702: Pull complete
Step #0 - "check_substitutions": 67ddeb5b15df: Pull complete
Step #0 - "check_substitutions": 7da1b82bcb72: Pull complete
Step #0 - "check_substitutions": 09eea12e3c05: Pull complete
Step #0 - "check_substitutions": Digest: sha256:61ecf0132b7dcda2abec2390f34fb9f363f4344f62e5dd0eb8433d2139c774dc
Step #0 - "check_substitutions": Status: Downloaded newer image for us-docker.pkg.dev/kaggle-cicd/tools/hatch:3.12.0
Step #0 - "check_substitutions": us-docker.pkg.dev/kaggle-cicd/tools/hatch:3.12.0
Finished Step #0 - "check_substitutions"
Starting Step #1 - "build-hatch-image"
Step #1 - "build-hatch-image": Already have image (with digest): gcr.io/cloud-builders/docker
Step #1 - "build-hatch-image": Sending build context to Docker daemon  6.656kB

Step #1 - "build-hatch-image": Step 1/4 : ARG PYTHON_VERSION
Step #1 - "build-hatch-image": Step 2/4 : FROM python:${PYTHON_VERSION}
Step #1 - "build-hatch-image": 3.12.0: Pulling from library/python
Step #1 - "build-hatch-image": 90e5e7d8b87a: Already exists
Step #1 - "build-hatch-image": 27e1a8ca91d3: Already exists
Step #1 - "build-hatch-image": d3a767d1d12e: Already exists
Step #1 - "build-hatch-image": 711be5dc5044: Already exists
Step #1 - "build-hatch-image": 48b2d58a56e9: Already exists
Step #1 - "build-hatch-image": b61fb8c5b702: Already exists
Step #1 - "build-hatch-image": 67ddeb5b15df: Already exists
Step #1 - "build-hatch-image": 7da1b82bcb72: Already exists
Step #1 - "build-hatch-image": Digest: sha256:1987c4ae3b5afaa3a7c5e247e9eaab7348082ba167986ca90d4d6a197fb364e8
Step #1 - "build-hatch-image": Status: Downloaded newer image for python:3.12.0
Step #1 - "build-hatch-image":  ---> afb69f3af77f
Step #1 - "build-hatch-image": Step 3/4 : RUN python -m pip install hatch twine
Step #1 - "build-hatch-image":  ---> Using cache
Step #1 - "build-hatch-image":  ---> 007e133cc715
Step #1 - "build-hatch-image": Step 4/4 : ENTRYPOINT ["hatch"]
Step #1 - "build-hatch-image":  ---> Using cache
Step #1 - "build-hatch-image":  ---> 1a20dac74835
Step #1 - "build-hatch-image": Successfully built 1a20dac74835
Step #1 - "build-hatch-image": Successfully tagged us-docker.pkg.dev/kaggle-cicd/tools/hatch:3.12.0
Finished Step #1 - "build-hatch-image"
Starting Step #2 - "tests"
Starting Step #3 - "lint"
Step #3 - "lint": Already have image (with digest): us-docker.pkg.dev/kaggle-cicd/tools/hatch:3.12.0
Step #2 - "tests": Already have image (with digest): us-docker.pkg.dev/kaggle-cicd/tools/hatch:3.12.0
Step #3 - "lint": Creating environment: lint
Step #2 - "tests": Creating environment: hatch-test.py3.12
Step #2 - "tests": Installing project in development mode
Step #3 - "lint": Checking dependencies
Step #3 - "lint": Syncing dependencies
Step #2 - "tests": Checking dependencies
Step #2 - "tests": Syncing dependencies
Step #2 - "tests": ============================= test session starts ==============================
Step #2 - "tests": platform linux -- Python 3.12.0, pytest-8.3.3, pluggy-1.5.0
Step #2 - "tests": rootdir: /workspace
Step #2 - "tests": configfile: pyproject.toml
Step #2 - "tests": collected 187 items
Step #2 - "tests": 
Step #2 - "tests": tests/test_auth.py .........                                             [  4%]
Step #2 - "tests": tests/test_cache.py ..................                                   [ 14%]
Step #2 - "tests": tests/test_colab_cache_dataset_download.py .........                     [ 19%]
Step #2 - "tests": tests/test_colab_cache_model_download.py .........                       [ 24%]
Step #2 - "tests": tests/test_config.py ....................                                [ 34%]
Step #2 - "tests": tests/test_dataset_upload.py .....                                       [ 37%]
Step #2 - "tests": tests/test_gcs_upload.py ..                                              [ 38%]
Step #2 - "tests": tests/test_handle.py .........                                           [ 43%]
Step #2 - "tests": tests/test_http_competition_download.py ...........                      [ 49%]
Step #2 - "tests": tests/test_http_dataset_download.py .........                            [ 54%]
Step #3 - "lint": cmd [1] | ruff check .
Step #3 - "lint": All checks passed!
Step #3 - "lint": cmd [2] | black --check --diff .
Step #3 - "lint": --- /workspace/src/kagglehub/models.py	2024-11-04 21:46:11.327569+00:00
Step #3 - "lint": +++ /workspace/src/kagglehub/models.py	2024-11-04 21:46:59.347461+00:00
Step #3 - "lint": @@ -13,15 +13,12 @@
Step #3 - "lint":  # Patterns that are always ignored for model uploading.
Step #3 - "lint":  DEFAULT_IGNORE_PATTERNS = [".git/", "*/.git/", ".cache/", ".huggingface/"]
Step #3 - "lint":  
Step #3 - "lint":  
Step #3 - "lint":  def model_download(
Step #3 - "lint": -    handle: str,
Step #3 - "lint": -    path: Optional[str] = None,
Step #3 - "lint": -    *,
Step #3 - "lint": -    force_download: Optional[bool] = False,
Step #3 - "lint": -    output_dir: Optional[str] = None) -> str:
Step #3 - "lint": +    handle: str, path: Optional[str] = None, *, force_download: Optional[bool] = False, output_dir: Optional[str] = None
Step #3 - "lint": +) -> str:
Step #3 - "lint":      """Download model files.
Step #3 - "lint":  
Step #3 - "lint":      Args:
Step #3 - "lint":          handle: (string) the model handle.
Step #3 - "lint":          path: (string) Optional path to a file within the model bundle.
Step #3 - "lint": @@ -38,22 +35,20 @@
Step #3 - "lint":      if output_dir is None or output_dir == cached_dir:
Step #3 - "lint":          return cached_dir
Step #3 - "lint":  
Step #3 - "lint":      try:
Step #3 - "lint":          # only copying so that we can maintain the cached files
Step #3 - "lint": -        logger.info(
Step #3 - "lint": -            f"Copying model files to requested directory: {output_dir} ...",
Step #3 - "lint": -            extra={**EXTRA_CONSOLE_BLOCK}
Step #3 - "lint": -        )
Step #3 - "lint": +        logger.info(f"Copying model files to requested directory: {output_dir} ...", extra={**EXTRA_CONSOLE_BLOCK})
Step #3 - "lint":          true_output_dir = copytree(cached_dir, output_dir, dirs_exist_ok=True)
Step #3 - "lint":          return true_output_dir
Step #3 - "lint":      except Exception as e:
Step #3 - "lint":          logger.warn(
Step #3 - "lint":              f"Successfully downloaded {handle}, but failed to copy from {cached_dir} "
Step #3 - "lint":              f"to requested output directory {output_dir}. Encountered error: {e}"
Step #3 - "lint":          )
Step #3 - "lint":          return cached_dir
Step #3 - "lint": +
Step #3 - "lint":  
Step #3 - "lint":  def model_upload(
Step #3 - "lint":      handle: str,
Step #3 - "lint":      local_model_dir: str,
Step #3 - "lint":      license_name: Optional[str] = None,
Step #3 - "lint": would reformat /workspace/src/kagglehub/models.py
Step #3 - "lint": --- /workspace/tests/test_http_model_download.py	2024-11-04 21:46:11.331569+00:00
Step #3 - "lint": +++ /workspace/tests/test_http_model_download.py	2024-11-04 21:47:00.140581+00:00
Step #3 - "lint": @@ -151,30 +151,27 @@
Step #3 - "lint":  
Step #3 - "lint":      def test_versioned_model_download_with_output_dir(self) -> None:
Step #3 - "lint":          with create_test_cache() as d:
Step #3 - "lint":              with TemporaryDirectory() as expected_output_dir:
Step #3 - "lint":                  self._download_model_and_assert_downloaded(
Step #3 - "lint": -                    d,
Step #3 - "lint": -                    VERSIONED_MODEL_HANDLE,
Step #3 - "lint": -                    expected_output_dir,
Step #3 - "lint": -                    output_dir=expected_output_dir
Step #3 - "lint": +                    d, VERSIONED_MODEL_HANDLE, expected_output_dir, output_dir=expected_output_dir
Step #3 - "lint":                  )
Step #3 - "lint":  
Step #3 - "lint":      def test_versioned_model_download_with_bad_output_dir(self) -> None:
Step #3 - "lint":          with (
Step #3 - "lint":              create_test_cache() as d,
Step #3 - "lint":              TemporaryDirectory() as placeholder_dir,
Step #3 - "lint": -            mock.patch("kagglehub.models.copytree") as mock_copytree
Step #3 - "lint": +            mock.patch("kagglehub.models.copytree") as mock_copytree,
Step #3 - "lint":          ):
Step #3 - "lint":              mock_copytree.side_effect = Exception("Mock exception")
Step #3 - "lint": -            expected_output_dir = EXPECTED_MODEL_SUBDIR # falls back to default
Step #3 - "lint": +            expected_output_dir = EXPECTED_MODEL_SUBDIR  # falls back to default
Step #3 - "lint":              self._download_model_and_assert_downloaded(
Step #3 - "lint":                  d,
Step #3 - "lint":                  VERSIONED_MODEL_HANDLE,
Step #3 - "lint":                  expected_output_dir,
Step #3 - "lint":                  # note: placeholder name is irrelevant since copytree is mocked to throw
Step #3 - "lint": -                output_dir=placeholder_dir
Step #3 - "lint": +                output_dir=placeholder_dir,
Step #3 - "lint":              )
Step #3 - "lint":  
Step #3 - "lint":      def test_unversioned_model_download_with_path_with_force_download(self) -> None:
Step #3 - "lint":          with create_test_cache() as d:
Step #3 - "lint":              self._download_test_file_and_assert_downloaded(d, UNVERSIONED_MODEL_HANDLE, force_download=True)
Step #3 - "lint": @@ -213,10 +210,11 @@
Step #3 - "lint":  
Step #3 - "lint":              # Not force downloaded, cache hit.
Step #3 - "lint":              model_path = kagglehub.model_download(VERSIONED_MODEL_HANDLE, path=TEST_FILEPATH, force_download=False)
Step #3 - "lint":  
Step #3 - "lint":              self.assertEqual(os.path.join(d, EXPECTED_MODEL_SUBPATH), model_path)
Step #3 - "lint": +
Step #3 - "lint":  
Step #3 - "lint":  class TestHttpNoInternet(BaseTestCase):
Step #3 - "lint":      def test_versioned_model_download_already_cached_with_force_download(self) -> None:
Step #3 - "lint":          with create_test_cache():
Step #3 - "lint":              server = serv.start_server(stub.app)
Step #3 - "lint": would reformat /workspace/tests/test_http_model_download.py
Step #3 - "lint": 
Step #3 - "lint": Oh no! 💥 💔 💥
Step #3 - "lint": 2 files would be reformatted, 62 files would be left unchanged.
Step #2 - "tests": tests/test_http_model_download.py .......................                [ 66%]
Step #2 - "tests": tests/test_integrity.py ....                                             [ 68%]
Finished Step #3 - "lint"
ERROR
ERROR: build step 3 "us-docker.pkg.dev/kaggle-cicd/tools/hatch:3.12.0" failed: step exited with non-zero status: 1
Step #2 - "tests": tests/test_kaggle_api_client.py .........

Build Log: https://console.cloud.google.com/cloud-build/builds/4884531b-aacc-4f33-a208-668db9db4b7a?project=464139560241