Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

various fixes for diff issues #393

Merged
merged 32 commits into from
Nov 13, 2020
Merged

various fixes for diff issues #393

merged 32 commits into from
Nov 13, 2020

Conversation

jperez999
Copy link
Contributor

This will refactor the workflow process for ingesting operators including simplified api and allowing multiple operators of same kind. Ordering will have priority this will allow chaining to follow a more user friendly convention. Reductions to phases will still be conducted before application phase. This also fixes tensorflow API 2 gpu memory usage util and band aid for torch tensor convergence issue.
#383 #377 #372

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit b35d2e726de4cedf291c52a4b6e32dde07eebece, no merge conflicts.
Running as SYSTEM
Setting status of b35d2e726de4cedf291c52a4b6e32dde07eebece to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1084/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse b35d2e726de4cedf291c52a4b6e32dde07eebece^{commit} # timeout=10
Checking out Revision b35d2e726de4cedf291c52a4b6e32dde07eebece (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f b35d2e726de4cedf291c52a4b6e32dde07eebece # timeout=10
Commit message: "various fixes for diff issues"
 > git rev-list --no-walk 96f7c9dd110e34d7c9843b314eb8cca1b1103e52 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins2675308116383736041.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
Oh no! 💥 💔 💥
4 files would be reformatted, 69 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins1979842867078301388.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit e09086045628965a8e22447e8d72d7bbcc6127df, has merge conflicts.
Running as SYSTEM
Setting status of e09086045628965a8e22447e8d72d7bbcc6127df to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1136/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse e09086045628965a8e22447e8d72d7bbcc6127df^{commit} # timeout=10
Checking out Revision e09086045628965a8e22447e8d72d7bbcc6127df (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e09086045628965a8e22447e8d72d7bbcc6127df # timeout=10
Commit message: "forward progress!"
 > git rev-list --no-walk dcd3428ad39e8aec93bec9663889307853e3a4d4 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3898417691150242608.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
Oh no! 💥 💔 💥
6 files would be reformatted, 67 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins5643162843928145624.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec, has merge conflicts.
Running as SYSTEM
Setting status of 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1143/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec^{commit} # timeout=10
Checking out Revision 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec # timeout=10
Commit message: "generating correct phases"
 > git rev-list --no-walk d9c6fd2c2cd88700b5847a656b608e5278471ac3 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1766930714392648553.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
Oh no! 💥 💔 💥
6 files would be reformatted, 67 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7742480476069775538.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 5fa294c4cb0fd658eb90db28705f56e087f939ac, has merge conflicts.
Running as SYSTEM
Setting status of 5fa294c4cb0fd658eb90db28705f56e087f939ac to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1148/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 5fa294c4cb0fd658eb90db28705f56e087f939ac^{commit} # timeout=10
Checking out Revision 5fa294c4cb0fd658eb90db28705f56e087f939ac (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5fa294c4cb0fd658eb90db28705f56e087f939ac # timeout=10
Commit message: "working all the way except some dataloader fails..."
 > git rev-list --no-walk c9f1d3034198ce753cc0d1daea38b61894929075 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins5352699732092640827.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_workflow.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
8 files would be reformatted, 65 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins5224801063729872014.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit f408d99d3a78b2541c191865cb4b23519eef4e1c, no merge conflicts.
Running as SYSTEM
Setting status of f408d99d3a78b2541c191865cb4b23519eef4e1c to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1149/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse f408d99d3a78b2541c191865cb4b23519eef4e1c^{commit} # timeout=10
Checking out Revision f408d99d3a78b2541c191865cb4b23519eef4e1c (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f f408d99d3a78b2541c191865cb4b23519eef4e1c # timeout=10
Commit message: "merging in"
 > git rev-list --no-walk 5fa294c4cb0fd658eb90db28705f56e087f939ac # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7330622643038130716.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
error: cannot format /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py: Cannot parse: 826:0: <<<<<<< HEAD
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_workflow.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
7 files would be reformatted, 68 files would be left unchanged, 1 file would fail to reformat.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins2721762884375901867.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 318679aa15695bc3ca8ff421db0f04bdaa33e217, no merge conflicts.
Running as SYSTEM
Setting status of 318679aa15695bc3ca8ff421db0f04bdaa33e217 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1150/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 318679aa15695bc3ca8ff421db0f04bdaa33e217^{commit} # timeout=10
Checking out Revision 318679aa15695bc3ca8ff421db0f04bdaa33e217 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 318679aa15695bc3ca8ff421db0f04bdaa33e217 # timeout=10
Commit message: "code reformat"
 > git rev-list --no-walk f408d99d3a78b2541c191865cb4b23519eef4e1c # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2750347368199920233.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF...... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py ......FF..FF..FFFFFFFFFFFFFF.. [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=================================== FAILURES ===================================
_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3597ab090>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa358d11f80>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa33c121e90>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b192ddd0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3584a76d0>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b5213290>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3b51bfc90>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b18f4e60>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3b5304690>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b5328050>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3b26f2ed0>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b1b4e170>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa32071b8d0>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b180ddd0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa33c1cf2d0>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa38c18ce60>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa320741e10>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b18f4050>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3200dfed0>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b52bfd40>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa320364690>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa36071b680>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3204e6310>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):
      X, y = next(data_itr)

tests/unit/test_tf_dataloader.py:95:


nvtabular/loader/backend.py:254: in next
return self._get_next_batch()
nvtabular/loader/backend.py:281: in _get_next_batch
self._fetch_chunk()
nvtabular/loader/backend.py:260: in _fetch_chunk
raise chunks
nvtabular/loader/backend.py:119: in load_chunks
chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)
nvtabular/loader/backend.py:313: in make_tensors
gdf = workflow.apply_ops(gdf)
nvtabular/workflow.py:729: in apply_ops
gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])
nvtabular/workflow.py:709: in _run_trans_ops_for_phase
gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3600ebf80>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name0-cont_names0-cat_names0-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc649ad0>
engine = 'parquet', cat_names = ['name-cat', 'name-string']
cont_names = ['x', 'y', 'id'], label_name = ['label']

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )
  df_out = processor.get_ddf().compute(scheduler="synchronous")

tests/unit/test_torch_dataloader.py:73:


/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute
(result,) = compute(self, traverse=False, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc44de60>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name0-cont_names0-cat_names1-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co1')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa280693c90>
engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']
label_name = ['label']

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )
  df_out = processor.get_ddf().compute(scheduler="synchronous")

tests/unit/test_torch_dataloader.py:73:


/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute
(result,) = compute(self, traverse=False, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc3f3710>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name1-cont_names0-cat_names0-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa38c1a89d0>
engine = 'parquet', cat_names = ['name-cat', 'name-string']
cont_names = ['x', 'y', 'id'], label_name = []

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )
  df_out = processor.get_ddf().compute(scheduler="synchronous")

tests/unit/test_torch_dataloader.py:73:


/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute
(result,) = compute(self, traverse=False, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2d8084c20>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________ test_empty_cols[label_name1-cont_names0-cat_names1-parquet] __________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co1')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc1c7810>
engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']
label_name = []

@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )
  df_out = processor.get_ddf().compute(scheduler="synchronous")

tests/unit/test_torch_dataloader.py:73:


/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute
(result,) = compute(self, traverse=False, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc3c9710>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-1-1e-06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_1e_0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2806fd950>
batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2d86d63b0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_______________________ test_gpu_dl[None-parquet-1-0.06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_0_00')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa318202510>
batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc55e8c0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-10-1e-06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_1e0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa29c666150>
batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa31831e4d0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-10-0.06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_0_0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa31850eb50>
batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa35924fd40>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_gpu_dl[None-parquet-100-1e-06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_10')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa318278d50>
batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2d83729e0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-100-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_00')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc622b10>
batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc5cf200>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-1-1e-06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_10')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3584a95d0>
batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc5cff80>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_gpu_dl[devices1-parquet-1-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_11')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa28026ded0>
batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc448710>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-10-1e-06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_12')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc56de50>
batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa29c517440>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-10-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_13')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa29c7f4e50>
batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc511830>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
___________________ test_gpu_dl[devices1-parquet-100-1e-06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_14')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa280797910>
batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet'
devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc432c20>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-100-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_15')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa29c579ed0>
batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa29c733ef0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_________________________ test_kill_dl[parquet-1e-06] __________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_1e_06_0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa280316fd0>
part_mem_fraction = 1e-06, engine = 'parquet'

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_path=output_train,
    )

tests/unit/test_torch_dataloader.py:184:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa280226710>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
__________________________ test_kill_dl[parquet-0.1] ___________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_0_1_0')
df = name-cat name-string id label x y
0 Yvonne Charlie 972 955 -0.699630 -0.63284...y 1017 981 0.645883 0.779787
2160 Ingrid Patricia 962 946 0.176731 -0.735924

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc721190>
part_mem_fraction = 0.1, engine = 'parquet'

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
      output_path=output_train,
    )

tests/unit/test_torch_dataloader.py:184:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa29c67def0>
fill_value = -0.007777519058436155

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 41425 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42525 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 34371 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa280693390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa3207359d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa3207359d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa2bc1aa790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa2bc1aa790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa2bc1aa790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa2bc1c75d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa2802bdd10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa2802bdd10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa3182233d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa3182233d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa3182233d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 41301 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 3 40 3 96% 54->55, 55-59, 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 9 112 7 96% 71->72, 72, 77-78, 123->124, 124, 212->214, 214, 220->222, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 14 52 9 85% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 37 10 6 0 63% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 606 87 336 26 81% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 514 1568 173 82%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 82.35%
=========================== short test summary info ============================
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names0-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names0-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06] - Typ...
FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-0.1] - TypeE...
===== 30 failed, 557 passed, 8 skipped, 266 warnings in 381.36s (0:06:21) ======
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins8671505101989059999.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 986c38d8f5e1f8f981f636438683dd9b1d1599eb, no merge conflicts.
Running as SYSTEM
Setting status of 986c38d8f5e1f8f981f636438683dd9b1d1599eb to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1155/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 986c38d8f5e1f8f981f636438683dd9b1d1599eb^{commit} # timeout=10
Checking out Revision 986c38d8f5e1f8f981f636438683dd9b1d1599eb (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 986c38d8f5e1f8f981f636438683dd9b1d1599eb # timeout=10
Commit message: "fixes for test fails in dataloader"
 > git rev-list --no-walk 8184506ba87fbfef7fbee544145236a4360adbae # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins2450794695755062914.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_torch_dataloader.py
Oh no! 💥 💔 💥
1 file would be reformatted, 75 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins643749491149907924.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 0e89af3472997c90fb06aa9883effa990e0eae16, no merge conflicts.
Running as SYSTEM
Setting status of 0e89af3472997c90fb06aa9883effa990e0eae16 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1156/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 0e89af3472997c90fb06aa9883effa990e0eae16^{commit} # timeout=10
Checking out Revision 0e89af3472997c90fb06aa9883effa990e0eae16 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0e89af3472997c90fb06aa9883effa990e0eae16 # timeout=10
Commit message: "reformat codes"
 > git rev-list --no-walk 986c38d8f5e1f8f981f636438683dd9b1d1599eb # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6914325229575234121.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py ..FF [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF...... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py ..............FFFFFFFFFFFF.... [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=================================== FAILURES ===================================
_____________________________ test_rossman_example _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0')

def test_rossman_example(tmpdir):
    pytest.importorskip("nvtabular.loader.tensorflow")
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "train.csv"))
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "valid.csv"))
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(
        dirname(TEST_PATH), "examples", "rossmann-store-sales-example.ipynb"
    )
  _run_notebook(tmpdir, notebook_path, lambda line: line.replace("EPOCHS = 25", "EPOCHS = 1"))

tests/unit/test_notebooks.py:67:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f0e30501050>
stdout = b'Train for 1 steps\n\r1/1 [==============================] - 6s 6s/step - loss: 0.5433 - rmspe_tf: 0.4862\n'
stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
2020-11-09 05:31:22.352842: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib
2020-11-09 05:31:22.352970: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib
2020-11-09 05:31:22.352985: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-11-09 05:31:23.166543: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-09 05:31:23.167476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.168508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.169530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:
pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.170528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:
pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.170799: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-09 05:31:23.170849: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-09 05:31:23.170883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-09 05:31:23.170914: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-09 05:31:23.170945: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-09 05:31:23.170975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-09 05:31:23.171006: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 05:31:23.178313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3
2020-11-09 05:31:23.371600: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-09 05:31:23.395251: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3198080000 Hz
2020-11-09 05:31:23.396319: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557dafa6b0e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-09 05:31:23.396380: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-09 05:31:23.745961: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557daefc8010 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-09 05:31:23.746031: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 05:31:23.746042: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (1): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 05:31:23.746050: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (2): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 05:31:23.746057: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (3): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 05:31:23.747951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.749138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.750154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:
pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.751192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:
pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 05:31:23.751302: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-09 05:31:23.751340: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-09 05:31:23.751360: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-09 05:31:23.751379: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-09 05:31:23.751396: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-09 05:31:23.751413: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-09 05:31:23.751431: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 05:31:23.758752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3
2020-11-09 05:31:23.758817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-09 05:31:23.763972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-09 05:31:23.763999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0 1 2 3
2020-11-09 05:31:23.764009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N Y Y Y
2020-11-09 05:31:23.764017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1: Y N Y Y
2020-11-09 05:31:23.764024: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 2: Y Y N Y
2020-11-09 05:31:23.764031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 3: Y Y Y N
2020-11-09 05:31:23.768748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8139 MB memory) -> physical GPU (device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0)
2020-11-09 05:31:23.770236: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 14864 MB memory) -> physical GPU (device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0)
2020-11-09 05:31:23.771660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 14864 MB memory) -> physical GPU (device: 2, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0e:00.0, compute capability: 6.0)
2020-11-09 05:31:23.773098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 14864 MB memory) -> physical GPU (device: 3, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0f:00.0, compute capability: 6.0)
2020-11-09 05:31:23.780216: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 7.95G (8534360064 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
2020-11-09 05:31:30.548029: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0/notebook.py", line 210, in
).to('cuda')
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 612, in to
return self._apply(convert)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 381, in _apply
param_applied = fn(param)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 610, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory
__________________________ test_multigpu_dask_example __________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0')

def test_multigpu_dask_example(tmpdir):
    with get_cuda_cluster() as cuda_cluster:
        os.environ["BASE_DIR"] = str(tmpdir)
        scheduler_port = cuda_cluster.scheduler_address

        def _nb_modify(line):
            # Use cuda_cluster "fixture" port rather than allowing notebook
            # to deploy a LocalCUDACluster within the subprocess
            line = line.replace("cluster = None", f"cluster = '{scheduler_port}'")
            # Use a much smaller "toy" dataset
            line = line.replace("write_count = 25", "write_count = 4")
            line = line.replace('freq = "1s"', 'freq = "1h"')
            # Use smaller partitions for smaller dataset
            line = line.replace("part_mem_fraction=0.1", "part_size=1_000_000")
            line = line.replace("out_files_per_proc=8", "out_files_per_proc=1")
            return line

        notebook_path = os.path.join(dirname(TEST_PATH), "examples", "multi-gpu_dask.ipynb")
      _run_notebook(tmpdir, notebook_path, _nb_modify)

tests/unit/test_notebooks.py:88:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f0ded3f9310>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
distributed.worker - WARNING - Run Failed
Function: _rmm_pool
args: ()
kwargs: {}
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/worker.py", line 3546, in run
result = function(*args, **kwargs)
File "/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool
initial_pool_size=None, # Use default size
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize
log_file_name=log_file_name,
File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py", line 84, in
client.run(_rmm_pool)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2512, in run
return self.sync(self._run, function, *args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 833, in sync
self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 340, in sync
raise exc.with_traceback(tb)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 324, in f
result[0] = yield future
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2449, in _run
raise exc.with_traceback(tb)
File "/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool
initial_pool_size=None, # Use default size
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize
log_file_name=log_file_name,
File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory
_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_1_parquet_0')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da7f868d0>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0dc8536c50>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_1_parquet_1')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da4182c50>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0da4132410>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_10_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da7f43050>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0da7ab67d0>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_10_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dc86b8050>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0dc84e9810>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_100_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0ded4145d0>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0ded400a10>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_100_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0e206a4890>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0d8c5a3d10>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_1_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d8c5e6b90>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0d8c5e6fd0>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_1_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dc8510890>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0f16fc5c90>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_10_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dc8079d90>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0e3048a1d0>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_10_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0ded750350>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0f172e9710>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_100_parqu0')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0e3037de10>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0f16e1c650>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_100_parqu1')
paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da407a810>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f0da41a4450>
dask_stats = x 0.009410043
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
______________________ test_gpu_dl[None-parquet-1-1e-06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_1_1e_0')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d30763050>
batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0dc86ae5f0>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_______________________ test_gpu_dl[None-parquet-1-0.06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_1_0_00')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d301d0410>
batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0dc866f050>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-10-1e-06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_10_1e0')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d301d0710>
batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d0c7b9a70>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-10-0.06] _______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_10_0_0')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d3068f290>
batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d84230b90>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_gpu_dl[None-parquet-100-1e-06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_100_10')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0cec16c290>
batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c749c20>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
______________________ test_gpu_dl[None-parquet-100-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_100_00')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d8c05e250>
batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = None

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d0c4bf9e0>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-1-1e-06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_10')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0cec5d7ad0>
batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d842c39e0>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
_____________________ test_gpu_dl[devices1-parquet-1-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_11')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d3073c850>
batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c5be050>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-10-1e-06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_12')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d30597410>
batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d30621440>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-10-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_13')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dec78c6d0>
batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d84341a70>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
___________________ test_gpu_dl[devices1-parquet-100-1e-06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_14')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d30684750>
batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet'
devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c6e7830>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
____________________ test_gpu_dl[devices1-parquet-100-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_15')
df = name-cat name-string id label x y
0 Xavier Edith 1047 1020 0.945586 0.02596...a 985 994 0.375274 -0.989079
2160 Kevin Kevin 1018 956 -0.376415 -0.533307

[4321 rows x 6 columns]
dataset = <nvtabular.io.dataset.Dataset object at 0x7f0cec3a2ed0>
batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]

@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,
      out_files_per_proc=2,
    )

tests/unit/test_torch_dataloader.py:113:


nvtabular/workflow.py:1025: in apply
dtypes=dtypes,
nvtabular/workflow.py:1144: in build_and_process_graph
num_threads=num_io_threads,
nvtabular/workflow.py:1233: in ddf_to_dataset
num_threads,
nvtabular/io/dask.py:112: in _ddf_to_dataset
out = dask.compute(out, scheduler="synchronous")[0]
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute
results = schedule(dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync
return get_async(apply_sync, 1, dsk, keys, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async
fire_task()
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task
callback=queue.put,
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync
res = func(args, **kwds)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task
result = pack_exception(e, dumps)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task
result = _execute_task(task, data)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(
(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get
result = _execute_task(task, cache)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task
return func(*(_execute_task(a, cache) for a in args))
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply
return func(*args, **kwargs)
/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce
df = func(*args, **kwargs)
nvtabular/workflow.py:830: in _aggregated_op
gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)
nvtabular/ops/transform_operator.py:100: in apply_op
new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)
nvtabular/ops/fill.py:113: in op_logic
new_gdf[col] = gdf[col].fillna(stat_val)
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna
value=value, method=method, axis=axis, inplace=inplace, limit=limit
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna
copy_data[name] = copy_data[name].fillna(value[name],)


self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c7a2680>
fill_value = -0.000922759878449142

def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(
                  type(fill_value).__name__, self.dtype.name
                )
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError
=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 34315 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 36571 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42533 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d302bb890>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d30453b90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d30453b90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d30277d10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d30277d10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d30277d10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d30111dd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d4c10dad0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d4c10dad0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d30498190>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d30498190>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d30498190>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112320 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 41903 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 2 22 4 90% 83->85, 85->87, 89->92, 92, 96->97, 97
nvtabular/framework_utils/torch/utils.py 31 8 10 4 71% 51->52, 52, 55->56, 56-58, 61->62, 62-65, 70->50
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 16 48 9 82% 190->191, 191, 196->198, 198, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 2 40 3 97% 73->75, 75, 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 13 112 8 95% 71->72, 72, 77-78, 123->124, 124, 131-132, 212->214, 214, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 14 52 9 85% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 37 10 6 0 63% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 10 30 1 80% 53-57, 80->92, 117-122
nvtabular/workflow.py 606 87 336 26 81% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 533 1568 180 82%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 81.70%
=========================== short test summary info ============================
FAILED tests/unit/test_notebooks.py::test_rossman_example - subprocess.Called...
FAILED tests/unit/test_notebooks.py::test_multigpu_dask_example - subprocess....
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-0.06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-0.06]
===== 26 failed, 561 passed, 8 skipped, 267 warnings in 384.30s (0:06:24) ======
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins4596099623436440276.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502, no merge conflicts.
Running as SYSTEM
Setting status of 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1158/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502^{commit} # timeout=10
Checking out Revision 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
Commit message: "udpate to pass tests"
 > git rev-list --no-walk 5f34242ae04ff2e3b918f9c111975bacd12304ea # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1047770170142313710.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py FFFF [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=================================== FAILURES ===================================
_____________________________ test_criteo_notebook _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0')

def test_criteo_notebook(tmpdir):
    # create a toy dataset in tmpdir, and point environment variables so the notebook
    # will read from it
    for i in range(24):
        df = _get_random_criteo_data(1000)
        df.to_parquet(os.path.join(tmpdir, f"day_{i}.parquet"))
    os.environ["INPUT_DATA_DIR"] = str(tmpdir)
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    _run_notebook(
        tmpdir,
        os.path.join(dirname(TEST_PATH), "examples", "criteo-example.ipynb"),
        # disable rmm.reinitialize, seems to be causing issues
      transform=lambda line: line.replace("rmm.reinitialize(", "# rmm.reinitialize("),
    )

tests/unit/test_notebooks.py:45:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d49b94850>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0/notebook.py", line 24, in
import nvtabular as nvt
ModuleNotFoundError: No module named 'nvtabular'
_____________________________ test_optimize_criteo _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0')

def test_optimize_criteo(tmpdir):
    _get_random_criteo_data(1000).to_csv(os.path.join(tmpdir, "day_0"), sep="\t", header=False)
    os.environ["INPUT_DATA_DIR"] = str(tmpdir)
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(dirname(TEST_PATH), "examples", "optimize_criteo.ipynb")
  _run_notebook(tmpdir, notebook_path)

tests/unit/test_notebooks.py:55:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d25e8d690>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0/notebook.py", line 60, in
import nvtabular as nvt
ModuleNotFoundError: No module named 'nvtabular'
_____________________________ test_rossman_example _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0')

def test_rossman_example(tmpdir):
    pytest.importorskip("nvtabular.loader.tensorflow")
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "train.csv"))
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "valid.csv"))
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(
        dirname(TEST_PATH), "examples", "rossmann-store-sales-example.ipynb"
    )
  _run_notebook(tmpdir, notebook_path, lambda line: line.replace("EPOCHS = 25", "EPOCHS = 1"))

tests/unit/test_notebooks.py:67:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d25f949d0>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0/notebook.py", line 17, in
import nvtabular as nvt
ModuleNotFoundError: No module named 'nvtabular'
__________________________ test_multigpu_dask_example __________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0')

def test_multigpu_dask_example(tmpdir):
    with get_cuda_cluster() as cuda_cluster:
        os.environ["BASE_DIR"] = str(tmpdir)
        scheduler_port = cuda_cluster.scheduler_address

        def _nb_modify(line):
            # Use cuda_cluster "fixture" port rather than allowing notebook
            # to deploy a LocalCUDACluster within the subprocess
            line = line.replace("cluster = None", f"cluster = '{scheduler_port}'")
            # Use a much smaller "toy" dataset
            line = line.replace("write_count = 25", "write_count = 4")
            line = line.replace('freq = "1s"', 'freq = "1h"')
            # Use smaller partitions for smaller dataset
            line = line.replace("part_mem_fraction=0.1", "part_size=1_000_000")
            line = line.replace("out_files_per_proc=8", "out_files_per_proc=1")
            return line

        notebook_path = os.path.join(dirname(TEST_PATH), "examples", "multi-gpu_dask.ipynb")
      _run_notebook(tmpdir, notebook_path, _nb_modify)

tests/unit/test_notebooks.py:88:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d25f3ce50>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0/notebook.py", line 31, in
import nvtabular as nvt
ModuleNotFoundError: No module named 'nvtabular'
=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44903 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44479 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 34601 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8c502baf90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8c5036cf50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8c5036cf50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8c50326e10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8c50326e10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8c50326e10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8c5033dcd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8d4f44b450>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8d4f44b450>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8d0c3d1550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8d0c3d1550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8d0c3d1550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 59784 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54912 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56160 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42891 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 2 22 7 85% 55->57, 58->60, 63->65, 83->85, 85->87, 89->92, 92, 96->97, 97
nvtabular/framework_utils/torch/utils.py 31 8 10 4 71% 51->52, 52, 55->56, 56-58, 61->62, 62-65, 70->50
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 17 48 9 80% 190->191, 191, 196->198, 198, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 333->334, 334-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 2 40 3 97% 73->75, 75, 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 13 112 11 94% 71->72, 72, 90->91, 91, 95->87, 115->116, 116, 123->124, 124, 131-132, 220->222, 240->241, 241, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 14 52 9 85% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 76->77, 77, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 26 20 5 45% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73, 85-90, 100-113
nvtabular/loader/torch.py 37 11 6 0 60% 25-27, 30-36, 118
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 42 82% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 313->317, 318->320, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 10 30 1 80% 53-57, 80->92, 117-122
nvtabular/workflow.py 606 97 336 26 79% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 601->600, 609->612, 612, 633-648, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 562 1568 188 81%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 80.59%
=========================== short test summary info ============================
FAILED tests/unit/test_notebooks.py::test_criteo_notebook - subprocess.Called...
FAILED tests/unit/test_notebooks.py::test_optimize_criteo - subprocess.Called...
FAILED tests/unit/test_notebooks.py::test_rossman_example - subprocess.Called...
FAILED tests/unit/test_notebooks.py::test_multigpu_dask_example - subprocess....
====== 4 failed, 583 passed, 8 skipped, 273 warnings in 480.94s (0:08:00) ======
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5948351989957738444.sh

@jperez999
Copy link
Contributor Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502, no merge conflicts.
Running as SYSTEM
Setting status of 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1160/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502^{commit} # timeout=10
Checking out Revision 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
Commit message: "udpate to pass tests"
 > git rev-list --no-walk 599afb8d8a160cd55336d830d9ded3f21a4046c0 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3609457319750059927.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py ..FF [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=================================== FAILURES ===================================
_____________________________ test_rossman_example _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0')

def test_rossman_example(tmpdir):
    pytest.importorskip("nvtabular.loader.tensorflow")
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "train.csv"))
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "valid.csv"))
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(
        dirname(TEST_PATH), "examples", "rossmann-store-sales-example.ipynb"
    )
  _run_notebook(tmpdir, notebook_path, lambda line: line.replace("EPOCHS = 25", "EPOCHS = 1"))

tests/unit/test_notebooks.py:67:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f80acfab050>
stdout = b'Train for 1 steps\n\r1/1 [==============================] - 6s 6s/step - loss: 0.5078 - rmspe_tf: 0.4699\n'
stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
2020-11-09 06:15:02.167514: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib
2020-11-09 06:15:02.167643: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib
2020-11-09 06:15:02.167657: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2020-11-09 06:15:02.968579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-09 06:15:02.969581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:02.970683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:02.971781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:
pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:02.972879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:
pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:02.973160: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-09 06:15:02.973214: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-09 06:15:02.973252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-09 06:15:02.973286: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-09 06:15:02.973320: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-09 06:15:02.973353: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-09 06:15:02.973388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:15:02.981211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3
2020-11-09 06:15:03.175182: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-11-09 06:15:03.199318: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3198080000 Hz
2020-11-09 06:15:03.200279: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56333e839550 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-09 06:15:03.200304: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-11-09 06:15:03.540481: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56333e14e8f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-09 06:15:03.540552: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 06:15:03.540565: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (1): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 06:15:03.540572: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (2): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 06:15:03.540580: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (3): Tesla P100-DGXS-16GB, Compute Capability 6.0
2020-11-09 06:15:03.542431: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:03.543516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:
pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:03.544577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:
pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:03.545650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:
pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0
coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-09 06:15:03.545742: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-09 06:15:03.545768: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-11-09 06:15:03.545789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-11-09 06:15:03.545809: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-11-09 06:15:03.545827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-11-09 06:15:03.545845: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-11-09 06:15:03.545864: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-09 06:15:03.552786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3
2020-11-09 06:15:03.552847: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-11-09 06:15:03.557607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-09 06:15:03.557631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0 1 2 3
2020-11-09 06:15:03.557641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N Y Y Y
2020-11-09 06:15:03.557648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1: Y N Y Y
2020-11-09 06:15:03.557656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 2: Y Y N Y
2020-11-09 06:15:03.557662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 3: Y Y Y N
2020-11-09 06:15:03.561931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8139 MB memory) -> physical GPU (device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0)
2020-11-09 06:15:03.563277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 14864 MB memory) -> physical GPU (device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0)
2020-11-09 06:15:03.564654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 14864 MB memory) -> physical GPU (device: 2, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0e:00.0, compute capability: 6.0)
2020-11-09 06:15:03.566006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 14864 MB memory) -> physical GPU (device: 3, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0f:00.0, compute capability: 6.0)
2020-11-09 06:15:03.573095: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 7.95G (8534360064 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
2020-11-09 06:15:10.391896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0/notebook.py", line 210, in
).to('cuda')
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 612, in to
return self._apply(convert)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
module._apply(fn)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 381, in _apply
param_applied = fn(param)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 610, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory
__________________________ test_multigpu_dask_example __________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0')

def test_multigpu_dask_example(tmpdir):
    with get_cuda_cluster() as cuda_cluster:
        os.environ["BASE_DIR"] = str(tmpdir)
        scheduler_port = cuda_cluster.scheduler_address

        def _nb_modify(line):
            # Use cuda_cluster "fixture" port rather than allowing notebook
            # to deploy a LocalCUDACluster within the subprocess
            line = line.replace("cluster = None", f"cluster = '{scheduler_port}'")
            # Use a much smaller "toy" dataset
            line = line.replace("write_count = 25", "write_count = 4")
            line = line.replace('freq = "1s"', 'freq = "1h"')
            # Use smaller partitions for smaller dataset
            line = line.replace("part_mem_fraction=0.1", "part_size=1_000_000")
            line = line.replace("out_files_per_proc=8", "out_files_per_proc=1")
            return line

        notebook_path = os.path.join(dirname(TEST_PATH), "examples", "multi-gpu_dask.ipynb")
      _run_notebook(tmpdir, notebook_path, _nb_modify)

tests/unit/test_notebooks.py:88:


tests/unit/test_notebooks.py:108: in _run_notebook
subprocess.check_output([sys.executable, script_path])
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output
**kwargs).stdout


input = None, capture_output = False, timeout = None, check = True
popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py'],)
kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f80ac08e250>
stdout = b'', stderr = None, retcode = 1

def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,
                                   output=stdout, stderr=stderr)

E subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py']' returned non-zero exit status 1.

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError
----------------------------- Captured stderr call -----------------------------
distributed.worker - WARNING - Run Failed
Function: _rmm_pool
args: ()
kwargs: {}
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/worker.py", line 3546, in run
result = function(*args, **kwargs)
File "/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool
initial_pool_size=None, # Use default size
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize
log_file_name=log_file_name,
File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory
Traceback (most recent call last):
File "/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py", line 84, in
client.run(_rmm_pool)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2512, in run
return self.sync(self._run, function, *args, **kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 833, in sync
self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 340, in sync
raise exc.with_traceback(tb)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 324, in f
result[0] = yield future
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/gen.py", line 735, in run
value = future.result()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2449, in _run
raise exc.with_traceback(tb)
File "/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool
initial_pool_size=None, # Use default size
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize
log_file_name=log_file_name,
File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize
File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit
MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory
=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46477 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44077 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 41669 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80045cdc10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80044fd690>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80044fd690>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80044842d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80044842d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80044842d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80044fe390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80044b2810>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80044b2810>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80045962d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80045962d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80045962d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58084 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54496 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56160 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 37385 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 2 22 4 90% 83->85, 85->87, 89->92, 92, 96->97, 97
nvtabular/framework_utils/torch/utils.py 31 8 10 4 71% 51->52, 52, 55->56, 56-58, 61->62, 62-65, 70->50
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 16 48 9 82% 190->191, 191, 196->198, 198, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 2 40 3 97% 73->75, 75, 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 10 112 7 96% 71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 13 52 8 86% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 37 10 6 0 63% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 10 30 1 80% 53-57, 80->92, 117-122
nvtabular/workflow.py 606 88 336 27 81% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 530 1568 179 82%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 81.77%
=========================== short test summary info ============================
FAILED tests/unit/test_notebooks.py::test_rossman_example - subprocess.Called...
FAILED tests/unit/test_notebooks.py::test_multigpu_dask_example - subprocess....
====== 2 failed, 585 passed, 8 skipped, 273 warnings in 534.17s (0:08:54) ======
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins4997896031306181373.sh

@jperez999
Copy link
Contributor Author

rerun tests

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502, no merge conflicts.
Running as SYSTEM
Setting status of 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1161/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502^{commit} # timeout=10
Checking out Revision 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
Commit message: "udpate to pass tests"
 > git rev-list --no-walk 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1502471784733242026.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF...... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=================================== FAILURES ===================================
_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeae16150>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbaeadc6dd0>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbb0425c090>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbb0408b0d0>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae80c4190>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbae8149390>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae80c9990>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbb04036cd0>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeb781f90>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbaea4ca650>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbad47f35d0>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbae80c9650>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbb65b71310>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbb65d29610>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae80c9f50>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbae815b710>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeb7fbd50>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbb4b589550>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae8194250>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbae8194190>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbad47d94d0>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbad47d9f50>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeb3be390>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:90:


nvtabular/workflow.py:1087: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1128: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:896: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7fbb04217410>
dask_stats = x
y -0.005164921
id 999.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 35279 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42665 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44063 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba900a1210>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c66f550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c66f550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c6570d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c6570d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba6c6570d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba6c6757d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c407550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c407550>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c661c50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c661c50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba6c661c50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52856 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54688 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 41017 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 11 112 8 95% 71->72, 72, 123->124, 124, 131-132, 212->214, 214, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 14 52 9 85% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 37 10 6 0 63% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 606 88 336 27 81% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 515 1568 174 82%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 82.31%
=========================== short test summary info ============================
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]
===== 12 failed, 575 passed, 8 skipped, 273 warnings in 493.04s (0:08:13) ======
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins7537479321991169570.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 292c97338e01fb6240f9fadb41e159f94156598f, no merge conflicts.
Running as SYSTEM
Setting status of 292c97338e01fb6240f9fadb41e159f94156598f to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1163/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 292c97338e01fb6240f9fadb41e159f94156598f^{commit} # timeout=10
Checking out Revision 292c97338e01fb6240f9fadb41e159f94156598f (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 292c97338e01fb6240f9fadb41e159f94156598f # timeout=10
Commit message: "Merge branch 'main' into fixes"
 > git rev-list --no-walk 8f43564392bd047e5ffd38d3acf8e26437a8ae6a # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins7026001201112661322.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46261 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46667 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44797 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff11416a350>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff1140f8c50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff1140f8c50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff114173fd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff114173fd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff114173fd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff1140cebd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff114103dd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff114103dd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff1143bf250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff1143bf250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff1143bf250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52856 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56872 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55104 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54428 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43885 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 10 112 7 96% 71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 13 52 8 86% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 37 10 6 0 63% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 606 88 336 27 81% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 513 1568 172 82%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 82.39%
=========== 587 passed, 8 skipped, 273 warnings in 506.65s (0:08:26) ===========
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins8460091688957331325.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit a4378fb6811698e943fcfaebdb33d27397052889, no merge conflicts.
Running as SYSTEM
Setting status of a4378fb6811698e943fcfaebdb33d27397052889 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1165/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse a4378fb6811698e943fcfaebdb33d27397052889^{commit} # timeout=10
Checking out Revision a4378fb6811698e943fcfaebdb33d27397052889 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a4378fb6811698e943fcfaebdb33d27397052889 # timeout=10
Commit message: "Merge branch 'fixes' of https://github.com/jperez999/NVTabular into fixes"
 > git rev-list --no-walk fc94cf4e09359f2b357978e59540799efc46d777 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins6137235349559612032.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 63%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 81%]
tests/unit/test_workflow.py ............................................ [ 88%]
..................................................................... [100%]

=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44861 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 44979 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 46417 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f094c113bd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f094c2cb690>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f094c2cb690>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f094c141710>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f094c141710>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f094c141710>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f092c729d90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f094c110b10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f094c110b10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f092c732290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f092c732290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f092c732290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56680 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54912 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56160 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112320 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43991 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 3 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 10 112 7 96% 71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481
nvtabular/loader/tensorflow.py 117 13 52 8 86% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 37 10 6 0 63% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 3 32 3 95% 149->150, 150, 154->179, 186->187, 187, 211
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 1 2 0 96% 52
nvtabular/ops/minmax.py 30 1 2 0 97% 56
nvtabular/ops/moments.py 91 1 20 0 99% 65
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 4 96% 144->146, 173->174, 174, 178->179, 179, 240->243
nvtabular/ops/transform_operator.py 54 9 18 3 78% 37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 606 88 336 27 81% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211
setup.py 2 2 0 0 0% 18-20

TOTAL 3611 513 1568 172 82%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 82.39%
=========== 587 passed, 8 skipped, 273 warnings in 509.33s (0:08:29) ===========
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins1397314997299959976.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 605266b87c49f852ee0f63ad642b42a6975c4c90, no merge conflicts.
Running as SYSTEM
Setting status of 605266b87c49f852ee0f63ad642b42a6975c4c90 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1195/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 605266b87c49f852ee0f63ad642b42a6975c4c90^{commit} # timeout=10
Checking out Revision 605266b87c49f852ee0f63ad642b42a6975c4c90 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 605266b87c49f852ee0f63ad642b42a6975c4c90 # timeout=10
Commit message: "Merge branch 'main' into fixes"
 > git rev-list --no-walk 001ead29294cb7090fe5c7278142bfc93cd20f7d # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1418035229950289504.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 62%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
...................................................................... [100%]

=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 35211 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 35277 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 45941 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c03c390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f47687d0b10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f47687d0b10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f478c254f10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f478c254f10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c254f10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c0b17d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f47687b4390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f47687b4390>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f478c03b250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f478c03b250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c03b250>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 59784 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58276 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58016 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 36817 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 22-23, 26-45, 56-69, 72
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 11 112 7 95% 71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488
nvtabular/loader/tensorflow.py 117 13 52 8 86% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 41 10 8 0 67% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 2 32 3 96% 149->150, 150, 154->179, 186->187, 187
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 0 2 0 100%
nvtabular/ops/minmax.py 30 0 2 0 100%
nvtabular/ops/moments.py 91 0 20 0 100%
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 3 96% 144->146, 173->174, 174, 178->179, 179
nvtabular/ops/transform_operator.py 50 6 16 3 83% 37->exit, 54-58, 80->81, 81-83, 100->101, 101
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 565 46 300 26 89% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 304->310, 310, 316->317, 317-321, 351->exit, 366->exit, 381->exit, 396->exit, 449->451, 467->466, 526->529, 529, 554->555, 555, 561->564, 564, 661->660, 715->720, 720, 723->724, 724, 769->770, 770, 828-856, 973->979, 979->exit, 1021->1022, 1022, 1031->1037, 1073->1074, 1074-1076, 1080->1081, 1081, 1116->1117, 1117
setup.py 2 2 0 0 0% 18-20

TOTAL 3601 489 1542 170 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 83.43%
=========== 588 passed, 8 skipped, 273 warnings in 508.48s (0:08:28) ===========
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins9183663288096164358.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 6892912d7042a381361fce1b6fe3c1ce3ce3c772, no merge conflicts.
Running as SYSTEM
Setting status of 6892912d7042a381361fce1b6fe3c1ce3ce3c772 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1200/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 6892912d7042a381361fce1b6fe3c1ce3ce3c772^{commit} # timeout=10
Checking out Revision 6892912d7042a381361fce1b6fe3c1ce3ce3c772 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6892912d7042a381361fce1b6fe3c1ce3ce3c772 # timeout=10
Commit message: "Merge branch 'fixes' of https://github.com/jperez999/NVTabular into fixes"
 > git rev-list --no-walk 8cb254ba35024c7baff6b9b7bafc40dcac1c266f # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins7853190238873591743.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF...... [ 62%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
...................................................................... [100%]

=================================== FAILURES ===================================
_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f4076483050>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f400cc0fa10>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff77b0ad0>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f3ff77b07d0>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff766f210>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f3ff766f390>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff7659bd0>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f3ff7659ed0>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f400c0ea290>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f400c0ea450>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = True
dataset = <nvtabular.io.dataset.Dataset object at 0x7f4064307190>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f423bc37910>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f40643dde90>
batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f3ff77c9b90>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f40643dd210>
batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f400c7c0290>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f4243d5e510>
batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f40643d3cd0>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f423b952e90>
batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f40643d5c10>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff77f1810>
batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f400c71af10>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')
paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']
use_paths = False
dataset = <nvtabular.io.dataset.Dataset object at 0x7f407d8a5290>
batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
  processor.update_stats(dataset)

tests/unit/test_tf_dataloader.py:101:


nvtabular/workflow.py:993: in update_stats
self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)
nvtabular/workflow.py:1034: in build_and_process_graph
self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))
nvtabular/workflow.py:802: in exec_phase
op.finalize(computed_stats)
/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner
return func(*args, **kwds)


self = <nvtabular.ops.median.Median object at 0x7f423b8cb090>
dask_stats = x -0.009865141
y
id 1000.0
Name: 0.5, dtype: float64

@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:
      self.medians[col] = float(dask_stats[col])

E TypeError: float() argument must be a string or a number, not 'NoneType'

nvtabular/ops/median.py:49: TypeError
=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 40821 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 34563 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43359 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f8c3b5790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c0d1b90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c0d1b90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c132850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c132850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f8c132850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f5469ac10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c143290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c143290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c1b7410>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c1b7410>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f8c1b7410>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 59592 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55832 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54912 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 37697 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 22-23, 26-45, 56-69, 72
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 12 112 8 95% 71->72, 72, 123->124, 124, 131-132, 212->214, 214, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488
nvtabular/loader/tensorflow.py 117 14 52 9 85% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 41 10 8 0 67% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 2 32 3 96% 149->150, 150, 154->179, 186->187, 187
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 0 2 0 100%
nvtabular/ops/minmax.py 30 0 2 0 100%
nvtabular/ops/moments.py 91 0 20 0 100%
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 26 0 12 1 97% 39->exit
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 3 96% 144->146, 173->174, 174, 178->179, 179
nvtabular/ops/transform_operator.py 47 6 14 2 84% 49-53, 75->76, 76-78, 95->96, 96
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 565 46 300 26 89% 80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 304->310, 310, 316->317, 317-321, 351->exit, 366->exit, 381->exit, 396->exit, 449->451, 467->466, 526->529, 529, 554->555, 555, 561->564, 564, 661->660, 715->720, 720, 723->724, 724, 769->770, 770, 828-856, 973->979, 979->exit, 1021->1022, 1022, 1031->1037, 1073->1074, 1074-1076, 1080->1081, 1081, 1116->1117, 1117
setup.py 2 2 0 0 0% 18-20

TOTAL 3598 491 1540 171 83%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 83.36%
=========================== short test summary info ============================
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]
FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]
===== 12 failed, 576 passed, 8 skipped, 273 warnings in 500.79s (0:08:20) ======
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins1269299399639666014.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e, no merge conflicts.
Running as SYSTEM
Setting status of fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1201/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e^{commit} # timeout=10
Checking out Revision fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e # timeout=10
Commit message: "code format changes"
 > git rev-list --no-walk 6892912d7042a381361fce1b6fe3c1ce3ce3c772 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins3212521516033073898.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
./nvtabular/workflow.py:125:13: F841 local variable 'found' is assigned to but never used
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins1999713744125321167.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit d14ebb4a72d0912c632c6488c3d9c9848284e83b, no merge conflicts.
Running as SYSTEM
Setting status of d14ebb4a72d0912c632c6488c3d9c9848284e83b to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1202/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse d14ebb4a72d0912c632c6488c3d9c9848284e83b^{commit} # timeout=10
Checking out Revision d14ebb4a72d0912c632c6488c3d9c9848284e83b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d14ebb4a72d0912c632c6488c3d9c9848284e83b # timeout=10
Commit message: "better exception message with actionable feedback"
 > git rev-list --no-walk fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7370348144888778361.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
./nvtabular/workflow.py:125:13: F841 local variable 'found' is assigned to but never used
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8635576175260255048.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 93f34e4e506030059ea66a262605bb0ad4e5b5f2, no merge conflicts.
Running as SYSTEM
Setting status of 93f34e4e506030059ea66a262605bb0ad4e5b5f2 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1203/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 93f34e4e506030059ea66a262605bb0ad4e5b5f2^{commit} # timeout=10
Checking out Revision 93f34e4e506030059ea66a262605bb0ad4e5b5f2 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 93f34e4e506030059ea66a262605bb0ad4e5b5f2 # timeout=10
Commit message: "flake error correction"
 > git rev-list --no-walk d14ebb4a72d0912c632c6488c3d9c9848284e83b # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2189674226408279097.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 62%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins4286647271522822677.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 5173f22ae1367b38554473ca612d1af016b88bc0, no merge conflicts.
Running as SYSTEM
Setting status of 5173f22ae1367b38554473ca612d1af016b88bc0 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1205/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 5173f22ae1367b38554473ca612d1af016b88bc0^{commit} # timeout=10
Checking out Revision 5173f22ae1367b38554473ca612d1af016b88bc0 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5173f22ae1367b38554473ca612d1af016b88bc0 # timeout=10
Commit message: "code reformat"
 > git rev-list --no-walk f48281cc74ac4020d221da225b5c6f6ff36845e0 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1476836800848207370.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py .....................................Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5364765974075243811.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8, no merge conflicts.
Running as SYSTEM
Setting status of 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1206/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8^{commit} # timeout=10
Checking out Revision 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 # timeout=10
Commit message: "added more fixes based on comments"
 > git rev-list --no-walk 5173f22ae1367b38554473ca612d1af016b88bc0 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2157146446715089014.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py /tmp/jenkins2157146446715089014.sh: line 10: 24449 Terminated py.test --cov-config tests/unit/.coveragerc --cov-report term-missing --cov-report xml --cov-fail-under 70 --cov=. tests/unit/
Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5908777722741407840.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit cb2c5817a1290f6b2a472c2c10df88503bd22516, no merge conflicts.
Running as SYSTEM
Setting status of cb2c5817a1290f6b2a472c2c10df88503bd22516 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1207/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse cb2c5817a1290f6b2a472c2c10df88503bd22516^{commit} # timeout=10
Checking out Revision cb2c5817a1290f6b2a472c2c10df88503bd22516 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f cb2c5817a1290f6b2a472c2c10df88503bd22516 # timeout=10
Commit message: "remove copy on list call... redundant"
 > git rev-list --no-walk 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2808874403306199635.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py .........Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins2356138026806348218.sh

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit d0d5590d288eb2453417dd7953695ac63d876992, no merge conflicts.
Running as SYSTEM
Setting status of d0d5590d288eb2453417dd7953695ac63d876992 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1208/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse d0d5590d288eb2453417dd7953695ac63d876992^{commit} # timeout=10
Checking out Revision d0d5590d288eb2453417dd7953695ac63d876992 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d0d5590d288eb2453417dd7953695ac63d876992 # timeout=10
Commit message: "Merge branch 'main' into fixes"
 > git rev-list --no-walk cb2c5817a1290f6b2a472c2c10df88503bd22516 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins215466437062872589.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 62%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
...................................................................... [100%]

=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43461 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 33963 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 35671 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07642f6dd0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f07642c0850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f07642c0850>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f07642c0110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f07642c0110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07642c0110>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07641d7290>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0764613d50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0764613d50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f07842dda10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f07842dda10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07842dda10>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54428 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54496 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 36619 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 22-23, 26-45, 56-69, 72
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 11 112 7 95% 71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488
nvtabular/loader/tensorflow.py 117 13 52 8 86% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 41 10 8 0 67% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 2 32 3 96% 149->150, 150, 154->179, 186->187, 187
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 0 2 0 100%
nvtabular/ops/minmax.py 30 0 2 0 100%
nvtabular/ops/moments.py 91 0 20 0 100%
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 28 1 12 1 95% 41->44, 44
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 3 96% 144->146, 173->174, 174, 178->179, 179
nvtabular/ops/transform_operator.py 47 6 14 2 84% 49-53, 75->76, 76-78, 95->96, 96
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 528 19 270 26 94% 80->81, 81, 125->126, 126, 127->128, 128, 139->161, 161, 287->293, 293, 299->300, 300-304, 334->exit, 349->exit, 364->exit, 379->exit, 432->434, 450->449, 509->512, 512, 537->538, 538, 544->547, 547, 644->643, 698->703, 703, 706->707, 707, 752->753, 753, 914->920, 920->exit, 958->959, 959, 968->974, 1010->1011, 1011-1013, 1017->1018, 1018, 1053->1054, 1054
setup.py 2 2 0 0 0% 18-20

TOTAL 3563 463 1510 169 84%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.21%
=========== 588 passed, 8 skipped, 273 warnings in 506.77s (0:08:26) ===========
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins659260479090043931.sh

@jperez999 jperez999 requested a review from benfred November 13, 2020 03:59
current = self.create_full_col_ctx_entry(op, target_cols, extra_cols, parent=parent)
self.reduce(self.columns_ctx["full"])

def reduce(self, full_dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it clear that these new methods on the workflow object are private - so users don't need to care about them. L

Methods like reduce/transform/analyze_placement/detect_cols_collision etc should also have a '_' by them ?

Copy link
Member

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 otherwise

@nvidia-merlin-bot
Copy link
Contributor

Click to view CI Results
GitHub pull request #393 of commit 209a58a4ea6f9d18c7ee59b199228d5d079d5798, no merge conflicts.
Running as SYSTEM
Setting status of 209a58a4ea6f9d18c7ee59b199228d5d079d5798 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1209/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 209a58a4ea6f9d18c7ee59b199228d5d079d5798^{commit} # timeout=10
Checking out Revision 209a58a4ea6f9d18c7ee59b199228d5d079d5798 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 209a58a4ea6f9d18c7ee59b199228d5d079d5798 # timeout=10
Commit message: "Merge branch 'fixes' of https://github.com/jperez999/NVTabular into fixes"
 > git rev-list --no-walk d0d5590d288eb2453417dd7953695ac63d876992 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins737640288114269815.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py ............................................ [ 8%]
.......... [ 10%]
tests/unit/test_io.py .................................................. [ 18%]
........................................ssssssss [ 26%]
tests/unit/test_notebooks.py .... [ 27%]
tests/unit/test_ops.py ................................................. [ 35%]
........................................................................ [ 47%]
....................................................................... [ 59%]
tests/unit/test_s3.py .. [ 59%]
tests/unit/test_tf_dataloader.py ................... [ 62%]
tests/unit/test_tf_layers.py ........................................... [ 70%]
.................................. [ 75%]
tests/unit/test_torch_dataloader.py .............................. [ 80%]
tests/unit/test_workflow.py ............................................ [ 88%]
...................................................................... [100%]

=============================== warnings summary ===============================
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219
/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)

tests/unit/test_column_similarity.py: 12 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.
warnings.warn(msg, DeprecationWarning)

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]
/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.

For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m
warnings.warn(errors.NumbaWarning(msg))

tests/unit/test_column_similarity.py: 12 warnings
tests/unit/test_dask_nvt.py: 2 warnings
tests/unit/test_io.py: 5 warnings
tests/unit/test_torch_dataloader.py: 1 warning
tests/unit/test_workflow.py: 5 warnings
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
mask = pd.Series(mask)

tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 37361 instead
http_address["port"], self.http_server.port

tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]
tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.
warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)

tests/unit/test_notebooks.py::test_multigpu_dask_example
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 36793 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 39521 instead
http_address["port"], self.http_server.port

tests/unit/test_ops.py::test_categorify_lists[0]
tests/unit/test_ops.py::test_categorify_lists[1]
tests/unit/test_ops.py::test_categorify_lists[2]
tests/unit/test_torch_dataloader.py::test_mh_model_support
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None
"right", dtype_r, dtype_l, libcudf_join_type

tests/unit/test_tf_dataloader.py: 72 warnings
tests/unit/test_tf_layers.py: 125 warnings
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
tensor_proto.tensor_content = nparray.tostring()

tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]
/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
if isinstance(inputs, collections.Sequence):

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f92083ad3d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f920840fe50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f920840fe50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f92087c42d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f92087c42d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f92087c42d0>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f9208355d50>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f92083c7b90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f92083c7b90>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f9208449790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f9208449790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f9208449790>], target columns is empty.
warnings.warn(f"Did not add operators: {operators}, target columns is empty.")

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55832 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57600 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58276 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58016 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069
f"Row group size {rg_byte_size_0} is bigger than requested part_size "

tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]
/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 43951 instead
http_address["port"], self.http_server.port

tests/unit/test_workflow.py::test_chaining_3
/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.
warnings.warn("part_mem_fraction is ignored for DataFrame input.")

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 8 0 0 0 100%
nvtabular/framework_utils/init.py 0 0 0 0 100%
nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100%
nvtabular/framework_utils/tensorflow/feature_column_utils.py 125 117 81 0 4% 12-16, 53-251
nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100%
nvtabular/framework_utils/tensorflow/layers/embedding.py 153 14 89 7 87% 47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374
nvtabular/framework_utils/tensorflow/layers/interaction.py 47 2 20 1 96% 47->48, 48, 112
nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 10 0 15% 22-23, 26-45, 56-69, 72
nvtabular/framework_utils/torch/init.py 0 0 0 0 100%
nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100%
nvtabular/framework_utils/torch/layers/embeddings.py 27 1 12 1 95% 46->47, 47
nvtabular/framework_utils/torch/models.py 38 0 22 0 100%
nvtabular/framework_utils/torch/utils.py 31 4 10 2 85% 51->52, 52, 55->56, 56-58
nvtabular/io/init.py 4 0 0 0 100%
nvtabular/io/avro.py 78 78 26 0 0% 16-175
nvtabular/io/csv.py 14 1 4 1 89% 35->36, 36
nvtabular/io/dask.py 80 3 32 6 92% 154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178
nvtabular/io/dataframe_engine.py 12 1 4 1 88% 31->32, 32
nvtabular/io/dataset.py 105 15 48 8 84% 190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355
nvtabular/io/dataset_engine.py 13 0 0 0 100%
nvtabular/io/hugectr.py 42 1 18 1 97% 64->87, 91
nvtabular/io/parquet.py 124 1 40 2 98% 87->89, 89, 183->185
nvtabular/io/shuffle.py 25 2 10 2 89% 38->39, 39, 43->46, 46
nvtabular/io/writer.py 123 9 45 2 92% 30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205
nvtabular/io/writer_factory.py 16 2 6 2 82% 31->32, 32, 49->52, 52
nvtabular/loader/init.py 0 0 0 0 100%
nvtabular/loader/backend.py 271 11 112 7 95% 71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488
nvtabular/loader/tensorflow.py 117 13 52 8 86% 39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347
nvtabular/loader/tf_utils.py 55 9 20 5 81% 29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73
nvtabular/loader/torch.py 41 10 8 0 67% 25-27, 30-36
nvtabular/ops/init.py 22 0 0 0 100%
nvtabular/ops/bucketize.py 37 4 25 4 81% 33->34, 34, 35->44, 36->42, 42-44, 54->55, 55
nvtabular/ops/categorify.py 397 59 218 40 83% 160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833
nvtabular/ops/clip.py 25 3 10 4 80% 52->53, 53, 61->62, 62, 66->68, 68->69, 69
nvtabular/ops/column_similarity.py 89 21 28 4 70% 171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238
nvtabular/ops/difference_lag.py 22 1 6 1 93% 75->76, 76
nvtabular/ops/dropna.py 14 0 0 0 100%
nvtabular/ops/fill.py 36 2 10 2 91% 66->67, 67, 107->108, 108
nvtabular/ops/filter.py 22 1 6 1 93% 44->45, 45
nvtabular/ops/groupby_statistics.py 83 2 32 3 96% 149->150, 150, 154->179, 186->187, 187
nvtabular/ops/hash_bucket.py 35 4 18 2 85% 98->99, 99-101, 102->105, 105
nvtabular/ops/hashed_cross.py 32 1 16 1 96% 35->36, 36
nvtabular/ops/join_external.py 66 4 26 5 90% 105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179
nvtabular/ops/join_groupby.py 56 0 18 0 100%
nvtabular/ops/lambdaop.py 27 2 10 2 89% 82->83, 83, 84->85, 85
nvtabular/ops/logop.py 17 1 4 1 90% 57->58, 58
nvtabular/ops/median.py 24 0 2 0 100%
nvtabular/ops/minmax.py 30 0 2 0 100%
nvtabular/ops/moments.py 91 0 20 0 100%
nvtabular/ops/normalize.py 49 4 14 4 84% 65->66, 66, 73->72, 122->123, 123, 132->134, 134-135
nvtabular/ops/operator.py 28 1 12 1 95% 41->44, 44
nvtabular/ops/stat_operator.py 10 0 0 0 100%
nvtabular/ops/target_encoding.py 98 2 40 3 96% 144->146, 173->174, 174, 178->179, 179
nvtabular/ops/transform_operator.py 47 6 14 2 84% 49-53, 75->76, 76-78, 95->96, 96
nvtabular/utils.py 25 5 10 5 71% 26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47
nvtabular/worker.py 65 1 30 2 97% 80->92, 118->121, 121
nvtabular/workflow.py 528 19 270 26 94% 80->81, 81, 125->126, 126, 127->128, 128, 139->161, 161, 286->292, 292, 298->299, 299-303, 333->exit, 348->exit, 363->exit, 378->exit, 431->433, 449->448, 508->511, 511, 536->537, 537, 543->546, 546, 643->642, 697->702, 702, 705->706, 706, 751->752, 752, 913->919, 919->exit, 957->958, 958, 967->973, 1009->1010, 1010-1012, 1016->1017, 1017, 1052->1053, 1053
setup.py 2 2 0 0 0% 18-20

TOTAL 3563 463 1510 169 84%
Coverage XML written to file coverage.xml

Required test coverage of 70% reached. Total coverage: 84.21%
=========== 588 passed, 8 skipped, 273 warnings in 512.32s (0:08:32) ===========
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop
loop.start()
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start
self.asyncio_loop.run_forever()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit
logger.warning("Restarting worker")
Message: 'Restarting worker'
Arguments: ()
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins4709658086127700440.sh

@jperez999 jperez999 merged commit 99fc9e5 into NVIDIA-Merlin:main Nov 13, 2020
mikemckiernan pushed a commit that referenced this pull request Nov 24, 2022
* various fixes for diff issues

* forward progress!

* generating correct phases

* working all the way except some dataloader fails...

* code reformat

* fixes for test fails in dataloader

* reformat codes

* udpate to pass tests

* add checks for workflow phases

* fixes and test case for issue 401

* code reformat

* remove extra code that was refactored out

* comments fixed

* more fices

* code format changes

* better exception message with actionable feedback

* flake error correction

* add warning on retry set

* code reformat

* added more fixes based on comments

* remove copy on list call... redundant

* privatize new functions

Co-authored-by: Ben Frederickson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Target Encoding Key Error [BUG] LambdaOp + Categorify: keyerror
3 participants