various fixes for diff issues #393

jperez999 · 2020-11-02T17:15:52Z

This will refactor the workflow process for ingesting operators including simplified api and allowing multiple operators of same kind. Ordering will have priority this will allow chaining to follow a more user friendly convention. Reductions to phases will still be conducted before application phase. This also fixes tensorflow API 2 gpu memory usage util and band aid for torch tensor convergence issue.
#383 #377 #372

nvidia-merlin-bot · 2020-11-02T17:16:15Z

Click to view CI Results

GitHub pull request #393 of commit b35d2e726de4cedf291c52a4b6e32dde07eebece, no merge conflicts.
Running as SYSTEM
Setting status of b35d2e726de4cedf291c52a4b6e32dde07eebece to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1084/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse b35d2e726de4cedf291c52a4b6e32dde07eebece^{commit} # timeout=10
Checking out Revision b35d2e726de4cedf291c52a4b6e32dde07eebece (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f b35d2e726de4cedf291c52a4b6e32dde07eebece # timeout=10
Commit message: "various fixes for diff issues"
 > git rev-list --no-walk 96f7c9dd110e34d7c9843b314eb8cca1b1103e52 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins2675308116383736041.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
Oh no! 💥 💔 💥
4 files would be reformatted, 69 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins1979842867078301388.sh

nvidia-merlin-bot · 2020-11-06T06:02:41Z

Click to view CI Results

GitHub pull request #393 of commit e09086045628965a8e22447e8d72d7bbcc6127df, has merge conflicts.
Running as SYSTEM
Setting status of e09086045628965a8e22447e8d72d7bbcc6127df to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1136/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse e09086045628965a8e22447e8d72d7bbcc6127df^{commit} # timeout=10
Checking out Revision e09086045628965a8e22447e8d72d7bbcc6127df (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f e09086045628965a8e22447e8d72d7bbcc6127df # timeout=10
Commit message: "forward progress!"
 > git rev-list --no-walk dcd3428ad39e8aec93bec9663889307853e3a4d4 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3898417691150242608.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
Oh no! 💥 💔 💥
6 files would be reformatted, 67 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins5643162843928145624.sh

nvidia-merlin-bot · 2020-11-07T07:19:49Z

Click to view CI Results

GitHub pull request #393 of commit 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec, has merge conflicts.
Running as SYSTEM
Setting status of 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1143/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec^{commit} # timeout=10
Checking out Revision 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4e5f41e1f7dbf82b7e1eba87d7f8995563ab47ec # timeout=10
Commit message: "generating correct phases"
 > git rev-list --no-walk d9c6fd2c2cd88700b5847a656b608e5278471ac3 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1766930714392648553.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
Oh no! 💥 💔 💥
6 files would be reformatted, 67 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins7742480476069775538.sh

nvidia-merlin-bot · 2020-11-08T05:48:31Z

Click to view CI Results

GitHub pull request #393 of commit 5fa294c4cb0fd658eb90db28705f56e087f939ac, has merge conflicts.
Running as SYSTEM
Setting status of 5fa294c4cb0fd658eb90db28705f56e087f939ac to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1148/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 5fa294c4cb0fd658eb90db28705f56e087f939ac^{commit} # timeout=10
Checking out Revision 5fa294c4cb0fd658eb90db28705f56e087f939ac (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5fa294c4cb0fd658eb90db28705f56e087f939ac # timeout=10
Commit message: "working all the way except some dataloader fails..."
 > git rev-list --no-walk c9f1d3034198ce753cc0d1daea38b61894929075 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins5352699732092640827.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_workflow.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
8 files would be reformatted, 65 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins5224801063729872014.sh

nvidia-merlin-bot · 2020-11-08T06:15:41Z

Click to view CI Results

GitHub pull request #393 of commit f408d99d3a78b2541c191865cb4b23519eef4e1c, no merge conflicts.
Running as SYSTEM
Setting status of f408d99d3a78b2541c191865cb4b23519eef4e1c to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1149/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse f408d99d3a78b2541c191865cb4b23519eef4e1c^{commit} # timeout=10
Checking out Revision f408d99d3a78b2541c191865cb4b23519eef4e1c (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f f408d99d3a78b2541c191865cb4b23519eef4e1c # timeout=10
Commit message: "merging in"
 > git rev-list --no-walk 5fa294c4cb0fd658eb90db28705f56e087f939ac # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7330622643038130716.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/torch.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/loader/tf_utils.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/lambdaop.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/operator.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/transform_operator.py
error: cannot format /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py: Cannot parse: 826:0: <<<<<<< HEAD
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_workflow.py
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_ops.py
Oh no! 💥 💔 💥
7 files would be reformatted, 68 files would be left unchanged, 1 file would fail to reformat.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins2721762884375901867.sh

nvidia-merlin-bot · 2020-11-08T06:38:18Z

Click to view CI Results

GitHub pull request #393 of commit 318679aa15695bc3ca8ff421db0f04bdaa33e217, no merge conflicts.
Running as SYSTEM
Setting status of 318679aa15695bc3ca8ff421db0f04bdaa33e217 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1150/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 318679aa15695bc3ca8ff421db0f04bdaa33e217^{commit} # timeout=10
Checking out Revision 318679aa15695bc3ca8ff421db0f04bdaa33e217 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 318679aa15695bc3ca8ff421db0f04bdaa33e217 # timeout=10
Commit message: "code reformat"
 > git rev-list --no-walk f408d99d3a78b2541c191865cb4b23519eef4e1c # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2750347368199920233.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF......                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ......FF..FF..FFFFFFFFFFFFFF..       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=================================== FAILURES ===================================

_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3597ab090>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa358d11f80>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa33c121e90>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b192ddd0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3584a76d0>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b5213290>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3b51bfc90>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b18f4e60>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3b5304690>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b5328050>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3b26f2ed0>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b1b4e170>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa32071b8d0>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b180ddd0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa33c1cf2d0>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa38c18ce60>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa320741e10>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b18f4050>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3200dfed0>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3b52bfd40>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa320364690>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa36071b680>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3204e6310>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))
    processor.update_stats(dataset)
    data_itr.map(processor)

    rows = 0
    for idx in range(len(data_itr)):


      X, y = next(data_itr)


tests/unit/test_tf_dataloader.py:95:

nvtabular/loader/backend.py:254: in next

return self._get_next_batch()

nvtabular/loader/backend.py:281: in _get_next_batch

self._fetch_chunk()

nvtabular/loader/backend.py:260: in _fetch_chunk

raise chunks

nvtabular/loader/backend.py:119: in load_chunks

chunks = dataloader.make_tensors(chunks, dataloader._use_nnz)

nvtabular/loader/backend.py:313: in make_tensors

gdf = workflow.apply_ops(gdf)

nvtabular/workflow.py:729: in apply_ops

gdf = self._run_trans_ops_for_phase(gdf, self.phases[phase_index])

nvtabular/workflow.py:709: in _run_trans_ops_for_phase

gdf = op.apply_op(gdf, self.columns_ctx, cols_grp, target_cols, self.stats)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa3600ebf80>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name0-cont_names0-cat_names0-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc649ad0>

engine = 'parquet', cat_names = ['name-cat', 'name-string']

cont_names = ['x', 'y', 'id'], label_name = ['label']
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )


  df_out = processor.get_ddf().compute(scheduler="synchronous")


tests/unit/test_torch_dataloader.py:73:

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute

(result,) = compute(self, traverse=False, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc44de60>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name0-cont_names0-cat_names1-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name0_co1')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa280693c90>

engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']

label_name = ['label']
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )


  df_out = processor.get_ddf().compute(scheduler="synchronous")


tests/unit/test_torch_dataloader.py:73:

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute

(result,) = compute(self, traverse=False, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc3f3710>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name1-cont_names0-cat_names0-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa38c1a89d0>

engine = 'parquet', cat_names = ['name-cat', 'name-string']

cont_names = ['x', 'y', 'id'], label_name = []
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )


  df_out = processor.get_ddf().compute(scheduler="synchronous")


tests/unit/test_torch_dataloader.py:73:

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute

(result,) = compute(self, traverse=False, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2d8084c20>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________ test_empty_cols[label_name1-cont_names0-cat_names1-parquet] __________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_empty_cols_label_name1_co1')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc1c7810>

engine = 'parquet', cat_names = [], cont_names = ['x', 'y', 'id']

label_name = []
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("cat_names", [["name-cat", "name-string"], []])
@pytest.mark.parametrize("cont_names", [["x", "y", "id"], []])
@pytest.mark.parametrize("label_name", [["label"], []])
def test_empty_cols(tmpdir, df, dataset, engine, cat_names, cont_names, label_name):
    # test out https://github.com/NVIDIA/NVTabular/issues/149 making sure we can iterate over
    # empty cats/conts
    # first with no continuous columns
    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_format=None,
    )


  df_out = processor.get_ddf().compute(scheduler="synchronous")


tests/unit/test_torch_dataloader.py:73:

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:167: in compute

(result,) = compute(self, traverse=False, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:471: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc3c9710>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-1-1e-06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_1e_0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2806fd950>

batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2d86d63b0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_______________________ test_gpu_dl[None-parquet-1-0.06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_1_0_00')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa318202510>

batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc55e8c0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-10-1e-06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_1e0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa29c666150>

batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa31831e4d0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-10-0.06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_10_0_0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa31850eb50>

batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa35924fd40>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_gpu_dl[None-parquet-100-1e-06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_10')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa318278d50>

batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2d83729e0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-100-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_None_parquet_100_00')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc622b10>

batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc5cf200>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-1-1e-06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_10')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa3584a95d0>

batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc5cff80>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_gpu_dl[devices1-parquet-1-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_11')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa28026ded0>

batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc448710>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-10-1e-06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_12')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc56de50>

batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa29c517440>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-10-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_13')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa29c7f4e50>

batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc511830>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

___________________ test_gpu_dl[devices1-parquet-100-1e-06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_14')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa280797910>

batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet'

devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa2bc432c20>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-100-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_gpu_dl_devices1_parquet_15')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa29c579ed0>

batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa29c733ef0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_________________________ test_kill_dl[parquet-1e-06] __________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_1e_06_0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa280316fd0>

part_mem_fraction = 1e-06, engine = 'parquet'
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_path=output_train,


    )

tests/unit/test_torch_dataloader.py:184:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa280226710>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

__________________________ test_kill_dl[parquet-0.1] ___________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_kill_dl_parquet_0_1_0')

df =       name-cat name-string    id  label         x         y

0       Yvonne     Charlie   972    955 -0.699630 -0.63284...y  1017    981  0.645883  0.779787

2160    Ingrid    Patricia   962    946  0.176731 -0.735924
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7fa2bc721190>

part_mem_fraction = 0.1, engine = 'parquet'
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.1])
@pytest.mark.parametrize("engine", ["parquet"])
def test_kill_dl(tmpdir, df, dataset, part_mem_fraction, engine):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,


      output_path=output_train,


    )

tests/unit/test_torch_dataloader.py:184:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7fa29c67def0>

fill_value = -0.007777519058436155
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 41425 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 42525 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 34371 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa280693390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa3207359d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa3207359d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa2bc1aa790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa2bc1aa790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa2bc1aa790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa2bc1c75d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa2802bdd10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa2802bdd10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fa3182233d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fa3182233d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fa3182233d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 41301 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      3     40      3    96%   54->55, 55-59, 87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271      9    112      7    96%   71->72, 72, 77-78, 123->124, 124, 212->214, 214, 220->222, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     14     52      9    85%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         37     10      6      0    63%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            606     87    336     26    81%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    514   1568    173    82%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 82.35%

=========================== short test summary info ============================

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names0-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names0-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06] - Typ...

FAILED tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-0.1] - TypeE...

===== 30 failed, 557 passed, 8 skipped, 266 warnings in 381.36s (0:06:21) ======

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins8671505101989059999.sh

nvidia-merlin-bot · 2020-11-09T05:24:31Z

Click to view CI Results

GitHub pull request #393 of commit 986c38d8f5e1f8f981f636438683dd9b1d1599eb, no merge conflicts.
Running as SYSTEM
Setting status of 986c38d8f5e1f8f981f636438683dd9b1d1599eb to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1155/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 986c38d8f5e1f8f981f636438683dd9b1d1599eb^{commit} # timeout=10
Checking out Revision 986c38d8f5e1f8f981f636438683dd9b1d1599eb (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 986c38d8f5e1f8f981f636438683dd9b1d1599eb # timeout=10
Commit message: "fixes for test fails in dataloader"
 > git rev-list --no-walk 8184506ba87fbfef7fbee544145236a4360adbae # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins2450794695755062914.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Running setup.py develop for nvtabular
Successfully installed nvtabular
would reformat /var/jenkins_home/workspace/nvtabular_tests/nvtabular/tests/unit/test_torch_dataloader.py
Oh no! 💥 💔 💥
1 file would be reformatted, 75 files would be left unchanged.
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins643749491149907924.sh

nvidia-merlin-bot · 2020-11-09T05:35:29Z

Click to view CI Results

GitHub pull request #393 of commit 0e89af3472997c90fb06aa9883effa990e0eae16, no merge conflicts.
Running as SYSTEM
Setting status of 0e89af3472997c90fb06aa9883effa990e0eae16 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1156/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 0e89af3472997c90fb06aa9883effa990e0eae16^{commit} # timeout=10
Checking out Revision 0e89af3472997c90fb06aa9883effa990e0eae16 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0e89af3472997c90fb06aa9883effa990e0eae16 # timeout=10
Commit message: "reformat codes"
 > git rev-list --no-walk 986c38d8f5e1f8f981f636438683dd9b1d1599eb # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins6914325229575234121.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ..FF                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF......                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............FFFFFFFFFFFF....       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=================================== FAILURES ===================================

_____________________________ test_rossman_example _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0')
def test_rossman_example(tmpdir):
    pytest.importorskip("nvtabular.loader.tensorflow")
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "train.csv"))
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "valid.csv"))
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(
        dirname(TEST_PATH), "examples", "rossmann-store-sales-example.ipynb"
    )


  _run_notebook(tmpdir, notebook_path, lambda line: line.replace("EPOCHS = 25", "EPOCHS = 1"))


tests/unit/test_notebooks.py:67:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f0e30501050>

stdout = b'Train for 1 steps\n\r1/1 [==============================] - 6s 6s/step - loss: 0.5433 - rmspe_tf: 0.4862\n'

stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

2020-11-09 05:31:22.352842: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib

2020-11-09 05:31:22.352970: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib

2020-11-09 05:31:22.352985: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

2020-11-09 05:31:23.166543: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-11-09 05:31:23.167476: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.168508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:

pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.169530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:

pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.170528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:

pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.170799: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-09 05:31:23.170849: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

2020-11-09 05:31:23.170883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10

2020-11-09 05:31:23.170914: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10

2020-11-09 05:31:23.170945: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10

2020-11-09 05:31:23.170975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10

2020-11-09 05:31:23.171006: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-11-09 05:31:23.178313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3

2020-11-09 05:31:23.371600: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

2020-11-09 05:31:23.395251: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3198080000 Hz

2020-11-09 05:31:23.396319: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557dafa6b0e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2020-11-09 05:31:23.396380: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

2020-11-09 05:31:23.745961: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x557daefc8010 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2020-11-09 05:31:23.746031: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 05:31:23.746042: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 05:31:23.746050: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (2): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 05:31:23.746057: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (3): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 05:31:23.747951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.749138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:

pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.750154: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:

pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.751192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:

pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 05:31:23.751302: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-09 05:31:23.751340: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

2020-11-09 05:31:23.751360: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10

2020-11-09 05:31:23.751379: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10

2020-11-09 05:31:23.751396: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10

2020-11-09 05:31:23.751413: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10

2020-11-09 05:31:23.751431: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-11-09 05:31:23.758752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3

2020-11-09 05:31:23.758817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-09 05:31:23.763972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-11-09 05:31:23.763999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 1 2 3

2020-11-09 05:31:23.764009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N Y Y Y

2020-11-09 05:31:23.764017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1:   Y N Y Y

2020-11-09 05:31:23.764024: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 2:   Y Y N Y

2020-11-09 05:31:23.764031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 3:   Y Y Y N

2020-11-09 05:31:23.768748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8139 MB memory) -> physical GPU (device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0)

2020-11-09 05:31:23.770236: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 14864 MB memory) -> physical GPU (device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0)

2020-11-09 05:31:23.771660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 14864 MB memory) -> physical GPU (device: 2, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0e:00.0, compute capability: 6.0)

2020-11-09 05:31:23.773098: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 14864 MB memory) -> physical GPU (device: 3, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0f:00.0, compute capability: 6.0)

2020-11-09 05:31:23.780216: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 7.95G (8534360064 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory

WARNING:tensorflow:sample_weight modes were coerced from

...

to

['...']

2020-11-09 05:31:30.548029: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-2/test_rossman_example0/notebook.py", line 210, in 

).to('cuda')

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 612, in to

return self._apply(convert)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply

module._apply(fn)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply

module._apply(fn)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply

module._apply(fn)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 381, in _apply

param_applied = fn(param)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 610, in convert

return t.to(device, dtype if t.is_floating_point() else None, non_blocking)

RuntimeError: CUDA error: out of memory

__________________________ test_multigpu_dask_example __________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0')
def test_multigpu_dask_example(tmpdir):
    with get_cuda_cluster() as cuda_cluster:
        os.environ["BASE_DIR"] = str(tmpdir)
        scheduler_port = cuda_cluster.scheduler_address

        def _nb_modify(line):
            # Use cuda_cluster "fixture" port rather than allowing notebook
            # to deploy a LocalCUDACluster within the subprocess
            line = line.replace("cluster = None", f"cluster = '{scheduler_port}'")
            # Use a much smaller "toy" dataset
            line = line.replace("write_count = 25", "write_count = 4")
            line = line.replace('freq = "1s"', 'freq = "1h"')
            # Use smaller partitions for smaller dataset
            line = line.replace("part_mem_fraction=0.1", "part_size=1_000_000")
            line = line.replace("out_files_per_proc=8", "out_files_per_proc=1")
            return line

        notebook_path = os.path.join(dirname(TEST_PATH), "examples", "multi-gpu_dask.ipynb")


      _run_notebook(tmpdir, notebook_path, _nb_modify)


tests/unit/test_notebooks.py:88:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f0ded3f9310>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

distributed.worker - WARNING - Run Failed

Function: _rmm_pool

args:     ()

kwargs:   {}

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/worker.py", line 3546, in run

result = function(*args, **kwargs)

File "/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool

initial_pool_size=None, # Use default size

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize

log_file_name=log_file_name,

File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit

MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py", line 84, in 

client.run(_rmm_pool)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2512, in run

return self.sync(self._run, function, *args, **kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 833, in sync

self.loop, func, *args, callback_timeout=callback_timeout, **kwargs

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 340, in sync

raise exc.with_traceback(tb)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 324, in f

result[0] = yield future

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/gen.py", line 735, in run

value = future.result()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2449, in _run

raise exc.with_traceback(tb)

File "/tmp/pytest-of-jenkins/pytest-2/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool

initial_pool_size=None, # Use default size

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize

log_file_name=log_file_name,

File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit

MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory

_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_1_parquet_0')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da7f868d0>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0dc8536c50>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_1_parquet_1')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da4182c50>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0da4132410>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_10_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da7f43050>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0da7ab67d0>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_10_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dc86b8050>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0dc84e9810>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_100_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0ded4145d0>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0ded400a10>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_True_100_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0e206a4890>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0d8c5a3d10>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_1_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d8c5e6b90>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0d8c5e6fd0>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_1_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dc8510890>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0f16fc5c90>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_10_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dc8079d90>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0e3048a1d0>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_10_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0ded750350>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0f172e9710>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_100_parqu0')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0e3037de10>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0f16e1c650>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_tf_gpu_dl_False_100_parqu1')

paths = ['/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-2/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0da407a810>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f0da41a4450>

dask_stats = x     0.009410043

y            

id         1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

______________________ test_gpu_dl[None-parquet-1-1e-06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_1_1e_0')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d30763050>

batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0dc86ae5f0>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_______________________ test_gpu_dl[None-parquet-1-0.06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_1_0_00')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d301d0410>

batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0dc866f050>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-10-1e-06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_10_1e0')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d301d0710>

batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d0c7b9a70>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-10-0.06] _______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_10_0_0')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d3068f290>

batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d84230b90>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_gpu_dl[None-parquet-100-1e-06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_100_10')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0cec16c290>

batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c749c20>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

______________________ test_gpu_dl[None-parquet-100-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_None_parquet_100_00')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d8c05e250>

batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = None
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d0c4bf9e0>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-1-1e-06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_10')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0cec5d7ad0>

batch_size = 1, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d842c39e0>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

_____________________ test_gpu_dl[devices1-parquet-1-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_11')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d3073c850>

batch_size = 1, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c5be050>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-10-1e-06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_12')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d30597410>

batch_size = 10, part_mem_fraction = 1e-06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d30621440>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-10-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_13')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0dec78c6d0>

batch_size = 10, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d84341a70>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

___________________ test_gpu_dl[devices1-parquet-100-1e-06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_14')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0d30684750>

batch_size = 100, part_mem_fraction = 1e-06, engine = 'parquet'

devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c6e7830>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

____________________ test_gpu_dl[devices1-parquet-100-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-2/test_gpu_dl_devices1_parquet_15')

df =       name-cat name-string    id  label         x         y

0       Xavier       Edith  1047   1020  0.945586  0.02596...a   985    994  0.375274 -0.989079

2160     Kevin       Kevin  1018    956 -0.376415 -0.533307
[4321 rows x 6 columns]

dataset = <nvtabular.io.dataset.Dataset object at 0x7f0cec3a2ed0>

batch_size = 100, part_mem_fraction = 0.06, engine = 'parquet', devices = [0, 1]
@pytest.mark.parametrize("part_mem_fraction", [0.000001, 0.06])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("devices", [None, GPU_DEVICE_IDS[:2]])
def test_gpu_dl(tmpdir, df, dataset, batch_size, part_mem_fraction, engine, devices):
    cat_names = ["name-cat", "name-string"]
    cont_names = ["x", "y", "id"]
    label_name = ["label"]

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)

    processor.add_feature([ops.FillMedian()])
    processor.add_preprocess(ops.Normalize())
    processor.add_preprocess(ops.Categorify())

    output_train = os.path.join(tmpdir, "train/")
    os.mkdir(output_train)

    processor.apply(
        dataset,
        apply_offline=True,
        record_stats=True,
        shuffle=nvt.io.Shuffle.PER_PARTITION,
        output_path=output_train,


      out_files_per_proc=2,


    )

tests/unit/test_torch_dataloader.py:113:

nvtabular/workflow.py:1025: in apply

dtypes=dtypes,

nvtabular/workflow.py:1144: in build_and_process_graph

num_threads=num_io_threads,

nvtabular/workflow.py:1233: in ddf_to_dataset

num_threads,

nvtabular/io/dask.py:112: in _ddf_to_dataset

out = dask.compute(out, scheduler="synchronous")[0]

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/base.py:452: in compute

results = schedule(dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:527: in get_sync

return get_async(apply_sync, 1, dsk, keys, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:494: in get_async

fire_task()

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:466: in fire_task

callback=queue.put,

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:516: in apply_sync

res = func(args, **kwds)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:227: in execute_task

result = pack_exception(e, dumps)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/local.py:222: in execute_task

result = _execute_task(task, data)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func((_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/optimization.py:961: in call

return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:151: in get

result = _execute_task(task, cache)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/core.py:121: in _execute_task

return func(*(_execute_task(a, cache) for a in args))

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/utils.py:29: in apply

return func(*args, **kwargs)

/opt/conda/envs/rapids/lib/python3.7/site-packages/dask/dataframe/core.py:5298: in apply_and_enforce

df = func(*args, **kwargs)

nvtabular/workflow.py:830: in _aggregated_op

gdf = logic(gdf, columns_ctx, cols_grp, target_cols, stats_context)

nvtabular/ops/transform_operator.py:100: in apply_op

new_gdf = self.op_logic(gdf, target_columns, stats_context=stats_context)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

nvtabular/ops/fill.py:113: in op_logic

new_gdf[col] = gdf[col].fillna(stat_val)

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/series.py:1812: in fillna

value=value, method=method, axis=axis, inplace=inplace, limit=limit

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/frame.py:1325: in fillna

copy_data[name] = copy_data[name].fillna(value[name],)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f0d4c7a2680>

fill_value = -0.000922759878449142
def fillna(self, fill_value):
    """
    Fill null values with *fill_value*
    """
    if np.isscalar(fill_value):
        # castsafely to the same dtype as self
        fill_value_casted = self.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):
            raise TypeError(
                "Cannot safely cast non-equivalent {} to {}".format(


                  type(fill_value).__name__, self.dtype.name


                )
            )

E               TypeError: Cannot safely cast non-equivalent float to int64
/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/column/numerical.py:432: TypeError

=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 34315 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 36571 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 42533 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d302bb890>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d30453b90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d30453b90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d30277d10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d30277d10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d30277d10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d30111dd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d4c10dad0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d4c10dad0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0d30498190>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0d30498190>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f0d30498190>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112320 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 41903 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      2     22      4    90%   83->85, 85->87, 89->92, 92, 96->97, 97

nvtabular/framework_utils/torch/utils.py                          31      8     10      4    71%   51->52, 52, 55->56, 56-58, 61->62, 62-65, 70->50

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     16     48      9    82%   190->191, 191, 196->198, 198, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      2     40      3    97%   73->75, 75, 87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     13    112      8    95%   71->72, 72, 77-78, 123->124, 124, 131-132, 212->214, 214, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     14     52      9    85%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         37     10      6      0    63%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65     10     30      1    80%   53-57, 80->92, 117-122

nvtabular/workflow.py                                            606     87    336     26    81%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    533   1568    180    82%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 81.70%

=========================== short test summary info ============================

FAILED tests/unit/test_notebooks.py::test_rossman_example - subprocess.Called...

FAILED tests/unit/test_notebooks.py::test_multigpu_dask_example - subprocess....

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-0.06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

FAILED tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-0.06]

===== 26 failed, 561 passed, 8 skipped, 267 warnings in 384.30s (0:06:24) ======

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins4596099623436440276.sh

nvidia-merlin-bot · 2020-11-09T05:53:21Z

Click to view CI Results

GitHub pull request #393 of commit 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502, no merge conflicts.
Running as SYSTEM
Setting status of 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1158/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502^{commit} # timeout=10
Checking out Revision 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
Commit message: "udpate to pass tests"
 > git rev-list --no-walk 5f34242ae04ff2e3b918f9c111975bacd12304ea # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1047770170142313710.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py FFFF                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=================================== FAILURES ===================================

_____________________________ test_criteo_notebook _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0')
def test_criteo_notebook(tmpdir):
    # create a toy dataset in tmpdir, and point environment variables so the notebook
    # will read from it
    for i in range(24):
        df = _get_random_criteo_data(1000)
        df.to_parquet(os.path.join(tmpdir, f"day_{i}.parquet"))
    os.environ["INPUT_DATA_DIR"] = str(tmpdir)
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    _run_notebook(
        tmpdir,
        os.path.join(dirname(TEST_PATH), "examples", "criteo-example.ipynb"),
        # disable rmm.reinitialize, seems to be causing issues


      transform=lambda line: line.replace("rmm.reinitialize(", "# rmm.reinitialize("),


    )

tests/unit/test_notebooks.py:45:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d49b94850>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-4/test_criteo_notebook0/notebook.py", line 24, in 

import nvtabular as nvt

ModuleNotFoundError: No module named 'nvtabular'

_____________________________ test_optimize_criteo _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0')
def test_optimize_criteo(tmpdir):
    _get_random_criteo_data(1000).to_csv(os.path.join(tmpdir, "day_0"), sep="\t", header=False)
    os.environ["INPUT_DATA_DIR"] = str(tmpdir)
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(dirname(TEST_PATH), "examples", "optimize_criteo.ipynb")


  _run_notebook(tmpdir, notebook_path)


tests/unit/test_notebooks.py:55:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d25e8d690>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-4/test_optimize_criteo0/notebook.py", line 60, in 

import nvtabular as nvt

ModuleNotFoundError: No module named 'nvtabular'

_____________________________ test_rossman_example _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0')
def test_rossman_example(tmpdir):
    pytest.importorskip("nvtabular.loader.tensorflow")
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "train.csv"))
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "valid.csv"))
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(
        dirname(TEST_PATH), "examples", "rossmann-store-sales-example.ipynb"
    )


  _run_notebook(tmpdir, notebook_path, lambda line: line.replace("EPOCHS = 25", "EPOCHS = 1"))


tests/unit/test_notebooks.py:67:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d25f949d0>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-4/test_rossman_example0/notebook.py", line 17, in 

import nvtabular as nvt

ModuleNotFoundError: No module named 'nvtabular'

__________________________ test_multigpu_dask_example __________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0')
def test_multigpu_dask_example(tmpdir):
    with get_cuda_cluster() as cuda_cluster:
        os.environ["BASE_DIR"] = str(tmpdir)
        scheduler_port = cuda_cluster.scheduler_address

        def _nb_modify(line):
            # Use cuda_cluster "fixture" port rather than allowing notebook
            # to deploy a LocalCUDACluster within the subprocess
            line = line.replace("cluster = None", f"cluster = '{scheduler_port}'")
            # Use a much smaller "toy" dataset
            line = line.replace("write_count = 25", "write_count = 4")
            line = line.replace('freq = "1s"', 'freq = "1h"')
            # Use smaller partitions for smaller dataset
            line = line.replace("part_mem_fraction=0.1", "part_size=1_000_000")
            line = line.replace("out_files_per_proc=8", "out_files_per_proc=1")
            return line

        notebook_path = os.path.join(dirname(TEST_PATH), "examples", "multi-gpu_dask.ipynb")


      _run_notebook(tmpdir, notebook_path, _nb_modify)


tests/unit/test_notebooks.py:88:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f8d25f3ce50>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-4/test_multigpu_dask_example0/notebook.py", line 31, in 

import nvtabular as nvt

ModuleNotFoundError: No module named 'nvtabular'

=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44903 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44479 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 34601 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8c502baf90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8c5036cf50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8c5036cf50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8c50326e10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8c50326e10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8c50326e10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8c5033dcd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8d4f44b450>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8d4f44b450>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f8d0c3d1550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f8d0c3d1550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f8d0c3d1550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 59784 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54912 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56160 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 42891 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      2     22      7    85%   55->57, 58->60, 63->65, 83->85, 85->87, 89->92, 92, 96->97, 97

nvtabular/framework_utils/torch/utils.py                          31      8     10      4    71%   51->52, 52, 55->56, 56-58, 61->62, 62-65, 70->50

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     17     48      9    80%   190->191, 191, 196->198, 198, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 333->334, 334-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      2     40      3    97%   73->75, 75, 87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     13    112     11    94%   71->72, 72, 90->91, 91, 95->87, 115->116, 116, 123->124, 124, 131-132, 220->222, 240->241, 241, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     14     52      9    85%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 76->77, 77, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55     26     20      5    45%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73, 85-90, 100-113

nvtabular/loader/torch.py                                         37     11      6      0    60%   25-27, 30-36, 118

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     42    82%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 313->317, 318->320, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65     10     30      1    80%   53-57, 80->92, 117-122

nvtabular/workflow.py                                            606     97    336     26    79%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 601->600, 609->612, 612, 633-648, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    562   1568    188    81%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 80.59%

=========================== short test summary info ============================

FAILED tests/unit/test_notebooks.py::test_criteo_notebook - subprocess.Called...

FAILED tests/unit/test_notebooks.py::test_optimize_criteo - subprocess.Called...

FAILED tests/unit/test_notebooks.py::test_rossman_example - subprocess.Called...

FAILED tests/unit/test_notebooks.py::test_multigpu_dask_example - subprocess....

====== 4 failed, 583 passed, 8 skipped, 273 warnings in 480.94s (0:08:00) ======

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins5948351989957738444.sh

jperez999 · 2020-11-09T06:12:11Z

rerun tests

nvidia-merlin-bot · 2020-11-09T06:21:40Z

Click to view CI Results

GitHub pull request #393 of commit 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502, no merge conflicts.
Running as SYSTEM
Setting status of 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1160/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502^{commit} # timeout=10
Checking out Revision 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
Commit message: "udpate to pass tests"
 > git rev-list --no-walk 599afb8d8a160cd55336d830d9ded3f21a4046c0 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins3609457319750059927.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ..FF                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=================================== FAILURES ===================================

_____________________________ test_rossman_example _____________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0')
def test_rossman_example(tmpdir):
    pytest.importorskip("nvtabular.loader.tensorflow")
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "train.csv"))
    _get_random_rossmann_data(1000).to_csv(os.path.join(tmpdir, "valid.csv"))
    os.environ["OUTPUT_DATA_DIR"] = str(tmpdir)

    notebook_path = os.path.join(
        dirname(TEST_PATH), "examples", "rossmann-store-sales-example.ipynb"
    )


  _run_notebook(tmpdir, notebook_path, lambda line: line.replace("EPOCHS = 25", "EPOCHS = 1"))


tests/unit/test_notebooks.py:67:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f80acfab050>

stdout = b'Train for 1 steps\n\r1/1 [==============================] - 6s 6s/step - loss: 0.5078 - rmspe_tf: 0.4699\n'

stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

2020-11-09 06:15:02.167514: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib

2020-11-09 06:15:02.167643: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/lib:/opt/conda/envs/rapids/lib

2020-11-09 06:15:02.167657: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

2020-11-09 06:15:02.968579: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

2020-11-09 06:15:02.969581: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:02.970683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:

pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:02.971781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:

pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:02.972879: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:

pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:02.973160: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-09 06:15:02.973214: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

2020-11-09 06:15:02.973252: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10

2020-11-09 06:15:02.973286: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10

2020-11-09 06:15:02.973320: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10

2020-11-09 06:15:02.973353: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10

2020-11-09 06:15:02.973388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-11-09 06:15:02.981211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3

2020-11-09 06:15:03.175182: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

2020-11-09 06:15:03.199318: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3198080000 Hz

2020-11-09 06:15:03.200279: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56333e839550 initialized for platform Host (this does not guarantee that XLA will be used). Devices:

2020-11-09 06:15:03.200304: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

2020-11-09 06:15:03.540481: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56333e14e8f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

2020-11-09 06:15:03.540552: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 06:15:03.540565: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (1): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 06:15:03.540572: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (2): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 06:15:03.540580: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (3): Tesla P100-DGXS-16GB, Compute Capability 6.0

2020-11-09 06:15:03.542431: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:

pciBusID: 0000:07:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:03.543516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 1 with properties:

pciBusID: 0000:08:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:03.544577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 2 with properties:

pciBusID: 0000:0e:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:03.545650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 3 with properties:

pciBusID: 0000:0f:00.0 name: Tesla P100-DGXS-16GB computeCapability: 6.0

coreClock: 1.4805GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s

2020-11-09 06:15:03.545742: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-09 06:15:03.545768: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

2020-11-09 06:15:03.545789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10

2020-11-09 06:15:03.545809: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10

2020-11-09 06:15:03.545827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10

2020-11-09 06:15:03.545845: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10

2020-11-09 06:15:03.545864: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

2020-11-09 06:15:03.552786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0, 1, 2, 3

2020-11-09 06:15:03.552847: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1

2020-11-09 06:15:03.557607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:

2020-11-09 06:15:03.557631: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 1 2 3

2020-11-09 06:15:03.557641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N Y Y Y

2020-11-09 06:15:03.557648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 1:   Y N Y Y

2020-11-09 06:15:03.557656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 2:   Y Y N Y

2020-11-09 06:15:03.557662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 3:   Y Y Y N

2020-11-09 06:15:03.561931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8139 MB memory) -> physical GPU (device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0)

2020-11-09 06:15:03.563277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 14864 MB memory) -> physical GPU (device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0)

2020-11-09 06:15:03.564654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 14864 MB memory) -> physical GPU (device: 2, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0e:00.0, compute capability: 6.0)

2020-11-09 06:15:03.566006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 14864 MB memory) -> physical GPU (device: 3, name: Tesla P100-DGXS-16GB, pci bus id: 0000:0f:00.0, compute capability: 6.0)

2020-11-09 06:15:03.573095: I tensorflow/stream_executor/cuda/cuda_driver.cc:801] failed to allocate 7.95G (8534360064 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory

WARNING:tensorflow:sample_weight modes were coerced from

...

to

['...']

2020-11-09 06:15:10.391896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-6/test_rossman_example0/notebook.py", line 210, in 

).to('cuda')

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 612, in to

return self._apply(convert)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply

module._apply(fn)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply

module._apply(fn)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply

module._apply(fn)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 381, in _apply

param_applied = fn(param)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/torch/nn/modules/module.py", line 610, in convert

return t.to(device, dtype if t.is_floating_point() else None, non_blocking)

RuntimeError: CUDA error: out of memory

__________________________ test_multigpu_dask_example __________________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0')
def test_multigpu_dask_example(tmpdir):
    with get_cuda_cluster() as cuda_cluster:
        os.environ["BASE_DIR"] = str(tmpdir)
        scheduler_port = cuda_cluster.scheduler_address

        def _nb_modify(line):
            # Use cuda_cluster "fixture" port rather than allowing notebook
            # to deploy a LocalCUDACluster within the subprocess
            line = line.replace("cluster = None", f"cluster = '{scheduler_port}'")
            # Use a much smaller "toy" dataset
            line = line.replace("write_count = 25", "write_count = 4")
            line = line.replace('freq = "1s"', 'freq = "1h"')
            # Use smaller partitions for smaller dataset
            line = line.replace("part_mem_fraction=0.1", "part_size=1_000_000")
            line = line.replace("out_files_per_proc=8", "out_files_per_proc=1")
            return line

        notebook_path = os.path.join(dirname(TEST_PATH), "examples", "multi-gpu_dask.ipynb")


      _run_notebook(tmpdir, notebook_path, _nb_modify)


tests/unit/test_notebooks.py:88:

tests/unit/test_notebooks.py:108: in _run_notebook

subprocess.check_output([sys.executable, script_path])

/opt/conda/envs/rapids/lib/python3.7/subprocess.py:411: in check_output

**kwargs).stdout

input = None, capture_output = False, timeout = None, check = True

popenargs = (['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py'],)

kwargs = {'stdout': -1}, process = <subprocess.Popen object at 0x7f80ac08e250>

stdout = b'', stderr = None, retcode = 1
def run(*popenargs,
        input=None, capture_output=False, timeout=None, check=False, **kwargs):
    """Run command with arguments and return a CompletedProcess instance.

    The returned instance will have attributes args, returncode, stdout and
    stderr. By default, stdout and stderr are not captured, and those attributes
    will be None. Pass stdout=PIPE and/or stderr=PIPE in order to capture them.

    If check is True and the exit code was non-zero, it raises a
    CalledProcessError. The CalledProcessError object will have the return code
    in the returncode attribute, and output & stderr attributes if those streams
    were captured.

    If timeout is given, and the process takes too long, a TimeoutExpired
    exception will be raised.

    There is an optional argument "input", allowing you to
    pass bytes or a string to the subprocess's stdin.  If you use this argument
    you may not also use the Popen constructor's "stdin" argument, as
    it will be used internally.

    By default, all communication is in bytes, and therefore any "input" should
    be bytes, and the stdout and stderr will be bytes. If in text mode, any
    "input" should be a string, and stdout and stderr will be strings decoded
    according to locale encoding, or by "encoding" if set. Text mode is
    triggered by setting any of text, encoding, errors or universal_newlines.

    The other arguments are the same as for the Popen constructor.
    """
    if input is not None:
        if kwargs.get('stdin') is not None:
            raise ValueError('stdin and input arguments may not both be used.')
        kwargs['stdin'] = PIPE

    if capture_output:
        if kwargs.get('stdout') is not None or kwargs.get('stderr') is not None:
            raise ValueError('stdout and stderr arguments may not be used '
                             'with capture_output.')
        kwargs['stdout'] = PIPE
        kwargs['stderr'] = PIPE

    with Popen(*popenargs, **kwargs) as process:
        try:
            stdout, stderr = process.communicate(input, timeout=timeout)
        except TimeoutExpired as exc:
            process.kill()
            if _mswindows:
                # Windows accumulates the output in a single blocking
                # read() call run on child threads, with the timeout
                # being done in a join() on those threads.  communicate()
                # _after_ kill() is required to collect that and add it
                # to the exception.
                exc.stdout, exc.stderr = process.communicate()
            else:
                # POSIX _communicate already populated the output so
                # far into the TimeoutExpired exception.
                process.wait()
            raise
        except:  # Including KeyboardInterrupt, communicate handled that.
            process.kill()
            # We don't call process.wait() as .__exit__ does that for us.
            raise
        retcode = process.poll()
        if check and retcode:
            raise CalledProcessError(retcode, process.args,


                                   output=stdout, stderr=stderr)


E               subprocess.CalledProcessError: Command '['/opt/conda/envs/rapids/bin/python', '/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py']' returned non-zero exit status 1.
/opt/conda/envs/rapids/lib/python3.7/subprocess.py:512: CalledProcessError

----------------------------- Captured stderr call -----------------------------

distributed.worker - WARNING - Run Failed

Function: _rmm_pool

args:     ()

kwargs:   {}

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/worker.py", line 3546, in run

result = function(*args, **kwargs)

File "/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool

initial_pool_size=None, # Use default size

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize

log_file_name=log_file_name,

File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit

MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory

Traceback (most recent call last):

File "/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py", line 84, in 

client.run(_rmm_pool)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2512, in run

return self.sync(self._run, function, *args, **kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 833, in sync

self.loop, func, *args, callback_timeout=callback_timeout, **kwargs

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 340, in sync

raise exc.with_traceback(tb)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 324, in f

result[0] = yield future

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/gen.py", line 735, in run

value = future.result()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/client.py", line 2449, in _run

raise exc.with_traceback(tb)

File "/tmp/pytest-of-jenkins/pytest-6/test_multigpu_dask_example0/notebook.py", line 81, in _rmm_pool

initial_pool_size=None, # Use default size

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/rmm/rmm.py", line 77, in reinitialize

log_file_name=log_file_name,

File "rmm/_lib/memory_resource.pyx", line 305, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 365, in rmm._lib.memory_resource._initialize

File "rmm/_lib/memory_resource.pyx", line 64, in rmm._lib.memory_resource.PoolMemoryResource.cinit

MemoryError: std::bad_alloc: CUDA error at: ../include/rmm/mr/device/cuda_memory_resource.hpp:68: cudaErrorMemoryAllocation out of memory

=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 46477 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44077 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 41669 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80045cdc10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80044fd690>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80044fd690>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80044842d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80044842d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80044842d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80044fe390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80044b2810>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80044b2810>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f80045962d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f80045962d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f80045962d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58084 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54496 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56160 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 37385 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      2     22      4    90%   83->85, 85->87, 89->92, 92, 96->97, 97

nvtabular/framework_utils/torch/utils.py                          31      8     10      4    71%   51->52, 52, 55->56, 56-58, 61->62, 62-65, 70->50

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     16     48      9    82%   190->191, 191, 196->198, 198, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      2     40      3    97%   73->75, 75, 87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     10    112      7    96%   71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     13     52      8    86%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         37     10      6      0    63%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65     10     30      1    80%   53-57, 80->92, 117-122

nvtabular/workflow.py                                            606     88    336     27    81%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    530   1568    179    82%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 81.77%

=========================== short test summary info ============================

FAILED tests/unit/test_notebooks.py::test_rossman_example - subprocess.Called...

FAILED tests/unit/test_notebooks.py::test_multigpu_dask_example - subprocess....

====== 2 failed, 585 passed, 8 skipped, 273 warnings in 534.17s (0:08:54) ======

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins4997896031306181373.sh

jperez999 · 2020-11-09T06:33:51Z

rerun tests

nvidia-merlin-bot · 2020-11-09T06:42:32Z

Click to view CI Results

GitHub pull request #393 of commit 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502, no merge conflicts.
Running as SYSTEM
Setting status of 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1161/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502^{commit} # timeout=10
Checking out Revision 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
Commit message: "udpate to pass tests"
 > git rev-list --no-walk 74a9fb0fa52e2580498c7fa5e2b7b70f271f9502 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins1502471784733242026.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.2.0
    Uninstalling nvtabular-0.2.0:
      Successfully uninstalled nvtabular-0.2.0
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF......                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=================================== FAILURES ===================================

_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeae16150>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbaeadc6dd0>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbb0425c090>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbb0408b0d0>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae80c4190>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbae8149390>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae80c9990>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbb04036cd0>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeb781f90>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbaea4ca650>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbad47f35d0>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbae80c9650>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbb65b71310>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbb65d29610>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae80c9f50>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbae815b710>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeb7fbd50>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbb4b589550>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbae8194250>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbae8194190>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbad47d94d0>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbad47d9f50>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7fbaeb3be390>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:90:

nvtabular/workflow.py:1087: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1128: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:896: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7fbb04217410>

dask_stats = x             

y     -0.005164921

id           999.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 35279 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 42665 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44063 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba900a1210>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c66f550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c66f550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c6570d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c6570d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba6c6570d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba6c6757d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c407550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c407550>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7fba6c661c50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7fba6c661c50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7fba6c661c50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52856 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54688 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 41017 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     11    112      8    95%   71->72, 72, 123->124, 124, 131-132, 212->214, 214, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     14     52      9    85%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         37     10      6      0    63%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            606     88    336     27    81%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    515   1568    174    82%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 82.31%

=========================== short test summary info ============================

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]

===== 12 failed, 575 passed, 8 skipped, 273 warnings in 493.04s (0:08:13) ======

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins7537479321991169570.sh

nvidia-merlin-bot · 2020-11-09T07:07:11Z

Click to view CI Results

GitHub pull request #393 of commit 292c97338e01fb6240f9fadb41e159f94156598f, no merge conflicts.
Running as SYSTEM
Setting status of 292c97338e01fb6240f9fadb41e159f94156598f to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1163/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 292c97338e01fb6240f9fadb41e159f94156598f^{commit} # timeout=10
Checking out Revision 292c97338e01fb6240f9fadb41e159f94156598f (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 292c97338e01fb6240f9fadb41e159f94156598f # timeout=10
Commit message: "Merge branch 'main' into fixes"
 > git rev-list --no-walk 8f43564392bd047e5ffd38d3acf8e26437a8ae6a # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins7026001201112661322.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 46261 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 46667 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44797 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff11416a350>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff1140f8c50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff1140f8c50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff114173fd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff114173fd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff114173fd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff1140cebd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff114103dd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff114103dd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7ff1143bf250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7ff1143bf250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7ff1143bf250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52856 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56872 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55104 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54428 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 43885 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     10    112      7    96%   71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     13     52      8    86%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         37     10      6      0    63%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            606     88    336     27    81%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    513   1568    172    82%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 82.39%

=========== 587 passed, 8 skipped, 273 warnings in 506.65s (0:08:26) ===========

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins8460091688957331325.sh

…ixes

nvidia-merlin-bot · 2020-11-09T16:44:15Z

Click to view CI Results

GitHub pull request #393 of commit a4378fb6811698e943fcfaebdb33d27397052889, no merge conflicts.
Running as SYSTEM
Setting status of a4378fb6811698e943fcfaebdb33d27397052889 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1165/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse a4378fb6811698e943fcfaebdb33d27397052889^{commit} # timeout=10
Checking out Revision a4378fb6811698e943fcfaebdb33d27397052889 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a4378fb6811698e943fcfaebdb33d27397052889 # timeout=10
Commit message: "Merge branch 'fixes' of https://github.com/jperez999/NVTabular into fixes"
 > git rev-list --no-walk fc94cf4e09359f2b357978e59540799efc46d777 # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins6137235349559612032.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
76 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 595 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 63%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 81%]

tests/unit/test_workflow.py ............................................ [ 88%]

.....................................................................    [100%]
=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44861 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 44979 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 46417 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f094c113bd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f094c2cb690>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f094c2cb690>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f094c141710>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f094c141710>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f094c141710>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f092c729d90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f094c110b10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f094c110b10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f092c732290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f092c732290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:310: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f092c732290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56680 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54912 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56160 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112320 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 43991 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            3      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     10    112      7    96%   71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481

nvtabular/loader/tensorflow.py                                   117     13     52      8    86%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         37     10      6      0    63%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      3     32      3    95%   149->150, 150, 154->179, 186->187, 187, 211

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      1      2      0    96%   52

nvtabular/ops/minmax.py                                           30      1      2      0    97%   56

nvtabular/ops/moments.py                                          91      1     20      0    99%   65

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      4    96%   144->146, 173->174, 174, 178->179, 179, 240->243

nvtabular/ops/transform_operator.py                               54      9     18      3    78%   37->exit, 41-44, 60-64, 86->87, 87-89, 106->107, 107

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            606     88    336     27    81%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 215->216, 216, 312->318, 318, 324->325, 325-329, 360->exit, 376->exit, 392->exit, 408->exit, 461->463, 485->484, 501-520, 526-537, 549-556, 609->612, 612, 637->638, 638, 644->647, 647, 700-703, 755->754, 809->814, 814, 817->818, 818, 863->864, 864, 922-950, 1067->1073, 1073->exit, 1115->1116, 1116, 1125->1131, 1167->1168, 1168-1170, 1174->1175, 1175, 1210->1211, 1211

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3611    513   1568    172    82%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 82.39%

=========== 587 passed, 8 skipped, 273 warnings in 509.33s (0:08:29) ===========

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins1397314997299959976.sh

nvidia-merlin-bot · 2020-11-12T21:19:54Z

Click to view CI Results

GitHub pull request #393 of commit 605266b87c49f852ee0f63ad642b42a6975c4c90, no merge conflicts.
Running as SYSTEM
Setting status of 605266b87c49f852ee0f63ad642b42a6975c4c90 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1195/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 605266b87c49f852ee0f63ad642b42a6975c4c90^{commit} # timeout=10
Checking out Revision 605266b87c49f852ee0f63ad642b42a6975c4c90 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 605266b87c49f852ee0f63ad642b42a6975c4c90 # timeout=10
Commit message: "Merge branch 'main' into fixes"
 > git rev-list --no-walk 001ead29294cb7090fe5c7278142bfc93cd20f7d # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins1418035229950289504.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 62%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 80%]

tests/unit/test_workflow.py ............................................ [ 88%]

......................................................................   [100%]
=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 35211 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 35277 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 45941 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c03c390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f47687d0b10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f47687d0b10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f478c254f10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f478c254f10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c254f10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c0b17d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f47687b4390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f47687b4390>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f478c03b250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f478c03b250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f478c03b250>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 59784 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58276 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58016 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 36817 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            4      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/tensorflow/layers/outer_product.py      30     24     10      0    15%   22-23, 26-45, 56-69, 72

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     11    112      7    95%   71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488

nvtabular/loader/tensorflow.py                                   117     13     52      8    86%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         41     10      8      0    67%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      2     32      3    96%   149->150, 150, 154->179, 186->187, 187

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      0      2      0   100%

nvtabular/ops/minmax.py                                           30      0      2      0   100%

nvtabular/ops/moments.py                                          91      0     20      0   100%

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      3    96%   144->146, 173->174, 174, 178->179, 179

nvtabular/ops/transform_operator.py                               50      6     16      3    83%   37->exit, 54-58, 80->81, 81-83, 100->101, 101

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            565     46    300     26    89%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 304->310, 310, 316->317, 317-321, 351->exit, 366->exit, 381->exit, 396->exit, 449->451, 467->466, 526->529, 529, 554->555, 555, 561->564, 564, 661->660, 715->720, 720, 723->724, 724, 769->770, 770, 828-856, 973->979, 979->exit, 1021->1022, 1022, 1031->1037, 1073->1074, 1074-1076, 1080->1081, 1081, 1116->1117, 1117

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3601    489   1542    170    83%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 83.43%

=========== 588 passed, 8 skipped, 273 warnings in 508.48s (0:08:28) ===========

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins9183663288096164358.sh

nvtabular/loader/tf_utils.py

nvtabular/ops/transform_operator.py

nvtabular/ops/operator.py

nvtabular/workflow.py

…ixes

nvidia-merlin-bot · 2020-11-13T01:29:12Z

Click to view CI Results

GitHub pull request #393 of commit 6892912d7042a381361fce1b6fe3c1ce3ce3c772, no merge conflicts.
Running as SYSTEM
Setting status of 6892912d7042a381361fce1b6fe3c1ce3ce3c772 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1200/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 6892912d7042a381361fce1b6fe3c1ce3ce3c772^{commit} # timeout=10
Checking out Revision 6892912d7042a381361fce1b6fe3c1ce3ce3c772 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6892912d7042a381361fce1b6fe3c1ce3ce3c772 # timeout=10
Commit message: "Merge branch 'fixes' of https://github.com/jperez999/NVTabular into fixes"
 > git rev-list --no-walk 8cb254ba35024c7baff6b9b7bafc40dcac1c266f # timeout=10
First time build. Skipping changelog.
[nvtabular_tests] $ /bin/bash /tmp/jenkins7853190238873591743.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py .FFFFFFFFFFFF......                     [ 62%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 80%]

tests/unit/test_workflow.py ............................................ [ 88%]

......................................................................   [100%]
=================================== FAILURES ===================================

_____________________ test_tf_gpu_dl[True-1-parquet-0.01] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f4076483050>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f400cc0fa10>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-1-parquet-0.06] ______________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_1_parquet_1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff77b0ad0>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f3ff77b07d0>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff766f210>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f3ff766f390>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[True-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_10_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff7659bd0>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f3ff7659ed0>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f400c0ea290>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f400c0ea450>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[True-100-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_True_100_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = True

dataset = <nvtabular.io.dataset.Dataset object at 0x7f4064307190>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f423bc37910>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f40643dde90>

batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f3ff77c9b90>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

_____________________ test_tf_gpu_dl[False-1-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_1_parquet1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f40643dd210>

batch_size = 1, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f400c7c0290>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.01] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f4243d5e510>

batch_size = 10, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f40643d3cd0>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-10-parquet-0.06] _____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_10_parque1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f423b952e90>

batch_size = 10, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f40643d5c10>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.01] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu0')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f3ff77f1810>

batch_size = 100, gpu_memory_frac = 0.01, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f400c71af10>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

____________________ test_tf_gpu_dl[False-100-parquet-0.06] ____________________
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_tf_gpu_dl_False_100_parqu1')

paths = ['/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-7/parquet0/dataset-1.parquet']

use_paths = False

dataset = <nvtabular.io.dataset.Dataset object at 0x7f407d8a5290>

batch_size = 100, gpu_memory_frac = 0.06, engine = 'parquet'
@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
def test_tf_gpu_dl(tmpdir, paths, use_paths, dataset, batch_size, gpu_memory_frac, engine):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    processor = nvt.Workflow(cat_names=cat_names, cont_names=cont_names, label_name=label_name)
    processor.add_feature([ops.FillMedian()])
    processor.add_feature(ops.Normalize())
    processor.add_preprocess(ops.Categorify())
    processor.finalize()

    data_itr = tf_dataloader.KerasSequenceLoader(
        paths if use_paths else dataset,
        cat_names=cat_names,
        cont_names=cont_names,
        batch_size=batch_size,
        buffer_size=gpu_memory_frac,
        label_names=label_name,
        engine=engine,
        shuffle=False,
    )
    _ = tf.random.uniform((1,))


  processor.update_stats(dataset)


tests/unit/test_tf_dataloader.py:101:

nvtabular/workflow.py:993: in update_stats

self.build_and_process_graph(dataset, end_phase=end_phase, record_stats=True)

nvtabular/workflow.py:1034: in build_and_process_graph

self.exec_phase(idx, record_stats=record_stats, update_ddf=(idx == (end - 1)))

nvtabular/workflow.py:802: in exec_phase

op.finalize(computed_stats)

/opt/conda/envs/rapids/lib/python3.7/contextlib.py:74: in inner

return func(*args, **kwds)

self = <nvtabular.ops.median.Median object at 0x7f423b8cb090>

dask_stats = x     -0.009865141

y             

id          1000.0

Name: 0.5, dtype: float64
@annotate("Median_finalize", color="green", domain="nvt_python")
def finalize(self, dask_stats):
    for col in dask_stats.index.values_host:


      self.medians[col] = float(dask_stats[col])


E           TypeError: float() argument must be a string or a number, not 'NoneType'
nvtabular/ops/median.py:49: TypeError

=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 40821 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 34563 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 43359 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f8c3b5790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c0d1b90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c0d1b90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c132850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c132850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f8c132850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f5469ac10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c143290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c143290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f3f8c1b7410>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f3f8c1b7410>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:302: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f3f8c1b7410>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 59592 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55832 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54912 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54236 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57824 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 37697 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            4      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/tensorflow/layers/outer_product.py      30     24     10      0    15%   22-23, 26-45, 56-69, 72

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     12    112      8    95%   71->72, 72, 123->124, 124, 131-132, 212->214, 214, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488

nvtabular/loader/tensorflow.py                                   117     14     52      9    85%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 293->294, 294, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         41     10      8      0    67%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      2     32      3    96%   149->150, 150, 154->179, 186->187, 187

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      0      2      0   100%

nvtabular/ops/minmax.py                                           30      0      2      0   100%

nvtabular/ops/moments.py                                          91      0     20      0   100%

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         26      0     12      1    97%   39->exit

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      3    96%   144->146, 173->174, 174, 178->179, 179

nvtabular/ops/transform_operator.py                               47      6     14      2    84%   49-53, 75->76, 76-78, 95->96, 96

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            565     46    300     26    89%   80->81, 81, 129->130, 130, 131->132, 132, 143->exit, 206-209, 304->310, 310, 316->317, 317-321, 351->exit, 366->exit, 381->exit, 396->exit, 449->451, 467->466, 526->529, 529, 554->555, 555, 561->564, 564, 661->660, 715->720, 720, 723->724, 724, 769->770, 770, 828-856, 973->979, 979->exit, 1021->1022, 1022, 1031->1037, 1073->1074, 1074-1076, 1080->1081, 1081, 1116->1117, 1117

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3598    491   1540    171    83%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 83.36%

=========================== short test summary info ============================

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[True-100-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-1-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-10-parquet-0.06]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.01]

FAILED tests/unit/test_tf_dataloader.py::test_tf_gpu_dl[False-100-parquet-0.06]

===== 12 failed, 576 passed, 8 skipped, 273 warnings in 500.79s (0:08:20) ======

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Build step 'Execute shell' marked build as failure

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins1269299399639666014.sh

nvidia-merlin-bot · 2020-11-13T02:43:20Z

Click to view CI Results

GitHub pull request #393 of commit fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e, no merge conflicts.
Running as SYSTEM
Setting status of fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1201/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e^{commit} # timeout=10
Checking out Revision fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e # timeout=10
Commit message: "code format changes"
 > git rev-list --no-walk 6892912d7042a381361fce1b6fe3c1ce3ce3c772 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins3212521516033073898.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
./nvtabular/workflow.py:125:13: F841 local variable 'found' is assigned to but never used
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins1999713744125321167.sh

nvidia-merlin-bot · 2020-11-13T02:50:49Z

Click to view CI Results

GitHub pull request #393 of commit d14ebb4a72d0912c632c6488c3d9c9848284e83b, no merge conflicts.
Running as SYSTEM
Setting status of d14ebb4a72d0912c632c6488c3d9c9848284e83b to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1202/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse d14ebb4a72d0912c632c6488c3d9c9848284e83b^{commit} # timeout=10
Checking out Revision d14ebb4a72d0912c632c6488c3d9c9848284e83b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d14ebb4a72d0912c632c6488c3d9c9848284e83b # timeout=10
Commit message: "better exception message with actionable feedback"
 > git rev-list --no-walk fe55514cd2d351f5d6c6fbd5d68dd87aa40ace0e # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins7370348144888778361.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
./nvtabular/workflow.py:125:13: F841 local variable 'found' is assigned to but never used
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" 
[nvtabular_tests] $ /bin/bash /tmp/jenkins8635576175260255048.sh

nvidia-merlin-bot · 2020-11-13T03:38:10Z

Click to view CI Results

GitHub pull request #393 of commit 93f34e4e506030059ea66a262605bb0ad4e5b5f2, no merge conflicts.
Running as SYSTEM
Setting status of 93f34e4e506030059ea66a262605bb0ad4e5b5f2 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1203/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 93f34e4e506030059ea66a262605bb0ad4e5b5f2^{commit} # timeout=10
Checking out Revision 93f34e4e506030059ea66a262605bb0ad4e5b5f2 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 93f34e4e506030059ea66a262605bb0ad4e5b5f2 # timeout=10
Commit message: "flake error correction"
 > git rev-list --no-walk d14ebb4a72d0912c632c6488c3d9c9848284e83b # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2189674226408279097.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 62%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py .............Build was aborted

Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins4286647271522822677.sh

nvidia-merlin-bot · 2020-11-13T03:48:13Z

Click to view CI Results

GitHub pull request #393 of commit 5173f22ae1367b38554473ca612d1af016b88bc0, no merge conflicts. Running as SYSTEM Setting status of 5173f22ae1367b38554473ca612d1af016b88bc0 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1205/ and message: 'Pending' Using context: Jenkins Unit Test Run Building in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10 > git rev-parse 5173f22ae1367b38554473ca612d1af016b88bc0^{commit} # timeout=10 Checking out Revision 5173f22ae1367b38554473ca612d1af016b88bc0 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 5173f22ae1367b38554473ca612d1af016b88bc0 # timeout=10 Commit message: "code reformat" > git rev-list --no-walk f48281cc74ac4020d221da225b5c6f6ff36845e0 # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins1476836800848207370.sh Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular Installing build dependencies: started Installing build dependencies: finished with status 'done' Getting requirements to build wheel: started Getting requirements to build wheel: finished with status 'done' Preparing wheel metadata: started Preparing wheel metadata: finished with status 'done' Installing collected packages: nvtabular Attempting uninstall: nvtabular Found existing installation: nvtabular 0.3.0a1 Uninstalling nvtabular-0.3.0a1: Successfully uninstalled nvtabular-0.3.0a1 Running setup.py develop for nvtabular Successfully installed nvtabular All done! ✨ 🍰 ✨ 77 files would be left unchanged. /var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images warn(f"Likely recursive symlink detected to {resolved_path}") Skipped 1 files ============================= test session starts ============================== platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0 collected 596 items

tests/unit/test_column_similarity.py ...... [ 1%]
tests/unit/test_dask_nvt.py .....................................Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5364765974075243811.sh

nvidia-merlin-bot · 2020-11-13T03:48:34Z

Click to view CI Results

GitHub pull request #393 of commit 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8, no merge conflicts. Running as SYSTEM Setting status of 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1206/ and message: 'Pending' Using context: Jenkins Unit Test Run Building in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10 > git rev-parse 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8^{commit} # timeout=10 Checking out Revision 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 # timeout=10 Commit message: "added more fixes based on comments" > git rev-list --no-walk 5173f22ae1367b38554473ca612d1af016b88bc0 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins2157146446715089014.sh Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular Installing build dependencies: started Installing build dependencies: finished with status 'done' Getting requirements to build wheel: started Getting requirements to build wheel: finished with status 'done' Preparing wheel metadata: started Preparing wheel metadata: finished with status 'done' Installing collected packages: nvtabular Attempting uninstall: nvtabular Found existing installation: nvtabular 0.3.0a1 Uninstalling nvtabular-0.3.0a1: Successfully uninstalled nvtabular-0.3.0a1 Running setup.py develop for nvtabular Successfully installed nvtabular All done! ✨ 🍰 ✨ 77 files would be left unchanged. /var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images warn(f"Likely recursive symlink detected to {resolved_path}") Skipped 1 files ============================= test session starts ============================== platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0 collected 596 items

tests/unit/test_column_similarity.py /tmp/jenkins2157146446715089014.sh: line 10: 24449 Terminated py.test --cov-config tests/unit/.coveragerc --cov-report term-missing --cov-report xml --cov-fail-under 70 --cov=. tests/unit/
Build was aborted
Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
source activate rapids
cd /var/jenkins_home/
python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[nvtabular_tests] $ /bin/bash /tmp/jenkins5908777722741407840.sh

nvidia-merlin-bot · 2020-11-13T03:49:07Z

Click to view CI Results

GitHub pull request #393 of commit cb2c5817a1290f6b2a472c2c10df88503bd22516, no merge conflicts.
Running as SYSTEM
Setting status of cb2c5817a1290f6b2a472c2c10df88503bd22516 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1207/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse cb2c5817a1290f6b2a472c2c10df88503bd22516^{commit} # timeout=10
Checking out Revision cb2c5817a1290f6b2a472c2c10df88503bd22516 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f cb2c5817a1290f6b2a472c2c10df88503bd22516 # timeout=10
Commit message: "remove copy on list call... redundant"
 > git rev-list --no-walk 31fc0d53be25f5e5fad5b11426ad9bdcbf1e15a8 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins2808874403306199635.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py .........Build was aborted

Aborted by �[8mha:////4I6AZwo/1Z8Fal8AhZTEatjIwqNwCcqT21311HdysuK+AAAAlx+LCAAAAAAAAP9b85aBtbiIQTGjNKU4P08vOT+vOD8nVc83PyU1x6OyILUoJzMv2y+/JJUBAhiZGBgqihhk0NSjKDWzXb3RdlLBUSYGJk8GtpzUvPSSDB8G5tKinBIGIZ+sxLJE/ZzEvHT94JKizLx0a6BxUmjGOUNodHsLgAzWEgZu/dLi1CL9xJTczDwAj6GcLcAAAAA=�[0madmin

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins2356138026806348218.sh

nvidia-merlin-bot · 2020-11-13T03:58:13Z

Click to view CI Results

GitHub pull request #393 of commit d0d5590d288eb2453417dd7953695ac63d876992, no merge conflicts.
Running as SYSTEM
Setting status of d0d5590d288eb2453417dd7953695ac63d876992 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1208/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse d0d5590d288eb2453417dd7953695ac63d876992^{commit} # timeout=10
Checking out Revision d0d5590d288eb2453417dd7953695ac63d876992 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d0d5590d288eb2453417dd7953695ac63d876992 # timeout=10
Commit message: "Merge branch 'main' into fixes"
 > git rev-list --no-walk cb2c5817a1290f6b2a472c2c10df88503bd22516 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins215466437062872589.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 62%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 80%]

tests/unit/test_workflow.py ............................................ [ 88%]

......................................................................   [100%]
=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 43461 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 33963 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 35671 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07642f6dd0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f07642c0850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f07642c0850>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f07642c0110>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f07642c0110>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07642c0110>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07641d7290>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f0764613d50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f0764613d50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f07842dda10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f07842dda10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:285: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f07842dda10>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57408 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54428 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 54496 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 36619 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            4      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/tensorflow/layers/outer_product.py      30     24     10      0    15%   22-23, 26-45, 56-69, 72

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     11    112      7    95%   71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488

nvtabular/loader/tensorflow.py                                   117     13     52      8    86%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         41     10      8      0    67%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      2     32      3    96%   149->150, 150, 154->179, 186->187, 187

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      0      2      0   100%

nvtabular/ops/minmax.py                                           30      0      2      0   100%

nvtabular/ops/moments.py                                          91      0     20      0   100%

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         28      1     12      1    95%   41->44, 44

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      3    96%   144->146, 173->174, 174, 178->179, 179

nvtabular/ops/transform_operator.py                               47      6     14      2    84%   49-53, 75->76, 76-78, 95->96, 96

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            528     19    270     26    94%   80->81, 81, 125->126, 126, 127->128, 128, 139->161, 161, 287->293, 293, 299->300, 300-304, 334->exit, 349->exit, 364->exit, 379->exit, 432->434, 450->449, 509->512, 512, 537->538, 538, 544->547, 547, 644->643, 698->703, 703, 706->707, 707, 752->753, 753, 914->920, 920->exit, 958->959, 959, 968->974, 1010->1011, 1011-1013, 1017->1018, 1018, 1053->1054, 1054

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3563    463   1510    169    84%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.21%

=========== 588 passed, 8 skipped, 273 warnings in 506.77s (0:08:26) ===========

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins659260479090043931.sh

benfred · 2020-11-13T05:35:38Z

nvtabular/workflow.py

+            current = self.create_full_col_ctx_entry(op, target_cols, extra_cols, parent=parent)
+        self.reduce(self.columns_ctx["full"])
+
+    def reduce(self, full_dict):


can we make it clear that these new methods on the workflow object are private - so users don't need to care about them. L

Methods like reduce/transform/analyze_placement/detect_cols_collision etc should also have a '_' by them ?

benfred

+1 otherwise

…ixes

nvidia-merlin-bot · 2020-11-13T06:10:44Z

Click to view CI Results

GitHub pull request #393 of commit 209a58a4ea6f9d18c7ee59b199228d5d079d5798, no merge conflicts.
Running as SYSTEM
Setting status of 209a58a4ea6f9d18c7ee59b199228d5d079d5798 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/1209/ and message: 'Pending'
Using context: Jenkins Unit Test Run
Building in workspace /var/jenkins_home/workspace/nvtabular_tests
using credential nvidia-merlin-bot
Cloning the remote Git repository
Cloning repository https://github.com/NVIDIA/NVTabular.git
 > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
 > git --version # timeout=10
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
 > git config remote.origin.url https://github.com/NVIDIA/NVTabular.git # timeout=10
Fetching upstream changes from https://github.com/NVIDIA/NVTabular.git
using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD
 > git fetch --tags --progress -- https://github.com/NVIDIA/NVTabular.git +refs/pull/393/*:refs/remotes/origin/pr/393/* # timeout=10
 > git rev-parse 209a58a4ea6f9d18c7ee59b199228d5d079d5798^{commit} # timeout=10
Checking out Revision 209a58a4ea6f9d18c7ee59b199228d5d079d5798 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 209a58a4ea6f9d18c7ee59b199228d5d079d5798 # timeout=10
Commit message: "Merge branch 'fixes' of https://github.com/jperez999/NVTabular into fixes"
 > git rev-list --no-walk d0d5590d288eb2453417dd7953695ac63d876992 # timeout=10
[nvtabular_tests] $ /bin/bash /tmp/jenkins737640288114269815.sh
Obtaining file:///var/jenkins_home/workspace/nvtabular_tests/nvtabular
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'done'
Installing collected packages: nvtabular
  Attempting uninstall: nvtabular
    Found existing installation: nvtabular 0.3.0a1
    Uninstalling nvtabular-0.3.0a1:
      Successfully uninstalled nvtabular-0.3.0a1
  Running setup.py develop for nvtabular
Successfully installed nvtabular
All done! ✨ 🍰 ✨
77 files would be left unchanged.
/var/jenkins_home/.local/lib/python3.7/site-packages/isort/main.py:125: UserWarning: Likely recursive symlink detected to /var/jenkins_home/workspace/nvtabular_tests/nvtabular/images
  warn(f"Likely recursive symlink detected to {resolved_path}")
Skipped 1 files
============================= test session starts ==============================
platform linux -- Python 3.7.8, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: setup.cfg
plugins: benchmark-3.2.3, asyncio-0.12.0, hypothesis-5.37.4, timeout-1.4.2, cov-2.10.1, forked-1.3.0, xdist-2.1.0
collected 596 items
tests/unit/test_column_similarity.py ......                              [  1%]

tests/unit/test_dask_nvt.py ............................................ [  8%]

..........                                                               [ 10%]

tests/unit/test_io.py .................................................. [ 18%]

........................................ssssssss                         [ 26%]

tests/unit/test_notebooks.py ....                                        [ 27%]

tests/unit/test_ops.py ................................................. [ 35%]

........................................................................ [ 47%]

.......................................................................  [ 59%]

tests/unit/test_s3.py ..                                                 [ 59%]

tests/unit/test_tf_dataloader.py ...................                     [ 62%]

tests/unit/test_tf_layers.py ........................................... [ 70%]

..................................                                       [ 75%]

tests/unit/test_torch_dataloader.py ..............................       [ 80%]

tests/unit/test_workflow.py ............................................ [ 88%]

......................................................................   [100%]
=============================== warnings summary ===============================

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

../../../../../opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219

/opt/conda/envs/rapids/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject

return f(*args, **kwds)
tests/unit/test_column_similarity.py: 12 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cupy/sparse/init.py:17: DeprecationWarning: cupy.sparse is deprecated. Use cupyx.scipy.sparse instead.

warnings.warn(msg, DeprecationWarning)
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py::test_column_similarity[tfidf-True]

/opt/conda/envs/rapids/lib/python3.7/site-packages/numba/cuda/envvars.py:17: NumbaWarning: �[1m

Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice/.
For more information about alternatives visit: ('https://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')�[0m

warnings.warn(errors.NumbaWarning(msg))
tests/unit/test_column_similarity.py: 12 warnings

tests/unit/test_dask_nvt.py: 2 warnings

tests/unit/test_io.py: 5 warnings

tests/unit/test_torch_dataloader.py: 1 warning

tests/unit/test_workflow.py: 5 warnings

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/dataframe.py:672: DeprecationWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

mask = pd.Series(mask)
tests/unit/test_io.py::test_hugectr[True-0-op_columns0-parquet-hugectr]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 37361 instead

http_address["port"], self.http_server.port
tests/unit/test_io.py::test_mulifile_parquet[True-0-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-0-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-1-2-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-0-csv]

tests/unit/test_io.py::test_mulifile_parquet[True-2-2-csv]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/shuffle.py:42: DeprecationWarning: shuffle=True is deprecated. Using PER_WORKER.

warnings.warn("shuffle=True is deprecated. Using PER_WORKER.", DeprecationWarning)
tests/unit/test_notebooks.py::test_multigpu_dask_example

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 36793 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_minmax[op_columns0-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 39521 instead

http_address["port"], self.http_server.port
tests/unit/test_ops.py::test_categorify_lists[0]

tests/unit/test_ops.py::test_categorify_lists[1]

tests/unit/test_ops.py::test_categorify_lists[2]

tests/unit/test_torch_dataloader.py::test_mh_model_support

/opt/conda/envs/rapids/lib/python3.7/site-packages/cudf/core/join/join.py:368: UserWarning: can't safely cast column from right with type float64 to object, upcasting to None

"right", dtype_r, dtype_l, libcudf_join_type
tests/unit/test_tf_dataloader.py: 72 warnings

tests/unit/test_tf_layers.py: 125 warnings

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/framework/tensor_util.py:523: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.

tensor_proto.tensor_content = nparray.tostring()
tests/unit/test_tf_layers.py::test_dot_product_interaction_layer[True-None-1-1]

/var/jenkins_home/.local/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:544: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working

if isinstance(inputs, collections.Sequence):
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f92083ad3d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f920840fe50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f920840fe50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f92087c42d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f92087c42d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name0-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f92087c42d0>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names0-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f9208355d50>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f92083c7b90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names0-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f92083c7b90>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.fill.FillMedian object at 0x7f9208449790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.normalize.Normalize object at 0x7f9208449790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_empty_cols[label_name1-cont_names1-cat_names1-parquet]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow.py:284: UserWarning: Did not add operators: [<nvtabular.ops.categorify.Categorify object at 0x7f9208449790>], target columns is empty.

warnings.warn(f"Did not add operators: {operators}, target columns is empty.")
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 52728 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 55832 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[None-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 57600 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-1-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58276 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-10-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 58016 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_gpu_dl[devices1-parquet-100-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 56352 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_torch_dataloader.py::test_kill_dl[parquet-1e-06]

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/parquet.py:56: UserWarning: Row group size 112640 is bigger than requested part_size 17069

f"Row group size {rg_byte_size_0} is bigger than requested part_size "
tests/unit/test_workflow.py::test_gpu_workflow_api[True-op_columns0-True-parquet-0.01]

/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/node.py:155: UserWarning: Port 8787 is already in use.

Perhaps you already have a cluster running?

Hosting the HTTP server on port 43951 instead

http_address["port"], self.http_server.port
tests/unit/test_workflow.py::test_chaining_3

/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:193: UserWarning: part_mem_fraction is ignored for DataFrame input.

warnings.warn("part_mem_fraction is ignored for DataFrame input.")
-- Docs: https://docs.pytest.org/en/stable/warnings.html
----------- coverage: platform linux, python 3.7.8-final-0 -----------

Name                                                           Stmts   Miss Branch BrPart  Cover   Missing
nvtabular/init.py                                              8      0      0      0   100%

nvtabular/framework_utils/init.py                              0      0      0      0   100%

nvtabular/framework_utils/tensorflow/init.py                   1      0      0      0   100%

nvtabular/framework_utils/tensorflow/feature_column_utils.py     125    117     81      0     4%   12-16, 53-251

nvtabular/framework_utils/tensorflow/layers/init.py            4      0      0      0   100%

nvtabular/framework_utils/tensorflow/layers/embedding.py         153     14     89      7    87%   47->56, 56, 64->45, 99->100, 100, 107->108, 108, 185->186, 186, 238-246, 249, 342->350, 364->367, 370-371, 374

nvtabular/framework_utils/tensorflow/layers/interaction.py        47      2     20      1    96%   47->48, 48, 112

nvtabular/framework_utils/tensorflow/layers/outer_product.py      30     24     10      0    15%   22-23, 26-45, 56-69, 72

nvtabular/framework_utils/torch/init.py                        0      0      0      0   100%

nvtabular/framework_utils/torch/layers/init.py                 2      0      0      0   100%

nvtabular/framework_utils/torch/layers/embeddings.py              27      1     12      1    95%   46->47, 47

nvtabular/framework_utils/torch/models.py                         38      0     22      0   100%

nvtabular/framework_utils/torch/utils.py                          31      4     10      2    85%   51->52, 52, 55->56, 56-58

nvtabular/io/init.py                                           4      0      0      0   100%

nvtabular/io/avro.py                                              78     78     26      0     0%   16-175

nvtabular/io/csv.py                                               14      1      4      1    89%   35->36, 36

nvtabular/io/dask.py                                              80      3     32      6    92%   154->157, 164->165, 165, 169->171, 171->167, 175->176, 176, 177->178, 178

nvtabular/io/dataframe_engine.py                                  12      1      4      1    88%   31->32, 32

nvtabular/io/dataset.py                                          105     15     48      8    84%   190->191, 191, 203->204, 204, 212->213, 213, 221->244, 226->230, 230-244, 319->320, 320, 334->335, 335-336, 354->355, 355

nvtabular/io/dataset_engine.py                                    13      0      0      0   100%

nvtabular/io/hugectr.py                                           42      1     18      1    97%   64->87, 91

nvtabular/io/parquet.py                                          124      1     40      2    98%   87->89, 89, 183->185

nvtabular/io/shuffle.py                                           25      2     10      2    89%   38->39, 39, 43->46, 46

nvtabular/io/writer.py                                           123      9     45      2    92%   30, 47, 71->72, 72, 110, 113, 181->182, 182, 203-205

nvtabular/io/writer_factory.py                                    16      2      6      2    82%   31->32, 32, 49->52, 52

nvtabular/loader/init.py                                       0      0      0      0   100%

nvtabular/loader/backend.py                                      271     11    112      7    95%   71->72, 72, 123->124, 124, 131-132, 220->222, 258->259, 259-260, 381->382, 382, 383->386, 386-387, 480->481, 481, 488

nvtabular/loader/tensorflow.py                                   117     13     52      8    86%   39->40, 40-41, 51->52, 52, 59->60, 60-63, 72->73, 73, 78->83, 83, 286->287, 287, 302-304, 314->318, 346->347, 347

nvtabular/loader/tf_utils.py                                      55      9     20      5    81%   29->32, 32->34, 39->41, 42->43, 43, 50-51, 58-60, 65->73, 68-73

nvtabular/loader/torch.py                                         41     10      8      0    67%   25-27, 30-36

nvtabular/ops/init.py                                         22      0      0      0   100%

nvtabular/ops/bucketize.py                                        37      4     25      4    81%   33->34, 34, 35->44, 36->42, 42-44, 54->55, 55

nvtabular/ops/categorify.py                                      397     59    218     40    83%   160->161, 161, 169->174, 174, 184->185, 185, 200->201, 201, 235->236, 236, 285->286, 286, 373->374, 374-376, 378->379, 379, 380->381, 381, 403->406, 406, 416->417, 417, 422->426, 426, 450->451, 451-452, 454->455, 455-456, 458->459, 459-475, 477->481, 481, 485->486, 486, 487->488, 488, 495->496, 496, 497->498, 498, 503->504, 504, 513->520, 520-521, 525->526, 526, 538->539, 539, 540->544, 544, 547->565, 565-568, 591->592, 592, 595->596, 596, 597->598, 598, 605->606, 606, 607->610, 610, 717->718, 718, 719->720, 720, 751->766, 789->790, 790, 806->811, 809->810, 810, 820->817, 825->817, 832->833, 833

nvtabular/ops/clip.py                                             25      3     10      4    80%   52->53, 53, 61->62, 62, 66->68, 68->69, 69

nvtabular/ops/column_similarity.py                                89     21     28      4    70%   171-172, 181-183, 191-207, 222->232, 224->227, 227->228, 228, 237->238, 238

nvtabular/ops/difference_lag.py                                   22      1      6      1    93%   75->76, 76

nvtabular/ops/dropna.py                                           14      0      0      0   100%

nvtabular/ops/fill.py                                             36      2     10      2    91%   66->67, 67, 107->108, 108

nvtabular/ops/filter.py                                           22      1      6      1    93%   44->45, 45

nvtabular/ops/groupby_statistics.py                               83      2     32      3    96%   149->150, 150, 154->179, 186->187, 187

nvtabular/ops/hash_bucket.py                                      35      4     18      2    85%   98->99, 99-101, 102->105, 105

nvtabular/ops/hashed_cross.py                                     32      1     16      1    96%   35->36, 36

nvtabular/ops/join_external.py                                    66      4     26      5    90%   105->106, 106, 107->108, 108, 122->125, 125, 138->142, 178->179, 179

nvtabular/ops/join_groupby.py                                     56      0     18      0   100%

nvtabular/ops/lambdaop.py                                         27      2     10      2    89%   82->83, 83, 84->85, 85

nvtabular/ops/logop.py                                            17      1      4      1    90%   57->58, 58

nvtabular/ops/median.py                                           24      0      2      0   100%

nvtabular/ops/minmax.py                                           30      0      2      0   100%

nvtabular/ops/moments.py                                          91      0     20      0   100%

nvtabular/ops/normalize.py                                        49      4     14      4    84%   65->66, 66, 73->72, 122->123, 123, 132->134, 134-135

nvtabular/ops/operator.py                                         28      1     12      1    95%   41->44, 44

nvtabular/ops/stat_operator.py                                    10      0      0      0   100%

nvtabular/ops/target_encoding.py                                  98      2     40      3    96%   144->146, 173->174, 174, 178->179, 179

nvtabular/ops/transform_operator.py                               47      6     14      2    84%   49-53, 75->76, 76-78, 95->96, 96

nvtabular/utils.py                                                25      5     10      5    71%   26->27, 27, 28->31, 31, 37->38, 38, 40->41, 41, 45->47, 47

nvtabular/worker.py                                               65      1     30      2    97%   80->92, 118->121, 121

nvtabular/workflow.py                                            528     19    270     26    94%   80->81, 81, 125->126, 126, 127->128, 128, 139->161, 161, 286->292, 292, 298->299, 299-303, 333->exit, 348->exit, 363->exit, 378->exit, 431->433, 449->448, 508->511, 511, 536->537, 537, 543->546, 546, 643->642, 697->702, 702, 705->706, 706, 751->752, 752, 913->919, 919->exit, 957->958, 958, 967->973, 1009->1010, 1010-1012, 1016->1017, 1017, 1052->1053, 1053

setup.py                                                           2      2      0      0     0%   18-20
TOTAL                                                           3563    463   1510    169    84%

Coverage XML written to file coverage.xml
Required test coverage of 70% reached. Total coverage: 84.21%

=========== 588 passed, 8 skipped, 273 warnings in 512.32s (0:08:32) ===========

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

--- Logging error ---

Traceback (most recent call last):

File "/opt/conda/envs/rapids/lib/python3.7/logging/init.py", line 1028, in emit

stream.write(msg + self.terminator)

ValueError: I/O operation on closed file.

Call stack:

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 890, in _bootstrap

self._bootstrap_inner()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 926, in _bootstrap_inner

self.run()

File "/opt/conda/envs/rapids/lib/python3.7/threading.py", line 870, in run

self._target(*self._args, **self._kwargs)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/utils.py", line 417, in run_loop

loop.start()

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 149, in start

self.asyncio_loop.run_forever()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 541, in run_forever

self._run_once()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once

handle._run()

File "/opt/conda/envs/rapids/lib/python3.7/asyncio/events.py", line 88, in _run

self._context.run(self._callback, *self._args)

File "/opt/conda/envs/rapids/lib/python3.7/site-packages/distributed/nanny.py", line 456, in _on_exit

logger.warning("Restarting worker")

Message: 'Restarting worker'

Arguments: ()

Performing Post build task...

Match found for : : True

Logical operation result is TRUE

Running script  : #!/bin/bash

source activate rapids

cd /var/jenkins_home/

python test_res_push.py "https://api.github.com/repos/NVIDIA/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"

[nvtabular_tests] $ /bin/bash /tmp/jenkins4709658086127700440.sh

* various fixes for diff issues * forward progress! * generating correct phases * working all the way except some dataloader fails... * code reformat * fixes for test fails in dataloader * reformat codes * udpate to pass tests * add checks for workflow phases * fixes and test case for issue 401 * code reformat * remove extra code that was refactored out * comments fixed * more fices * code format changes * better exception message with actionable feedback * flake error correction * add warning on retry set * code reformat * added more fixes based on comments * remove copy on list call... redundant * privatize new functions Co-authored-by: Ben Frederickson <[email protected]>

various fixes for diff issues

b35d2e7

This was linked to issues Nov 2, 2020

[FEA] Standardize a Workflow's representation of its output columns #372

Closed

[BUG] LambdaOp + Categorify: keyerror #377

Closed

[FEA] Simplify API for adding ops to a workflow #383

Closed

forward progress!

e090860

generating correct phases

4e5f41e

working all the way except some dataloader fails...

5fa294c

merging in

f408d99

code reformat

318679a

fixes for test fails in dataloader

986c38d

reformat codes

0e89af3

udpate to pass tests

74a9fb0

Merge branch 'main' into fixes

292c973

jperez999 added 2 commits November 9, 2020 16:33

add checks for workflow phases

81d0445

Merge branch 'fixes' of https://github.com/jperez999/NVTabular into f…

a4378fb

…ixes

Merge branch 'main' into fixes

605266b

benfred reviewed Nov 12, 2020

View reviewed changes

jperez999 added 2 commits November 13, 2020 00:45

comments fixed

a15584e

Merge branch 'fixes' of https://github.com/jperez999/NVTabular into f…

6892912

…ixes

jperez999 added 2 commits November 13, 2020 02:41

more fices

1c2d058

code format changes

fe55514

better exception message with actionable feedback

d14ebb4

jperez999 added 5 commits November 13, 2020 03:19

flake error correction

93f34e4

add warning on retry set

4c3ec84

code reformat

5173f22

added more fixes based on comments

31fc0d5

remove copy on list call... redundant

cb2c581

Merge branch 'main' into fixes

d0d5590

jperez999 requested a review from benfred November 13, 2020 03:59

benfred reviewed Nov 13, 2020

View reviewed changes

benfred approved these changes Nov 13, 2020

View reviewed changes

jperez999 added 2 commits November 13, 2020 06:00

privatize new functions

25328d7

Merge branch 'fixes' of https://github.com/jperez999/NVTabular into f…

209a58a

…ixes

jperez999 merged commit 99fc9e5 into NVIDIA-Merlin:main Nov 13, 2020

various fixes for diff issues #393

various fixes for diff issues #393

Conversation

jperez999 commented Nov 2, 2020

nvidia-merlin-bot commented Nov 2, 2020

nvidia-merlin-bot commented Nov 6, 2020

nvidia-merlin-bot commented Nov 7, 2020

nvidia-merlin-bot commented Nov 8, 2020

nvidia-merlin-bot commented Nov 8, 2020

nvidia-merlin-bot commented Nov 8, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 9, 2020

nvidia-merlin-bot commented Nov 9, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 9, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

jperez999 commented Nov 9, 2020

nvidia-merlin-bot commented Nov 9, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

jperez999 commented Nov 9, 2020

nvidia-merlin-bot commented Nov 9, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 9, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 9, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 12, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 13, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvidia-merlin-bot commented Nov 13, 2020

nvidia-merlin-bot commented Nov 13, 2020

nvidia-merlin-bot commented Nov 13, 2020

nvidia-merlin-bot commented Nov 13, 2020

nvidia-merlin-bot commented Nov 13, 2020

nvidia-merlin-bot commented Nov 13, 2020

nvidia-merlin-bot commented Nov 13, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

benfred Nov 13, 2020

Choose a reason for hiding this comment

benfred left a comment

Choose a reason for hiding this comment

nvidia-merlin-bot commented Nov 13, 2020

----------- coverage: platform linux, python 3.7.8-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing

----------- coverage: platform linux, python 3.7.8-final-0 -----------
Name Stmts Miss Branch BrPart Cover Missing