Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Nightly Tests For Clojure #13624

Merged
merged 1 commit into from
Jan 22, 2019
Merged

Conversation

hellonico
Copy link
Contributor

Description

This is about running the inrtegration tests during nightly CI.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

ci/docker/runtime_functions.sh Outdated Show resolved Hide resolved
ci/jenkins/Jenkins_steps.groovy Outdated Show resolved Hide resolved
@hellonico
Copy link
Contributor Author

updated !

hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Dec 12, 2018
@hellonico
Copy link
Contributor Author

  • Moved 'Clojure Nightly: CPU' to JenkinsfileForBinaries
  • Renamed main function to: unittest_ubuntu_cpu_clojure_integration_tests

@gigasquid gigasquid added Clojure pr-work-in-progress PR is still work in progress labels Dec 13, 2018
@hellonico
Copy link
Contributor Author

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-13624/4/pipeline

Let me guess ... its fails because the workspace is different in JenkinsfileForBinaries ?

@gigasquid
Copy link
Member

@hellonico - I'm still a little bit unsure of how all the CI/ Nightly tests fit together. I started another PR (temporary) to test my theories of how it all works so we can get a better understanding of it. Anything we learn from that, we can apply here. #13642

@hellonico hellonico requested a review from gigasquid as a code owner December 14, 2018 01:42
hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Dec 14, 2018
hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Dec 14, 2018
@gigasquid
Copy link
Member

@marcoabreu I think we are at a good point where we understand that the integration tests can run on a node in the regular CI. We just need to move that node over to the nightly tests. Do you think that you could make a copy of the job to point to this branch so that we can test that?

hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Dec 19, 2018
@sandeep-krishnamurthy
Copy link
Contributor

@marcoabreu - can you please help here? Thanks.

@gigasquid
Copy link
Member

@marcoabreu So I've been thinking about this a bit more and have a couple of questions.

  1. It seems like other languages have integration tests that run as part of regular CI https://github.com/apache/incubator-mxnet/blob/master/ci/jenkins/Jenkins_steps.groovy (do a search for integration). Since the Clojure one does not take too long (15 min for the suite), could we have it as part of the regular CI instead of nightly?

  2. If we can't and have to do nightly, how do we get notified of breaking tests?

@stu1130
Copy link
Contributor

stu1130 commented Jan 16, 2019

@marcoabreu ping again for the update! Thanks

@gigasquid
Copy link
Member

started a dev discussion about this - https://lists.apache.org/thread.html/a22e40685d543cce4e2f4420046e8142126275888c1c15cb83a7a103@%3Cdev.mxnet.apache.org%3E

I think we can move forward on this by just adding it to CI and not nightlies right now until the notifications get implemented.

@gigasquid
Copy link
Member

@hellonico if you have some bandwidth to move this forward I think we need:

  • Merge with master
  • Remove any changes from the nightlies directory
  • Keep our CI integration tests (the new node we added and tested)
  • Tweak the integration-tests.sh to have some sort of exclude list so we have a way to easily take out tests that become problematical until we can fix them.

Thanks for your patience in figuring this out :)

hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Jan 19, 2019
@hellonico
Copy link
Contributor Author

hellonico commented Jan 19, 2019

Bonjour !

Working on it ...

  • Merge with master
  • Remove any changes from the nightlies directory
  • Keep our CI integration tests (the new node we added and tested)
  • Tweak the integration-tests.sh to have some sort of exclude list so we have a way to easily take out tests that become problematical until we can fix them.
  • Update infer test cases to run in the integration phase (simply with lein test).
    - imageclassifier and predictor are ok
    - My change to draw contour boxes on the objectdetection broke the tests. Nothing is returned in detect-images-in-dir and detect-single-image, so nothing to test against. 🥇

node(NODE_LINUX_GPU) {
ws('workspace/clojure') {
utils.unpack_and_init('gpu', mx_lib)
utils.docker_run('ubuntu_nightly_gpu', 'unittest_ubuntu_cpu_clojure_integration', false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unittest_ubuntu_cpu_clojure_integration Are these CPU or GPU? maybe rename the function to XPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larroy Merci ! Thanks for the comment ! Right now and as it is, only CPU but it's almost ready to run a few GPU tests
I don't see any reference to XPU in master, is another language already using the notation ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In c++ we use the notion for operator definitions that are applicable for CPU as well as GPU, thus called XPU. I can look for a reference if you're interested

hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Jan 21, 2019
@hellonico
Copy link
Contributor Author

Hmmm ... locally, I get

Ran 2 tests containing 11 assertions.
0 failures, 0 errors.

And CI shows:

lein test :only infer.objectdetector-example-test/test-batch-detection
FAIL in (test-batch-detection) (objectdetector_example_test.clj:68)
expected: (= #{"dog" "person" "bicycle" "car"} (set (mapv :class predictions)))
  actual: (not (= #{"dog" "person" "bicycle" "car"} #{"dog" "cow" "person" "horse"}))

Will do some digging later.

@marcoabreu
Copy link
Contributor

Poor person, getting misclassified as cow 🐄 :(

Jokes aside. There's a guide in our confluence wiki under continuous integration named "reproducing test results" that allows you to locally reproduce the results. I'm on mobile right now, but let me know if you can't find it and I'll be happy to provide a link.

@hellonico
Copy link
Contributor Author

@marcoabreu poor cow as well ;)

Thanks, will dig into the wiki !

hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Jan 21, 2019
@hellonico
Copy link
Contributor Author

hellonico commented Jan 21, 2019

Update infer test cases to run in the integration phase (simply with lein test).

  • imageclassifier and predictor is ok
  • fix objectdetection on local/CI
  • imageclassifier-example-test failing on CI

hellonico added a commit to hellonico/incubator-mxnet that referenced this pull request Jan 21, 2019
@hellonico
Copy link
Contributor Author

hellonico commented Jan 21, 2019

@gigasquid @marcoabreu : Thank you so much for all the help and support. I promise no cows were harmed during the making of this PR.

screen shot 2019-01-21 at 4 39 21 pm

Let's get this in !!!!!!!!!!!!!!!!!!!!!!!!!!

@gigasquid gigasquid changed the title WIP: Nightly Tests For Clojure Nightly Tests For Clojure Jan 22, 2019
@gigasquid
Copy link
Member

Thanks @hellonico - We are soo close!
There is a conflict that needs to be resolved and it looks like there is still a file change in tests/nightly/JenkinsfileForBinaries - For this PR, let's remove any changes to that file so that we keep our integration tests in the regular CI. We can revisit the nighties later after notifications are implemented.

cd ${test_dir} && lein test
done
# use AWK pattern for blacklisting
TEST_CASES=`find ${EXAMPLES_HOME} -name test | awk '!/dontselect1|dontselect2/'`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thanks :)

Copy link
Member

@gigasquid gigasquid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 💯

@gigasquid gigasquid merged commit eebdd5f into apache:master Jan 22, 2019
piyushghai added a commit to piyushghai/incubator-mxnet that referenced this pull request Jan 22, 2019
@marcoabreu
Copy link
Contributor

Thanks a lot for your great efforts @hellonico and @gigasquid!

What do you think about sending an announcement to dev@ that we are now testing Clojure examples? That's a big win!

unittest_ubuntu_cpu_clojure_integration() {
set -ex
make scalapkg USE_OPENCV=1 USE_BLAS=openblas USE_DIST_KVSTORE=1 ENABLE_TESTCOVERAGE=1
make scalainstall USE_OPENCV=1 USE_BLAS=openblas USE_DIST_KVSTORE=1 ENABLE_TESTCOVERAGE=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to inform that we plan to deprecate the make command for Scala package. Please use mvn install instead to cover the Scala portion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I'd like to keep the native build, the scala build and the scala packaging separate. Right now, this job is done on demand and this leads to GPU instances compiling code without caching. So this sounds like a move in the right direction!

stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Clojure pr-work-in-progress PR is still work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants