Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TorchScript to Knossos test with preview PyTorch on GitHub Actions #669

Merged
merged 34 commits into from
Mar 29, 2021

Conversation

cgravill
Copy link
Contributor

Prototype of porting DevOps pipeline to GitHub Action. This allows us to easily schedule the task and demos we could run more on GitHub Actions.

We're using the preview version of PyTorch due to #659

Unfortunately the PyTorch extension creation mechanism seems to leak memory - my guess is that it doesn't release modules so currently if you run all five tests the C++ build runs out of heap:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Tools\MSVC\14.28.29910\include\memory(3337): fatal error C1060: compiler is out of heap space

(running the same test by itself completes)

I've reordered the dependency install, it looks like Cabal is stopping early (before getting to Ninja). Need to investigate this.

@cgravill
Copy link
Contributor Author

@toelli-msft any thoughts if this is a good direction to keep working on?

@cgravill cgravill marked this pull request as draft March 24, 2021 10:54
@cgravill cgravill requested a review from toelli-msft March 24, 2021 10:54
@cgravill
Copy link
Contributor Author

As some extra context the virtual machines do have Visual Studio installed but not available on the path. The easiest way to do it would be to use:
https://github.com/ilammy/msvc-dev-cmd (which I've confirmed elsewhere works)
however, the security folks tend not to like 3rd party build extensions. That's why I've used the approach from:
https://github.com/microsoft/vswhere/wiki/Find-VC

@toelli-msft
Copy link
Contributor

it looks like Cabal is stopping early (before getting to Ninja)

It's refreshenv. It actually aborts the script. It should probably be replaced with call RefreshEnv.cmd. chocolatey/choco#1461 (comment)

@toelli-msft
Copy link
Contributor

Seems like a fine idea in general, although I don't understand what functionality it gives us over Azure DevOps. Is it just so that we can have scheduled jobs, or something else?

@cgravill
Copy link
Contributor Author

Seems like a fine idea in general, although I don't understand what functionality it gives us over Azure DevOps. Is it just so that we can have scheduled jobs, or something else?

It's the ease of connection from the source to the build. The connection between GitHub and DevOps via the marketplace app always seems opaque. If we have the build on GitHub it's clearer to trigger manually or on a schedule (and I suspect easier to get artifacts back out on e.g. PRs).

This was a specific build, useful by itself, but also demonstrates we could port the other builds over - I was going to get some experience and see how it goes before proposing more though.

Alternatively I can try to see if we can get the DevOps schedules https://docs.microsoft.com/en-us/azure/devops/pipelines/process/scheduled-triggers?view=azure-devops&tabs=yaml working without it becoming a compulsory check for PRs.

@cgravill cgravill mentioned this pull request Mar 26, 2021
8 tasks
@cgravill
Copy link
Contributor Author

Discussed this more, think it makes sense to experiment with this initial nightly build then consider porting more over to GitHub Actions later. Something to investigate is how much build time and parallelism we have on Github Actions.

avoid early exit
@cgravill
Copy link
Contributor Author

cgravill commented Mar 26, 2021

For information, the early Cabal exit was caused by the refreshenv (which itself is a .cmd) not having a call. Fixed now. Left Ninja earlier.

@cgravill cgravill changed the title RFC: TorchScript to Knossos tests with preview PyTorch on GitHub Actions TorchScript to Knossos test with preview PyTorch on GitHub Actions Mar 26, 2021
@cgravill cgravill requested a review from awf March 26, 2021 10:08
@cgravill cgravill marked this pull request as ready for review March 26, 2021 10:08
@cgravill
Copy link
Contributor Author

It's a bit messy with the out heap on multiple tests #679. In spite of that I propose to merge this with a single test being run. That gives us a a working CI build with Windows against the preview of PyTorch. If something gets broken before the maintenance 1.8.1 release we want to know ASAP as we need a working stable release. I'll continue to investigate the memory issue.

Copy link
Contributor

@awf awf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For windows-only scripts, I would be inclined to prefer PowerShell. It's less opaque to folk coming from real programming languages.

@cgravill cgravill merged commit 5f32322 into master Mar 29, 2021
@toelli-msft toelli-msft deleted the cogravil/pytorch_preview_ci branch March 29, 2021 11:12
@cgravill
Copy link
Contributor Author

cgravill commented Mar 29, 2021

For windows-only scripts, I would be inclined to prefer PowerShell. It's less opaque to folk coming from real programming languages.

Agreed, I think that would be a good move as a separate change #686

Part of me would like to go further and it being a good real programming language configured for builds e.g. https://fake.build or Shake as @toelli-msft mentions here:
#622 (comment)

As an anecdotal experience from other projects then moving from a mix of YAML and shell scripts to F# via fake.build made things much clearer and referenceable (reducing duplication).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants