Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSUB-1345: Migrate away from Azure in favor of Linode #5

Merged

Conversation

atodorov
Copy link

@atodorov atodorov commented Nov 7, 2024

NOTE: opening against release-polkadot-v1.1.0-patch (instead of release-polkadot-v1.1.1-patch) b/c this is what gluwa/creditcoin3 & gluwa/creditcoin3-next are compiled against).

@atodorov atodorov force-pushed the ci/CSUB-1345-migrate-away-from-azure branch from 341e765 to 0cae581 Compare November 7, 2024 13:39
which triggers MD039/no-space-in-links error
Running on Ubuntu 24.04 - same as gluwa/creditcoin3
the original VM was

$0.56/hr in West US 3; 4 vCPU, 64 GiB RAM, Memory optimized
with 512 GB disk space
@atodorov atodorov force-pushed the ci/CSUB-1345-migrate-away-from-azure branch from 0cae581 to f45894d Compare November 7, 2024 15:35
@atodorov
Copy link
Author

atodorov commented Nov 7, 2024

FTR I think that the failure reported by gluwa / cargo-test is rustc going out of memory. I can also reproduce this locally.

@atodorov atodorov force-pushed the ci/CSUB-1345-migrate-away-from-azure branch 2 times, most recently from f21e224 to cbee2e5 Compare November 8, 2024 15:20
@atodorov
Copy link
Author

atodorov commented Nov 8, 2024

FTR at https://github.com/gluwa/polkadot-sdk/actions/runs/11741349059/job/32710035675?pr=5 I see 2 unit tests failing, which however are passing over at https://github.com/gluwa/polkadot-sdk/actions/runs/11738728011/job/32701907195?pr=3.

From what I can tell starting branch/commit for both PR#3 and PR#5 is the same and there are no changes related to the polkadot-sdk code base itself. IDK what contributes to the different outcomes here, nor I'm sure what do we want to do with that information.

@atodorov atodorov marked this pull request as ready for review November 8, 2024 15:21
@beqaabu
Copy link

beqaabu commented Nov 13, 2024

FTR at https://github.com/gluwa/polkadot-sdk/actions/runs/11741349059/job/32710035675?pr=5 I see 2 unit tests failing, which however are passing over at https://github.com/gluwa/polkadot-sdk/actions/runs/11738728011/job/32701907195?pr=3.

From what I can tell starting branch/commit for both PR#3 and PR#5 is the same and there are no changes related to the polkadot-sdk code base itself. IDK what contributes to the different outcomes here, nor I'm sure what do we want to do with that information.

The difference between the two runs is the --release flag.
https://github.com/gluwa/polkadot-sdk/actions/runs/11741349059/workflow?pr=5#L143 - fails
https://github.com/gluwa/polkadot-sdk/actions/runs/11738728011/workflow?pr=3#L163 - passes

I have been able to reproduce the same thing locally. compiling without --release flag makes the tests green

.github/workflows/gluwa.yml Outdated Show resolved Hide resolved
what I am seeing with the move to Linode VMs is essentially this:
rust-lang/cargo#9157
rust-lang/cargo#12912

B/c these new VMs have more CPU cores, 16 (new) vs 4(old),
compilation is faster however this causes cargo to be overzealous and
spawn too many linker processes which consume all of the available
memory (on a 64 GB VM) and causes an OOM error forcing the kernel to
kill the linker process and causing cargo to fail!

Another alternative, which works, is using `--jobs 8`, however that is
less optimal b/c it leaves unused CPU capacity and also affects the
number of parallel threads when executing the test suite!

WARNING: using `--release` is not an option because it breaks tests. The
polkadot-sdk code uses the macro defensive! which is designed to panic
when running in debug mode and multiple test scenarios rely on this
behavior via #[should_panic]!

WARNING: we still need the 64 GB memory!
@atodorov atodorov force-pushed the ci/CSUB-1345-migrate-away-from-azure branch from cbee2e5 to e335c88 Compare November 14, 2024 17:33
@atodorov atodorov requested a review from beqaabu November 15, 2024 08:40
@atodorov atodorov merged commit afaa29d into release-polkadot-v1.1.0-patch Nov 15, 2024
9 of 10 checks passed
@atodorov atodorov deleted the ci/CSUB-1345-migrate-away-from-azure branch November 15, 2024 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants