Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the option to specify a machine_type in software for gcloud build #229

Merged
merged 3 commits into from
Jul 5, 2022

Conversation

KevinCLydon
Copy link
Member

Google Cloud Build allows specifying a machine type to use for running the build. It has become clear that it is necessary to allow the user to specify this for their software builds because the recent gatk builds have failed because of OOM errors.

This is implemented by added a machine_type column to software, which corresponds to the different options for machine types allowed by google cloud build. If a value other than standard is specified when creating/updating a software, builds for that software will use the specified machine type.

…machine_type param when triggering google cloud build
@KevinCLydon KevinCLydon requested a review from jamesemery June 16, 2022 17:21
Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KevinCLydon I have a few comments/ questions.

"--machine_type",
default="",
help="Optional machine type for Google Cloud Build to use for building this software.",
type=click.Choice(["standard", "n1-highcpu-8", "n1-highcpu-32", ''], case_sensitive=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What machine is "standard"?

What's the deal with the e2 machines?
https://cloud.google.com/build/pricing

Copy link
Member Author

@KevinCLydon KevinCLydon Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default machine is "e2-medium". Maybe I should refer to it as "default" instead of "standard"?

Also, it turns out I was using an old version of the gcloud docker, so that's why I couldn't get the e2 machines to work. I'm working on a commit that'll add those.

It might be good to have some indication that the default is e2-medium, but you can't actually choose e2-medium from the command line, so allowing the user to choose e2-medium in carrot would just translate to whatever the default is in google cloud, which might change. Thoughts?

gcloud builds submit --tag ~{registry_host}/~{software_name}:~{commit_hash} --timeout=24h
if [ -z ~{machine_type} ]
then
gcloud builds submit --tag ~{registry_host}/~{software_name}:~{commit_hash} --timeout=24h
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this switch here, why not add a default option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A default option for machine_type? It seems like you can only get the default machine by not using the --machine-type flag, so I think I need the switch.

///
/// Google's Cloud Build docs explain the machine-type argument here:
/// https://cloud.google.com/sdk/gcloud/reference/builds/submit#--machine-type
/// Notably, it mentions two other machine types: e2-highcpu-8 and e2-highcpu-32, but, when I
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, wierd.

#[derive(Debug, PartialEq, DbEnum, Serialize, Deserialize, Clone, Copy)]
#[DieselType = "Machine_type_enum"]
pub enum MachineTypeEnum {
#[serde(rename = "n1-highcpu-8")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this store the value in the DB with - instead of as an identifier? Why is that desireable/necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version with dashes is the format google cloud build actually uses, so I wanted what's in the database to match that, so using the value from the database is easier and users see the actual value that is submitted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes a lot of sense.

@@ -12,7 +12,7 @@ use crate::models::software_version::{
use crate::requests::cromwell_requests::{CromwellClient, CromwellRequestError};
use crate::util::temp_storage;
use diesel::PgConnection;
use serde_json::json;
use serde_json::{json, Value};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this change about?

"--machine_type",
default="",
help="Optional machine type for Google Cloud Build to use for building this software.",
type=click.Choice(["standard", "n1-highcpu-8", "n1-highcpu-32", ''], case_sensitive=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably add the actual machine type for standard as well if that's possible so people can choose between "Give me the default" and "Give me this specific small machine" in case the default changes in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I can make that change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out you cannot actually supply the default machine type to the gcloud builds --machine-type build, so I'm not sure if letting the user choose it here would actually be meaningful?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bizarre, lets leave it as is then

@KevinCLydon
Copy link
Member Author

KevinCLydon commented Jun 24, 2022

@lbergelson @jamesemery @droazen I'm curious what y'all have for opinions on this, if any: I limited the values for machine-type to only the values I've actually gotten to work: standard (which is allegedly e2-medium), n1-highcpu-8, and n1-highcpu-32. According to Google's documentation, there should be 2 more machines (e2-highcpu-8 and e2-highcpu-32), but I got errors when I tried to use them.

There is also the option in google cloud build to set up a private pool of machines (maybe not a bad option for us if we mainly have need of high memory machines, not high cpu) and that provides a longer list of options.

Should I modify this to allow specifying all of the options in the full private pool list? Maybe then I can include messages in the CLI or somewhere to indicate the differences in requirements for the different options?

It maybe should also be noted that this was originally intended as a quick fix for carrot tests failing in gatk because the builds were using too much memory.

@lbergelson
Copy link
Member

@KevinCLydon I'm in favor of allowing the additional options, or even any potential input to future proof. I think it also makes sense to add a warning message when using ones you can't vouch for. Since this is meant to be a quick fix I don't think either change is necessary at this time but might be worth doing in the future.

…ker build wdl to support the up-to-date list of machine types
@KevinCLydon KevinCLydon assigned lbergelson and unassigned jamesemery Jul 5, 2022
Copy link
Member

@lbergelson lbergelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KevinCLydon Looks Good to Me

@KevinCLydon KevinCLydon merged commit 3959b68 into master Jul 5, 2022
@KevinCLydon KevinCLydon deleted the kl_software_machine_type branch July 5, 2022 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants