Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds tagging for sagemaker endpoints and sagemaker config. Issue #9159 #9310

Merged
merged 15 commits into from
Jan 2, 2024

Conversation

clarkh-ncino
Copy link
Contributor

@clarkh-ncino clarkh-ncino commented Aug 14, 2023

Related Issues/PRs

Resolve #9159

What changes are proposed in this pull request?

Fixing a bug where tags were not being added to sagemaker endpoints during deployment. Also added logic to add custom tags to sagemaker config resources as well.

How is this patch tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests (describe details, including test results, below)

Does this PR change the documentation?

  • No. You can skip the rest of this section.
  • Yes. Make sure the changed pages / sections render correctly in the documentation preview.

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

Users of the SageMaker deployment functionality can now configure custom tags for both endpoints and endpoint configuration resources from both the CLI and Python SDK.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/gateway: AI Gateway service, Gateway client APIs, third-party Gateway integrations
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

@clarkh-ncino clarkh-ncino marked this pull request as ready for review August 14, 2023 18:10
@mlflow-automation
Copy link
Collaborator

mlflow-automation commented Aug 14, 2023

Documentation preview for 7acc172 will be available here when this CircleCI job completes successfully.

More info

@github-actions github-actions bot added area/scoring MLflow Model server, model deployment tools, Spark UDFs integrations/sagemaker Sagemaker integrations rn/bug-fix Mention under Bug Fixes in Changelogs. labels Aug 15, 2023
Copy link
Collaborator

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Overall looks good to me! only dropped some comments on the style.

Btw, could you help edit the title to include the context of this PR? It would be easier to track in the history (after squash and merge).

def _get_sagemaker_config_tags(endpoint_name):
return [{"Key": "app_name", "Value": endpoint_name}]

def _add_sagemaker_tags(tags, config_tags):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The args are little bit confusing, it's hard to tell the diff between tags and config_tags. We can either add a docstring to explain or use a more explicit name, e.g., tags and app_name_tag.


def _add_sagemaker_tags(tags, config_tags):
"""
Convert dict of tags to list for SageMaker and adds to config tags list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: period at the end

model = next(model for model in self.models.values() if model.arn == resource_arn)
return model.resource.tags
resource_values = getattr(self, resource_type).values()
sagemaker_resource = next(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually if we need to break inline code to multiple lines, it indicates we should replace the inline style with the normal for loop for readability. could you help refactor? thanks!

@clarkh-ncino clarkh-ncino changed the title Issue-9159 Adds tagging for sagemaker endpoints and sagemaker config. Issue #9159 Aug 16, 2023
"""
if sagemaker_tags:
sagemaker_tags = [{"Key": key, "Value": str(value)} for key, value in sagemaker_tags.items()]
config_tags.extend(sagemaker_tags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line do?
The return fo the private function is sagemaker_tags, which is handled entirely by lines 1349 and 1350.
Was the intention to return config_tags ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds any custom tags that are passed in the initial create_deployment function (example here ) to also be added to the SageMaker configuration tags. With the current functionality, the only tags being added to the SageMaker config are manually set here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_prepare_sagemaker_tags({"a_config_tag": "a_value", "my_custom_tag": "my_custom_value"}, [{"Key": "app_name", "Value": "my_app"}])
[{'Key': 'a_config_tag', 'Value': 'a_value'},
 {'Key': 'my_custom_tag', 'Value': 'my_custom_value'}]

If this is the intended output, then there is no need to have config_tags (which is the structure [{"Key": "app_name", "Value": <endpoint_name>}] defined and returned within _get_sagemaker_config_tags)

return [{"Key": "app_name", "Value": endpoint_name}]

def _prepare_sagemaker_tags(
sagemaker_tags: Dict[str, str],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the dict sagemaker_tags contains the key "app_name"? What does AWS do with multiple conflicting tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, looks like the request would be rejected according to this documentation. I'll add a check for that.

sagemaker_tags = [{"Key": key, "Value": str(value)} for key, value in sagemaker_tags.items()]
config_tags.extend(sagemaker_tags)

return sagemaker_tags
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something like this perhaps be a little bit more robust?

from mlflow import MlflowException
from typing import List, Dict, Optional

SAGEMAKER_APP_NAME_TAG_KEY = "app_name"

def _get_sagemaker_config_tags(endpoint_name):
    return [{"Key": SAGEMAKER_APP_NAME_TAG_KEY, "Value": endpoint_name}]

def _prepare_sagemaker_tags(
    config_tags: List[Dict[str, str]],
    sagemaker_tags: Optional[Dict[str, str]]=None,
):
    if not sagemaker_tags:
        return config_tags
    
    if SAGEMAKER_APP_NAME_TAG_KEY in sagemaker_tags:
        raise MlflowException.invalid_parameter_value(f"Duplicate tag provided for '{SAGEMAKER_APP_NAME_TAG_KEY}'")
    parsed = [{"Key": key, "Value": str(value)} for key, value in sagemaker_tags.items()]

    return config_tags + parsed

Validating the behavior:

_prepare_sagemaker_tags([{"Key": "app_name", "Value": "my_app"}], {"a": "1", "b": "2", "c": "3"})
[{'Key': 'app_name', 'Value': 'my_app'},
 {'Key': 'a', 'Value': '1'},
 {'Key': 'b', 'Value': '2'},
 {'Key': 'c', 'Value': '3'}]
_prepare_sagemaker_tags([{"Key": "app_name", "Value": "my_app"}], {"app_name": "my_better_app", "b": "2", "c": "3"})
---------------------------------------------------------------------------
MlflowException                           Traceback (most recent call last)
/var/folders/cd/n8n0rm2x53l_s0xv_j_xklb00000gp/T/ipykernel_12446/3588563336.py in <cell line: 1>()
----> 1 _prepare_sagemaker_tags([{"Key": "app_name", "Value": "my_app"}], {"app_name": "my_better_app", "b": "2", "c": "3"})

/var/folders/cd/n8n0rm2x53l_s0xv_j_xklb00000gp/T/ipykernel_12446/3590447905.py in _prepare_sagemaker_tags(config_tags, sagemaker_tags)
     14 
     15     if SAGEMAKER_APP_NAME_TAG_KEY in sagemaker_tags:
---> 16         raise MlflowException.invalid_parameter_value(f"Duplicate tag provided for '{SAGEMAKER_APP_NAME_TAG_KEY}'")
     17     parsed = [{"Key": key, "Value": str(value)} for key, value in sagemaker_tags.items()]
     18 

MlflowException: Duplicate tag provided for 'app_name'
_prepare_sagemaker_tags([{"Key": "app_name", "Value": "my_app"}], None)
_prepare_sagemaker_tags([{"Key": "app_name", "Value": "my_app"}], {})
[{'Key': 'app_name', 'Value': 'my_app'}]

Does this reflect the intended functionality for the _prepare_sagemaker_tags() function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this looks great.

Comment on lines 193 to 196
if "model" in arn:
sagemaker_resource = "models"
elif "endpoint" in arn:
sagemaker_resource = "endpoints"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be modified to a ternary operator.

model = next(model for model in self.models.values() if model.arn == resource_arn)
return model.resource.tags
resource_values = getattr(self, resource_type).values()
for sagemaker_resource in resource_values:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this being modified to a for loop? is self.resource_type.values() not accessible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not accessible because resource_type is not a valid attribute name. It's a variable containing a string value which is why line 526 was added. The original inline code was broken out to a for loop based on a previous review to improve readability.

Copy link
Member

@BenWilson2 BenWilson2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests that validate the correctness of supplied tags and that parsing is performed correctly for the conversion from dict -> AWS-specified List[Dict[str, str]] format. The current implementation seems like it is dropping data.

@BenWilson2
Copy link
Member

Hi @clarkh-ncino could you rebase to master to address the merge conflicts (and then I can initiate the CI suite for you)

clarkh-ncino and others added 9 commits August 23, 2023 13:57
Signed-off-by: Clark Hollar <[email protected]>
Signed-off-by: Clark Hollar <[email protected]>
Signed-off-by: Clark Hollar <[email protected]>
This reverts commit 3225a11.

Signed-off-by: Clark Hollar <[email protected]>
Signed-off-by: Clark Hollar <[email protected]>
@clarkh-ncino
Copy link
Contributor Author

clarkh-ncino commented Aug 24, 2023

Hi @clarkh-ncino could you rebase to master to address the merge conflicts (and then I can initiate the CI suite for you)

Should be good to go whenever you have a moment. 👍 @BenWilson2

@clarkh-ncino
Copy link
Contributor Author

@BenWilson2 @chenmoneygithub anything I can do here to help move this forward? Think we just need to kick of the CI build.

@BenWilson2
Copy link
Member

@clarkh-ncino could you rebase to master again? The issues that you're encountering should be resolved (you'll also need to fix those lint failures, though with import ordering and the black formatting issues). If you run the pre-commit hook (pre-commit run --all-files, as mentioned in https://github.com/mlflow/mlflow/blob/master/CONTRIBUTING.md#python )

@clarkh-ncino
Copy link
Contributor Author

@BenWilson2 Should be good to go now 🤞

Copy link
Member

@BenWilson2 BenWilson2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@BenWilson2 BenWilson2 merged commit e4e23c3 into mlflow:master Jan 2, 2024
annzhang-db pushed a commit to annzhang-db/mlflow that referenced this pull request Jan 3, 2024
B-Step62 pushed a commit to B-Step62/mlflow that referenced this pull request Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/scoring MLflow Model server, model deployment tools, Spark UDFs integrations/sagemaker Sagemaker integrations rn/bug-fix Mention under Bug Fixes in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Tags are not added to sagemaker endpoints during deployment.
4 participants