AWS Batch Operational Dashboard provides a code sample to deploy a solution to show Amazon EC2 resources and Container resource usage by AWS Batch jobs.
This solution relies on a serverless architecture to create a Grafana dashboard to visualize compute and memory resources usage by AWS Batch jobs. It provides better insights at the jobs level on how Amazon EC2 resources are used.
This application is designed to be scalable by collecting data from events and API calls using Amazon EventBrige and does not make API calls to describe your resources. Data collected through events and API are partially aggregated to DynamoDB to recoup information and generate Amazon CloudWatch metrics with the Embedded Metric Format. The application also deploys a several of dashboards displaying the job states, Amazon EC2 instances belonging your Amazon ECS Clusters (AWS Batch Compute Environments), ASGs across Availability Zones.
Install AWS Serverless Application Model Command Line Interface (AWS SAM CLI) version >=1.72.0 by following the instructions
The dashboard allows to visualize AWS Batch jobs status, start and stop time, job queue, instance type, instance id, availability zone, Amazon Cloudwatch logs associated with the job. In addition, you can navigate through time and focus on a specific AWS Batch job to observe the Amazon EC2 CPU and Memory usage, the container CPU and memory requested and used as well as the EBS operations that are related to the job.
The architecture track AWS Batch job events through Amazon EventBrige that are routed to a step function that store the AWS job states, availability zones, instance type, instance id, instance pricing model, log stream in a DynamoDB database. You can visualize the the results using Amazon Managed Grafana through Amazon Athena.
The deployment of the dashboard is composed of four steps.
Amazon Managed Grafana relies on single sign-on using your organization’s identity provider to authenticate users. The following steps guide you to setup AWS Organization, and AWS IAM Identity Center.
NOTE: If you already have AWS Organization and AWS IAM Identity Center you can skip those steps.
- Open AWS Organization.
- Choose Create an Organization. By default, the organization is created with all features enabled.
- The organization is created and the AWS accounts page appears. The only account present is your management account, and it's currently under the root organizational unit (OU).
- Open AWS IAM Identity Center.
- Choose Enable.
To deploy the serverless application, run the following in your shell:
sam build
sam deploy --stack-name ${BATCH_DASHBOARD_NAME} \
--guided \
Follow the instructions and fill the parameters.
Once deploy let's retrieve the Amazon Managed Grafana dashboard id.
GRAFANA_ID=`sam list stack-outputs --stack-name ${BATCH_DASHBOARD_NAME} \
--output json | \
jq -r '.[] | select(.OutputKey=="GrafanaWorkspaceId") | .OutputValue'`
Amazon Managed Grafana integrates with AWS IAM Identity Center to provide identity federation. The federation provides users and groups that will grant access to Amazon Managed Grafana as a Viewer, Editor or Admin. The following steps guide you to create a viewer and admin group.
- Open AWS IAM Identity Center settings
- Copy the Identity store ID from the identity store tab. This will be used in the next step
Set the identity store ID to the value copied in the previous step.
Create grafana admin and viewer groups.
ADMIN_GROUP=`aws identitystore create-group --identity-store-id ${IDENTITY_STORE} \
--display-name 'grafana-batch-op-dashboard-admin'\
--query GroupId \
--output text`
VIEWER_GROUP=`aws identitystore create-group --identity-store-id ${IDENTITY_STORE} \
--display-name 'grafana-batch-op-dashboard-viewer' \
--query GroupId \
--output text`
Create users
USER_ID=`aws identitystore create-user --identity-store-id ${IDENTITY_STORE} \
--user-name 'johndoe' \
--display-name 'John'\
--name Formatted=string,FamilyName=Doe,GivenName=John \
--emails [email protected],Type=string,Primary=True \
--query UserId \
--output text`
Add user to group. Here the ADMIN GROUP.
aws identitystore create-group-membership --identity-store-id ${IDENTITY_STORE} \
--group-id ${ADMIN_GROUP} \
--member-id UserId=${USER_ID}
First, you will add the groups created previously to the grafana dashboard.
aws grafana update-permissions --workspace-id ${GRAFANA_ID} \
--update-instruction-batch \
aws grafana update-permissions --workspace-id ${GRAFANA_ID} \
--update-instruction-batch \
Now let's get the URL to access the dashboard.
aws grafana describe-workspace --workspace-id ${GRAFANA_ID} \
--query workspace.endpoint \
--output text
You should get an URL that you will paste in your web browser, like
You will be prompted to login with the credentials user-name created earlier. At user creation, each user will receive an initial password in their emails.
Use the password from the mail associated with your user name to login.
Once connected as administrator, you will start by settings data sources.
Amazon Managed Grafana since version 9.4 comes with Prometheus and Cloudwatch plugin enabled by default. For other plugins such as Amazon Athena, you will need to install them with the following procedure:
- Select the hamburger menu on the left pane.
- Expand Administration.
- Choose Plugins.
- Search Amazon Athena.
- Choose Install.
- Select the hamburger menu on the left pane.
- Expand Administration
- Choose Data sources.
- Choose Add new data source.
- Select CloudWatch.
- On the Default Region menu, choose your AWS Region.
- Choose Save & test.
Before starting, you will retrieve the S3 bucket name created to store the AWS Batch jobs data through Amazon Athena.
In a terminal:
sam list stack-outputs --stack-name ${BATCH_DASHBOARD_NAME} \
--output json | \
jq -r '.[] | select(.OutputKey=="AthenaSpillBucket") | .OutputValue'
Copy the output that you will use in the setup of Amazon Athena data source in Amazon Managed Grafana.
In the Amazon Managed Grafana dashboard:
- Select the hamburger menu on the left pane.
- Expand Administration
- Choose Data sources.
- Choose Add new data source.
- Choose Amazon Athena.
- On the Default Region menu, choose your AWS Region.
- On the Data source menu, choose aws-batch-jobs-data.
- On the Database menu, choose default.
- On the Workgroup menu, choose batch-wg.
- On the Output Location menu, copy paste the bucket value s3://DOC-EXAMPLE-BUCKET.
- Choose Save & test.
To create a dashboard in Amazon Managed Grafana for AWS Batch, you will start from the template provided in this repository to generate dashboard for your environment.
In a terminal, run the following commands:
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
TABLE=`sam list stack-outputs --stack-name ${BATCH_DASHBOARD_NAME} \
--output json | \
jq -r '.[] | select(.OutputKey=="AthenaDataSource") | .OutputValue'`
python3 ./ --table ${TABLE}
Once you have created your dashboard in json format, you will import it in Amazon Managed Grafana:
- Select the hamburger menu on the left pane.
- Choose Dashboards.
- Choose New on the right side.
- Choose Import.
- Choose Upload dashboard JSON file, select the
file. - Choose Load.
- Select the Athena and CloudWatch data sources your created previously.
- Choose Import.
You will be redirected to the dashboard you imported. Once you will have your first AWS Batch jobs running. You will be able to see the data associated with it in the dashboard.
To delete the SAM application deployment, you can use the terminal and enter:
sam delete
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.