Skip to content

Stop EC2 instances and RDS/Aurora databases overnight by tagging them with cron schedules, to cut AWS costs. Trigger CloudFormation stack updates and AWS Backup, too.

License

GPL-3.0, GFDL-1.3 licenses found

Licenses found

GPL-3.0
LICENSE-CODE.md
GFDL-1.3
LICENSE-DOC.md
Notifications You must be signed in to change notification settings

sqlxpert/lights-off-aws

Repository files navigation

Lights Off!

Ever forget to turn the lights off? Now you can:

  • Stop EC2 instances and RDS/Aurora databases overnight by tagging them with cron schedules, to cut AWS costs.

  • Trigger AWS Backup with cron schedules in resource tags.

  • Delete expensive infrastructure overnight by tagging your own CloudFormation stacks with cron schedules.

  • Easily deploy this solution to multiple AWS accounts and regions.

Most of all, this solution is lightweight. Not counting blanks, comments, or tests, AWS's Instance Scheduler has over 9,500 lines of Python! At under 600 lines of Python, Lights Off is easy to understand, maintain, and extend.

Jump to: Quick StartTagsSchedulesMulti-Account, Multi-RegionSecurity

Quick Start

  1. Log in to the AWS Console as an administrator.

  2. Tag a running, non-essential EC2 instance with:

    • sched-stop : d=_ H:M=11:30 , replacing 11:30 with the current UTC time + 20 minutes, rounded upward to :00, :10, :20, :30, :40, or :50.
  3. Create a CloudFormation stack. Select Upload a template file, then select Choose file and navigate to a locally-saved copy of lights_off_aws.yaml [right-click to save as...]. On the next page, set:

    • Stack name: LightsOff

    If stack creation fails with an UnreservedConcurrentExecution error...

    Request that Service Quotas → AWS services → AWS Lambda → Concurrent executions be increased. The default value is 1000 .

    Lights Off needs 1 unit for a time-critical function. New AWS accounts start with a quota of 10 units, but AWS always holds back 10, which leaves 0 available!

  4. After about 20 minutes, check whether the EC2 instance is stopped. Restart it and delete the sched-stop tag.

Jump to: Extra SetupMulti-Account, Multi-Region

Tag Keys (Operations)

sched-stop sched-hibernate sched-backup
sched-start
EC2:
Instance → Image (AMI)
EBS Volume → Snapshot
RDS and Aurora:
Database Cluster → Snapshot
Database Instance → Snapshot

Tag Values (Schedules)

Work Week Examples

These cover Monday to Friday daytime work hours, 07:30 to 19:30, year-round (see time zone converter).

Locations Hours Saved sched-start sched-stop
USA Mainland 52% u=1 u=2 u=3 u=4 u=5 H:M=11:30 u=2 u=3 u=4 u=5 u=6 H:M=03:30
North America (Hawaii to Newfoundland) 42% u=1 u=2 u=3 u=4 u=5 H:M=10:00 u=2 u=3 u=4 u=5 u=6 H:M=05:30
Europe 55% u=1 u=2 u=3 u=4 u=5 H:M=04:30 u=1 u=2 u=3 u=4 u=5 H:M=19:30
India 64% u=1 u=2 u=3 u=4 u=5 H:M=02:00 u=1 u=2 u=3 u=4 u=5 H:M=14:00
North America, Europe 28% u=1 H:M=04:30 u=6 H:M=05:30
North America, Europe, India 26% u=1 H:M=02:00 u=6 H:M=05:30
Europe, India 48% u=1 u=2 u=3 u=4 u=5 H:M=02:00 u=1 u=2 u=3 u=4 u=5 H:M=19:30

Rules

  • Coordinated Universal Time (UTC)
  • 24-hour clock
  • Days before times, hours before minutes
  • The day, the hour and the minute must all be resolved
  • Multiple operations on the same resource at the same time are all canceled

Space was chosen as the separator and underscore, as the wildcard, because RDS does not allow commas or asterisks.

Single Terms

Type Literal Values (strftime) Wildcard
Day of month d=01 ... d=31 d=_
Day of week (ISO 8601) u=1 (Monday) ... u=7 (Sunday)
Hour H=00 ... H=23 H=_
Minute (multiple of 10) M=00 , M=10 , M=20 , M=30 , M=40 , M=50

Compound Terms

Type Note Literal Values
Once a day d=_ or d=NN or u=N first! H:M=00:00 ... H:M=23:50
Once a week uTH:M=1T00:00 ... uTH:M=7T23:50
Once a month dTH:M=01T00:00 ... dTH:M=31T23:50

Backup Examples

sched-backup Description
d=01 d=15 H=03 H=19 M=00 Traditional cron: 1st and 15th days of the month, at 03:00 and 19:00
d=_ H:M=03:00 H=_ M=15 M=45 Every day, at 03:00 plus every hour at 15 and 45 minutes after the hour
dTH:M=01T00:00 Start of month (instead of end of month)
dTH:M=01T03:00 uTH:M=5T19:00 d=_ H=11 M=15 1st day of the month at 03:00, plus Friday at 19:00, plus every day at 11:15

Extra Setup

Starting EC2 Instances with Encrypted EBS Volumes

In most cases, you can use the sched-start tag without setup.

If you use custom KMS encryption keys from a different AWS account...

The sched-start tag works for EC2 instances with EBS volumes if:

  • Your EBS volumes are unencrypted, or
  • You use the default, AWS-managed aws/ebs encryption key, or
  • You use custom keys in the same AWS account as each EC2 instance, the key policies contain the default "Enable IAM User Permissions" statement, and they do not contain "Deny" statements.

Because your custom keys are in a different AWS account than your EC2 instances, you must add a statement like the following to the key policies:

    {
      "Sid": "LightsOffEc2StartInstancesWithEncryptedEbsVolumes",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "kms:CreateGrant",
      "Resource": "*",
      "Condition": {
        "ForAnyValue:StringLike": {
          "aws:PrincipalOrgPaths": "o-ORG_ID/r-ROOT_ID/ou-PARENT_ORG_UNIT_ID/*"
        },
        "ArnLike": {
          "aws:PrincipalArn": "arn:aws:iam::ACCOUNT:role/*LightsOff*-DoLambdaFnRole-*"
        },
        "StringLike": {
          "kms:ViaService": "ec2.*.amazonaws.com"
        },
        "Bool": {
          "kms:GrantIsForAWSResource": "true"
        }
      }
    }
  • One account: Delete the entire "ForAnyValue:StringLike" section and replace ACCOUNT with the account number of the AWS account in which you have installed Lights Off.

  • AWS Organizations: Replace ACCOUNT with * and o-ORG_ID , r-ROOT_ID , and ou-PARENT_ORG_UNIT_ID with the identifiers of your organization, your organization root, and the organizational unit in which you have installed Lights Off. /* at the end of this organization path stands for child OUs, if any. Do not use a path less specific than "o-ORG_ID/*" .

If an EC2 instance does not start as scheduled, a KMS key permissions error is possible.

Making Backups

You can use the sched-backup tag with minimal setup if you work in a small number of regions and/or AWS accounts. Use the AWS Console to view the list of AWS Backup vaults one time in each AWS account and region. Make one backup in each AWS account (AWS Backup → My account → Dashboard → On-demand backup). If you use custom KMS keys, they must be in the same AWS account as the disks and databases encrypted with them.

If you work across many regions and/or AWS accounts...

Because you want to use the sched-backup tag in a complex AWS environment, you must address the following AWS Backup requirements:

  1. Vault

    AWS Backup creates the Default vault the first time you open the list of vaults in a given AWS account and region, using the AWS Console. Otherwise, see Backup vault creation and AWS::Backup::BackupVault or aws_backup_vault . Update the BackupVaultName CloudFormation stack parameter if necessary.

  2. Vault policy

    If you have added "Deny" statements, be sure that DoLambdaFnRole still has access.

  3. Backup role

    AWS Backup creates AWSBackupDefaultServiceRole the first time you make a backup in a given AWS account using the AWS Console (AWS Backup → My account → Dashboard → On-demand backup). Otherwise, see Default service role for AWS Backup. Update BackupRoleName in CloudFormation if necessary.

  4. KMS key policies

    AWSBackupDefaultServiceRole works if:

    • Your EBS volumes and RDS/Aurora databases are unencrypted, or
    • You use the default, AWS-managed aws/ebs and aws/rds encryption keys, or
    • You use custom keys in the same AWS account as each disk and database, the key policies contain the default "Enable IAM User Permissions" statement, and they do not contain "Deny" statements.

    If your custom keys are in a different AWS account than your disks and databases, you must modify the key policies. See Encryption for backups in AWS Backup, How EBS uses KMS, Overview of encrypting RDS resources, and Key policies in KMS.

If no backup jobs appear in AWS Backup, or if jobs do not start, a permissions problem is likely.

Hidden Policies

Service and resource control policies (SCPs and RCPs), permissions boundaries, and session policies can interfere with the installation or usage of Lights Off. Check with your AWS administrator!

Accessing Backups

Goal Services
List backups AWS Backup
View underlying images and/or snapshots EC2 and RDS
Restore (create new resources from) backups EC2 and RDS, or AWS Backup
Delete backups AWS Backup

AWS Backup copies resource tags to backups. Lights Off adds sched-time to indicate when the backup was scheduled to occur, in ISO 8601 form (example: 2024-12-31T14:00Z).

On/Off Switch

  • You can toggle the Enable parameter of your Lights Off CloudFormation stack.
  • While Enable is false, scheduled operations do not happen; they are skipped permanently.

Logging

  • Check the LightsOff CloudWatch log groups.
    • Log entries are JSON objects. Entries from Lights Off include "level", "type" and "value" keys.
    • For more data, change the LogLevel in CloudFormation.
  • Check the ErrorQueue SQS queue for undeliverable "Find" and "Do" events.
  • Check CloudTrail for the final stages of sched-start and sched-backup operations.

Advanced Installation

Multi-Account, Multi-Region (CloudFormation StackSet)

For reliability, Lights Off works completely independently in each AWS account+region combination. To deploy to multiple regions and/or AWS accounts,

  1. Delete any standalone Lights Off CloudFormation stacks in the target AWS accounts and regions.

  2. Complete the prerequisites for creating a StackSet with service-managed permissions.

  3. Make sure that every target AWS Account has a sufficient AWS Lambda Concurrent executions quota. See the note at the end of Quick Start Step 3.

  4. In the management AWS account (or a delegated administrator account), create a CloudFormation StackSet. Select Upload a template file, then select Choose file and upload a locally-saved copy of lights_off_aws.yaml [right-click to save as...]. On the next page, set:

    • StackSet name: LightsOff
  5. Two pages later, under Deployment targets, select Deploy to Organizational Units (OUs). Enter the AWS OU ID of the target Organizational Unit. Lights Off will be deployed to all AWS accounts within this Organizational Unit. Toward the bottom of the page, specify the target regions.

Least-Privilege Installation

Least-privilege installation details...

You can use a CloudFormation service role to delegate only the privileges needed to create the Lights Off stack. First, create the LightsOffPrereq stack from lights_off_aws_prereq.yaml . Next, when you create the LightsOff stack from lights_off_aws.yaml , set IAM role - optional to LightsOffPrereq-DeploymentRole . If your own privileges are limited, you might need permission to pass the deployment role to CloudFormation. See the LightsOffPrereq-SampleDeploymentRolePassRolePol IAM policy for an example.

For a CloudFormation StackSet, you can use self-managed permissions by copying the inline IAM policy of LightsOffPrereq-DeploymentRole to a customer-managed IAM policy, attaching your policy to AWSCloudFormationStackSetExecutionRole and propagating the policy and the role policy attachment to all target AWS accounts.

Installation with Terraform

Terraform users often wrap a CloudFormation stack in HashiCorp Configuration Language, because AWS and other vendors supply software as CloudFormation templates. See aws_cloudformation_stack .

Wrapping a CloudFormation StackSet in HCL is a relatively easy way to deploy software to multiple AWS accounts and/or regions. See aws_cloudformation_stack_set .

Security

In accordance with the software license, nothing in this section creates a warranty, an indemnification, an assumption of liability, etc. Use this software at your own risk. You are encouraged to evaluate the source code.

Security details...

Security Design Goals

  • Least-privilege roles for the AWS Lambda functions that find resources and do scheduled operations. The "Do" function is authorized to perform a small set of operations, and at that, only when a resource has the correct tag key. (AWS Backup creates backups, using a role that you can configure.)

  • A least-privilege queue policy. The operation queue can only consume messages from the "Find" function and produce messages for the "Do" function (or an error queue, if an operation fails). Encryption in transit is required.

  • Readable IAM policies, formatted as CloudFormation YAML rather than JSON, and broken down into discrete statements by service, resource or principal.

  • Optional encryption at rest with the AWS Key Management System (KMS), for queue message bodies (may contain resource identifiers) and for logs (may contain resource metadata).

  • No data storage other than in queues and logs, with short or configurable retention periods.

  • Tolerance for clock drift in a distributed system. The "Find" function starts 1 minute into the 10-minute cycle and operation queue entries expire 9 minutes in.

  • An optional CloudFormation service role for least-privilege deployment.

Security Steps You Can Take

  • Only allow trusted people and services to tag AWS resources. You can deny the right to add, change and delete sched- tags by including the aws:TagKeys condition key in a permissions boundary.

  • Prevent people who can set the sched-backup tag from deleting backups.

  • Prevent people from modifying components, most of which can be identified by LightsOff in ARNs and in the automatic aws:cloudformation:stack-name tag. Limiting permissions so that the deployment role is necessary for stack modifications is ideal.

  • Prevent people from directly invoking the AWS Lambda functions and from passing the function roles to arbitrary functions.

  • Log infrastructure changes using AWS CloudTrail, and set up alerts.

  • Automatically copy backups to an AWS Backup vault in an isolated account.

  • Separate production workloads. You might choose not to deploy Lights Off to AWS accounts used for production, or you might add a custom policy to the "Do" function's role, denying authority to stop production resources ( AttachLocalPolicy in CloudFormation).

Advice

  • Test Lights Off in your AWS environment. Please report bugs.

  • Test your backups! Are they finishing on-schedule? Can they be restored? AWS Backup restore testing can help.

  • Be aware: of charges for AWS Lambda functions, SQS queues, CloudWatch Logs, KMS, backup storage, and early deletion from cold storage; of the minimum charge when you stop an EC2 instance or RDS database with a commercial license; of the resumption of charges when RDS or Aurora restarts a stopped database after 7 days; and of ongoing storage charges while EC2 instances and RDS/Aurora databases are stopped. Have we missed anything?

Bonus: Delete and Recreate Expensive Resources on a Schedule

Scheduled CloudFormation stack update details...

Lights Off can delete and recreate many types of expensive AWS infrastructure in your own CloudFormation stacks, based on cron schedules in stack tags.

Deleting AWS Client VPN resources overnight, while developers are asleep, is a sample use case. See 10-minute AWS Client VPN.

To make your own CloudFormation template compatible, see lights_off_aws_bonus_cloudformation_example.yaml .

Not every resource needs to be deleted and recreated; condition the creation of expensive resources on the Enable parameter. In the AWS Client VPN stack, the VPN endpoints and VPC security groups are not deleted, because they do not cost anything. The VPN attachments can be deleted and recreated with no need to reconfigure VPN clients.

Set the sched-set-Enable-true and sched-set-Enable-false tags on your own CloudFormation stack. At the scheduled times, Lights Off will perform a stack update, toggling the value of the Enable parameter to true or false. (Capitalize Enable in the tag keys, to match the parameter name.)

Extensibility

Extensibility details...

Lights Off takes advantage of patterns in boto3, the AWS software development kit (SDK) for Python, and in the underlying AWS API. Adding AWS services, resource types, and operations is easy. For example, supporting RDS database clusters (RDS database instances were already supported) required adding:

    AWSRsrcType(
      "rds",
      ("DB", "Cluster"),
      {
        ("start", ): {},
        ("stop", ): {},
        ("backup", ): {"class": AWSOpBackUp},
      },
      rsrc_id_key_suffix="Identifier",
      tags_key="TagList",
    )

Given the words DB and Cluster in the resource type name, plus the operation verb start, the sched-start tag key and the start_db_cluster method name are derived mechanically.

If an operation method takes more than just the resource identifier, add a dictionary of static keyword arguments. For complex arguments, sub-class the AWSOp class and override op_kwargs .

The start_backup_job method takes an Amazon Resource Name (ARN), whose format is consistent for all resource types. As long as AWS Backup supports the resource type, there is no extra work to do.

Add statements like the one below to the Identity and Access Management (IAM) policy for the role used by the "Do" AWS Lambda function, to authorize operations. You must of course authorize the role used by the "Find" function to describe (list) resources.

          - Effect: Allow
            Action: rds:StartDBCluster
            Resource: !Sub "arn:${AWS::Partition}:rds:${AWS::Region}:${AWS::AccountId}:cluster:*"
            Condition:
              StringLike: { "aws:ResourceTag/sched-start": "*" }

What capabilities would you like to add? Submit a pull request today!

Progress

Paul wrote TagSchedOps, the first version of this project, before Systems Manager, Data Lifecycle Manager or AWS Backup existed. The project remains a simple alternative to Systems Manager Automation runbooks for stopping EC2 instances, etc. It is now integrated with AWS Backup, leveraging the security and management benefits (including backup retention lifecycle policies) but offering a simple alternative to backup plans. Despite new features, the code has gotten shorter, based on GitHub LOC.

Year AWS Lambda Python Lines Core CloudFormation YAML Lines
2017 ≈ 775 ≈ 2,140
2022 630 800 ✓
2025 530 ✓ 890

Dedication

This project is dedicated to ej, Marianne and Régis, and to the wonderful colleagues whom Paul has worked with over the years. Thank you to Corey for sharing the original version with the AWS user community, and to Lee for suggesting the new name.

Licenses

Scope Link Included Copy
Source code files, and source code embedded in documentation files GNU General Public License (GPL) 3.0 LICENSE-CODE.md
Documentation files (including this readme file) GNU Free Documentation License (FDL) 1.3 LICENSE-DOC.md

Copyright Paul Marcelin

Contact: marcelin at cmu.edu (replace "at" with @)

About

Stop EC2 instances and RDS/Aurora databases overnight by tagging them with cron schedules, to cut AWS costs. Trigger CloudFormation stack updates and AWS Backup, too.

Topics

Resources

License

GPL-3.0, GFDL-1.3 licenses found

Licenses found

GPL-3.0
LICENSE-CODE.md
GFDL-1.3
LICENSE-DOC.md

Stars

Watchers

Forks

Languages