There are different approaches for scaling a system. Traditionally systems have used, what we call an Infrastructure First approach, where the system focuses on infrastructure metrics such as CPU or Memory usage, and scales up the cluster infrastructure. In this case the application scales up following the metrics derived from the infrastructure.
While you can still use that approach on ECS, ECS follows an Application First scaling approach, where the scaling is based on the number of desired. ECS has two type of scaling activities:
ECS Service / Application Scaling: This refers to the ability to increase or decrease the desired count of tasks in your Amazon ECS service based on dynamic traffic and load patterns in the workload. Amazon ECS publishes CloudWatch metrics with your service’s average CPU and memory usage. You can use these and other CloudWatch metrics to scale out your service (add more tasks) to deal with high demand at peak times, and to scale in your service (run fewer tasks) to reduce costs during periods of low utilization.
ECS Container Instances Scaling: This refers to the ability to increase or decrease the desired count of EC2 instances in your Amazon ECS cluster based on ECS Service / Application scaling. For this kind of scaling, it is typical practice depending upon Auto Scaling group level scaling policies.
To scale the infrastructure using the Application First approach on ECS, we will use Amazon ECS cluster Capacity Providers to determine the infrastructure in use for our tasks and we will use Amazon ECS Cluster Auto Scaling (CAS) to enables to manage the scale of the cluster according to the application needs.
Capacity Providers configuration include:
Each ECS cluster can have one or more capacity providers and an optional default capacity provider strategy. For an ECS Cluster there is a Default capacity provider strategy that can be set for Newly created tasks or services on the cluster that are created without an explicit strategy. Otherwise, for those services or tasks where the default capacity provider strategy does not meet your needs you can define a capacity provider strategy that is specific for that service or task.
You can read more about Capacity Provider Strategies here
When enabling managed scaling Amazon ECS manages the scale-in and scale-out actions of the Auto Scaling group. This is what we call ECS Cluster Auto Scaling (CAS). CAS is a new capability for ECS to manage the scaling of EC2 Auto Scaling groups (ASG). CAS relies on ECS capacity providers.
Amazon ECS creates an AWS Auto Scaling scaling plan with a target tracking scaling policy based on the target capacity value you specify. Amazon ECS then associates this scaling plan with your Auto Scaling group. For each of the capacity providers with managed scaling enabled, an Amazon ECS managed CloudWatch metric with the prefix AWS/ECS/ManagedScaling
is created along with two CloudWatch alarms. The CloudWatch metrics and alarms used to monitor the container instance capacity in your Auto Scaling groups and will trigger the Auto Scaling group to scale in and scale out as needed.
The scaling policy uses a new CloudWatch metric called CapacityProviderReservation that ECS publishes for every ASG capacity provider that has managed scaling enabled. The new CloudWatch metric CapacityProviderReservation is defined as follows.
CapacityProviderReservation = ( M / N ) x 100
Where:
Given this assumption, if N = M, scaling out not required, and scaling in isn’t possible. If N < M, scale out is required because you don’t have enough instances. If N > M, scale in is possible (but not necessarily required) because you have more instances than you need to run all of your ECS tasks.The CapacityProviderReservation metric is a relative proportion of Target capacity value and dictates how much scale-out / scale-in should happen. CAS always tries to ensure CapacityProviderReservation is equal to specified Target capacity value either by increasing or decreasing number of instances in ASG.
The scale-out activity is triggered if CapacityProviderReservation
> Target capacity
for 1 datapoints with 1 minute duration. That means it takes 1 minute to scale out the capacity in the ASG. The scale-in activity is triggered if CapacityProviderReservation < Target capacity for 15 data points with 1 minute duration. We will see all of this demonstrated in this workshop.
You can read more about ECS Cluster Auto Scaling (CAS) and how it works under different scenarios and conditions in this blog post