Autoscaling rules
In this unit, we look at autoscaling rule concepts. Then, in the next exercise, we trigger the rules.
Rules
Autoscaling is based on a set of scale conditions, rules, and limits. A scale condition combines time and a set of scale rules. If the current time falls within the period defined in the scale condition, the condition's scale rules are evaluated. The results of this evaluation determine whether to add or remove instances. The scale condition also defines the limits of scaling for the maximum and minimum number of instances.
Azure Spring Apps autoscaling allows you to scale the number of running instances out or in, based on metrics. The autoscaling rules process these metrics. You can create complex overlapping rules as needed for your situation.
Autoscale conditions
There are two rule types:
- Metric-based rules
- Schedule-based rules
In metric-based types, the number of apps and resources are horizontally scaled out to the amount necessary to handle the load, without exceeding the maximum limits that you establish. Similarly, the number of apps and resources are horizontally scaled in to the amount necessary to support your load, without falling below the minimums that you set.
In schedule-based rules, your apps are scaled in and out based on your predefined schedule and limits. This rule type is useful for cases that often follow a predictable pattern, and to establish a baseline for more metric-based scaling.
You can create multiple autoscale conditions to handle different schedules and metrics. Azure autoscales your service when any of these conditions apply. You can also define a default condition to be used if none of the other conditions are applicable. This condition is always active and doesn't have a schedule.
Autoscale actions
When an autoscale rule detects that a metric crossed a threshold, it does an autoscale action. An autoscale action can be scale-out or scale-in. A scale-out action increases the number of instances, and a scale-in action reduces the instance count. An autoscale action uses an operator (such as less than, greater than, equal to, and so on) to determine how to react to the threshold. Scale-out actions typically use the greater than operator to compare the metric value to the threshold. Scale-in actions tend to compare the metric value to the threshold with the less than operator. An autoscale action can also set the instance count to a specific level, rather than incrementing or decrementing the number available.
An autoscale action has a cool-down period, specified in minutes. During this interval, the scale rule can't be triggered again. This cool-down period is to allow the system to stabilize between autoscale events. Remember that it takes time to start up or shut down instances, and so any metrics gathered might not show any significant changes for several minutes.
Estimation during a scale-in is intended to avoid Flapping situations, where scale-in and scale-out actions continually go back and forth. Keep this behavior in mind when you choose the same thresholds for scale-out and scale-in.