AWS Auto Scaling allows you to set simple rules for scaling up or down the number of EC2 instances in your environment but one possibly, unforeseen snag is that the Auto Scaling rules may try to downscale your instance before it has finished all it's work.

Ideally the application will have been built with Auto Scaling in mind and handle the Lifecycle Hooks and application heartbeats, but when working with an existing legacy application this may not be an option. Fortunately it's relatively simple to put together a script using the AWS CLI to run as cron job to watch out for this.

First off, the AWS CLI needs to be installed and working with the appropriate keys, if that is not the case you can find instructions here.

The Auto Scaling Lifecycle Hooks are accessible through the web console under EC2 --> Auto Scaling --> Auto Scaling Groups and then under the Lifecycle Hooks tab or they alternatively can be created or removed with the AWS CLI.

The new Lifecycle Hook needs a name, and for this case the Lifecycle Transition should be set to Instance Terminate and the Default result to Continue. The Heartbeat Timeout can be adjusted at any point but the default of 1 hour is a good starting point.

Now with the Lifcycle Hook in place, the next time the Auto Scaling wants to scale down an instance, it will enter the Terminating:Wait state. In this state the instance will be removed from the Load Balancer and then wait for the Heartbeat Timeout period before terminating. As it is this would simply delay each instance 1 hour before Terminating.

You can get the information on the state of an Auto Scaling instance using aws autoscaling describe-auto-scaling-instances as shown below. curl is used to pull the Instance ID from an internal resource (No route from outside AWS) Amazon provides that will return meta-data about the instance making the request. Oherwise this could be replaced with the Instance ID.

Once an instance is in the Terminating:Wait state you can send a heartbeat with aws autoscaling record-lifecycle-action-heartbeat to keep it from being Terminated. The instance will then run for the period of time defined in the Lifecycle Hook Heartbeat Timeout, starting over each time it recieves another heartbeat. Per Amazon's documentation "The maximum amount of time that you can keep an instance in a wait state is 48 hours or 100 times the heartbeat timeout, whichever is smaller."

To decide whether to keep the instance up or not, it will need away of checking if the application is finished working. One easy way is if the application has a PID file we can use to check if the PID is running, other options might include checking if any instance of a a process (possibly an interpreter like Node or Python) is running. Without a good indication if the application is done working a guess at a set timeout period might be better than nothing, if removing it from the Load Balancer prevents it from starting more work.

The example below uses a PID file to check if the application has finished working, it runs through a simple loop where it first determines if Auto Scaling wants to scale down the instance and then if the PID is running. If so, it will send a heartbeat keeping AWS from Terminating the instance prematurely. If not, it uses /usr/bin/aws autoscaling complete-lifecycle-action with the CONTINUE flag to tell AWS to continue Terminating the instance.

Scheduling this script with cron and adjusted with the appropriate Hearbeat Timeout period will allow the script to determine whether an instance is still working and needs more time to finish before it is autoscaled down or if it is done and ready for termination.

This won't be as precise as if the application made these calls to AWS's API directly, but still lets us bolt on some amount of control if that isn't an option.