In this post I’ll explain how I created an AWS EC2 Spot Instance Termination Simulator to run on any EC2 instance.
It can signal typical monitoring scripts, applications, interrupt handlers, and more, just like a real Spot Termination signal would do.
Back story
Looking around, there aren’t really any straight forward ways of testing this behaviour.
The obvious way is to change the price you bid for spot instances to the threshold of the current market price. This is so that if there is a slight increase in demand your spot instance(s) will be terminated.
The problem with that approach is of course that you can’t easily predict when the price will move, or by how much.
How EC2 Spot Instance Termination warnings work
When your spot instance bid price is surpassed by the current market price, a Termination Notice becomes available on one of the instance’s metadata endpoints. For example, http://169.254.169.254/latest/meta-data/spot/termination-time
.
There is one other, newer endpoint at http://169.254.169.254/latest/meta-data/spot/instance-action
that could also be used.
At this point the endpoint returns an HTTP 200 response. A timestamp of when a shutdown signal is going to be sent to your instance’s OS is also returned.
In the case of the newer endpoint, a JSON string is returned with an action and time. Usually the endpoint usually simply returns a 404 not found response.
The tools out there tend to monitor this endpoint to warn and action in case of a Spot Termination notice being received.
The two minute warning feature was added by Amazon back in 2015, announced in this blog post.
Kubernetes Spot Termination / Interrupt handling
I’ve been looking at kube-aws/kube-spot-termination-notice-handler.
It runs as a DaemonSet (one container/pod on each node) in host networking mode. It then polls the EC2 metadata endpoint every ‘x’ number of seconds, watching for termination signals.
During polling, if one is found, it will send a kubectl drain
command to it’s host node. Pods will move off to other nodes in the cluster.
Simulating Spot Instance Termination with a fake web service and some proxying
I created a simple web service/API that simply returns a 200 HTTP response on the same http://169.254.169.254/latest/meta-data/spot/termination-time
endpoint.
This is the legacy endpoint that spot instance termination signals go to. However, there is another, newer one that AWS now recommend using instead.
The other endpoint is http://169.254.169.254/latest/meta-data/spot/instance-action
. The web service will also send a response there as a JSON string with an action and time field.
All that needs to be done is to forward traffic from the metadata endpoint 169.254.169.254:80 to the custom web service.
Have a look at the code and Docker image at the following locations:
- The code for the simple NodeJS web service / API is in GitHub here
- You can find the Docker image over here.
Run the EC2 spot instance termination simulator endpoints
Identify a candidate EC2 instance that you don’t mind messing with.
Warning: the following steps will override the entire EC2 instance metadata service. No other metadata endpoints will work. This is because they’re not implemented in this web service.
Kubernetes
Deploy the docker container (a Kubernetes deployment / service manifest is in the git repository).
Ideally, create a nodeSelector / label up in the deployment manifest before you kubectl apply
it. This is so you can get the pod to run on the specific node you want to test.
kubectl -n namespace apply -f https://raw.githubusercontent.com/Shogan/ec2-spot-termination-simulator/master/simple-k8s-deployment.yaml
The service is a NodePort service, exposing the port on the host it runs on.
List the service and the pod (and find the kubernetes node the pod is running on):
kubectl get svc ec2-spot-termination-simulator kubectl get pod spot-term-simulator-xxxxxx -o wide
Take note of the NodePort the service is listening on for the Kubernetes node. E.g. port 30626
.
Docker
docker run -p 30626:80 -e PORT=80 shoganator/ec2-spot-termination-simulator
Proxying the EC2 metadata service
Perform some trickery to proxy traffic destined to 169.254.169.254 on port 80 to localhost (where the container runs). This is so that the fake service can take the place of the real one.
SSH onto the Kubernetes node and run:
sudo ifconfig lo:0 169.254.169.254 up sudo socat TCP4-LISTEN:80,fork TCP4:127.0.0.1:30626
- Create an alias for the localhost interface at 169.254.169.254, effectively taking over the EC2 metadata service and sending that to 127.0.0.1 instead.
- Forward TCP port 80 traffic (usually destined to the EC2 metadata service) to 127.0.0.1 on the NodePort
30626
. This is where the ec2-spot-termination-simulator pod is running on this host. Substitute this with the correct port in your case.
Test the new, faked endpoint:
curl -i http://169.254.169.254/latest/meta-data/spot/termination-time
As a result, it should return a 200 OK response with a timestamp from the fake service. This is the same as the real one would.
Example
Looking at how the kube-spot-termination-notice-handler service works specifically:
This service runs as a DaemonSet, meaning one instance per host. The instance on the node you set the simualtor up on should immediately drain the node. The instance won’t actually be terminated as this was of course just a simulated termination.
Other scenarios
If you’re not running on Kubernetes and are using a different spot termination handler, don’t worry. The system you’re using to monitor the EC2 instance metadata endpoint should still take action at this point.
The proxied web service is now returning a legitimate looking termination time notice on http://169.254.169.254/latest/meta-data/spot/termination-time.