There's one thing we get asked a lot when talking with people about containers and Kontena.

Can Kontena automatically scale up my services?

The answer is one of the most classic ones in the IT industry:

So the answer is yes and no, depending a bit on the context. I'll walk through both sides of the answer.


As known, Kontena collects lots of statistics on the running services. So theoretically, Kontena actually has the data based on which it could make a decision to spin up new containers for your service. But we've decided not to do it automatically. Why?

If your services are burning up, it means also that in most cases the infrastructure underneath is also burning up. Now when your infrastructure is already at its capacity limits, it would be 100% the wrong decision to spin up more containers. Spinning up more containers on resource exhausted infrastructure just makes things much worse.


The yes side of the story is more interesting. So we just learned that Kontena will not automatically scale up your services. But then how can we say yes?

Kontena reacts to infrastructure changes and thus can scale up services with the infrastructure. There's actually two distinct ways how Kontena reacts to changes, in this case scaling up, in the infrastructure.

Automatic re-scheduling

As the infrastructure scales up, new nodes will join the platform grid. The Kontena Platform Master sees these new nodes and their capacity "offered". There's a constant scheduling "calculation" ongoing on the master. The scheduling calculation takes into account the capacity of the nodes and the desired capacity of the services and tries to always balance out the load across the nodes.

As the scheduling loop will now see new nodes it will re-schedule some services to be moved from busy nodes to new, less busy nodes. Thus, it will balance the load across the grid and will indeed react to the fact that the available infrastructure has somehow scaled up.

Daemon services

There's a special deployment strategy called daemon that you can use for services. With the daemon deploy strategy, Kontena Service Instances are deployed to all Kontena Nodes. If used together with the service instances configuration option, a given number of instances will be deployed to all Kontena Nodes.

So now when the infrastructure scales up and new nodes join the platform grid, these daemon services are automatically deployed to the new nodes. With daemon, the reaction at service instance level is closer to auto-scaling as Kontena does actually spin up new containers on these new nodes and not just move the service instances around to balance the load.

Setting up autoscaling for your Kontena Platform

I'm going to walk you through how to create a Kontena Platform where we use some autoscaling capacity on AWS to react to an increased load automatically. Naturally the process is pretty similar rfo other cloud providers too, the details will of course change a bit.

I'm running the setup "manually" to illustrate the needed configurations and to make all the concepts clearer. Naturally, for a more controlled setup one could use tools like Terraform or CloudFormation to name a few.


Architecturally, I'm splitting my platform into two parts, static and dynamic. The dynamic part is now being set up using an auto-scaling group.

And as always, I want to spread everything over many availability zones. Luckily auto-scaling groups nowadays are able to spread the new instances across different zones in round-robin fashion.

Why static vs. dynamic?

It is not really advisable to have the so called initial nodes as part of the autoscaling infrastructure. The reason is that the initial nodes host the etcd which in turn has some critical data needed for the platform to operate. While Kontena can automatically replace initial nodes, autoscaling might make the changes a bit too dynamic. And when the autoscaling group is removing a node it should also talk to the Kontena Master API to actually remove the node information from the Kontena side as well.

Thus, things are easier to set up and manage when we make a distinction between the static part of the platform and have part of the platform dynamically allocated.


To provision nodes using AWS ASG I'll be using Container Linux as the host OS. As ASG needs to pre-configure everything on the nodes, I'll be using cloud-config as the configuration "tool". In this case the node configurations are really static for their lifetime, as we "just" need to spin up Kontena Agent with the correct config.

Static part

As mentioned, it's better to have the static part, mainly the initial nodes, set up outside of the ASG. So to get started, I'll provision the platform master using Kontena Cloud:

$ kontena cloud platform create --type standard  --initial-size 3 --region eu-west-1 --org jussi asg-demo

This will automatically spin up the Kontena Platform Master for me in a highly available setup.

To provision the initial nodes, I'll use the Kontena AWS plugin.

kontena aws node create --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --type t2.small \
    --zone a \
    --region eu-central-1

Just repeat and rinse the command for different zones. It is advisable to spread the initial nodes across many AZs for maximum availability.

Launch configuration

The setup starts with setting up the Launch configuration. Launch config is the template that the ASG uses to launch the new node instances.

As I'm planning to use CoreOS, eh Container Linux by CoreOS :), I need to first specify the AMI for stable CoreOS:

As of writing this, ami-90c152ff is the latest stable release AMI in the eu-central-1 region. You can check the AMI id's for your own region from here.

In the next step, you need to select the instance type. For the demo purposes I'm using t2.small instances. For your use case, adjust the size and type based on your needs.

The most important part of the launch configuration is configuring the new instances that will be deployed. One of the benefits of using containers is that your host configuration is always pretty much static. In this case it means I can easily use static cloud-init configuration to spin up the nodes, they all have exactly the same configuration. We "just" need to get Kontena Agent into the node and instruct it to call home to the platform master running on Kontena Cloud.

You can find the complete cloud-init yaml example at my related GitHub repo here.

Some important bits in the cloud-config:

Labels: Proper labeling is done for the Docker engine. Kontena will pick up these labels automatically and use those for both service scheduling and overlay network setup purposes. One of the most important labels for the ASG type of "ephemeral" nodes is ephemeral=true. With that, Kontena will automatically remove the node from the platform when it's been offline for more than 20 minutes. So as your ASG automatically scales down, Kontena will do an automatic cleanup. Nice.

KONTENA_URL & KONTENA_TOKEN: These are needed to tell the agent which master to connect to and to authorize itself to join a platform. You'll get these for your current platform by using the kontena cloud platform env command on your local CLI.

Step 4 of the launch config asks for storage settings. For the demo purposes I'll use only small disks without any provisioned capacity, but again, you can adjust it based on your needs. Especially increase the storage size, if the container images you are using tend to be on the larger side.

The last bit to configure for the launch configuration is the security group settings. This is now actually pretty simple, as the Kontena AWS plugin has already created a basic security group for the static part of the platform. And in this case, we want to attach these ASG nodes too to the same security group to allow easy connectivity with the nodes.

After this, just review your settings and create the launch configuration.

Autoscaling group

Now that we have the launch config in place, we can create the actual autoscaling group.

In the initial step, we need to select the launch configuration to be used as the base for new instances.

Next we need to configure the basics of the ASG, name, networks, etc.

After that we need to tell how the ASG scale out is triggered. In this sample I'm using a simple average CPU metric as the base of my scale out trigger. So whenever the aggregated CPU usage on this ASG goes above 40% in a 300 sec period, AWS will start to scale up. Of course, your triggering depends on your application's needs and requirements.

Note: I'm using low CPU utilization as an example so that it's fairly easy to test that the scale out is actually happening and working properly.

Testing out

Once the ASG is created, it will automatically spin up the first instance as I set the min capacity as 1.

As you see, I could set a few more labels on the cloud-init to correspond better to the labels that are set by the AWS CLI plugin.

Let's create a load to see that the nodes actually get created. For that I'll use a custom stack called stress. It does what it says on the tin, just creates a load on the nodes. :) I'll pin the service only to ASG nodes using the affinity filter.

stack: jussi/stress
description: Runs progrium/stress on multiple instances
version: 0.1.1
    image: progrium/stress
    command: -q --cpu 1 --vm 1 --vm-bytes 128M --timeout 30m
      strategy: daemon
    instances: 1
      - label==ephemeral=true
    cpu_shares: 100

Deploy it using:

$ kontena stack install stress.yml 
 [done] Creating stack stress      
 [done] Triggering deployment of stack stress     
 [done] Waiting for deployment to start     
 [done] Deploying service stress    

Sit back, relax and watch the magic happen. :)

First you'll see the first ASG node burning up:

Remember we set a 300 sec threshold period, se we'll need to wait a few mins to see the actual scale out happening.

After a while, you'll see the scale-out happening:

In our purposefully crafted demo case here, the scale out doesn't actually help as the new nodes will also get the progrium/stress container deployed thanks to the daemon strategy. :D

The end result is that we've scaled the platform up automatically.

Now go and remove the stress stack to reduce the load.

$ kontena stack rm --force stress
 [done] Removing stack stress      

You'll see loads dropping on the nodes and after a few minutes the scale-in happening.


Using an auto-scalable infrastructure enables Kontena to react to infrastructure changes. Reactions happen in two different ways that help with scaling your application's serving capacity automatically:

  1. By automatic balancing of the load, stateless services are moved within the platform to balance the load. Essentially this also gives more processing capacity for your application.
  2. Daemon services are automatically "scaled" on new nodes joining the platform grid. In practice this is kinda autoscaling for your containers.

One thing kinda missing from the tutorial is how to put proper labeling for the AZ that the node is launched into. I know one can actually access the metadata from the node, but I wasn't able to find an easy way to inject that data into the systemd units created with cloud-config. To find out the zone, we could use curl and some regexp to isolate the plain zone alphabet out of it. But how to get it into those units? All hints welcome. :)

About Kontena

Kontena provides the most easy-to-use, fully integrated solution for DevOps and software development teams to deploy, run, monitor and operate containers on the cloud. The underlying Kontena Platform technology is open source and available under Apache 2.0 license. It is used by hundreds of startups and software development teams working for some of the biggest enterprises in the world.

Image Credits: Container Port Loading Stacked by Markus Distelrath.