Host your Private VSTS Linux Agent on Azure Kubernetes Service

After having introduced the Azure Kubernetes Service, today we are going to take a technical example to play with Docker, Kubernetes and Helm.
For that, we need a purpose. The exercise I would like to do is configuring a private VSTS agent. You know how I love playing with VSTS! ;)
VSTS has by default 4 Hosted Agents: Windows (VS2017), Windows (VS2013/2015), Linux (Ubuntu) or MacOS for your Build and Release definitions. And the first 240 minutes per month are free. That's awesome, isn't it!?
So, why do I need a private VSTS agent? There is different reasons why you may need it:
  • To avoid the limitations of the Hosted agents explained here
  • To decrease cost, you don't pay for the minutes you are consuming
  • To interact between your agent and your resources both in Azure, on-premise, etc. (for security, performance, etc. reasons)
  • To have specific version of OS/tools installed on the agent
  • To avoid waiting for an hosted agent queue
  • To do incremental builds (source control files, docker images) by leveraging cache
  • If you are using the Linux Container, to run deployments and tests from within a Kubernetes cluster for example (for security, performance, etc. reasons)
You could setup a private agent on physical machine or virtual machine by following the general documentation for Windows, for Linux or for MacOS. Another alternative we will follow through this blog article is setting up the official Linux container.
Note: by default, you have 1 Private Agent for free, see the details of the VSTS pricing if you need more than 1 Private agent.

So based on the VSTS Agent Docker Hub page, let's get started!

Here are some information we could notice:
  • Ubuntu 14.04 and 16.04 are the current OSes supported. There are plans for Windows support in the future (WIP works for now here)
  • latest tag points at a Standard Docker imaged based on the best supported OS, as we speak, it's targeting ubuntu-16.04-docker-17.12.0-ce-standard
  • There is images for both VSTS and TFS
  • There is 3 kind of images: standard, docker and docker-standard
In our case, we will target the latest tag, because we would like the capabilities of Ubuntu 16.04 + docker-standard setup for VSTS.

At the end of the description, we need to pay attention to this mention:
These images do not run "Docker in Docker", but rather re-use the host instance of Docker. To ensure this works correctly, volume map the host's Docker socket into the container: -v /var/run/docker.sock:/var/run/docker.sock
Important note here: if you upgrade your AKS cluster, current nodes will be removed  and new fresh nodes will be created. With this Docker socket binding, your VSTS agent won't work anymore, you will have to redeploy it.

So the associated Docker command looks like this so far:
docker run \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -it microsoft/vsts-agent:latest

You need to assign the appropriate value of the variables: VSTS_ACCOUNT and VSTS_TOKEN. To know how to retrieve the VSTS_TOKEN value, please check out this documentation.

Now you will see the VSTS Agent Queues in your VSTS Account:

That you could now use in your Build and Release definitions:

Ok, but how to deploy this in a Kubernetes/AKS cluster?

You need to have an existing Kubernetes cluster, if you don't have any yet and would like to create one with Azure Kubernetes Service (AKS), checkout this tutorial.

From there, let's create and run now the associated Kubernetes config as YAML file to describe our deployment. And changing your values appropriately.
First, let's add a new Kubernetes Secret:
kubectl create secret generic vsts-pat \
  --from-literal=VSTS_TOKEN=$VSTS_TOKEN \

And now let's apply the Kubernetes Deployment:
kubectl apply \
  -f aks-vsts-agent-without-pv.yml

You could now see your new Pod created with STATUS=Running, you could check this by running the kubectl get pods command.
If you go to your Agent Queues tab in VSTS, you should now see your new Agent as well. Congrats! ;)
You could now run/queue your Build definition which uses this Agent and you will see how fast it is to run. In my case I have a very simple VSTS Build definition which is just pulling the source code from GitHub and then building a Docker image based on this Dockerfile.
The first run will take about ~93s, and guess how long for the second run? ~11s! What! Why!? Yep, that's one of the advantage to have your own private agent. The files pulled from GitHub and the Docker images pulled during the previous build run are still there and the next build runs will just grab deltas.

Important note: There is a limitation to leverage the Kubernetes' Docker host with this setup you have to have in mind. This config is ok for "basic" Docker features (Docker 1.13 and lower) on your agent but if you are using "advanced" Docker features you could get some error messages like "Error parsing reference: "microsoft/aspnetcore-build:1.1 AS build-env" is not a valid repository/tag: invalid reference format". if you are using for example the concept of multi-stage Docker builds. That's a constraint with AKS, and we don't know when it will be supported... You could up vote here.

Ok, but now what's happening if I update my Container/Pod, if the Pod crashes or is deleted, etc.?

Good point... let's try this out!
You could run this command to delete your pod:
kubectl delete pod/<name-of-your-pod>

You will see that your previous Pod will be Terminating, then removed and that a new Pod will be created. This mechanism by Kubernetes is because we set replicas: 1 in our YAML file to deploy this Pod, which guarantees to always have 1 Pod.
If we queue a third Build in VSTS, we will see that the duration is now ~13s. So it's higher than the second one but lower than the first one. Why? By looking at the VSTS Build logs we could see that we are doing again a git remote add origin and pulling all the files from GitHub, not like the second run but like the first one. Remark: my GitHub repository in this case is very small, but if you have a bigger one you could expect more differences regarding this step's duration.
In the other hand, we are seeing that the docker build command is using the cache like the first run and not like the second one. That's because we volume mapped the host's Docker socket into the container in our YAML file.
So what about the GitHub files? Is there any way to persist them on a storage? Yes indeed!
For that we will use the concept of Volumes. Here are 4 documented ways to do that in Azure: AzureDisk dynamic | AzureDisk static | AzureFile dynamic | AzureFile static.

To clean up your Kubernetes cluster from the previous step, you could run these commands:
kubectl delete \
  -f aks-vsts-agent-without-pv.yml

In our case we will use AzureFile dynamic and will run these commands:
kubectl apply \
  -f aks-vsts-agent-sc.yml
kubectl apply \
  -f aks-vsts-agent-pvc.yml
kubectl apply \
  -f aks-vsts-agent-with-pv.yml

Here we are, you could now redo the test multiple times and see that your files from your source control are now persisted even if your pod is removed/updated. Congrats!

Ok, but now how to automate and repeat the YAML deployment file for different Agents, different projects, environments, etc.?

Yes good point, when we look at the YAML file we could indeed see hard-coded value like deployment name, image name, image tag, etc.
Helm is the tool to automate Kubernetes deployments by bringing the concept of charts, templates, parameters and values.

To clean up your Kubernetes cluster from the previous step, you could run these commands:
kubectl delete \
  -f aks-vsts-agent-sc.yml
kubectl delete \
  -f aks-vsts-agent-pvc.yml
kubectl delete \
  -f aks-vsts-agent-with-pv.yml

To create our own Helm charts we could get inspiration of these 2 repositories: Kubernetes or Azure.
Or we could use an existing one to avoid re-inventing the wheels and and we could proceed with these steps:

helm init
git clone
VSTS_TOKEN=$(echo -n $VSTS_TOKEN | base64)
helm install \
  --name vsts-agent \
  ./helm-vsts-agent \
  --set vstsToken=$VSTS_TOKEN \
  --set vstsAccount=$VSTS_ACCOUNT \
  --set vstsPool=Default \
  --set replicas=1 \
  --set resources.limits.cpu=0 \
  --set resources.requests.cpu=0

Note: if you have helm version 2.9.0, you should get this error:, see the associated workaround or upgrade to helm 2.9.1.

Here we are, after some times, we have a new agent provisioned by this Helm chart and we could now repeat the process across our different projects, etc. Congrats!

Optionally, you could remove this deployment by running this command:
helm delete \
  --purge \

To complete our Kubernetes learning, here are some differences we could notice from our initial YAML files (with or without PersistentVolume) and the template provided by this Helm chart:
  • Deployment versus StatefulSet
    • Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling. So for example the StatefulSet will guarantee the duplication of your PersistentVolume while scaling your Pods, Deployment won't.
  • Secrets in YAML instead of running the kubectl create secret command
    • That's two ways to register your Secrets
  • resources.limits + resources.requests information
  • Service and Namespace
    • You could leverage these concepts of Services and Namespaces for more isolation, security, etc. reasons.
  • AzureDisk instead of AzureFile for PersistentVolume
    • Like explained few lines above, you have different options on Azure for PersistentVolume
Note: it was the opportunity for me to submit a quick fix with this PR#2 already merged into master and also propose this Improvement#1.

In summary, what did we do and learn?

  • Setup a VSTS Private Agent (Docker Linux)
  • Deploy your private agent in a Kubernetes cluster on AKS
  • Deploy and play with different Kubernetes objects such has: Deployment, StatefulSet, Secret, PersistentVolume and PersistentVolumeClaim.
  • Find out that the Docker version on Kubernetes for now is just Docker 1.13, with some limitations to build your Containers from within the cluster
  • See the interest of Helm for more repeatable and automation capabilities
Resources I used to build this:
Further steps and alternatives (and some ideas for future blog articles ;)):
Hope you enjoyed this blog article, you appreciated the walk-through process to understand different concepts of Docker/Kubernetes/Helm. And hopefully you could leverage it for your own needs and context!


Post a Comment