Terminating

This week I had to compute some stats on how much time pods take to shut down for a few specific workloads and that required knowing what “Terminating” means for Kubernetes pods. If you have used kubectl to interact with a Kubernetes cluster, you have surely noticed that immediately after a pod is deleted and is shutting down, it is indeed shown as “Terminating”.

What not everybody knows is that “Terminating” is not a status for pods, but only something that is shown (for convenience?) by kubectl. So what is kubectl doing? This1. This means that the definition of “pod terminating” is a pod in Running status with the DeletionTimestamp set.

This is not extremely surprising: a Terminating pod is in fact still “running” while it is shutting down. If you are talking with Kubernetes directly and are interested in determining the state of pods, for example to figure out the pods Terminating, you have to keep this in mind as there is no such state as Terminating. In the same way, while you get events for several things that happen during the startup of a pod, there is no event emitted for pods that will track the time needed to shutdown, missing a relatively important observability metric.

To come back at original task, I had to measure the time to shut down pods and to do that I used a Kubernetes PodInformer to get notified of the event of deletion of a pod and computed the total shutdown time as (time of the event - DeletionTimestamp). Kubernetes is always a great system in my opinion: no matter how often I bump into a “magic” behavior in kubectl or other things that are not great, the API driven approach always allows me to work around things and to extract the information that I’m looking for relatively easily.

  1. Courtesy of DirectXMan12 in https://github.com/kubernetes-sigs/kubebuilder/issues/648#issuecomment-481039177. 

Keeping ExternalDNS Secure

Navigating to this blog, I realized that I haven’t written a blogpost in a year. It seems quite a long time, but 2022 has been a huge mess in my personal life. I had different health issues that affected my life and my mood and I didn’t really find a lot of time to write anything tech related. I’m gonna do this now: my topic for today is “keeping ExternalDNS secure”.

The status quo during summer 2021

During the summer of 2021, I realized that ExternalDNS’s official Docker image had several vulnerabilities. That sucked because I was not aware of it, the project had no scanning configured and I only noticed it by chance. I couldn’t sit and wait on that and I immediately thought I would inform the community of the findings and commit to ship improvements in this issue.

What I did to fix the problem

The strategy was relatively simple: add automation to scan the image, do it periodically and have more automation to keep the dependencies up to date.

To do so, I added trivy as an image scanner. This gave us a pretty good view of what was happening inside the image so that we could work on making sure to bump all the dependencies that were showing up as vulnerable. This brought us to zero vulnerabilities relatively quickly, but the challenge was keeping the image up to date with ExternalDNS’ somewhat infrequent, unscheduled releases. We needed a process and more automation.

Dependabot to the rescue

Keeping dependencies up to date is a time intensive task. You have to run things locally, make sure the project still compiles, the tests pass… It’s a lot of work and I didn’t want to do it. The obvious solution on GitHub is to use dependabot: with an easy configuration you can make sure to keep your dependencies up to date, including the ones in GitHub actions. It looks like we already closed 200+ dependabot PRs which shows how much Dependabot is helping.

More automation

Dependabot has a problem though: you end up with a lot of open PRs. To automate the process of merging them, I wrote a little script to “pack” PRs together. The script replicates the go get operation for each dependabot PR and builds the project with the changed dependency. If everything builds correctly, my script continues to the next PR, if not, it stops so that I can investigate what changed in the dependency and work on a fix. More often than I would wish, things do break, but the fixes that need to be done are minimal. This process is relatively quick and whenever everything passes, I create a grouped PR like this one.

Continuous scanning

Another problem that I am regularly facing is the fact that the image being built without vulnerabilities, means nothing over time. A perfect image with zero vulnerabilities today, will definitely have a few vulnerabilities in a year. This means that we have a need of continuously scanning at least the last release of the project so that we know when we have vulnerabilities, their criticality and when to act. It’s also a given that being at zero vulnerabilities all the time is hard and often doesn’t justify the amount of work that it requires. For this reason, we need to know what the vulnerabilities actually are and how they affect ExternalDNS. To do so, I configured a separate GitHub repository that uses GitHub actions to continuously scan the latest Docker image for vulnerabilities and build a report that I periodically check. When there are vulnerabilities that can’t be ignored or are above a certain criticality, we start the release process.

Conclusion

If you found all of this boring, chances are that you are already doing good with your project. If not, I hope this inspired you to take a look and to follow similar steps to keep your project and images (if you have any) up to date.

I can now say that at any point in time, I know how many vulnerabilities there are in ExternalDNS’ Docker image and how we need to influence our release schedule to address those. If users stay up to date with the latest development, they can make sure that they always run a relatively free from vulnerabilities image, which is obviously not executed as root by default. While ExternalDNS, by nature, is unlikely to be the most sensitive workload you will run on your Kubernetes cluster, I am proud to be able to play even a small part in keeping everyone’s Kubernetes infrastructure a bit more secure.

My weird setup for end to end testing ExternalDNS

I wanted to write this blogpost for a long time, but always procrastinated doing it to work on more important things. Now I found a bit of time so I decided to just do it, but keep it reasonably short.

The problem

As you might know, I am the maintainer of ExternalDNS. The work I do requires, among other things, to take care of the release process of the project. ExternalDNS is not a project that is continuously delivered: we review PRs periodically which need to pass unit tests and when everything is green, we merge them to the default branch. Periodically and somewhat frequently we create new releases. To make sure that those releases work end to end we run a sort of smoke test on the docker image built for them. Our real goal is to find out if the image works and to do so by running it in a real cluster and creating real DNS records with it.

We would ideally want to test all providers and all sources, but the reality is that we are resource constrained both in terms of infrastructure and access to providers. We have literally no budget to use to finance the development of such tests and the infrastructure that is needed. At the time of writing, only AWS has provided an account to use and thus we only end to end tests on AWS. If you work at a major cloud provider and you are reading this and want to have end to end tests for ExternalDNS, please reach out!

This post will explain how those tests are architected and to better understand the design choices of our end to end tests, it’s important to describe the release process.

ExternalDNS’ release process

Our release process is structured as follows:

  • We create a new GitHub release. This has been in the past source of quite some friction, but first with the use of cli/cli, later with the new feature of automatically generating release notes, it became relatively easy to do.
  • We wait for prow to finish creating images. This can take many minutes.
  • We end to end test those images.
  • We “promote” the image, which is done with a PR in k8s.io.
  • We update the release with the right image tag.
  • We create a PR to update kustomize and/or the helm chart to reference the pushed image.

This whole process takes ~1 day because there are many points in which we need to wait for CI or approvals. We’re still only two maintaining ExternalDNS and we have a life, jobs, families.

Leaving most of the release process aside, I would love to talk a bit about the end to end testing part, how it works and the design decisions about it and how they are a bit unconventional.

What do we test

As I said, the end to end tests are similar to smoke tests. We only prove that basic functionalities work on a very restricted set of features. What we want to prove is that:

  • The image starts correctly.
  • The image can create DNS records correctly based on one source.

This is extremely basic, but it gives a good enough understanding if we broke something major. The project is relatively stable and doesn’t change too much, so this test gives already a decent indication that a release is good. All other problems will need to be addressed via GitHub issues and fixed in the next releases. Things that we are particularly interested in verifying for specific sources or providers will be tested manually by the maintainers or by the people submitting the PRs.

The end to end setup

In the past sections I covered how I use end to end tests to get a sense if an image is good to go or not, let’s figure out now how those tests are set up.

As said above, I built those tests alone and I happen to have a full time job that doesn’t pay me to work on ExternalDNS and that doesn’t budget any hours for me to maintain the project. I also have a bunch of problems in life like most of us out there. This all means that I don’t have a lot of hours to dedicate to the project, which meant that I needed to design the end to end tests to be something extremely low effort to maintain. I decided to use technologies and processes that would make maintaining the setup a no brainer while keeping everything real… and I think I achieved that. Let’s go one by one through the decisions I made.

Decision 1: use a private repo

I thought about having the end to end tests live in the ExternalDNS repo. In that way, I could have more people help me maintain them, I could make them opensource, share the knowledge and so on. But then I decided to build them based on a library that I developed for myself and that I have no intention to opensource because it is experimental and hacky. So I asked myself: what would make me go quicker? A private repo! The repo will need to run some CI jobs (more about this later) and that will require secrets that are always hard to manage for opensource repositories. Private repo it is.

Decision 2: use GitHub Actions in the private repo

It’s no secret that I work at GitHub, but this is not the reason I decided to use GitHub actions. I wanted something easy enough to use, but more importantly I wanted the “Run Workflow” button. I seriously want to manually run workflow for different images whenever I want without thinking too much and that makes it perfect. No commit based workflow is better than a big green button.

end to end workflow

Decision 3: Kubernetes cluster as cattle

I run an EKS cluster to run the tests. I don’t want to shock the readers, but I created the cluster by clicking in the UI. I could go as far as saying that infrastructure as YAML is a bad idea, but that is definitely a topic for another time. Anyway, this time I had no interest in getting a YAML or script done to create an EKS cluster and instead of that, I created it manually in the UI and built all the scripts under one simple convention: the cluster must have a given static name. Now, how do I manage cluster upgrades? Simple: I don’t! I delete the cluster and recreate it with the same name and it all beautifully works. There is no reason to maintain something that can be gone even for multiple days. Not everything needs to be continuously deployed or continuously updated. And boy I don’t want to write another YAML file.

Decision 4: I’m not going to use a real domain

Oh DNS, my dear DNS, you really need a domain to work don’t you. Well, yes and I could have bought and delegated a domain to the AWS account I’m using. But I’m cheap and don’t trust anyone so no thanks. I ended up deciding to only build the tests around private hosted zones in AWS cause they don’t require a real domain to be available. I did the same in Azure, but don’t run that test often cause I don’t have dedicated credits to use for ExternalDNS.

Decision 5: Hack it till you make it

This is the part that I am most proud of but that is the most hacky. If I’m using a private hosted zone, how do you make it so that I can test if the DNS records are correctly created given that the zone is private?

There are a few solutions:

  1. Use a GitHub Actions runner in the cluster.
  2. Use a public DNS zone.
  3. Find another solution.

The problem with (1) is that it requires maintaining the runners and I really don’t want to maintain anything. (2) requires money and trust and I don’t have them. So it was time to find another solution.

What are the things I have? Kubectl. And so I wrote a dirty hack…

I created an application that exposes one endpoint to the internet. If you make an HTTP POST request to that endpoint passing a json with the record you want to resolve, the service will try to resolve the record from inside the cluster and return 200 if it found it, 404 if not. Simple, easy to write, easy to maintain and stable. And this is how I roll.

Putting it together: how it works

So here’s how I can test things:

  • I update a Kustomize YAML containing the tag of the ExternalDNS image.
  • I manually run a GitHub action with the “Run workflow” functionality off of the main branch.
  • The workflow runs, connects to the EKS cluster, deletes everything and deploys ExternalDNS.
  • The workflow deploys Kuard with a record in the private zone.
  • The workflow checks via the HTTP endpoint if the record has been created in the public zone.

This setup is absolutely zero maintenance, it doesn’t require me to do anything, I can trigger it with a button and it has helped spot issues in ExternalDNS for more than a year.

Conclusion

I hope this post was interesting to read. While the setup is not fancy and does not cover a lot of the things that we would really need to test, it does show a way to test things that requires no maintenance, by using a few creative, outside of the “best practices” way of doing things.

One reason why YAML is bad for configuration

YAML is a data serialization language that is widely used for application configuration. YAML is relatively readable, flexible and, compared to JSON, it allows for adding comments.

I don’t think that YAML is generally terrible for configuration, but the abuse of YAML when dealing with complex systems like Kubernetes makes all of its problems more evident: wrong indentations, the fact that you can cut a YAML in two and it’s likely still valid YAML, that problem with Norway and so on.

But today I’d like to talk about a more specific example that can seem surprising and that I found in a codebase I’m working on these days.

A simple Kubernetes deployment

I found myself facing a file that, for the sake of this blogpost, is equivalent to the following:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
  annotations:
    foo: "bar"
    foo: "bar"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

Is this a valid YAML? If you are unsure, you can use your preferred way for validating YAML, I will use Ruby’s irb:

% irb
2.6.0 :001 > require 'yaml'
 => true
2.6.0 :002 > a = YAML.load_file("deployment.yaml")
 => {"apiVersion"=>"apps/v1", "kind"=>"Deployment", "metadata"=>{"name"=>"nginx-deployment", "labels"=>{"app"=>"nginx"}, "annotations"=>{"foo"=>"bar"}}, "spec"=>{"replicas"=>3, "selector"=>{"matchLabels"=>{"app"=>"nginx"}}, "template"=>{"metadata"=>{"labels"=>{"app"=>"nginx"}}, "spec"=>{"containers"=>[{"name"=>"nginx", "image"=>"nginx", "ports"=>[{"containerPort"=>80}]}]}}}}

Valid, cool. Now look at the annotations.

2.6.0 :003 > a["metadata"]["annotations"]
 => {"foo"=>"bar"}
2.6.0 :004 >

The original deployment.yaml file had a duplicate annotation which is perfectly valid in YAML. You are basically saying “the key foo has value bar” and repeating it twice. Not too bad, except that the duplicate is not duplicate once parsed.

Things can be a little bit more fun though:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
  annotations:
    foo: "bar"
    foo: "baz"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

Now we have the same key with two different values. Let’s apply this to Kubernetes with kubectl and see what we have in the cluster:

kubectl get deployments nginx-deployment -oyaml
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    foo: baz

[CUT]

Fun, there’s no sign of the value “bar”. This means that if for any reason you have two duplicate keys, you will not have an invalid YAML and just overwrite things.

This case seems rare and again not too terrible, but there are more similar cases:

% cat deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
  annotations:
    foo: "bar"
    foo: "baz"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
        env:
          - name: foo
            value: bar
        env:
          - name: foo
            value: baz

The YAML file above has a duplicate env. Let’s kubectl apply it and look at the env:

 spec:
      containers:
      - env:
        - name: foo
          value: baz

Fun, isn’t it? Now imagine this “problem” over thousands of templated lines…

Please validate your YAML a lot

What is allowed in YAML is not always what you want to do. Config changes are still the reason for outages, issues in production and generally unexpected behaviors. I’m not going to say that YAML was a bad idea for Kubernetes resources because it would require a much more complicated and detailed discussion, but for sure if you want to deal with YAML files to configure your applications and infrastructure, there is a lot that you should be doing.

Validate your files. If you render them with a tool, validate the rendered files. If YAML is your source of truth, take care of it. Don’t generate and apply on the fly. And maybe try to not abuse YAML too much… I’m liking cue these days, but that’s a topic for another time.

Staging

“Death to staging”. “Staging lies”. “Staging is worst than it works on my machine”. “WTF!”

I’ve heard those a lot. And, in a way, they are all true. A snowflake staging environment is bad. The reason? Because production has its unique characteristics in terms of topology, data and a million other details that go from uptime of the machines, to pretty much anything else. Put like that, no other environment other than production makes sense. I believe this way of seeing things is missing a key point.

On staging, the word

The use of the word “staging” triggers bad feelings and for that reason it should not be used in a discussion lightly if we know that not all participants are on the same page. Staging triggers all the feelings described in the introduction of this blogpost. Its definition however, is not as bad:

2 a stage or set of stages or temporary platforms arranged as a support for performers or between different levels of scaffolding

And the definition of “staging area” is even better:

a stopping place or assembly point en route to a destination

Staging, the word, is like “legacy”: supposed to have a relatively positive meaning, left with a lot of negative ones, mostly because of the scars we all have operating systems. But it’s really just a stopping place en route to a destination.

A stopping place en route to a destination

Staging is not the final destination. Staging is a stop in the journey to build confidence on the quality of a change. That’s the key here: confidence on a change is built incrementally. We start when we write unit tests: we learn if a few units, in isolation, work on our machine. Then we push those and we learn if those units work in CI. Then we write integration tests, we do manual end to end testing and so on. We keep moving changes forwards until we reach production. How we reach production is completely up to us: slowly over the course of days or all at once. By testing changes in a different environment first or not. By feature flagging or not. There are a lot of combinations and different needs, but confidence in a change is an incremental process that tends to 100% only when the change is 100% enabled in production and serving traffic. How we combine those strategies depends on the needs of the application, the cost of testing and it’s ultimately a matter of tradeoffs that vary case by case.

Staging is dead…

The “staging environment” as the one environment alternative to production in its own slowflake configuration is almost useless and should be avoided. It will likely give a false sense of confidence in changes, especially if not approached with the right care. As time and money investment is probably the wrong one: I’d rather invest in getting changes safely to production and testing actively in production with things like feature flags. But it can’t be denied that using non-snowflake pre-production environments, we gain the possibility to test changes in an integrated environment before hitting production. And that’s good, as long as we understand that testing in staging will not catch all the possible bugs. Note that there are environments that are extremely complex for which is close to impossible to build a comprehensive staging environment. In those cases, maybe a staging environment is not the right thing. And maybe it’s worth questioning too if we really need all of that complexity.

… long live to dynamic pre-production environments.

Stigmatizing words is something that happens pretty frequently in the tech world. When this happens, we adapt our language to not be misunderstood and to be sure to get the best out of the conversations we have. Assuming we are not misunderstood, what we want is clear: to gradually increase our confidence in changes removing all the possible unknowns. Testing and conducting experiments in production is key. Using environments that are not production to test those changes is important as well and those tests are unlikely to backfire if the environments are not snowflakes and if we approach them with the assumption that there is no environment identical to production other than production.