Writing modern CI/CD Pipelines with Azure DevOps
Azure DevOps is Microsoft's solution for managing the full software development life cycle (SDLC). It contains everything a team needs to build outstanding products. This includes an issue tracker, dashboards and reporting, source control incl. an advanced editor and other features, artifacts, test management, and - crucial for this article - a system to manage builds and deployments. The system to manage builds of software and their releases is called "Azure Pipelines".
Areas
Before we dive into Azure Pipelines as a product we should have a look at what makes Azure DevOps so appealing. Surely, there are many products to cover the same spectrum out there. Out of the vast spectrum we have solutions such as GitHub Enterprise, GitLab, as well as the Atlassian stack on the market. What's the unique selling point of Azure DevOps? Potentially, the main advantage lies in the great integration with Azure. While Azure DevOps works with any kind of deployment (even on the mobile app stores), the implementation of Azure services is best. We will see that later with some examples such as secret management.
While Azure DevOps is a compelling product "out of the box", i.e., even in the simplest configuration with a free plan, there are multiple additions that can be made. Some of them require a paid subscription. An example for such an addition is the test management. Besides these official additions the whole SaaS product can be enriched via extensions. They make it possible to get, e.g., a better dashboard experience by providing some useful widgets to Azure Boards. They also include things like improved pull request management in Azure Repos or overview pages for Azure Pipelines.
Artifacts
One of the areas that stands out is Azure Artifacts. This is a single place to manage packages such as published NPM or NuGet packages. While there are official and other third-party services that offer private package feeds, the integration in Azure DevOps is super appealing. Here, we do not have to worry much about expiring keys for publishing or other integration topics. Since Artifacts is part of Azure DevOps we can directly publish from our Azure Pipelines. And the best part is that we can create as many feeds as we want to. We have up to one feed that can be shared with the whole organization and then at least one feed per project.
Talking about organizations and projects may yield a common question: How is Azure DevOps actually set up? The following diagram illustrates that briefly. We can create a subscription by creating a new organization. In advanced plans, this organization may be coupled to an Active Directory, too. The organization then has members which can be given right on the organization level, such as the ability to create projects or install extensions. A project is a host of all the Azure DevOps functionality. This includes the areas we've discussed earlier: Azure Boards, Azure Repos, Azure Pipelines, and others. A project can be public or private. Public access yields only visibility and read only access. In any case the internal access can be determined, too. Members of the organization can be given roles and permissions.
One of the best things about Azure DevOps is the great permission model, which has some rich functionality for creating reusable groups or really fine-grain access across whole teams. This alone could fill a whole article. But let's come back to the topic: Azure Pipelines!
Pipelines
Azure Pipelines is not a new product. Actually, it goes back to the on-premise Microsoft Team Foundation Server (TFS) product. With Visual Studio Online it saw an enhanced integration with Azure, where the (managed or integrated) build agents are running. Still, build agents can be customized and put at any location. Usually, though, we are fine with the provided agents. In all the years the performance and capabilities of Pipelines got enhanced significantly. As we will see, the system that originally followed Microsoft's approach of making everything approachable via some GUI has been replaced by a system that follows open standards and is now more flexible than ever.
One of the secrets that makes Azure Pipelines so successful is that it can truly handle a lot of load. Even better, without any additional cost we can use Azure Pipelines - even for proprietary software. The only limits are the available agents (there is a priority for paid subscriptions) and the number of currently running concurrent jobs. Even with free agents we might need to queue up if we run too many jobs in parallel.
Build vs Releases
In recent years Azure Pipelines has seen many very fruitful changes. Originally, pipelines was divided into the following concepts:
- Build jobs
- Release definitions
- Task groups
- Libraries
As we will see only libraries and build jobs are necessary to define truly reusable and flexible pipelines these days. Nevertheless, not only for historical purposes it makes sense to look into the full spectrum first.
Build jobs are a way of defining how software is produced. A typical build job may contain everything to create and verify an artifact. It should not deploy anything directly, but could (and should) publish an artifact. The artifact may then be used in a release definition, which contains different triggers.
This allows reusing the build job without any changes for, e.g., pull request validation. In contrast, the release definition would be only used when certain branches changed. Here, an actual release will be triggered.
Below we see how a release definition with multiple stages may look in the classic approach.
The split of builds and releases is also useful for things besides pull request validation. This allows a quick and easy rollback mechanism. Since the build already happened we have retained build artifacts that can be released independently of any long running build jobs.
Reusability
What are the other bullet points good for? Well, so far we only covered what we want to do - but not exactly how. In order to bring in some reusability we need to have proper mechanisms for that. Grouping certain behavior that is used inside a build job or release definition was classically done via a task group. Task groups allow composing tasks for use in build jobs and release definitions.
Task groups are composed using the graphical editor, too. There is an underlying JSON format, which is quite often very useful, however, there is no easy ability to update changes from a manipulated JSON file. The easiest way would be using the API. In the web app we would need to create a new task group (using a new name) for uploading the changed task group's JSON content.
Likewise, if we want to have reusable configurations we use libraries. One great aspect of the library feature is that it can seamlessly work against an Azure Key Vault. This enables a central place for sharing environment secrets that may be used for deploying or pre-configuration of created artifacts.
The following image shows how a connected library links the secrets from Azure Key Vault. The Azure Subscription and the Azure Key Vault name can be set directly in the UI.
An example for a classic project can be found at florianrappl/typescript-fullstack-sample.
Modern Pipelines
Modern pipelines are no longer created in graphical editors even though this can still be done. Instead, a special file called azure-pipelines.yml is added to the root of a repository. If this file is found then a pipeline is automatically created (or updated). The format of this file is YAML.
What is YAML
YAML is a recursive acronym that means "YAML Ain't Markup Language". The official spec is available at yaml.org. It describes itself as a
[...] human friendly data serialization standard for all programming languages.
YAML is another simplification in the data format, where we could naturally progress XML to JSON to YAML. Though this conclusion has some flaws it can be helpful to place YAML somewhere on our mind map.
To make the format human friendly the significant parts of the language have been designed to make it very readable. Consequently, YAML is a language that relies on significant whitespaces. While this comes with drawbacks it has a visual appealing that makes it ideal for declarative coding. For creating instructions in a declarative way we can leverage YAML really well. This is not known since yesterday - actually, for instance, most CI/CD services offer their automated code pipelines to work with YAML.
YAML in Azure Pipelines
An example for a YAML declaration in Azure DevOps looks as follows:
pool:
vmImage: 'ubuntu-16.04'
steps:
- task: NodeTool@0
inputs:
versionSpec: $(node_version)
- script: npm install
Like with JSON for YAML there is also a schema to define what keys are expected, allowed, or could contain what kind of values.
Before we dive into the technical side of the Azure Pipelines YAML schema we should have a look at the hierarchy defined by a pipeline YAML:
- Pipeline
- Stage A
- Job 1
- Step 1.1
- Step 1.2
- ...
- Job 2
- ...
- Job 1
- Stage B
- ...
- Stage A
According to the official specification the following terms are defined:
- A pipeline is one or more stages that describe a CI/CD process. Stages are the major divisions in a pipeline. The stages "Build this app," "Run these tests," and "Deploy to pre production" are good examples.
- A stage is one or more jobs , which are units of work assignable to the same machine. You can arrange both stages and jobs into dependency graphs. Examples include "Run this stage before that one" and "This job depends on the output of that job."
- A job is a linear series of steps . Steps can be tasks, scripts, or references to external templates.
A pipeline does not need to make use of the full hierarchy. Instead, if only a series of steps is required, then there is no need to define jobs or stages.
Coming back to the topic of the Azure Pipelines YAML schema we see the following key regions:
- trigger to set if the pipeline should be running when branches changed
- pr to set if the pipeline should be running for pull requests
- schedules to set if the pipeline should be running at fixed times
- pool to define where the build agent should be running
- resources to define what other resources your pipeline needs to consume
- variables to define variables and their values
- parameters to define input parameters for the pipeline template
- stages to define the stages (consists of jobs)
- jobs to define the different jobs (consists of steps)
- steps to define what steps to take
- template to allow referencing other (YAML) files
There are a couple more, but these give us the most important parts to get everything done.
Maintainable Pipelines
The key for writing maintainable Azure Pipelines YAML files lies in the resources
specifier. This one allows us to also consider other repositories as input artifacts. That way we can "just obtain" another repository, which brings in additional files.
In practice this all starts with a file like the following:
# azure-pipelines.yml
resources:
repositories:
- repository: pipeline
type: git
name: my-pipelines
trigger:
- develop
- release
- master
stages:
- template: deploy-service.yml@pipeline
parameters:
serviceName: my-test-service
Here we already refer to a repository later on called pipeline
, which is named my-pipelines. Importantly, this repository must be available in the same project as the pipeline on Azure DevOps.
Instead of specifying the different stages
we refer to template
. This allows us to define the full content of thestages
in another file. We pick the deploy-service.yml file coming from the pipeline
resource. As we defined it the pipeline
resource refers to our my-pipelines repository.
Looking at the file we see no big surprises here. A standard YAML file, right? But wait - we define the parameters
in here, too.
# my-pipelines/deploy-service.yml
parameters:
pool:
name: my-pool
demands:
- TYPE -equals node
- ID -equals 0
serviceName: ''
stages:
- stage: Build
pool: ${{ parameters.pool }}
jobs:
- template: /build/npm-docker-build.yml
- template: /deploy/deploy-stages.yml
parameters:
pool: ${{ parameters.pool }}
serviceName: ${{ parameters.serviceName }}
The parameters give us a convenient way to define what needs to be known (but flexible) inside the YAML definition. Using the ${{ }}
replacement syntax we can refer to the given value of these parameters later.
The definition of the pool
parameter is interesting. To avoid any misuse here we fix the value using a demands
section. The default value is given via name
.
Other than that we see that template
keys are used again to fill some other sub sections. While the build stage is fully defined (except the contained jobs), the additional publishing stages are all found in a different file found at /deploy/deploy-stages.yml. Notice that we did not define a resource here, nor did we reference the file from a resource.
The reason is simple: This file is still in the same repository as the referrer deploy-service.yml. Let's take a look at another referenced file here, the definition of the build job via npm-docker-build.yml.
The file is pretty much straight forward - it just contains a number of steps. Every step is - for reusability reasons - coming from a different YAML file.
# my-pipelines/build/npm-docker-build.yml
jobs:
- job: NPM.Docker.Build
displayName: NPM Docker Build
variables:
- group: my-keyvault-vars
- template: /vars/common.yml
steps:
- template: /build/tasks/npm-build.yml
- template: /build/tasks/npm-test.yml
- template: /build/tasks/keyvault-verify.yml
- template: /build/tasks/docker-build.yml
- template: /build/tasks/docker-push.yml
At this point the composition pattern is quite clear. The given pipeline-as-code description is highly efficient and - thanks to git as a version control system - very resilient against things like switching Azure DevOps organizations. It should also give people that are afraid of data losses quite a good sleep at night.
Defining variables also does not need to be done via libraries. Instead, we can leverage features such as defining all the variables
via YAML, too.
In the code above we referred to /vars/common.yml for the whole variables
section. Let's see how this looks in practice:
# my-pipelines/vars/common.yml
parameters:
additionalVariables: {}
variables:
# Build trigger for forced builds
${{ if and(eq(variables['Build.Reason'], 'Manual'), notIn(variables['Build.SourceBranch'], 'refs/heads/develop', 'refs/heads/release', 'refs/heads/master')) }}:
VERSION: '-manual'
ENV_SUFFIX: '-test'
ENV_NAME: 'test'
# Build trigger for pull request
${{ if eq(variables['Build.Reason'], 'PullRequest') }}:
VERSION: '-pr'
ENV_SUFFIX: '-test'
ENV_NAME: 'test'
# Build trigger when branch "release" is merged
${{ if eq(variables['Build.SourceBranch'], 'refs/heads/release')}}:
VERSION: '-beta'
ENV_SUFFIX: '-stage'
ENV_NAME: 'stage'
# Define some standard variables
IMAGE_NAME: '$(Build.Repository.Name)'
IMAGE_TAG: '$(Build.BuildId)$(VERSION)'
# Insert additional variables
${{ insert }}: ${{ parameters.additionalVariables }}
The additionalVariables
parameter allows us to bring in additional variables - if wanted.
In any other case the code above demonstrates how to
- include some fixed variables in case of a specific (manual) build
- include some fixed variables for a pull request
- include some fixed variables for a predefined branch
- include some variables based on existing variables (or in generic on parameters, too)
- include the additional variables via a custom spread operation using
${{ insert }}
.
The great thing is that this now forms a wonderful round circle regarding builds and releases.
Was the screen formerly split (as mentioned for good reason) it is now one.
We see directly why the release is where and what has been done to actually create the artifact that was then pushed to the different release stages.
Approval Gates
Finally, in our stages we want to be able to define environments and approval steps, too. We start by defining an environment. Ideally, this is a virtual machine or an AKS. Otherwise, we can still use just an "empty" environment, where we do not add any resources.
Now we only need to reference the environment. This can be done via the environment's name.
The following YAML shows us how such a fragment might be looking like for deploying an Helm chart to AKS.
parameters:
serviceName: ''
namespace: 'test'
envName: 'test'
jobs:
- deployment: 'HelmDeploy${{ parameters.envName }}'
displayName: 'Helm Deploy'
variables:
- template: /vars/common.yml
parameters:
serviceName: ${{ parameters.serviceName }}
environment: ${{ parameters.envName }}
strategy:
runOnce:
deploy:
steps:
- checkout: self
- script: helm repo update
displayName: 'Helm: Update Helm Charts'
- task: HelmDeploy@0
displayName: 'Helm: Deployment'
inputs:
namespace: ${{ parameters.namespace }}
command: upgrade
chartName: 'test/my-service-chart'
releaseName: ${{ parameters.serviceName }}
Through the environment settings the gate approvals are determined without any fiddling in the intrinsic build steps. This is quite nice as it allows also general administrators to adjust the approval conditions without touching the code. The separation is clean and straightforward.
Next Steps
Now that we know everything for building modern pipelines using code with Azure Pipelines the key question is: what's next? There are two points I'd love to leave you with. The first one is that the new way allows defining reusable environments. Earlier one, only release definitions could define "stages", which would be used similar to environments. Reusability was ensured via templates, but due to a number of reasons that was an insufficient mechanism.
Environments
Environments solve this and more. They allow coupling to actual resources, e.g., AKS, which is a managed control plane for Kubernetes (K8s). Using environments the Azure hosted K8s resources can be seen and inspected directly. This makes it possible to get on a single screen all the running pods, their status, and even log file output.
The image below shows how environments allow inspection of running resources in an AKS. The coupling to the previous build job delivers useful insights. The ability to retrieve logs from a single place is amazing.
The second thing I'd like to share is the ability to automatically create projects using these pipelines. Having the whole pipeline definition in code makes such a scaffolding quite simple. Earlier one we needed to create all resources (e.g., build jobs, release definitions, ...) either
- via some custom tooling,
- via the given command line tooling, or
- via a dedicated pipeline that accepts input parameters.
All ways would use the Azure DevOps API.
Still, even though build jobs and release definitions are quite comfortably now moved out of the equation we may still want to use these tools for bringing in some consistency. This way we could create things like branches, their policies, and a given coding boilerplate without much trouble.
Focusing on the branch policies we could use the official Azure CLI extension for Azure DevOps to bring this functionality in.
The command can be as simple as:
az repos policy create ./my-policy.json
The unfortunate part here is that the whole payload of the body must be configured via a file (here my-policy.json). Let's see an example where we add a (manual) build job to the branch policies for the main
branch.
{
"isBlocking": true,
"isDeleted": false,
"isEnabled": true,
"revision": 1,
"settings": {
"buildDefinitionId": 22,
"displayName": "Manual Queue Policy",
"manualQueueOnly": true,
"queueOnSourceUpdateOnly": false,
"scope": [
{
"matchKind": "Exact",
"refName": "refs/heads/main",
"repositoryId": "the-repo-id"
}
],
"validDuration": 0
},
"type": {
"displayName": "Build",
"id": "0609b952-1397-4640-95ec-e00a01b2f659"
}
}
The build policy id and most of the settings should remain as-is. The build definition id must be updated accordingly to refer to the right build id.
In this post we've seen how easy it is to create reusable pipelines with Azure Pipelines. The whole experience of Azure DevOps is potentially not the best in every subarea, but combined together it truly excels. The best part is that integration of all the different tools (e.g., CI/CD with private package feeds and issue tracking) is already done and just works.