Github Workflows Intro¶
- Github workflows are for CI/CD in Github, similar to Gitlab pipelines.
- The name for the platform as a whole is Github Actions (although they are used quite interchangably).
- There is similar external CI/CD software (e.g. CircleCI), but having workflows close to the code is helpful for better integration.
- At HOT we also use Github Container Registry to store container images.
Concepts¶
There is comprehensive official documentation of Github Actions / Workflows.
Important Info:
-
There are 3 levels of jobs: steps --> jobs --> workflows.
- A step is the lowest level, typically running the actual code in a bash shell.
- A job encapsulates many steps, to have an end outcome.
- A workflow encapsulates many jobs.
- Each job in a workflow run on a seperate machine (runner), with different packages, containers, cache, etc.
- Typically workflows run on the
ubuntu-latest
image. Unless something simple, it is generally better practice to run your job inside a specifc container (for consistency).
Types¶
Local Repo Workflows¶
- Located under
.github/workflows/workflow_name.yml
. - Specific to your repo. Run on based on set criteria.
Reusable Local Workflows¶
- Also located under
.github/workflows/workflow_name.yml
. - Use the
workflow_call
trigger, meaning they can only be called from another parent workflow. - Useful if you repeat the same job multiple times, i.e. DRY code.
Reusable Remote Workflows¶
- Located under
.github/workflows/workflow_name.yml
, but within another Git repository. - A centralised location can be used to store workflows to be called from other repos.
- Exactly what this repo is!
Github Actions¶
- A single workflow job packaged up for use / calling from within another workflow.
- There is a lot of overlap here with reusable workflows - they operate in a very similar way.
- The file structure is defined slightly differently, but the important thing to note is that a single job is defined per published repo.
- HOT does not maintain any Actions as of now, but we use plenty of official and unofficial Actions within our workflows.
Triggers¶
There are quite a few triggers. The main ones to note are below.
push¶
- A standard push to the repo.
- Useful for deployment & publishing workflows.
- Workflow runs on the branch you merged into (e,g. main, development).
- Can specify which branch, or which files to trigger on.
- Also covers when a PR is merged (this is essentially a push too).
- Includes tag pushes.
pull_request¶
- Runs when a PR is made.
- Useful for testing workflows.
- Runs on the PR source branch (i.e. the one you want to merge).
pull_request_target¶
- Runs when a PR is made, BUT runs on the target branch instead.
- For
feat/some-feat
-->main
, then would be themain
branch. - Give's some extra permissions to the workflow, so use with caution.
- We only use this for the PR Label workflow, as it allows for labelling of the PR, even if the PR author does not have repo permissions.
workflow_call¶
- Runs when called from another workflow.
- What we use mostly throughout this repo, to make 'reusable' workflows.
Using Reusable Workflows¶
You need two keys to run a reusable workflow: uses
and with
.
Example:
frontend-build:
uses: hotosm/gh-workflows/.github/workflows/image_build.yml@main
needs: [frontend-tests]
with:
context: src/frontend
dockerfile: prod.dockerfile
build_target: prod
image_name: ghcr.io/${{ github.repository }}/frontend
extra_build_args: |
APP_VERSION=${{ github.ref_name }}
COMMIT_REF=${{ github.sha }}
VITE_API_URL=${{ vars.URL_SCHEME }}://${{ vars.API_URL }}"
As you can see, uses
is to define the workflow you want to run:
The version of the workflow can be specified after the
@
symbol. It is bad practic to use @main. Generally you should release tagged versions of the reusable workflows, then specify the tag, e.g. @0.1.2
The with
key is used to specify all of the inputs to pass to the workflow.
(these are predefined by the creator of the reusable workflow).
Using Secrets in Reusable Workflows¶
By default reusable workflows will not have access to environment secrets, unless specified in the workflow definition.
There are two ways to do this:
-
Specify secrets individually:
jobs: pytest: uses: hotosm/gh-workflows/.github/workflows/some_workflow.yml@main secrets: SECRET_VAR_1: ${{ secrets.SECRET_VAR_1 }} SECRET_VAR_2: ${{ secrets.SECRET_VAR_2 }}
-
Inherit all secrets (recommended):
jobs: pytest: uses: hotosm/gh-workflows/.github/workflows/some_workflow.yml@main secrets: inherit
Passing Info Between Workflows¶
As noted above, each reusable workflow runs on a different machine (runner).
There are various ways to pass information between different workflow jobs.
Environment Variables¶
Any variable in $GITHUB_ENV will be available to all jobs in the workflow.
Set a variable:
echo "README_PATH=./README.md" >> $GITHUB_ENV
Read a variable:
# Read the content of the file
cat $README_PATH
# Alternative syntax
cat ${{ env.README_PATH }}
Outputs¶
Outputs can be defined on a step level, job level, and workflow level.
Step Output¶
Typically the output is defined by code within a step:
# ... some code to determine var_content
echo "var1=${var_content}" >> $GITHUB_OUTPUT
This can be read in another step (using the step id):
steps:
- id: first-task
name: Set the var
run: |
echo "var1=${var_content}" >> $GITHUB_OUTPUT
- name: Read the var
run: |
echo ${{ steps.first-task.outputs.var1 }}
Job Output¶
Outputs defined in a step can be set a job outputs, making them available to other jobs in the workflow.
jobs:
build-image:
runs-on: ubuntu-latest
outputs:
image_name: ${{ steps.get_image_name.outputs.image_name }}
image_tag: ${{ steps.get_image_name.outputs.image_tag }}
This can be read by another job in the workflow:
To use the output from a workflow, we have to use the
needs
keyword to create a linking between two jobs.
jobs:
job1:
runs-on: ubuntu-latest
outputs:
var1: ${{ steps.first-task.outputs.var1 }}
steps:
- id: first-task
name: Set the var
run: |
echo "var1=${var_content}" >> $GITHUB_OUTPUT
job2:
runs-on: ubuntu-latest
needs: [job1]
steps:
- name: Get the var
run: |
echo "${{ needs.job1.outputs.var1 }}"
Workflow Output¶
- Outputs can also be passed up to the workflow level.
- This is especially useful for reusable workflows.
- We can define an standard output for the workflow, then pass the value onto our next called workflow.
As with passing data between jobs, we also have to use the
needs
keyword to pass data between workflows.
# some_workflow.yml
on:
workflow_call:
outputs:
var1:
description: "The test var."
value: ${{ jobs.job1.outputs.var1 }}
jobs:
job1:
runs-on: ubuntu-latest
outputs:
var1: ${{ steps.first-task.outputs.var1 }}
steps:
- id: first-task
name: Set the var
run: |
echo "var1=${var_content}" >> $GITHUB_OUTPUT
We can then use this in our parent workflow:
do-something:
uses: hotosm/gh-workflows/.github/workflows/some_workflow.yml@main
do-something-else:
uses: hotosm/gh-workflows/.github/workflows/some_workflow2.yml@main
needs: [do-something]
with:
# Run workflow with a variable output from previous workflow
some_variable: ${{ needs.do-something.outputs.var1 }}
Cache¶
Caching is used for persisting data across workflow multiple runs.
Example flow:
- Workflow pulls a container image to run a task.
- This workflow runs the first time on a pull request.
- The container image is cached.
- The developer makes some edits and pushes new commits to the PR.
- The cache is hit for the PR, so the container image does not download again.
As you can see, this is a useful efficiency gain for workflows running many times consecutively.
Caches are scoped to a branch, so the cache for
main
is not available within a PR.
Artifacts¶
Artifacts are used for persisting files within a single workflow run.
Example flow:
- Workflow 1 creates file
- Uploads as artifact
- Workflow 2 downloads artifacts
- File can be used in workflow 2.
Artifact also make file outputs available via the Github CLI (e.g. for making releases that the user can download).