How we brought down GitHub Actions workflow execution time from 20 minutes to 5 minutes?

GitHub Actions has been around for a while now and it’s been a great experience so far, although there is plenty of room for improvement.

When it comes to CI/CD, we have quite a few choices for a JavaScript codebase. Having used Travis and Circle CI where things are too abstracted to understand what happens under the hood, GitHub Actions is a much better approach for a CI.

We at Event Espresso, manage all our TypeScript/React codebase in a monorepo. The repository contains all our internal packages as well as our domains (use cases). We use ejected CRA (Create React App) along with Yarn workspaces for development and build process. So, the codebase is quite large, with 25 odd huge packages and many domains along with 1000+ jest unit test cases.

Like any other JavaScript project, we run lint, build and unit tests for every Pull Request (PR). The whole process took a lot of time when we initially set up GitHub Actions workflow for PR checks. It was mainly because of the biggest drawback (IMHO) of GitHub actions, i.e. it doesn’t support reusability of workflow logic or actions (at the time of writing) – more context here.

For every JavaScript codebase, the basic steps needed to carry out lint/build/test are these:

  • Checking out the repo/codebase
  • Setting up the Node environment
  • May be use some caching?
  • Installing NPM dependencies

Which means you will have to repeat all of that for each of the tasks you need to run – lint/build/test. To avoid that, we ran all the three tasks in a single job in series, i.e. after the above basic steps, we first ran lint followed by build and test, which made the whole process pretty long.

Here is how our workflow looked like:

name: Pull Request checks

on: [pull_request]

jobs:
    build:
        runs-on: ubuntu-latest
        name: Lint/Build/Test
        steps:
            - name: Get Yarn cache path
              id: yarn-cache
              run: echo "::set-output name=dir::$(yarn cache dir)"

            - name: Checkout the commit
              uses: actions/checkout@v2

            - name: Set up Node
              uses: dcodeIO/setup-node-nvm@master
              with:
                  node-version: lts/*

            - name: Load Yarn cache
              uses: actions/cache@v2
              with:
                  path: ${{ steps.yarn-cache.outputs.dir }}
                  key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
                  restore-keys: |
                      ${{ runner.os }}-yarn-

            - name: Install dependencies
              run: yarn install --frozen-lockfile

            - name: Check for lints
              run: yarn lint:ci

            - name: Build all packages
              run: yarn build:ci

            - name: Run all unit tests
              run: yarn test-unit:ci

The total execution time for that workflow was nearly 20 minutes – 17m 18s here and 20m 19s here, which was quite frustrating. It meant that even if you make a small change, you had to wait for that much of time to let the whole process finish, so as to be able to merge the changes.

The actual reason for that long execution time was that all the tasks used to run in series (one after the other). We could save much of that time by running all the tasks in parallel but that would mean duplication of logic as mentioned above.

Welcome to build matrix!

GitHub Actions has an excellent feature called build matrix, where you can create multiple jobs by performing variable substitution in a single job definition. Since much of our workflow logic was same for all the tasks we run, we used the build matrix to reuse the logic to run each task as a separate job in parallel, it brought down the execution time to around 10 minutes. 🙂

Here is the updated workflow using build matrix:

name: Pull Request checks

on: [pull_request]

jobs:
    build:
        strategy:
            matrix:
                task: [lint, build, test]
        name: ${{ matrix.task }}
        steps:
            - name: Get Yarn cache path
              id: yarn-cache
              run: echo "::set-output name=dir::$(yarn cache dir)"

            - name: Checkout the commit
              uses: actions/checkout@v2

            - name: Set up Node
              uses: dcodeIO/setup-node-nvm@master
              with:
                  node-version: lts/*

            - name: Load Yarn cache
              uses: actions/cache@v2
              with:
                  path: ${{ steps.yarn-cache.outputs.dir }}
                  key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
                  restore-keys: |
                      ${{ runner.os }}-yarn-

            - name: Install dependencies
              run: yarn install --frozen-lockfile

            # Run the task based on matrix variable
            - name: Run ${{ matrix.task }}
              run: yarn ${{ matrix.task }}:ci

Notice the task being run based on the matrix variable.

We are not done yet 😄

Improve caching!

Previously, we only reused yarn cache and thus the dependencies were installed regardless, which took around 2 minutes. Why not cache all the NPM dependencies? 🤔

We changed our Caching step to this:

- name: Cache dependencies
  id: cache-deps
  uses: actions/cache@v2
  with:
      path: '**/node_modules'
      key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}

And only installed dependencies if lockfile changed:

- name: Install dependencies
  # install deps only if lockfile has changed
  if: steps.cache-deps.outputs.cache-hit != 'true'
  run: yarn install --frozen-lockfile

It reduced the execution time further by around 2 minutes. 😍

Are we done yet? “No” is the answer 😄

Intelligently run unit tests!

Why should we run the unit tests which are not affected by the change? Well, that’s the strategy we used to further reduce the execution time. As already said, we use jest for unit tests. Jest provides an option in CLI called --changedSince, which runs only those tests which are affected since the given commit SHA. So we decided to use the SHA of the base branch of the PR for --changedSince to run only the tests which are affected by the PR. All we had to do it to add this to our CLI arguments:

--changedSince ${{ github.event.pull_request.base.sha }}

It also meant that if no tests are affected by the PR, unit tests will not be executed. So, if there is a minor change which does not affect any tests, the workflow would finish in less than 5 minutes. 😍 For example in this run 😎.

Here is the final workflow:

name: Pull Request checks

on: [pull_request]

jobs:
    build:
        runs-on: ubuntu-latest
        strategy:
            matrix:
                task: [lint, build, test]
        name: ${{ matrix.task }}
        steps:
            - name: Checkout the commit
              uses: actions/checkout@v2
              with:
                  # To make sure all history is fetch for jest --changedSince to work as expected
                  fetch-depth: ${{ ( matrix.task != 'test' && 1 ) || 0 }} # 0 for test, 1 otherwise


            - name: Set up Node
              uses: dcodeIO/setup-node-nvm@master
              with:
                  node-version: lts/*

            - name: Cache dependencies
              id: cache-deps
              uses: actions/cache@v2
              with:
                  path: '**/node_modules'
                  key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}

            - name: Install dependencies
              # install deps only if lockfile has changed
              if: steps.cache-deps.outputs.cache-hit != 'true'
              run: yarn install --frozen-lockfile

            - name: Set test CLI args
              if: ${{ matrix.task == 'test' }}
              run: echo "CLI_ARGS=--changedSince ${{ github.event.pull_request.base.sha }}" >> $GITHUB_ENV

            - name: Run ${{ matrix.task }}
              run: yarn ${{ matrix.task }}:ci ${{ env.CLI_ARGS }}

We may further improve the execution time for our build step to run the step only for the affected and their dependent packages/domains 😊.

Tags: , , , , , , ,