GitHub Actions has been around for a while now and it’s been a great experience so far, although there is plenty of room for improvement.
When it comes to CI/CD, we have quite a few choices for a JavaScript codebase. Having used Travis and Circle CI where things are too abstracted to understand what happens under the hood, GitHub Actions is a much better approach for a CI.
We at Event Espresso, manage all our TypeScript/React codebase in a monorepo. The repository contains all our internal packages as well as our domains (use cases). We use ejected CRA (Create React App) along with Yarn workspaces for development and build process. So, the codebase is quite large, with 25 odd huge packages and many domains along with 1000+ jest unit test cases.
Like any other JavaScript project, we run lint, build and unit tests for every Pull Request (PR). The whole process took a lot of time when we initially set up GitHub Actions workflow for PR checks. It was mainly because of the biggest drawback (IMHO) of GitHub actions, i.e. it doesn’t support reusability of workflow logic or actions (at the time of writing) – more context here.
For every JavaScript codebase, the basic steps needed to carry out lint/build/test are these:
- Checking out the repo/codebase
- Setting up the Node environment
- May be use some caching?
- Installing NPM dependencies
Which means you will have to repeat all of that for each of the tasks you need to run – lint/build/test. To avoid that, we ran all the three tasks in a single job in series, i.e. after the above basic steps, we first ran lint followed by build and test, which made the whole process pretty long.
Here is how our workflow looked like:
name: Pull Request checks
on: [pull_request]
jobs:
build:
runs-on: ubuntu-latest
name: Lint/Build/Test
steps:
- name: Get Yarn cache path
id: yarn-cache
run: echo "::set-output name=dir::$(yarn cache dir)"
- name: Checkout the commit
uses: actions/checkout@v2
- name: Set up Node
uses: dcodeIO/setup-node-nvm@master
with:
node-version: lts/*
- name: Load Yarn cache
uses: actions/cache@v2
with:
path: ${{ steps.yarn-cache.outputs.dir }}
key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
restore-keys: |
${{ runner.os }}-yarn-
- name: Install dependencies
run: yarn install --frozen-lockfile
- name: Check for lints
run: yarn lint:ci
- name: Build all packages
run: yarn build:ci
- name: Run all unit tests
run: yarn test-unit:ci
The total execution time for that workflow was nearly 20 minutes – 17m 18s here and 20m 19s here, which was quite frustrating. It meant that even if you make a small change, you had to wait for that much of time to let the whole process finish, so as to be able to merge the changes.
The actual reason for that long execution time was that all the tasks used to run in series (one after the other). We could save much of that time by running all the tasks in parallel but that would mean duplication of logic as mentioned above.
Welcome to build matrix!
GitHub Actions has an excellent feature called build matrix, where you can create multiple jobs by performing variable substitution in a single job definition. Since much of our workflow logic was same for all the tasks we run, we used the build matrix to reuse the logic to run each task as a separate job in parallel, it brought down the execution time to around 10 minutes. đ
Here is the updated workflow using build matrix:
name: Pull Request checks
on: [pull_request]
jobs:
build:
strategy:
matrix:
task: [lint, build, test]
name: ${{ matrix.task }}
steps:
- name: Get Yarn cache path
id: yarn-cache
run: echo "::set-output name=dir::$(yarn cache dir)"
- name: Checkout the commit
uses: actions/checkout@v2
- name: Set up Node
uses: dcodeIO/setup-node-nvm@master
with:
node-version: lts/*
- name: Load Yarn cache
uses: actions/cache@v2
with:
path: ${{ steps.yarn-cache.outputs.dir }}
key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }}
restore-keys: |
${{ runner.os }}-yarn-
- name: Install dependencies
run: yarn install --frozen-lockfile
# Run the task based on matrix variable
- name: Run ${{ matrix.task }}
run: yarn ${{ matrix.task }}:ci
Notice the task being run based on the matrix variable.
We are not done yet đ
Improve caching!
Previously, we only reused yarn cache and thus the dependencies were installed regardless, which took around 2 minutes. Why not cache all the NPM dependencies? đ¤
We changed our Caching step to this:
- name: Cache dependencies
id: cache-deps
uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}
And only installed dependencies if lockfile changed:
- name: Install dependencies
# install deps only if lockfile has changed
if: steps.cache-deps.outputs.cache-hit != 'true'
run: yarn install --frozen-lockfile
It reduced the execution time further by around 2 minutes. đ
Are we done yet? “No” is the answer đ
Intelligently run unit tests!
Why should we run the unit tests which are not affected by the change? Well, that’s the strategy we used to further reduce the execution time. As already said, we use jest for unit tests. Jest provides an option in CLI called --changedSince
, which runs only those tests which are affected since the given commit SHA. So we decided to use the SHA of the base branch of the PR for --changedSince
to run only the tests which are affected by the PR. All we had to do it to add this to our CLI arguments:
--changedSince ${{ github.event.pull_request.base.sha }}
It also meant that if no tests are affected by the PR, unit tests will not be executed. So, if there is a minor change which does not affect any tests, the workflow would finish in less than 5 minutes. đ For example in this run đ.
Here is the final workflow:
name: Pull Request checks
on: [pull_request]
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
task: [lint, build, test]
name: ${{ matrix.task }}
steps:
- name: Checkout the commit
uses: actions/checkout@v2
with:
# To make sure whole history is fetched for jest --changedSince to work as expected
fetch-depth: ${{ ( matrix.task != 'test' && 1 ) || 0 }} # 0 for test, 1 otherwise
- name: Set up Node
uses: dcodeIO/setup-node-nvm@master
with:
node-version: lts/*
- name: Cache dependencies
id: cache-deps
uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}
- name: Install dependencies
# install deps only if lockfile has changed
if: steps.cache-deps.outputs.cache-hit != 'true'
run: yarn install --frozen-lockfile
- name: Set test CLI args
if: ${{ matrix.task == 'test' }}
run: echo "CLI_ARGS=--changedSince ${{ github.event.pull_request.base.sha }}" >> $GITHUB_ENV
- name: Run ${{ matrix.task }}
run: yarn ${{ matrix.task }}:ci ${{ env.CLI_ARGS }}
We may further improve the execution time for our build
step to run the step only for the affected and their dependent packages/domains đ.