GitHub Actions has been around for a while now and it’s been a great experience so far, although there is plenty of room for improvement.
When it comes to CI/CD, we have quite a few choices for a JavaScript codebase. Having used Travis and Circle CI where things are too abstracted to understand what happens under the hood, GitHub Actions is a much better approach for a CI.
We at Event Espresso, manage all our TypeScript/React codebase in a monorepo. The repository contains all our internal packages as well as our domains (use cases). We use ejected CRA (Create React App) along with Yarn workspaces for development and build process. So, the codebase is quite large, with 25 odd huge packages and many domains along with 1000+ jest unit test cases.
Like any other JavaScript project, we run lint, build and unit tests for every Pull Request (PR). The whole process took a lot of time when we initially set up GitHub Actions workflow for PR checks. It was mainly because of the biggest drawback (IMHO) of GitHub actions, i.e. it doesn’t support reusability of workflow logic or actions (at the time of writing) – more context here.
For every JavaScript codebase, the basic steps needed to carry out lint/build/test are these:
- Checking out the repo/codebase
- Setting up the Node environment
- May be use some caching?
- Installing NPM dependencies
Which means you will have to repeat all of that for each of the tasks you need to run – lint/build/test. To avoid that, we ran all the three tasks in a single job in series, i.e. after the above basic steps, we first ran lint followed by build and test, which made the whole process pretty long.
Here is how our workflow looked like:
name: Pull Request checks on: [pull_request] jobs: build: runs-on: ubuntu-latest name: Lint/Build/Test steps: - name: Get Yarn cache path id: yarn-cache run: echo "::set-output name=dir::$(yarn cache dir)" - name: Checkout the commit uses: actions/checkout@v2 - name: Set up Node uses: dcodeIO/setup-node-nvm@master with: node-version: lts/* - name: Load Yarn cache uses: actions/cache@v2 with: path: ${{ steps.yarn-cache.outputs.dir }} key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }} restore-keys: | ${{ runner.os }}-yarn- - name: Install dependencies run: yarn install --frozen-lockfile - name: Check for lints run: yarn lint:ci - name: Build all packages run: yarn build:ci - name: Run all unit tests run: yarn test-unit:ci
The total execution time for that workflow was nearly 20 minutes – 17m 18s here and 20m 19s here, which was quite frustrating. It meant that even if you make a small change, you had to wait for that much of time to let the whole process finish, so as to be able to merge the changes.
The actual reason for that long execution time was that all the tasks used to run in series (one after the other). We could save much of that time by running all the tasks in parallel but that would mean duplication of logic as mentioned above.
Welcome to build matrix!
GitHub Actions has an excellent feature called build matrix, where you can create multiple jobs by performing variable substitution in a single job definition. Since much of our workflow logic was same for all the tasks we run, we used the build matrix to reuse the logic to run each task as a separate job in parallel, it brought down the execution time to around 10 minutes. đ
Here is the updated workflow using build matrix:
name: Pull Request checks on: [pull_request] jobs: build: strategy: matrix: task: [lint, build, test] name: ${{ matrix.task }} steps: - name: Get Yarn cache path id: yarn-cache run: echo "::set-output name=dir::$(yarn cache dir)" - name: Checkout the commit uses: actions/checkout@v2 - name: Set up Node uses: dcodeIO/setup-node-nvm@master with: node-version: lts/* - name: Load Yarn cache uses: actions/cache@v2 with: path: ${{ steps.yarn-cache.outputs.dir }} key: ${{ runner.os }}-yarn-${{ hashFiles('**/yarn.lock') }} restore-keys: | ${{ runner.os }}-yarn- - name: Install dependencies run: yarn install --frozen-lockfile # Run the task based on matrix variable - name: Run ${{ matrix.task }} run: yarn ${{ matrix.task }}:ci
Notice the task being run based on the matrix variable.
We are not done yet đ
Improve caching!
Previously, we only reused yarn cache and thus the dependencies were installed regardless, which took around 2 minutes. Why not cache all the NPM dependencies? đ¤
We changed our Caching step to this:
- name: Cache dependencies id: cache-deps uses: actions/cache@v2 with: path: '**/node_modules' key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }}
And only installed dependencies if lockfile changed:
- name: Install dependencies # install deps only if lockfile has changed if: steps.cache-deps.outputs.cache-hit != 'true' run: yarn install --frozen-lockfile
It reduced the execution time further by around 2 minutes. đ
Are we done yet? “No” is the answer đ
Intelligently run unit tests!
Why should we run the unit tests which are not affected by the change? Well, that’s the strategy we used to further reduce the execution time. As already said, we use jest for unit tests. Jest provides an option in CLI called --changedSince
, which runs only those tests which are affected since the given commit SHA. So we decided to use the SHA of the base branch of the PR for --changedSince
to run only the tests which are affected by the PR. All we had to do it to add this to our CLI arguments:
--changedSince ${{ github.event.pull_request.base.sha }}
It also meant that if no tests are affected by the PR, unit tests will not be executed. So, if there is a minor change which does not affect any tests, the workflow would finish in less than 5 minutes. đ For example in this run đ.
Here is the final workflow:
name: Pull Request checks on: [pull_request] jobs: build: runs-on: ubuntu-latest strategy: matrix: task: [lint, build, test] name: ${{ matrix.task }} steps: - name: Checkout the commit uses: actions/checkout@v2 with: # To make sure whole history is fetched for jest --changedSince to work as expected fetch-depth: ${{ ( matrix.task != 'test' && 1 ) || 0 }} # 0 for test, 1 otherwise - name: Set up Node uses: dcodeIO/setup-node-nvm@master with: node-version: lts/* - name: Cache dependencies id: cache-deps uses: actions/cache@v2 with: path: '**/node_modules' key: ${{ runner.os }}-modules-${{ hashFiles('**/yarn.lock') }} - name: Install dependencies # install deps only if lockfile has changed if: steps.cache-deps.outputs.cache-hit != 'true' run: yarn install --frozen-lockfile - name: Set test CLI args if: ${{ matrix.task == 'test' }} run: echo "CLI_ARGS=--changedSince ${{ github.event.pull_request.base.sha }}" >> $GITHUB_ENV - name: Run ${{ matrix.task }} run: yarn ${{ matrix.task }}:ci ${{ env.CLI_ARGS }}
We may further improve the execution time for our build
step to run the step only for the affected and their dependent packages/domains đ.