The story of reducing a Github Actions workflow by ~7 minutes

I was working in a nasty flaky test, the test was passing locally but was failing on Github Actions CI, but there was a deeper problem, the debugging process to find out why that was happening was f*cking slow. I was changing something to the test, push and waiting our Github Action workflow to eventually run the test, so I can see what is happening there. The workflow was taking almost 10 minutes… So I decided to speed this up because it seemed ridiculous.

There are 3 main tasks that we want to ensure that are green before merging:

Passing Tests
Linting
Respect Typescript types

Run jobs in parallel

Our Github action initially looked like this

jobs:
  testing_and_linting:
    name: Testing & Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - name: Install npm
        run: npm install
      - name: Run typecheck
        run: npm run typecheck
      - name: Run linting
        run: npm run lint
      - name: Run tests
        run: npm run test

Github is running the steps sequentially, so type check will run before linting which will run before tests. In reality there is no reason for these tasks to run sequentially, we can split the job and run them in parallel. So my first attempt was do to exactly that:

jobs:
  typecheck:
    name: Typechecking
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - name: Install npm
        run: npm install
      - name: Run typecheck
        run: npm run typecheck

  linting:
    name: Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - name: Install npm
        run: npm install
      - name: Run linting
        run: npm run lint

  testing:
    name: Testing & Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - name: Install npm
        run: npm install
      - name: Run testing
        run: npm run test

This improved the things a bit but not as much as I expected. I noticed that npm install was taking quite some time and there:

So every job was wasting 1 minute to install node_modules and that has to happen every time we pushed.

Cache /node_modules

Github provides an action that can help to cache files based on a key. In our case the key can be the hash of the ‘package-lock.json’ file. When dependencies in the project change the file will change so the action will install the dependencies, otherwise it will use the cache. My workflow file looked like this after that change:

jobs:
  typecheck:
    name: Typechecking
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        run: npm install
      - name: Run typecheck
        run: npm run typecheck
  

  linting:
    name: Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        run: npm install
      - name: Run linting
        run: npm run lint

  testing:
    name: Testing & Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        run: npm install
      - name: Run testing
        run: npm run test

This resulted in better npm install times:

But I was still not satisfied. Why to waste 17sec just to check that you dont have to install something?

Thankfully Github let us check if we hit or not the cache, based on that we can skip completely the step:

      - name: Install npm
        if: steps.npm-cache.outputs.cache-hit != 'true'
        run: npm install

That resulted to the npm install step to be skipped completely:

At that stage my workflow file looked like that:

jobs:
  typecheck:
    name: Typechecking
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        if: steps.npm-cache.outputs.cache-hit != 'true'
        run: npm install
      - name: Run typecheck
        run: npm run typecheck
  

  linting:
    name: Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        if: steps.npm-cache.outputs.cache-hit != 'true'
        run: npm install
      - name: Run linting
        run: npm run lint

  testing:
    name: Testing & Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        if: steps.npm-cache.outputs.cache-hit != 'true'
        run: npm install
      - name: Run testing
        run: npm run test

(Ab)Use Matrix Strategy to run tests in parallel chunks

Github Actions have a great feature called Matrix Strategy. It lets you set of possible configurations and run the same job with these different configurations. For example:

runs-on: ${{ matrix.os }}
strategy:
  matrix:
    os: [ubuntu-16.04, ubuntu-18.04]
    node: [6, 8, 10]
steps:
  - uses: actions/setup-node@v1
    with:
      node-version: ${{ matrix.node }}

In their words:

You can define a matrix of different job configurations. A matrix allows you to create multiple jobs by performing variable substitution in a single job definition. For example, you can use a matrix to create jobs for more than one supported version of a programming language, operating system, or tool. A matrix reuses the job’s configuration and creates a job for each matrix you configure.

I don’t need to run the tests with different configurations but I thought that I can create chunks of tests and spawn dynamically jobs each of which will run a separate chunk these test, and since these jobs will run in parallel I can save execution time. In my case I decided to create 10 chunks so my configuration file looked like this:

  testing:
    name: Testing
    runs-on: ubuntu-latest
    strategy:
      matrix:
        chunk: [1,2,3,4,5,6,7,8,9,10]
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        id: npm-cache
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        if: steps.npm-cache.outputs.cache-hit != 'true'
        run: npm install

Now the question is how I will assign test files to each chunk?

Jest provide an option that you can list all your test files:

jest --listTests

And if you run this you get a list of your test files:

So we are pretty close, we will split this list into chunks and run each chunk independently. For that I wrote a bash script (bare with me, I am not a bash expert) that it looks like this:

echo "Config: Current Chunk $1, Number of chunks: $2"

# Calculate how many test files should each chunk have
I=$( expr "$2" '+' "1")
CHUNK_SIZE=$((`./node_modules/.bin/jest --listTests | wc -l` / $I))

echo "Chunk size: $CHUNK_SIZE"

# Get the list of the tests that will run 
TEST_START_INDEX=$( expr "$CHUNK_SIZE" '*' "$1")
TEST_FILES=$(./node_modules/.bin/jest --listTests | head -n $TEST_START_INDEX | tail -n $CHUNK_SIZE)

echo "======= TEST FILES ======="
echo $TEST_FILES

# Run each chunk
npm run test $TEST_FILES

The $1 and $2 are variables that will be passed to the script by the Github action. The first is the current chunk from the strategy matrix and the second one is the number of chunks we have.

So my testing job finally looked like:

  testing:
    name: Testing
    runs-on: ubuntu-latest
    strategy:
      matrix:
        chunk: [1,2,3,4,5,6,7,8,9,10]
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - uses: actions/cache@v2
        id: npm-cache
        with:
          path: '**/node_modules'
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
      - name: Install npm
        if: steps.npm-cache.outputs.cache-hit != 'true'
        run: npm install
      - name: jest
        run: ./scripts/setupTestMatrix.sh ${{ matrix.chunk }} 10

These generated the following jobs in the workflow:

All the jobs run in parallel and all the test jobs run the same amount of test suites:

Testing: 3

Testing: 4

Finally, all these reduced the Github Action workflow

From:

To:

Note

One thing that worth mentioning is that you may have chunks that are really fast (e.g. 20 sec) while others may take 1 minute. We assign to each of our chunks a few test files, but we don’t know how many tests each test file have, so you may end up with some chunks that run many more tests than others. An interesting project would be how to balance the chunks, so we assign the same amount of tests (instead of the same amount of test suites).

Published 20 Nov 2020

Clean Code

Engineering Manager. Opinions are my own and not necessarily the views of my employer.
Avraam Mavridis on Twitter