I was working in a nasty flaky test, the test was passing locally but was failing on Github Actions CI, but there was a deeper problem, the debugging process to find out why that was happening was f*cking slow. I was changing something to the test, push and waiting our Github Action workflow to eventually run the test, so I can see what is happening there. The workflow was taking almost 10 minutes… So I decided to speed this up because it seemed ridiculous.
There are 3 main tasks that we want to ensure that are green before merging:
Our Github action initially looked like this
jobs:
testing_and_linting:
name: Testing & Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- name: Install npm
run: npm install
- name: Run typecheck
run: npm run typecheck
- name: Run linting
run: npm run lint
- name: Run tests
run: npm run test
Github is running the steps sequentially, so type check will run before linting which will run before tests. In reality there is no reason for these tasks to run sequentially, we can split the job and run them in parallel. So my first attempt was do to exactly that:
jobs:
typecheck:
name: Typechecking
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- name: Install npm
run: npm install
- name: Run typecheck
run: npm run typecheck
linting:
name: Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- name: Install npm
run: npm install
- name: Run linting
run: npm run lint
testing:
name: Testing & Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- name: Install npm
run: npm install
- name: Run testing
run: npm run test
This improved the things a bit but not as much as I expected. I noticed that npm install
was taking quite some time and there:
So every job was wasting 1 minute to install node_modules and that has to happen every time we pushed.
Github provides an action that can help to cache files based on a key. In our case the key can be the hash of the ‘package-lock.json’ file. When dependencies in the project change the file will change so the action will install the dependencies, otherwise it will use the cache. My workflow file looked like this after that change:
jobs:
typecheck:
name: Typechecking
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
run: npm install
- name: Run typecheck
run: npm run typecheck
linting:
name: Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
run: npm install
- name: Run linting
run: npm run lint
testing:
name: Testing & Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
run: npm install
- name: Run testing
run: npm run test
This resulted in better npm install
times:
But I was still not satisfied. Why to waste 17sec just to check that you dont have to install something?
Thankfully Github let us check if we hit or not the cache, based on that we can skip completely the step:
- name: Install npm
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm install
That resulted to the npm install
step to be skipped completely:
At that stage my workflow file looked like that:
jobs:
typecheck:
name: Typechecking
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm install
- name: Run typecheck
run: npm run typecheck
linting:
name: Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm install
- name: Run linting
run: npm run lint
testing:
name: Testing & Linting
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm install
- name: Run testing
run: npm run test
Github Actions have a great feature called Matrix Strategy. It lets you set of possible configurations and run the same job with these different configurations. For example:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-16.04, ubuntu-18.04]
node: [6, 8, 10]
steps:
- uses: actions/setup-node@v1
with:
node-version: ${{ matrix.node }}
In their words:
You can define a matrix of different job configurations. A matrix allows you to create multiple jobs by performing variable substitution in a single job definition. For example, you can use a matrix to create jobs for more than one supported version of a programming language, operating system, or tool. A matrix reuses the job’s configuration and creates a job for each matrix you configure.
I don’t need to run the tests with different configurations but I thought that I can create chunks of tests and spawn dynamically jobs each of which will run a separate chunk these test, and since these jobs will run in parallel I can save execution time. In my case I decided to create 10 chunks so my configuration file looked like this:
testing:
name: Testing
runs-on: ubuntu-latest
strategy:
matrix:
chunk: [1,2,3,4,5,6,7,8,9,10]
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
id: npm-cache
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm install
Now the question is how I will assign test files to each chunk?
Jest provide an option that you can list all your test files:
jest --listTests
And if you run this you get a list of your test files:
So we are pretty close, we will split this list into chunks and run each chunk independently. For that I wrote a bash script (bare with me, I am not a bash expert) that it looks like this:
echo "Config: Current Chunk $1, Number of chunks: $2"
# Calculate how many test files should each chunk have
I=$( expr "$2" '+' "1")
CHUNK_SIZE=$((`./node_modules/.bin/jest --listTests | wc -l` / $I))
echo "Chunk size: $CHUNK_SIZE"
# Get the list of the tests that will run
TEST_START_INDEX=$( expr "$CHUNK_SIZE" '*' "$1")
TEST_FILES=$(./node_modules/.bin/jest --listTests | head -n $TEST_START_INDEX | tail -n $CHUNK_SIZE)
echo "======= TEST FILES ======="
echo $TEST_FILES
# Run each chunk
npm run test $TEST_FILES
The $1
and $2
are variables that will be passed to the script by the Github action.
The first is the current chunk from the strategy matrix and the second one is the number of
chunks we have.
So my testing job finally looked like:
testing:
name: Testing
runs-on: ubuntu-latest
strategy:
matrix:
chunk: [1,2,3,4,5,6,7,8,9,10]
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v1
- uses: actions/cache@v2
id: npm-cache
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
- name: Install npm
if: steps.npm-cache.outputs.cache-hit != 'true'
run: npm install
- name: jest
run: ./scripts/setupTestMatrix.sh ${{ matrix.chunk }} 10
These generated the following jobs in the workflow:
All the jobs run in parallel and all the test jobs run the same amount of test suites:
Testing: 3
Finally, all these reduced the Github Action workflow
From:
To:
One thing that worth mentioning is that you may have chunks that are really fast (e.g. 20 sec) while others may take 1 minute. We assign to each of our chunks a few test files, but we don’t know how many tests each test file have, so you may end up with some chunks that run many more tests than others. An interesting project would be how to balance the chunks, so we assign the same amount of tests (instead of the same amount of test suites).