Power Tester

Playwright Tests: Yes, you can finish ALL your playwright tests in 2 to 5 mins!

3k-tests-in-90-seconds

Problem Statement

Playwright provides a option to distribute tests on multiple machines with its Test Sharding option. However there are two main problems with it:

    Sharding is not aware of how much time each test takes.

  1. Even though playwright sharding tries to evenly distributes tests, as explained here in the article about Balancing Shards, in terms of execution time, not all tests take equal time.

    And since playwright sharding does not keep a track of time each test takes, this means, there will be some runners that will finish early and some that will finish late. Eventually your total test run time will be bottlenecked due to the slowest running tests.

    Often the only way team tries to bring down total exeuction time is by increasing the number of runners, which increases both the cost but also the waste per runner since not all runners are utilized to their full potential.

  2. uneven-test-run

    No examples of dynamic scaling of runners

  3. Playwright provides a Github Actions Example to show how to define runners for sharding. However, since often the same reusable workflow is used to run all tests, and test fixes, there are times, when the number of tests touched are very few, but the reusable workflow will still spin up all the runners. This results into a lot of wasted resources and money.

    NOTE: As mentioned in previous point, since team often tries to increase the runners to bring down execution time, it means, the waste is very high, if the team does not have a way to scale runners down, when only fixing a few broken tests.

  4. too-many-runners

Desired Solution

A desired solution would involve creating a dynamic test sharding mechanism that is aware of how much time each test takes and depending on the load of tests and desired total execution time, creates required runners at the run time.

This means, if we have a lot of tests to run, then based on the desired total execution time, we can create bundles of tests based on their run time, and create as many runners as required to cover all tests in the given time. In this model, one runner may run 20 small tests where another may run only 2 big tests. Where each runner, does not surpass the total execution time limit (of say 2 mins)

dynamic-runners-for-dynamic-load

It also means, that if we only touch a few tests and want to finish them in 1 or 2 mins, then we only need to spin up the bare minimum runners to finish all tests in 1 or 2 mins.

less-runnes-for-less-load

Solution Design

Below is a explain of how the above solution works: The key components of this design are:

STATE.JSON

  1. This file should contain a map of each test case and the time it took to run on the local machine.
  2. The updated tests should run automatically when a developer tries to commit these tests (using pre-commit hooks).
  3. File should be automatially created after each test run (using a custom runner) that runs automatically after each run.
  4. And by committing this file automatically (using a post-commit hook).

A Custom GitHub action

Actual Solution

Here is the reference to the actual solution RunWright (a custom github action and caller workflow code example), that shows how different pieces come together to make it all work).

Reference

Reference spreadseheet: Spreadsheet