Name

This string specifies the name of this test run configuration

Output directory

Directory path to use as the root folder for all of the runner output (logs, reports, etc)

Analytics configuration

Configuration of analytics backend to be used for storing and retrieving test metrics. This plays a major part in optimising performance and mitigating flakiness.

Disabled analytics

By default no analytics backend is expected which means that each test will be treated as a completely new test.

InfluxDB

Assuming you’ve done the setup for InfluxDB you need to provide:

  • url
  • username
  • password
  • database name
  • retention policy

Database name is quite useful in case you have multiple configurations of tests/devices and you don’t want metrics from one configuration to affect the other one, e.g. regular and end-to-end tests.

Pooling strategy

Pooling strategy affects how devices are grouped.

Omni

All connected devices are merged into one group. This is the default mode.

Abi

Devices are grouped by their ABI, e.g. x86 and mips.

Manufacturer

Devices are grouped by manufacturer, e.g. Samsung and Yota.

Model

Devices are grouped by model name, e.g. LG-D855 and SM-N950F.

OS version

Devices are grouped by OS version, e.g. 24 and 25.

Sharding strategy

Sharding is a mechanism that allows the marathon to affect the tests scheduled for execution inside each pool

Parallel sharding

Executes each test in parallel on all of the available devices in pool. This is the default behaviour.

Count sharding

Executes each test count times inside each pool. For example you want to test the flakiness of a specific test hence you need to execute this test a lot of times. Instead of running the build X times just use this sharding strategy and the test will be executed X times.

Sorting strategy

In order to optimise the performance of test execution tests need to be sorted. This requires analytics backend enabled since we need historical data in order to anticipate tests behaviour like duration and success/failure rate.

No sorting

No sorting of tests is done at all. This is the default behaviour.

Success rate sorting

For each test analytics storage is providing the success rate for a time window specified by time timeLimit parameter. All the tests are then sorted by the success rate in an increasing order, that is failing tests go first and successful tests go last.

Execution time sorting

For each test analytics storage is providing the X percentile duration for a time window specified by time timeLimit parameter. Percentile is configurable via the percentile parameter. All the tests are sorted so that long tests go first and short tests are executed last. This allows marathon to minimise the error of balancing the execution of tests at the end of execution.

Batching strategy

Batching mechanism allows you to trade off stability for performance. A group of tests executed using one single run is called a batch. Most of the times this means that between tests in the same batch you’re sharing the device state so there is no clean-up. On the other hand you gain some performance improvements since the execution command usually is quite slow (up to 10 seconds for some platforms).

Isolate batching

No batching is done at all, each test is executed using separate command execution, that is performance is sacrificed in favor of stability. This is the default mode.

Fixed size batching

Each batch is created based on the size parameter which is required. When a new batch of tests is needed the queue is dequeued for at most size tests.

Optionally if you want to limit the batch duration you have to specify the timeLimit for the test metrics time window and the durationMillis. For each test the analytics backend is accessed and percentile of it’s duration is queried. If the sum of durations is more than the durationMillis then no more tests are added to the batch.

This is useful if you have very very long tests and you use batching, e.g. you batch by size 10 and your test run duration is roughly 10 minutes, but you have tests that are expected to run 2 minutes each. If you batch all of them together then at least one device will be finishing it’s execution in 20 minutes while all other devices might already finish. To mitigate this just specify the time limit for the batch using durationMillis.

Another optional parameter for this strategy is the lastMileLength. At the end of execution batching tests actually hurts the performance so for the last tests it’s much better to execute them in parallel in separate batches. This works only if you execute on multiple devices. You can specify when this optimisation kicks in using the lastMileLength parameter, the last lastMileLength tests will use this optimisation.

Flakiness strategy

This is the main anticipation logic for marathon. Using the analytics backend we can understand the success rate and hence queue preventive retries to mitigate the flakiness of the tests and environment.

Ignore flakiness

Nothing is done with this mode. This is the default behaviour.

Probability based flakiness strategy

The main idea is that flakiness strategy anticipates the flakiness of the test based on the probability of test passing and tries to maximise the probability of passing when executed multiple times. For example the probability of test A passing is 0.5 and configuration has probability of 0.8 requested, then the flakiness strategy multiplies the test A to be executed 3 times (0.5 x 0.5 x 0.5 = 0.125 is the probability of all tests failing, so with probability 0.875 > 0.8 at least one of tests will pass).

The minimal probability that you want is specified using minSuccessRate during the time window controlled by the timeLimit. Additionally if you specify too high minSuccessRate you’ll have too many retries, so the upper bound for this is controlled by the maxCount parameter so that this strategy will calculate the required number of retries according to the minSuccessRate but if it’s higher than the maxCount it will choose maxCount.

Retry strategy

This is the logic that kicks in if our preventive logic failed to anticipate such high number of retries. This works after the tests were actually executed.

No retries

As the name implies, no retries are done. This is the default mode.

Fixed quota retry strategy

Parameter totalAllowedRetryQuota specifies how many retries at all (for all the tests is total) are allowed. retryPerTestQuota controls how many retries can be done for each test individually.

Filtering configuration

Filtering of tests is important since usually we as developers have the same codebase for all the different types of tests we want to execute. In order to indicate to marathon which tests you want to execute you can use the whitelist and blackist parameters. First whitelist is applied, then the blacklist. Each accept a TestFilter based on the class name, fully qualified class name, package, annotation or method. Each expects a regular expression as a value.

In order to filter using multiple filters at the same time a composition filter is also available which accepts a list of base filters and also an operation such as UNION, INTERSECTION or SUBTRACT. This allows to create complex filters such as get all the tests starting with E2E but get only methods from there ending with Test.

An important thing to mention is that by default platform specific ignore options are not taken into account. This is because a cross-platform test runner cannot account for all the possible test frameworks out there. However each framework’s ignore option can still be “explained” to marathon, e.g. JUnit’s org.junit.Ignore annotation can be specified in the filtering configuration.

Ignore failures

By default the build fails if some tests failed. If you want to the build to succeed even if some tests failed use true.

Test class regular expression

By default test classes are searched with the "^((?!Abstract).)*Test$" regex. You can override this if you need to.

Test output timeout

This parameter specifies the behaviour for the underlying test executor to timeout if there is no output. By default this is set to 60 seconds.

Debug mode

Enabled very verbose logging to stdout of all the marathon components. Very useful for debugging.

Vendor configuration

See relevant vendor module page, e.g. Android or iOS

Analytics tracking

To better understand the use-cases that marathon is used for we’re asking you to provide us with anonymised information about your usage. By default this is disabled. Use true to enable.

Uncompleted test retry quota

By default tests that don’t have any status reported after execution (for example a device disconnected during the execution) retry indefinitely. You can limit the number of total execution for such cases using this option.

Strict mode

By default if one of the test retries succeeds then the test is considered successfully executed. If you require success status only when all retries were executed successfully you can enable the strict mode. This may be useful to verify that flakiness of tests was fixed for example.