Sample Ratio Mismatch

2022/01/19

Summary

SRM is a mismatch between the expected sample ratio and the observed sample ratio

SRMs are a common data quality issues

Approximately 6% of experiments at Microsoft exhibit an SRM

SRMs cause a selection bias that invalidates any causal inference

We can use a standard t-test or chi-squared test to compute the p-value

There are different ways to detect SRMs

What is SRM

The Sample Ratio Mismatch (SRM) metric looks at the ratio of users (or other units) between two variants. If the experiment design exposes a specific user ratio to the two variants, then the results should closely match the design.

Scenario 1

Given we run an experiment with 2 variants, control, and treatment, each assigned 50% of users. We expect to see an approximately equal number of users in each, but our results are:

Control: 821,588 users
Treatment: 815,482 users

The ratio between the two is 0.993, whereas the ratio should be 1.0

Ratio between control and treatment is (815,482/821,588) = 0.993

The p-value of the above .993 Sample Ratio is 1.8E-6.

That said, there is an SRM. Therefore, it is more likely that there is a bug in the implementation of the experiment.

Causes

Buggy randomization of users
Data pipeline issues
Residual effects (Some deployment, bug fixes, or new features cause the issue on experimentation)

Debugging SRMs

Validate that there is no difference upstream of randomization point.
Validate that variant assignment is correct
Follow the stages of the data processing pipeline
- Bot filtering, User ID filtering
Exclude the initial period
- Sometimes, the initial period did not start together. For example, caches take time to prime
Look at the Sample Ratio for segments
- Look at each day separately
- Is there a browser segment, app, or platform that stands out (e.g., Android, iOS)
- Do new users and returning users show different ratios?
Look at the intersection with other experiments

Some exciting reads

#ab-testing #sample-ratio-mismatch

lukkiddd. 2022, powered by Jekyll Garden

Linkedin | Github | Twitter