Bangda Sun

Practice makes perfect

AB Testing (Udacity) Learning Notes (3)

Designing Experiments.

1. Basic Components

1.1 Unit of Diversion

Unit of diversion is the subject to test, this is like “sample” in statistical analysis. We need to decide how to assign events to either the control or experiment group. The assignment will be based on unit of diversion, it needs to be user consistency, e.g. user, cookie (identified as same person), etc.

1.2 Unit of Analysis

Unit of analysis is usually the “denominator” of the metric. When empirical variability is higher than analytical variability for the metric, the it could be the unit of analysis is different from the unit of diversion.

1.3 Population

This is determined from multiple perspectives.

1.4 Sizing

The size depends on both significance level (both statistical and practical) and sensitivity. The choice of metrics and unit of diversion are also needed to take into account since they will affect the variability of the metrics.

1.5 Duration and Exposure

Duration relates to the proportion of the traffic. The reason why not run the test on all traffic:

  • safety, if something goes wrong
  • effect of holiday / week variation (get data too fast might not be good)
  • need traffic for other test

1.6 Cohort

People who enter the experiment at same time will be the test subject. Cohorts are harder to analyze since they need more data. Typically you only want to use them when you are looking for user stability. For example, you have a learning effect or you want to measure something like increasing usage of the site or device, these are the case you want to see if changes will have a real effect on their behavior relative to their history.

1.7 Summary

In previous examples, all unit of diversions are really looking at proxies for users, the real users are hard to identified, for user login, one user can login multiple accounts; for cookie, one user can change to another device. Incorrect settings can end up with the mix of the same people on both control and experiment side, which makes the testing unreliable.

AB testing is inter-user experiment, which means same people cannot be in both control and experiment sides. There are other methods called intra-user experiment.