Tag: statistics

A/B testing, zeroinflated (truncated) distributions and power
Naive A/B testing just uses ttests or proportion tests, with the assumption that at large sample sizes, the right statistical test does not matter that much. I explore the case of a zeroinflated upperbounded Poisson distribution and find that using the wrong test can require 3x the sample size to achieve the same statistical power, a difference large enough to matter in a real business setting.

Gaussian Processes: a versatile data science method that packs infinite dimensions
Last semester, I learned about Gaussian Processes. They seemed really intriguing at the first glance, and it turned out they are even more intriguing as you dig deeper. This post is an applicationoriented intro to Gaussian Processes. I’ll cover GP regressions, forecasting for time series and usage of GPs in bayesian optimization among other things.

Interpretation of log transformations in linear models: just how accurate is it?
Logtransformations and their interpretation as percentage impact is taught in every introductory regression class. But are most people aware that there is a hidden approximation behind the percentagebased intuition? One that may not be appropriate in some cases?