An interactive introduction to wavelets and discrete wavelet transformation for data scientists

A few weeks ago, our team won Georgia Tech’s MSc Analytics case competition week sponsored by COX communications. In it, we were asked to build a solution to detect impaired cable transmission signal frequencies, cluster such signals based on similarity of impairments and, leveraging information about the physical network structure, identify potential upstream sources of impairment.

One of the reasons we were able to get to a winning solution was the fact that I happened to learn about this magical thing called discrete wavelet transformation in one of my classes in the fall (ISYE6404 Nonparametric Statistics, in case you’re wondering). Why magical? Well, I don’t know how else to describe an approach that allows you to take a ~8700 data points long signal and approximate it with a small set of coefficients (perhaps as few as 100-1000) while keeping approximation error below ~5%.

Original signal
Reconstructed signal that achieves 95% compression rate with only 3.5% error rate.

When I first learned about them, I looked for resources online but a lot of them were too advanced to fully grasp. One exception to the rule was a blog post A Guide to Using the Wavelet Transform in Machine Learning – I highly recommend reading it through.

In any way, I figured I will create something myself, too. The case competition provided an excellent dataset to build on, and The Practical Guide to Discrete Wavelet Transformation was born. I hope you find it useful!