Simulating Context-Free Bandits

July 22, 2019

In this post I describe a framework and experiment in simulating context-free bandits The explore-exploit dilemma can be found in many aspects of every day life. To put this in concrete terms, imagine a person receives a free 30 meal gift card from a new breakfast restaurant that just opened up in their city. The restaurant may be well known for having good breakfast options; and as a breakfast lover, the person wants to find the best breakfast option on the menu — note that best here means personally favored, not categorically best, as in defined by a food critic or social media popularity. ... Read more

Concept Drift: Notes for the practicioner

October 20, 2018

In this article, I share notes on handling concept drift for machine learning models. Introduction Concept drift occurs in an online supervised learning setting, when the relationship between the input data X and output data y is altered to the extent that a model mapping X to y can no longer do so with the same efficacy. In online supervised learning, there are three types of drift that can occur: (1) feature drift, i. ... Read more

The Monty Hall Problem

August 18, 2018

In this post I explain the way I came to reason about the Monty Hall problem and provide a tool for you to run experiements to see the outcome of different strategies for playing the game. The Monty Hall problem is an interesting probability teaser. The premise is this: suppose you are at a game show with three doors, one of which has a prize. You, as a guest, have two chances of choosing a door to win the prize. ... Read more

Understanding Data

April 3, 2018

Rich Metadata on Data I have been learning many things about dealing with data at a large scale. At first, I kept using the term quality to describe the state of data. However, it quickly became clear that the term had various dimensions to it, and it could not summarise the issues one can observe. I have come to use the expression understanding data instead because (1) it captures the state I wish to describe and (2) speaks the scientific and functional purposes of that state. ... Read more

Property Based Testing: Describing expected behaviour in terms of properties

March 19, 2018

In my daily work, I often need to write data pipelines to produce metrics, create datasets for machine learning models, or just clean up logs. I had usually done testing the traditional way, verifying my code does what I expect by checking for normal and edge cases I could think of. But over the past couple of months, I started using property based testing, and I feel like my code quality has improved dramatically. ... Read more

Language Style Guide

March 18, 2018

The first programming language I learned was Pascal. I was in my math class in secondary school, and my teacher told us about a programming competition that was held every year between schools. At that point, I everything I knew about programming was a pure construct of my imagination. I knew that you were suppose to type something into a machine and it would do things. But I was into computer hardware at the time, so I figured why not. ... Read more

Task Paralysis

February 26, 2018

What do you do when you have several things you think you should be doing, but aren’t sure which of them you should do next, either right now or at some other point in the near future? This is what is called task paralysis. And it happens to all of us, at one point or another. You look up from your screen and you suddenly realize there are sticky notes filled with things you need or want to do on your desk, from checking out that presentation about market places to reviewing a pull request. ... Read more

Relatively Painless Technical Excellence

December 20, 2017

I have been thinking about excellence for the past couple of months, since I've starting working full time again. Excellence in writing code, designing systems, and formulating problems. Last week I had the opportunity to attend a talk titled Relatively Painless Technical Excellence from J.B. Reinsberger. It was one of the best talks I've heard on agile and software engineering. I have to start by saying I didn't have much expectations about it, and I was positively surprised by the content. ... Read more

fsql: search through your file system like a database

May 20, 2017

fsql is a tool I came across recently, while searching for Go projects on github. It's a command line tool that lets you run SQL-like queries on your file system. You can search through your files based on their name, size, mode, and time. It'll take you 10 min to get setup, and start using it like a (semi-)pro. It's definetly one of my favourite command line tools now. You'll find everything you need to know about it on its Github page. ... Read more

These Past Weeks in Science & Tech - 002

May 2, 2017

In this May's edition of “These Past Weeks in Science & Tech”, I'll be discussing /R Biological Data Stores Our bodies are walking a library of information. From chemical processes that regulate our biological functions, to experiences imprinted in our minds through exeprience, our cells somehow manage to store, access, and use information. So, it is no surpise that we should turn to biology to find the next generation of storage. ... Read more