March 15, 2020

We are in the middle of a viral outbreak. There are many things that we're learning as we go along about this new corona virus. In the meantime, health organizations from around the world are scrambling to learn as much as they can from the cases that are known about. In Stockholm, where I live, the number of cases has been growing over the past two weeks. During this time, I have found many tools and dashboards that have been created to track the number of total infections. ... Read more

Fashion, Trees, and Convolutions: Part III - Convolutions

March 10, 2020

In this mini-series of posts, I will describe a hyper-parameter tuning experiment on the fashion-mnist dataset. I wanted to test out a guided and easy way to run hyper-parameter tuning. In Part II, I described setting up the end-to-end pipeline with a baseline, and running hyper-parameter tuning with the hyperopt package. In this third and final chapter, I describe my target models, a convolutional neural network trained from scratch and a transfer learning model. ... Read more

Fashion, Trees, and Convolutions: Part II - Baseline

March 8, 2020

In this mini-series of posts, I will describe a hyper-parameter tuning experiment on the fashion-mnist dataset. In the Part I, I described the workflow to create the data for my experiments. In this post, I describe creating the baseline and a guided hyper-parameter tuning method. The Baseline For any modeling tasks, I always like to create a baseline model as a starting point. Typically, this will be a relatively basic model in nature. ... Read more

Fashion, Trees, and Convolutions: Part I - Data Crunch

March 6, 2020

In this mini-series of posts, I will describe a hyper-parameter tuning experiment on the fashion-mnist dataset. Hyper-paramater tuning is a very important aspect of training models in machine learning. Particularly with neural networks, where the architecture, optimizer and data can be subject to different parameters. When developing machine learning solutions, there is an interative cycle that can be adopted to enable fast iteration, continous targeted improvements, and testing of the solution - as with any other software systems problems. ... Read more

Cognitive Dissonance: Type hinting and linting in Python

February 12, 2020

At work, the adoption of python 3 was finally moving at warp speed - the end of its support might have had something to do with it. As a result, there was a lot of code to migrate over. One of the things I did during this migration was add type hinting as well as linter checks to the codebase. And I... was not... ready for that! When I first read about type hinting I thought it would be a neat thing to help people new to the language and existing users navigate through code. ... Read more

Revising Documents

December 14, 2019

In this post, I share my notes on doing document revision As an engineer working in a company with more than 1000 people, I am often in a position to read and write documents. Documents, when used in moderation, can be useful for getting input on a design spec, describing a problem and context, or describing how a project was carried out, e.g. this is how we built our first ever lunch place automatic rotation assignment system to avoid the same debate we have every week. ... Read more

Encoding in Python 2 and 3

November 21, 2019

In this post I an encoding behavior in python 2 and their differences in python 3 If you're still using python 2, which many folks are, you may run into encoding issues when processing data. Let's say we have a file called translations.txt that contains translations between English and Mandarin (from the Oxford dictionary): "The book has 500 pages of text.","这本书正文有500页。" "I'll send you a text as soon as I have any news. ... Read more

Simulating Context-Free Bandits

July 22, 2019

In this post I describe a framework and experiment in simulating context-free bandits The explore-exploit dilemma can be found in many aspects of every day life. To put this in concrete terms, imagine a person receives a free 30 meal gift card from a new breakfast restaurant that just opened up in their city. The restaurant may be well known for having good breakfast options; and as a breakfast lover, the person wants to find the best breakfast option on the menu — note that best here means personally favored, not categorically best, as in defined by a food critic or social media popularity. ... Read more

Concept Drift: Notes for the practicioner

October 20, 2018

In this article, I share notes on handling concept drift for machine learning models. Introduction Concept drift occurs in an online supervised learning setting, when the relationship between the input data X and output data y is altered to the extent that a model mapping X to y can no longer do so with the same efficacy. In online supervised learning, there are three types of drift that can occur: (1) feature drift, i. ... Read more

The Monty Hall Problem

August 18, 2018

In this post I explain the way I came to reason about the Monty Hall problem and provide a tool for you to run experiements to see the outcome of different strategies for playing the game. The Monty Hall problem is an interesting probability teaser. The premise is this: suppose you are at a game show with three doors, one of which has a prize. You, as a guest, have two chances of choosing a door to win the prize. ... Read more