Revising Documents

December 14, 2019

In this post, I share my notes on doing document revision As an engineer working in a company with more than 1000 people, I am often in a position to read and write documents. Documents, when used in moderation, can be useful for getting input on a design spec, describing a problem and context, or describing how a project was carried out, e.g. this is how we built our first ever lunch place automatic rotation assignment system to avoid the same debate we have every week. ... Read more

Concept Drift: Notes for the practicioner

October 20, 2018

In this article, I share notes on handling concept drift for machine learning models. Introduction Concept drift occurs in an online supervised learning setting, when the relationship between the input data X and output data y is altered to the extent that a model mapping X to y can no longer do so with the same efficacy. In online supervised learning, there are three types of drift that can occur: (1) feature drift, i. ... Read more

Understanding Data

April 3, 2018

Rich Metadata on Data I have been learning many things about dealing with data at a large scale. At first, I kept using the term quality to describe the state of data. However, it quickly became clear that the term had various dimensions to it, and it could not summarise the issues one can observe. I have come to use the expression understanding data instead because (1) it captures the state I wish to describe and (2) speaks the scientific and functional purposes of that state. ... Read more