Computing Variance Online

September 5, 2020

In my previous post, A Fable of an Aggregator, I dedcribed the properties of an abstract data type (ADT) that enables concurrent execution of aggregations, such as sum, mean, max. For example, if we want the mean of a collection of values, it sufficies for us to accumluate its sum and count - dividing the former by the latter gives us the answer. More importantly, this accumulation can be done concurrently - and hence, it’s parallelizable. ... Read more

A fable of an Aggregator

June 13, 2020

Many parallel data computing tasks can be solved with one abstract data type (ADT). We will describe how an Aggregator does that by walking through a problem we want to solve with parallelism and uncovering the ideal properties of an ADT that enable us to do so. Relevance of Aggregations: The Desirable ADT In the world of analytics and machine learning, data processing makes up a significant chunk of the plumbing required to do both. ... Read more