Cognitive Dissonance: Type hinting and linting in Python

February 12, 2020 by guidj

At work, the adoption of python 3 was finally moving at warp speed - the end of its support might have had something to do with it. As a result, there was a lot of code to migrate over. One of the things I did during this migration was add type hinting as well as linter checks to the codebase. And I... was not... ready for that!

When I first read about type hinting I thought it would be a neat thing to help people new to the language and existing users navigate through code. After all, the language wasn't becoming statically typed. I figured the hints were, as its name suggested, slight indications. However, combined with static analysis tools, they can actually help you identify bugs and other issues in your code, and I'm all in for that.

The trouble started when said issues started getting surfaced, to the point that my entire way of thinking when writing python was, for lack of a better term, rudely disturbed.

Let me give you an example. There was a function that converted datetime.datetime and datetime.date values into the first of the month.

def first_of_the_month(date):
    # reset value to first of the month
    return new_date

This function was used to transform both datetime.datetime and datetime.date objects. So what should it's input be? Simple: a union of both types. Unions represent a set of possible types that a value can take.

def first_of_the_month(date: Union[datetime.datetime, datetime.date]):
    # reset value to first of the month
    return new_date

Now, how about the output type? Equally trivial: a union of both as well. Whatever value type we get, we return a new one of the same type.

def first_of_the_month(date: Union[datetime.datetime, datetime.date]) -> Union[datetime.datetime, datetime.date]:
    # reset value to first of the month
    return new_date

And now all seems right with the world. Except, it's not, because somewhere down the road, there is a function that applies this transformation to two distinct values and tries to subtract them from one another after:

def logic_with_dates(first, second):
    new_first = first_of_the_month(first)
    new_second = first_of_the_month(second)
    return new_first - new_second

This function, logic_with_dates has something to say about us returning a Union from first_of_the_month. It complains, rightly so, that the operation we apply on the returned values can't be applied between two different value types. Meaning, we can subtract a datetime.date value from another; we can subtract a datetime.datetime value from another; but we cannot subtract a datetime.date value from a datetime.datetime value, or vice-versa.

Now, in no place was this actually being done. Anywhere in the code where this function was used, it was dealing with either one or the other type of values. So, technically, in our very dynamic and “everything is defined at runtime” world, everything was perfectly fine. But in this new “you verify once you write” world, things were very much broken. And rightly so.

What we're dealing with here is the usage of different types while assuming them to be compatible. Inside first_of_the_month, they were. But outside, with the additive operations, they are not. Obviously, there is a solution to this particularity. In fact, there are several solutions to it. What I was struck with is how something that was so completely fine suddenly become broken due to type hints.

It's important for me to note that I also spend a fair amount of time writing code is Scala and sometimes Java. I personally enjoy taking advantage of the type system in Scala, for instance, through constructs such as type classes. Some people believe Option or Optional types are something to get rid of. I think they beautifully describe, semantically, the idea that we can be dealing with a potentially undefined value, and our code should factor that into its design. When it comes to data processing, I have yet to experience anything better. But all of this is within the kingdom of strongly typed languages.

The reason I learned python is because, truth be told, I read somewhere that is was a very easy language to learn. At the time, I was still learning C++ and Java, and I wanted to learn something different. I wanted a language that would give me access to tools like web scrappers and text processing libraries. I hated writing Java and I pretty much still do. It's an extremely practical language, given the plethora libraries that exist in it, its (somewhat) strongly typed system, compilation speed, runtime speed, etc. I just didn't enjoy writing it. It's idioms have evolved, and it has tried to embody concepts from other languages. It's a great choice for many problems, but to me a programming language is no different from a verbal language. Everyone has preferences over how they enjoy expressing themselves.

The reason I stayed with python is because it always felt easy going. Don't get me wrong, writing good code in any language is hard. And typically, when writing in python, you probably do yourself a favour by having more tests than you think you need. So while I do welcome type hinting, I must confess that I just wasn't expecting it to change the way I viewed and used the language in the way that it has.

I got warnings for re-assigning values to variables, something I would never do in Scala. Warnings for assigning None to a variable first, and then giving it its intended value after some conditional evaluation, while declaring its type based on that value as opposed to an Optional - again, something I would never do in Scala.

And that's when I realised that the reason I did those things is because, well... I was allowed to. And the reason I didn't do those things in other languages, like Scala, is because it's strictly forbidden, as in, your code just won't compile. I was aware of this. But I was not prepared to face this. For a moment, it was as if a veil of cognitive dissonance had been lifted — I cannot tell you how hard, and politely, I have pressed on code reviews about the proper use of immutable values and Option types. Yet in python, I barely even thought about it. In fairness, some of the tasks I perform using python don't quite lend themselves to these rules, e.g. usage or implementation of iterative computing algorithms. But a great deal of them do.

Type hinting isn't just a cosmetic change to python. It changes the way a person reasons about their code. By combining type hinting with static analysis tools, what was more permissible has become less so. And I do see this change as for the better.

I suspect, still, that I will crave a playground with less discipline from time to time. A place to write something and know that it works because... reasons.

For now, I bid goodnight to the wild-west, and welcome the guardrails of safey. Safety from statically bad code.