Static duck typing

Welcome to chapter 4, we're already half-way through the course! It is now time to really dive into the deep end with typing, and start learning how to add type hints to the more uncommon Python syntax and code styles.

And we're starting with probably the most common Python code pattern, called "duck typing".

Duck typing is essentially a way that Python uses to make its functions really flexible. Instead of expecting an object of a specific class, these functions will be fine with an object of any type, as long as it satisfies some conditions. These conditions are usually specific properties or methods being present on the object. As long as they're available, the code works.

We've already seen an example of this in the course, although a much more simplified one:

from typing import Any

def do_quack(duck: Any) -> None:
    if hasattr(duck, "quack"):
        quack_function = duck.quack
        quack_function()
    else:
        print("Expected an object with a 'quack' method.")

class Goose:
    def quack(self) -> None:
        print("Goose goes 'quack'.")

goose = Goose()
do_quack(goose)

do_quack only cares about the quack method being present, and nothing else. If it is there, it calls the method. But it's not that great -- You can still pass anything to do_quack, and if it doesn't, the function simply returns without doing anything. That's not very good.

Let's go the other way around, by asking for a concrete type:

class Duck:
    def quack(self) -> None:
        print("Duck goes 'quack'.")

def do_quack(duck: Duck) -> None:
    duck.quack()

class Goose:
    def quack(self) -> None:
        print("Goose goes 'quack'.")

goose = Goose()
do_quack(goose)

This code 'works', but mypy complains that it expected an object of type Duck. You could just extend Goose from Duck:

class Duck:
    def quack(self) -> None:
        print("Duck goes 'quack'.")

def do_quack(duck: Duck) -> None:
    duck.quack()

class Goose(Duck):
    def quack(self) -> None:
        print("Goose goes 'quack'.")

goose = Goose()
do_quack(goose)

Now mypy is happy, and the code works! And traditionally, this has been the solutions in statically typed languages: just use inheritance. But this comes with the big problem that your code is now coupled to Duck: If Duck adds new behaviour, you add new behaviour too. In general, it's not as flexible.

What we really need, is a way to make Duck a type that can accept any type that implements quack. This concept is called a Protocol in Python:

from typing import Protocol

class Duck(Protocol):
    def quack(self) -> None:
        pass

def do_quack(duck: Duck) -> None:
    duck.quack()

class Goose:
    def quack(self) -> None:
        print("Goose goes 'quack'.")

goose = Goose()
do_quack(goose)

It works! Now mypy knows what we expect from a Duck. So as long as Goose satisfies the Duck protocol, mypy is fine with it. If you remove the quack method from Goose, or if you change its return type or signature, the type checking will start to fail. It's as simple as that.

Note that we completely got rid of the hasattr(duck, "quack") part of the code from the first example. That's because mypy will be checking if the code is fine or not anyway, we don't need to confirm it anymore at runtime. As long as we use mypy, we can rest assured.

Also note that we didn't provide any method body to Duck.quack(). That's because we're only interested in defining the method signature in a protocol class, not the actual contents inside it. So, mypy simply ignores the content. Traditionally, you're supposed to use ... for the body of a protocol method. We'll do that in the next example.

There's a lot more that can be done using protocols. A nice example would be the "callable" protocol:

def func() -> int:
    return 42

class FuncGenerator:
    def __call__(self) -> int:
        return 42

func2 = FuncGenerator()


print(func())
print(func2())

In Python, you can create your own objects that are "callable", i.e. you can run them like a function, if you define a __call__ method on that object's class. This function is run when you do obj(). This is an excellent place for using a protocol:

from typing import Protocol

class Callable(Protocol):
    def __call__(self) -> None: ...


def call_twice(function: Callable) -> None:
    """Calls the given function twice."""
    function()
    function()


def fortytwo() -> None:
    print(42)

class Counter:
    def __init__(self) -> None:
        self.count = 1

    def __call__(self) -> None:
        print("Call count:", self.count)
        self.count += 1

counter = Counter()

call_twice(fortytwo)
call_twice(counter)

We were able to create our own "callable" counter object, and pass it to call_twice, all while mypy being able to ensure our code will run fine.

Another example would be the "iterable" protocol. It refers to any object that can be iterated over using a for loop, like this for example:

def count_unique(items: list[str]) -> int:
    """Returns the number of unique items present"""
    unique_count = 0
    seen: set[str] = set()
    for item in items:
        if item not in seen:
            # We've seen a new unique item!
            unique_count += 1
            seen.add(item)

    return unique_count

count = count_unique(['10', '20', '20', '10'])
print(count)

greetings = {1: 'hello', 2: 'hi', 3: 'hello'}
unique_greetings = count_unique(greetings.values())
print(unique_greetings)

While the code works, we get an error telling us that we passed a dict_values[int, str], where list was expected. Basically it's telling us that greetings.values() does not return a list.

But does that really matter to us? I don't think so. We do know that greetings.values() returns us something that resembles a list, and we can definitely iterate over it:

greetings = {1: 'hello', 2: 'hi', 3: 'hello'}
for greeting in greetings.values():
    print(greeting)

And all we care about is the item being iterable and having strings inside it, what we really want is a protocol:

from typing import Protocol

class StringIterator(Protocol):
    def __next__(self) -> str: ...

class StringIterable(Protocol):
    def __iter__(self) -> StringIterator: ...

def count_unique(items: StringIterable) -> int:
    """Returns the number of unique items present"""
    unique_count = 0
    seen: set[str] = set()
    for item in items:
        if item not in seen:
            # We've seen a new unique item!
            unique_count += 1
            seen.add(item)

    return unique_count

count = count_unique(['10', '20', '20', '10'])
print(count)

greetings = {1: 'hello', 2: 'hi', 3: 'hello'}
unique_greetings = count_unique(greetings.values())
print(unique_greetings)

It might be a bit confusing seeing two protocols here, but that's just how Python does iteration. When you do a for loop on any object in Python, it does two things:

  • It calls iter(obj) on that object, which returns an "iterator" object.
  • Python will then repeatedly call next(iterator) until it no longer contains a value.

To support this, we need to define both an "iterable" type and an "iterator" type. The StringIterable expects to find a __iter__ method on the object, and StringIterator expects to get a string back on every iteration.

A list of strings and greetings.values() both satisfy this definition, as both can be passed to a for loop, and both return strings. So mypy is happy with the protocol. In fact, you can now pass a tuple of strings, a set of strings, and a lot of other kinds of types into the function, and it will all work.

I hope that you're thinking "this seems like it'd be a very common type to use", and you're right. The typing module already has an Iterator and an Iterable type defined in it. Though they are slightly different. We will take a look at them in the next section.