some background
When we decided to wake everyone up. Poland, 2019.

Using dunder methods to refine your data model

Introduction

Practically everyone who has ever used Python came across at least one of the so-called Python magic methods. Dunder methods, as they also called that way, are Python’s special functions that allow users to hook into some specific actions being performed. Probably the most frequently encountered one is the __init__ method. It is called when instantiating a new object from a class and by overriding it, we can gain control over that process.

However, this post is not going to take you through a full list of these.

Instead, we will show how you can effectively use this great Python feature by telling of a short story. We will use quaternions as an example to explain the proces of creating of our data model that is easy to handle for other developers, especially those less enthusiastic about advanced algebra. Most importantly, we will explain the decision process and argue why it makes sense to even bother.

Simple object

A quaternion is an algebraic concept often used for describing rotations and widely applied in 3D modeling and gaming. Conceptually, quaternions can be thought of as an extension of complex numbers body, having not one, but three imaginary parts. Depending on the application, they are can also be understood as quotients of three-dimensional vectors or four-dimensional objects or scalar-vector pairs.

OK, but how do we code this thing?

Instantiation

From the programming point of view, we do not need to focus that deep into math. At this stage, all we need to know is that one quaternion is defined by four real numbers.

__int__

1
2
3
4
5
6
class Quaternion:
    def __init__(self, w, x, y, z):
        self.w = w
        self.x = x
        self.y = y
        self.z = z

We model our mathematical “being” as an object, and we have our first dunder method. All this code does is to tell Python: “look, when you create a new object of class Quaternion, I will need four numbers from you to instantiate it. Since every quaternion is different, it makes sense to define w, x, y and z as object attributes instead of class properties.

Representation

Let’s create our first quaternion.

>>> q1 = Quaternion(1, 2, 3, 4)
>>> q1
<__main__.Quaternion at 0x7f4210f483c8>

Our quaternion is an object, but it looks pretty ugly. By default, we see an address of where that object lives in memory, but that description tells us nothing about the qualities we are interested in.

__repr__, __str__

1
2
3
4
5
6
7
def __repr__(self):
    return "Quaternion({}, {}, {}, {})".format(
        self.w, self.x, self.y, self.z)

def __str__(self):
    return "Q = {:.2f} + {:.2f}i + {:.2f}j + {:.2f}k".format(
        self.w, self.x, self.y, self.z)

Here, we have defined two more methods. The __repr__ method is an “official” representation of the object, and here with this quality that eval(repr(obj)) == obj.

Good. The __repr__ method returns a string that is descriptive enough. However, we can further enhance our representation with __str__. The output will be as follows:

>>> q1          # calls q1.__repr__
Quaternion(1, 2, 3, 4)

>>> print(q1)   # calls q1.__str__
Q = 1.00 + 2.00i + 3.00j + 4.00k

Performing algebraic operations

You may wonder, at this point, why not using a list or a dictionary? It is certainly less code and we can easily see the elements.

Well, we indeed need something more than just a “bag of numbers”. There are two main arguments against it:

  1. We don’t want to rely on convention. Is w always going to be named “w” and used as the first argument? What if someone breaks it?
  2. We define this object to reflect upon the mathematical properties it is designed to represent.

Pretty tough, right? Apart from 1., quaternions are additive. Try adding dictionaries or lists together… one will result in TypeError, while the other will extend the number of elements, thus breaking our definition. There is another way.

Addition

__add__

1
2
3
4
5
6
def __add__(self, other):
    w = self.w + other.w
    x = self.x + other.x
    y = self.y + other.y
    z = self.z + other.z
    return Quaternion(w, x, y, z)

There we have it. We have just overridden the + operator, making the addition of quaternions defined.

>>> q1 = Quaternion(1, 2, 3, 4)
>>> q2 = Quaternion(0, 1, 3, 5)
>>> q1 + q2
Quaternion(1, 3, 6, 9)

Subtraction

__sub__

The same we can do with subtracting. This time we will be fancy and do it in just one line of code.

1
2
def __sub__(self, other):
    return Quaternion(*list(map(lambda i, j: i - j, self.__dict__.values(), other.__dict__.values())))

Although that was unnecessary, it also shows another convenient dunder method. The __dict__ method collects all the attributes of an object and returns them as a dictionary.

Multiplication

If you still think that overriding of operations is boring, now it is time for fun.

__matmul__

The easiest of all is the dot product (see this gist for more methods).

Represented with @, since Python 3.5, it invokes __matmul__ method, which for quaternions, is defined as a simple element-wise multiplication.

The “normal” multiplication is harder though. First, the algebra distinguishes between quaternion times quaternion multiplication and quaternion times scalar multiplication. Secondly, quaternion-by-quaternion multiplication is not commutative, meaning that .

__mul__

1
2
3
4
5
6
7
8
9
10
11
def __mul__(self, other):
    if isinstance(other, Quaternion):
        w = self.w * other.w - self.x * other.x - self.y * other.y - self.z * other.z
        x = self.w * other.x + self.x * other.w + self.y * other.z - self.z * other.y
        y = self.w * other.y + self.y * other.w + self.z * other.x - self.x * other.z
        z = self.w * other.z + self.z * other.w + self.x * other.y - self.y * other.x
        return Quaternion(w, x, y, z)
    elif isinstance(other, (int, float)):
        return Quaternion(*[other * i for i in self.__dict__.values()])
    else:
        raise TypeError("Operation undefined.")

Here, if the other is a quaternion, we compute the so-called Hamilton product and return a new object. If the other is a scalar (a number), we multiply each of the quaternion’s coordinates with that number. Finally, anything else raises an exception.

As mentioned earlier, the multiplication of quaternions is not commutative. However, that is only when multiplying quaternions by one another. With the current definition, if we execute 2 * q1, we will get an error. To fix it, we can use __rmul__ which covers our case:

__rmul__

1
2
3
4
5
def __rmul__(self, other):
    if isinstance(other, (int, float)):
        return self.__mul__(other)
    else:
        raise TypeError("Operation undefined.")

Now, we can multiply a quaternion by a scalar on both sides, while quaternion can multiply another quaternion in a strictly defined order.

Equality

We will skip the division as it follows in the same pattern. Instead look at one more curiosity: equality.

What does it mean that two quaternions are actually equal? Is it when all components are pair-wise equal or perhaps when two objects represent the same truth?

We can go for any of these definitions… however, the very fact that we asked this question to ourselves, justifies overriding one more method.

__eq__

1
2
3
def __eq__(self, other):
    r = list(map(lambda i, j: abs(i) == abs(j), self.__dict__.values(), other.__dict__.values()))
    return sum(r) == len(r)

Here we defined our == as a case where all coordinates’ absolute values having to match.

Other operations

Python defines a list of operators that can be overridden. However, not every mathematical operation is represented in the dunder methods. In these cases, it is better to stick to “normal” methods, since the usage of other symbols would be counterintuitive.

For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from math import sqrt


def norm(self):
    return sqrt(sum([i**2 for i in self.__dict__.values()))

def conjugate(self):
    x, y, z = -self.x, -self.y, -self.z
    return Quaterion(self.w, x, y, z)

def normalize(self):
    norm = self.norm()
    return Quaternion(*[i / norm for in self.__dict__.values()])

def inverse(self):
    qconj = self.conjugate()
    norm  = self.norm()
    return Quaternion(*[i / norm for i in qconj.__dict__.values()])

Overriding or overloading?

Throughout this post, we were carefully watching our language. We “wrote over” some of the dunder methods for good reason. However, we did not perform any overloading of operators. Overloading of operators does not exist in Python in a strict sense. One method can only have one interface to it, although Python allows a variable number of arguments.

Do you remember how we instantiated our objects? We used four numbers w, x, y, and z as arguments. When dealing with quaternions, however, it is common to derive them from yaw, pitch and roll angles, which are closely related to Euler angles.

The question arises, how do we go about them programmatically? Do we extend our __init__ method’s interface to accept seven numbers? Is it better to make some of them optional? If yes, then how do we ensure the integrity of our object? What price do we need to pay in terms of the code quality?

Speaking of quaternions, we do have an opportunity to implement something close to overloading, making our code even cleaner.

Pythonic “overloading”

Since all operations, as we saw them, involve w, x, y, z variables, there is no point in adding any more attributes to our class. What we must do, however, is to have an option to bypass the constructor’s interface with something that takes yaw, pitch, and roll converts them to (w, x, y, z) and instantiates of a new object.

First, let’s create the re-calculation method:

1
2
3
4
5
6
7
8
9
10
11
12
13
from math import sin, cos


def _ypr_to_coords(yaw, pitch, roll):
    y = 0.5 * yaw
    p = 0.5 * pitch
    r = 0.5 * roll
    
    w = cos(y) * cos(p) * cos(r) + sin(y) * sin(p) * sin(r)
    x = cos(y) * cos(p) * sin(r) - sin(y) * sin(p) * cos(r)
    y = sin(y) * cos(p) * sin(r) + cos(y) * sin(p) * cos(r)
    z = sin(y) * cos(p) * cos(r) - cos(y) * sin(p) * sin(r)
    return w, x, y, z

The method is protected in the sense that it is “internal” to the class. It also does not perform any operations over the object. It only recalculates the angles returns the coordinates.

Next, we use it as a part of our __init__’s second face.

1
2
3
4
5
6
7
8
9
10
11
class Quaternion:
    def __init__(self, w, x, y, z):
        self.w = w
        self.x = x
        self.y = y
        self.z = z

    @classmethod
    def create_from_ypr(cls, yaw, pitch, roll):
        r = cls._ypr_to_coords(yaw, pitch, roll)
        return cls(*r)

Without affecting the __init__ or the attributes, we have now another way to instantiate our quaternion. With @classmethod decorator, we appoint create_from_ypr(...) method to be a class method rather than an object method. When invoked on a class, it recalculates our coordinates and returns the class itself (through former __init__) feeding the necessary arguments in.

This trick allows us to stay true to our definition, but adds more flexibility. We can even use this approach to define special kind of objects:

1
2
3
4
5
6
class Quaternion:
    ...

    @classmethod
    def create_identity(cls):
        return cls(1, 0, 0, 0)
>>> q0 = Quaternion.create_identity()
>>> print(q0)
Q = 1.00 + 0.00i + 0.00j + 0.00k

Conclusions

In this post, we have presented a pattern behind using some of Python’s special features known as dunder methods. We have given an example of how these methods can be harnessed to model an abstract algebraic object, namely a quaternion. We have also made a clear distinction between overriding and overloading and shown how the latter can be implemented to facilitate working with our objects.

To see more methods, take a look at this gist. If there is anything to improve, please give the feedback in the comments below! Thanks ;)