Alex C. Viana's blog

Posted Wed 15 March 2017

Special Methods in Python

This post is intended to accompany a presentation I gave to the Baltimore Python Meetup on Wednesday, March 15th 2017.

This post assumes some familiarity with object-oriented programming as it relates to classes and inheritance in Python.

The purpose of this post is to explore Python's special methods, also called "magic", "built-in", "double underscore", or "dunder" methods. A common example is the __init__ method. These methods are considered special because they are referenced by Python to determine class behavior. Note that the double underscore notation is just a naming convention for indicating which methods Python considers special. Adding a double underscore to any other method (e.g. __foo__) doesn't give it any special properties.

Special Methods

To explore how these methods work we're going to pretend we're building a Scrabble-type game. A natural object for this type of project would be an object that encapsulates a word and it's score. Let's start with a super basic class:

In [1]: class ScrabbleWord():
    ...:     pass
    ...:
In [2]: my_word = ScrabbleWord()
In [3]: my_word.word = 'cat'
In [4]: my_word.score = 5

We start with an "empty" class, and manually assigned it two attributes, a word, and a score which I calculated by consulting a table of scrabble letter values. Using Python's dir() function we can inspect the attributes and methods on the my_word object.

In [1]: dir(my_word)
Out[1]: ['__class__', '__delattr__', '__dict__', '__doc__',
 '__format__', '__getattribute__', '__hash__', '__init__',
 '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__',
 '__setattr__', '__sizeof__', '__str__', '__subclasshook__',
 '__weakref__', 'score', 'word']

At the very end, we can see the score and word attributed I added as well as about a dozen other methods and attributes which all Python classes have. These other attributes, which all happen to be special attributes and methods, came from the Python base class. These are the minimum set of methods that Python will need to call to be able to use an object so it injects them into all its classes. Let's start looking at what we can do with these special methods.

The __init__ Method

One of the first things we can change about our ScrabbleWord class is that we have to manually add the the word and the score. Since we will always want to do this for every class instance we should just make it part of the object creation.

To define the way an object is created (also referred to as "instantialized") we can create an __init__ method. If we look at the output of the dir() function though we can see that ScrabbleWord already has an __init__ method. Because we are replacing an existing method inherited from the parent class, this is called operator overloading. In this case the parent, implicit in our class definition, is the Python base class for all objects.

As it turns out, all classes have an __init__ method. This is because Python always uses the __init__ method to create class instances. This is an example of what we mean when we say __init__ is a "special" method; these methods are referenced by Python to determine class behavior. Our replacement looks like this:

class ScrabbleWords(object):

    letter_values = {
        'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f': 4, 'g': 2, 'h': 4,
        'i': 1, 'j': 8, 'k': 5, 'l': 1, 'm': 3, 'n': 1, 'o': 1, 'p': 3,
        'q': 10, 'r': 1, 's': 1, 't': 1, 'u': 1, 'v': 4, 'q': 10,
        'w': 4, 'x': 8, 'y': 4, 'z': 10
    }

    def __init__(self, word):
        self.word = word
        self.score = sum([
            ScrabbleWords.letter_values[letter] for letter in self.word]
        )

Our __init__ method now takes an argument called word which it assigns as an attribute to the class. I added a class attribute called letter_values that is accessed as an attribute on the class, i.e. ScrabbleWords.letter_values. This attribute is a dictionary that maps letters to their Scrabble score. __init__ uses this dictionary in a list comprehension to turn the input word into a list of letter scores, sum those up, and then assign the sum to the score attribute. Calling the class now looks like this:

In [1]: my_word = ScrabbleWords('cat')

In [2]: my_word.word
Out[2]: 'cat'

In [3]: my_word.score
Out[3]: 5

Not bad, we now provide the word at the same time as we create the object and both the word the calculated score are saved on the object. Let's see what else we can do.

The __repr__ and __str__ and Methods

Let's take a look at what happens when we try to inspect or print our class instance 1.

In [1]: my_word
Out[1]: <__main__.ScrabbleWord instance at 0x1084c80e0>

In [2]: print(my_word)
Out[2]: <__main__.ScrabbleWord instance at 0x1084c80e0>

This tells us that the my_word object is an instance of the ScrabbleWord class from the __main__ namespace and is located at memory address 0x1084c80e0. While this is all correct it's the default representation of any Python object and that information is not as useful as it could be for building our Scrabble game

We can make the results more descriptive by overloading the __repr__ and __str__ methods to fit our use case. This could look like this.

def __repr__(self):
    return '{}: {} points'.format(self.word, self.score)

def __str__(self):
    return '{}: {} points'.format(self.word, self.score)

Now interacting with our object will look like this:

In [1]: my_word
Out[1]: cat: 5 points

In [2]: print(my_word)
cat: 5 points

You can read more about the difference between the __repr__ and __str__ methods in this Stack Overflow Question.

The __eq__ and Other Operator Methods

At some point we're probably going to want some type of score optimization algorithm for our game, maybe for a simple computer opponent. To do that it would be helpful to have a way of comparing the value of two ScrabbleWord objects. Right now we can do that manually by explicitly comparing the score attribute.

In [1]: cat_word  = ScrabbleWords('cat')

In [2]: dog_word = ScrabbleWords('dog')

In [3]: cat_word
Out[3]: cat: 5 points

In [4]: dog_word
Out[4]: dog: 5 points

In [5]: cat_word.score == dog_word.score
Out[5]: True

In [6]: cat_word.score > dog_word.score
Out[6]: False

But if we start to implicitly compare the words Python doesn't know we mean to compare the score and we get unexpected behavior.

In [5]: cat_word == dog_word
Out[5]: False

In [8]: cat_word > dog_word
Out[8]: True

It turns out we can define the behavior of these comparison operators ==, !=, <, >, <=, and >= using the __eq__, __ne__, __lt__, __gt__, __le__, and __ge__ respectively. If we want to make our comparisons implicitly use the score attribute to prevent us from having to write it out every time our definitions could look like this:

def __eq__(self, y):
    if self.score == y.score:
        return True
    else:
        return False

def __ne__(self, y):
    if self.score != y.score:
        return True
    else:
        return False

def __lt__(self, y):
    if self.score < y.score:
        return True
    else:
        return False

def __gt__(self, y):
    if self.score > y.score:
        return True
    else:
        return False

def __le__(self, y):
    if self.score <= y.score:
        return True
    else:
        return False

def __ge__(self, y):
    if self.score >= y.score:
        return True
    else:
        return False

Now using our object would go like this:

In [1]: cat_word == dog_word
Out[1]: True

In [2]: cat_word != dog_word
Out[2]: False

In [3]: cat_word < dog_word
Out[3]: False

In [4]: cat_word > dog_word
Out[4]: False

In [5]: cat_word <= dog_word
Out[5]: True

In [6]: cat_word >= dog_word
Out[6]: True

It's worth pointing out what's happening here. Taking the __eq__ method as an example. Where Python sees the == operator between two instance of our ScrabbleWords class it'll evaluate the __eq__ method to determine the result.

But looking at our code and you can see the comparison being performed is self.score == y.score which is something like <int> == <int>. So what determines the answer of that statement? Because everything in python is an object, including integers, Python uses the integer classes __eq__ method to evaluate this.

The __iter__ and __len__ and Methods

When we play a word on the board in Scrabble we might care about how long a word is (to see if it'll fit on the board) and we might need to look at a word letter-by-letter (to evaluate things like bonus letter scores). If we try to do either of these things with Our class in it's current state we'll see neither are not supported.

In [1]: len(cat_word)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-48-7ee5d4e3efb8> in <module>()
----> 1 len(cat_word)

AttributeError: ScrabbleWords instance has no attribute '__len__'

In [2]: [letter for letter in cat_word]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-b747de29668a> in <module>()
----> 1 [letter for letter in cat_word]

TypeError: iteration over non-sequence

Again we can get around this by directly accessing the attribute we what to work with, in this case the word attribute.

In [1]: len(cat_word.word)
Out[1]: 3

In [2]: [letter for letter in cat_word.word]
Out[2]: ['c', 'a', 't']

But, following the theme of this post, we can do this implicitly with the special methods, in this case the __len__ and __iter__ methods.

def __iter__(self):
    for letter in self.word:
        yield letter

def __len__(self):
    return len(self.word)

Now our class works like this:

In [43]: len(cat_word)
Out[43]: 3

In [44]: [letter for letter in cat_word]
Out[44]: ['c', 'a', 't']

Conclusion

Python special methods become increasingly important as you start working on more involved projects. Either you will be writing your own libraries, in which case these are common "pythonic" programming patterns. Or even if you are leveraging existing libraries, at some point you will need to understand exactly what some special methods are doing in order to understanding some bug or quirk in the library.

Just doing some quick poking around on the github repos for some of my favorite open source projects turns up a couple of examples of classes with lots of special methods in the wild: BaseRowProxy in SQLAlchemy, SiteRegistry in Astropy.


  1. Note I'm using the Python 3 style print statement. To use this in Python 2 run from __future__ import print_function 

Category: Python
Tags: python presentations