geeknam's blog

A young inspired developer.

Python Caching Using Defaultdict

Today, I’ve learnt about defaultdict container from my colleague, who uses it for caching. defaultdict container is part of collections module which provides high-performance container datatypes. You can find more about it here. Note: New in version 2.5.

Let’s see the code without caching mechanism:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import csv
from models import Pony, PonySwag

# Getting some data from csv file
swags = csv.DictReader('pony_swag.csv', 'rU'), dialect='excel')

for swag in swags:
    # Get or create a pony
    pony = Pony.objects.get_or_create(
        name=swag['pony_name']
    )[0]

    # Create pony swag which belongs to a pony
    PonySwag.objects.get_or_create(
        name=swag['swag_name'],
        power=swag['swag_power'],
        pony=pony
    )[0]

In this scenario, for every entry in csv, the code hits the database to get a pony (based on the name) to assign it to a corresponding swag. Now if we cache the created ponies, we would not have to query the database anymore.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import csv
from collections import defaultdict
from models import Pony, PonySwag

class PoniesCache(defaultdict):
    # If dict lookup fails this will be called
    def __missing__(self, key):
        # Insert a pony in cache dict and return it
        pony = Pony.objects.get_or_create(name=key)[0]
        self[key] = pony
        return pony

ponies_cache = PoniesCache()
# Getting some data from csv file
swags = csv.DictReader('pony_swag.csv', 'rU'), dialect='excel')

for swag in swags:
    pony = ponies_cache[swag['pony_name']]

    # Create pony swag which belongs to a pony
    PonySwag.objects.get_or_create(
        name=swag['swag_name'],
        power=swag['swag_power'],
        pony=pony
    )[0]

As you might have read, defaultdict is a subclass of dict and it overrides a method called __missing__.
__missing__ is called by the __getitem__() method of the dict class when the requested key is not found (when swag['pony_name'] value is not found in ponies_cache). In our case, we override __missing__ to create a pony and store it in cache (if it doesn’t exist yet) and return it. This results in a pony being inserted in the dictionary with a key swag['pony_name'].

Comments