<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: orm</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/orm.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2021-08-24T23:16:42+00:00</updated><author><name>Simon Willison</name></author><entry><title>SQLModel</title><link href="https://simonwillison.net/2021/Aug/24/sqlmodel/#atom-tag" rel="alternate"/><published>2021-08-24T23:16:42+00:00</published><updated>2021-08-24T23:16:42+00:00</updated><id>https://simonwillison.net/2021/Aug/24/sqlmodel/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/tiangolo/sqlmodel"&gt;SQLModel&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A new project by FastAPI creator Sebastián Ramírez: SQLModel builds on top of both SQLAlchemy and Sebastián’s Pydantic validation library to provide a new ORM that’s designed around Python 3’s optional typing. The real brilliance here is that a SQLModel subclass is simultaneously a valid SQLAlchemy ORM model AND a valid Pydantic validation model, saving on duplicate code by allowing the same class to be used both for form/API validation and for interacting with the database.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlalchemy"&gt;sqlalchemy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pydantic"&gt;pydantic&lt;/a&gt;&lt;/p&gt;



</summary><category term="orm"/><category term="python"/><category term="sql"/><category term="sqlalchemy"/><category term="pydantic"/></entry><entry><title>PugSQL</title><link href="https://simonwillison.net/2019/Jul/3/pugsql/#atom-tag" rel="alternate"/><published>2019-07-03T18:19:38+00:00</published><updated>2019-07-03T18:19:38+00:00</updated><id>https://simonwillison.net/2019/Jul/3/pugsql/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://pugsql.org/"&gt;PugSQL&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Interesting new twist on a definitely-not-an-ORM library for Python. With PugSQL you define SQL queries in files, give them names and then load them into a module which allows you to execute them as Python methods with keyword arguments. You can mark statements as only returning a single row (or a single scalar value) with a comment at the top of their file.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dan-mckinley"&gt;dan-mckinley&lt;/a&gt;&lt;/p&gt;



</summary><category term="orm"/><category term="python"/><category term="sql"/><category term="dan-mckinley"/></entry><entry><title>Sqorn</title><link href="https://simonwillison.net/2018/Sep/19/sql/#atom-tag" rel="alternate"/><published>2018-09-19T18:34:16+00:00</published><updated>2018-09-19T18:34:16+00:00</updated><id>https://simonwillison.net/2018/Sep/19/sql/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://sqorn.org/"&gt;Sqorn&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
JavaScript library for building SQL queries that makes really smart usage of ES6 tagged template literals. The magic of tagged template literals is that they let you intercept and process interpolated values, making them ideally suited to escaping parameters in SQL queries. Sqorn takes that basic ability and layers on some really interesting API design to allow you to further compose queries.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://michelenasti.com/2018/09/19/Javascript-chiamare-funzioni-senza-usare-parentesi-(what!).html"&gt;Michele Nasti&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="orm"/><category term="sql"/></entry><entry><title>Describing events in code</title><link href="https://simonwillison.net/2018/Mar/28/describing-events-in-code/#atom-tag" rel="alternate"/><published>2018-03-28T15:41:59+00:00</published><updated>2018-03-28T15:41:59+00:00</updated><id>https://simonwillison.net/2018/Mar/28/describing-events-in-code/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.gyford.com/phil/writing/2018/03/28/events-part-2/"&gt;Describing events in code&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Phil Gyford built an online directory of every play, movie, gig and exhibition he has been to in the past 38 years using a combination of digital archaeology and saved ticket stubs. He built it using Django and published this piece extensively describing the process he went through to design the data model.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/philgyford/status/979008628148592642"&gt;@philgyford&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/phil-gyford"&gt;phil-gyford&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="orm"/><category term="phil-gyford"/></entry><entry><title>Building a combined stream of recent additions using the Django ORM</title><link href="https://simonwillison.net/2018/Mar/25/combined-recent-additions/#atom-tag" rel="alternate"/><published>2018-03-25T00:47:54+00:00</published><updated>2018-03-25T00:47:54+00:00</updated><id>https://simonwillison.net/2018/Mar/25/combined-recent-additions/#atom-tag</id><summary type="html">
    &lt;p&gt;I’m a big believer in the importance of a “recent additions” feed. Any time you’re building an application that involves users adding and editing records it’s useful to have a page somewhere that shows the most recent objects that have been created across multiple different types of data.&lt;/p&gt;
&lt;p&gt;I’ve used a number of techniques to build these in the past - from an extra database table (e.g. the Django Admin’s &lt;a href="https://github.com/django/django/blob/623117d1f1d7866b7321f0e73a6c497bb3b3cb01/django/contrib/admin/models.py#L33"&gt;LogEntry model&lt;/a&gt;) to a Solr or Elasticsearch index that exists just to serve recent additions.&lt;/p&gt;
&lt;p&gt;For a recent small project I found myself needing a recent additions feed and realized that there’s a new, simple way to build one thanks to the &lt;code&gt;QuerySet.union()&lt;/code&gt; method introduced in Django 1.11 &lt;a href="https://docs.djangoproject.com/en/2.0/releases/1.11/"&gt;back in April 2017&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Consider a number of different ORM models that can be added by users, each with a &lt;code&gt;created&lt;/code&gt; timestamp field.&lt;/p&gt;
&lt;p&gt;Prior to &lt;code&gt;QuerySet.union()&lt;/code&gt;, building a combined recent additions feed across multiple models was difficult: it’s easy to show recent additions for a single model, but how can we intersperse and paginate additions made to models stored across more than one table?&lt;/p&gt;
&lt;h3&gt;&lt;a id="Using_union_to_combine_records_from_different_models_12"&gt;&lt;/a&gt;Using .union() to combine records from different models&lt;/h3&gt;
&lt;p&gt;Consider the following three models:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;class Project(models.Model):
    name = models.CharField(max_length=128)
    description = models.TextField()
    created = models.DateTimeField(auto_now_add=True)

class Image(models.Model):
    project = models.ForeignKey(
        Project, related_name='images', on_delete=models.CASCADE
    )
    image = models.ImageField()
    created = models.DateTimeField(auto_now_add=True)

class Comment(models.Model):
    project = models.ForeignKey(
        Project, related_name='comments', on_delete=models.CASCADE
    )
    comment = models.TextField()
    created = models.DateTimeField(auto_now_add=True)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s build a single QuerySet that returns objects from all three models ordered by their created dates, most recent first.&lt;/p&gt;
&lt;p&gt;Using &lt;code&gt;.values()&lt;/code&gt; we can reduce these different models to a common subset of fields, which we can then &lt;code&gt;.union()&lt;/code&gt; together like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;recent = Project.objects.values(
    'pk', 'created'
).union(
    Image.objects.values('pk', 'created'),
    Comment.objects.values('pk', 'created'),
).order_by('-created')[:4]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now if we print out &lt;code&gt;list(recent)&lt;/code&gt; it will look something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[{'created': datetime.datetime(2018, 3, 24, 1, 27, 23, 625195, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 28},
 {'created': datetime.datetime(2018, 3, 24, 15, 51, 29, 116511, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 15},
 {'created': datetime.datetime(2018, 3, 23, 20, 14, 3, 31648, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 5},
 {'created': datetime.datetime(2018, 3, 23, 18, 57, 36, 585376, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 11}]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’ve successfully combined recent additions from three different tables! Here’s what the SQL for that looks like:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; from django.db import connection
&amp;gt;&amp;gt;&amp;gt; print(connection.queries[-1]['sql'])
(SELECT &amp;quot;myapp_project&amp;quot;.&amp;quot;id&amp;quot;, &amp;quot;myapp_project&amp;quot;.&amp;quot;created&amp;quot; FROM &amp;quot;myapp_project&amp;quot;)
 UNION (SELECT &amp;quot;myapp_image&amp;quot;.&amp;quot;id&amp;quot;, &amp;quot;myapp_image&amp;quot;.&amp;quot;created&amp;quot; FROM &amp;quot;myapp_image&amp;quot;)
 UNION (SELECT &amp;quot;myapp_comment&amp;quot;.&amp;quot;id&amp;quot;, &amp;quot;myapp_comment&amp;quot;.&amp;quot;created&amp;quot; FROM &amp;quot;myapp_comment&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There’s just one problem: we got back a bunch of &lt;code&gt;pk&lt;/code&gt; and &lt;code&gt;created&lt;/code&gt; records, but we don’t know &lt;em&gt;which&lt;/em&gt; model each of those rows represents.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Using_annotate_to_add_a_type_constant_to_the_rows_67"&gt;&lt;/a&gt;Using .annotate() to add a type constant to the rows&lt;/h3&gt;
&lt;p&gt;We can fix this by using Django’s &lt;code&gt;annotate()&lt;/code&gt; method combined with a &lt;code&gt;Value()&lt;/code&gt; object to attach a constant string to each record specifying the type of the row it represents. Here’s how to do that for a single model:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt;&amp;gt;&amp;gt; from django.db.models import Value, CharField
&amp;gt;&amp;gt;&amp;gt; list(Image.objects.annotate(
...     type=Value('image', output_field=CharField()
... )).values('pk','type', 'created')[:2])
[{'created': datetime.datetime(2018, 3, 22, 17, 16, 33, 964900, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 3,
  'type': 'image'},
 {'created': datetime.datetime(2018, 3, 22, 17, 49, 47, 527907, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 4,
  'type': 'image'}]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’ve added the key/value pair &lt;code&gt;'type': 'image'&lt;/code&gt; to every record returned from the querystring. Now let’s do that to all three of our models and combine the results using &lt;code&gt;.union()&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;recent = Project.objects.annotate(
    type=Value('project', output_field=CharField())
).values(
    'pk', 'created', 'type'
).union(
    Image.objects.annotate(
        type=Value('image', output_field=CharField())
    ).values('pk', 'created', 'type'),
    Comment.objects.annotate(
        type=Value('comment', output_field=CharField())
    ).values('pk', 'created', 'type'),
).order_by('-created')[:4]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we run &lt;code&gt;list(recent)&lt;/code&gt; we get this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[{'created': datetime.datetime(2018, 3, 24, 15, 51, 29, 116511, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 15,
  'type': 'comment'},
 {'created': datetime.datetime(2018, 3, 24, 15, 50, 3, 901320, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 29,
  'type': 'image'},
 {'created': datetime.datetime(2018, 3, 24, 15, 46, 35, 42123, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 15,
  'type': 'project'},
 {'created': datetime.datetime(2018, 3, 24, 7, 53, 15, 222029, tzinfo=&amp;lt;UTC&amp;gt;),
  'pk': 14,
  'type': 'comment'}]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is looking pretty good! We’ve successfully run a single SQL UNION query across three different tables and returned the combined results in reverse chronological order. Thanks to the &lt;code&gt;type&lt;/code&gt; column we know which model each record corresponds to.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Inflating_the_full_referenced_objects_114"&gt;&lt;/a&gt;Inflating the full referenced objects&lt;/h3&gt;
&lt;p&gt;Now we need to &lt;em&gt;inflate&lt;/em&gt; those primary key references a full ORM object from each corresponding table.&lt;/p&gt;
&lt;p&gt;The most efficient way to do this is to collect together the IDs for each type and then run a single SQL query per type to load the full objects.&lt;/p&gt;
&lt;p&gt;Here’s code that does exactly that: it first collects the list of primary keys that need to be loaded for each type, then executes an efficient SQL IN query against each type to fetch the underlying objects:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;records = list(recent)

type_to_queryset = {
    'image': Image.objects.all(),
    'comment': Comment.objects.all(),
    'project': Project.objects.all(),
}

# Collect the pks we need to load for each type:
to_load = {}
for record in records:
    to_load.setdefault(record['type'], []).append(record['pk'])

# Fetch them 
fetched = {}
for type, pks in to_load.items():
    for object in type_to_queryset[type].filter(pk__in=pks):
        fetched[(type, object.pk)] = object

# Annotate 'records' with loaded objects
for record in records:
    key = (record['type'], record['pk'])
    record['object'] = fetched[key]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After running the above code, &lt;code&gt;records&lt;/code&gt; looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[{'created': datetime.datetime(2018, 3, 24, 15, 51, 29, 116511, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Comment: a comment&amp;gt;,
  'pk': 15,
  'type': 'comment'},
 {'created': datetime.datetime(2018, 3, 24, 15, 50, 3, 901320, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Image: Image object (29)&amp;gt;,
  'pk': 29,
  'type': 'image'},
 {'created': datetime.datetime(2018, 3, 24, 15, 46, 35, 42123, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Project: Recent changes demo&amp;gt;,
  'pk': 15,
  'type': 'project'},
 {'created': datetime.datetime(2018, 3, 24, 7, 53, 15, 222029, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Comment: Here is another comment&amp;gt;,
  'pk': 14,
  'type': 'comment'}]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can now feed this to a template and use it to render our recent additions page.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Wrapping_it_in_a_reusable_function_167"&gt;&lt;/a&gt;Wrapping it in a re-usable function&lt;/h3&gt;
&lt;p&gt;Here’s a function that implements the above in a re-usable way:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;def combined_recent(limit, **kwargs):
    datetime_field = kwargs.pop('datetime_field', 'created')
    querysets = []
    for key, queryset in kwargs.items():
        querysets.append(
            queryset.annotate(
                recent_changes_type=Value(
                    key, output_field=CharField()
                )
            ).values('pk', 'recent_changes_type', datetime_field)
        )
    union_qs = querysets[0].union(*querysets[1:])
    records = []
    for row in union_qs.order_by('-{}'.format(datetime_field))[:limit]:
        records.append({
            'type': row['recent_changes_type'],
            'when': row[datetime_field],
            'pk': row['pk']
        })
    # Now we bulk-load each object type in turn
    to_load = {}
    for record in records:
        to_load.setdefault(record['type'], []).append(record['pk'])
    fetched = {}
    for key, pks in to_load.items():
        for item in kwargs[key].filter(pk__in=pks):
            fetched[(key, item.pk)] = item
    # Annotate 'records' with loaded objects
    for record in records:
        record['object'] = fetched[(record['type'], record['pk'])]
    return records
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is &lt;a href="https://gist.github.com/simonw/dd0da256716c0b0ec4efe12a81caec45"&gt;also available as a gist&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I can now use that function to combine arbitrary querysets (provided they share a &lt;code&gt;created&lt;/code&gt; datestamp field) like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;recent = combined_recent(
    20,
    project=Project.objects.all(),
    image=Image.objects.all(),
    comment=Comment.objects.all(),
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will return the most recent 20 records across all three types, with the results looking like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[{'when': datetime.datetime(2018, 3, 24, 15, 51, 29, 116511, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Comment: a comment&amp;gt;,
  'pk': 15,
  'type': 'comment'},
 {'when': datetime.datetime(2018, 3, 24, 15, 50, 3, 901320, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Image: Image object (29)&amp;gt;,
  'pk': 29,
  'type': 'image'},
 {'when': datetime.datetime(2018, 3, 24, 15, 46, 35, 42123, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Project: Recent changes demo&amp;gt;,
  'pk': 15,
  'type': 'project'},
 {'when': datetime.datetime(2018, 3, 24, 7, 53, 15, 222029, tzinfo=&amp;lt;UTC&amp;gt;),
  'object': &amp;lt;Comment: Here is another comment&amp;gt;,
  'pk': 14,
  'type': 'comment'}]
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;a id="Efficient_object_loading_with_selectprefetch_related_231"&gt;&lt;/a&gt;Efficient object loading with select/prefetch_related&lt;/h3&gt;
&lt;p&gt;If you’re going to render these objects on a page, it’s pretty likely you’ll need to load additional data about them. My example models above are deliberately simplified, but in any serious Django project it’s likely they will have additional references to other tables.&lt;/p&gt;
&lt;p&gt;We can apply Django’s magic &lt;code&gt;select_related()&lt;/code&gt; and &lt;code&gt;prefetch_related()&lt;/code&gt; methods directly to the querysets we pass to the function, like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;recent = combined_recent(
    20,
    project=Project.objects.all().prefetch_related('tags'),
    image=Image.objects.all().select_related('uploaded_by'),
    comment=Comment.objects.all().select_related('author'),
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Django’s query optimizer is smart enough to ignore those calls entirely when building the initial union queries, so even with the above extras the initial union query will still look like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;(SELECT &amp;quot;myapp_project&amp;quot;.&amp;quot;id&amp;quot;, &amp;quot;myapp_project&amp;quot;.&amp;quot;created&amp;quot;, 'project' AS &amp;quot;recent_changes_type&amp;quot; FROM &amp;quot;myapp_project&amp;quot;)
 UNION (SELECT &amp;quot;myapp_image&amp;quot;.&amp;quot;id&amp;quot;, &amp;quot;myapp_image&amp;quot;.&amp;quot;created&amp;quot;, 'image' AS &amp;quot;recent_changes_type&amp;quot; FROM &amp;quot;myapp_image&amp;quot;)
 UNION (SELECT &amp;quot;myapp_comment&amp;quot;.&amp;quot;id&amp;quot;, &amp;quot;myapp_comment&amp;quot;.&amp;quot;created&amp;quot;, 'comment' AS &amp;quot;recent_changes_type&amp;quot; FROM &amp;quot;myapp_comment&amp;quot;)
ORDER BY (2) DESC LIMIT 20
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;select_related()&lt;/code&gt; and &lt;code&gt;prefetch_related()&lt;/code&gt; clauses will then be incorporated into the subsequent SQL queries that are used to efficiently inflate the full objects from the database.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Taking_it_further_253"&gt;&lt;/a&gt;Taking it further&lt;/h3&gt;
&lt;p&gt;There are a bunch of interesting extensions that can be made to this pattern.&lt;/p&gt;
&lt;p&gt;Want pagination? The initial unioned queryset can be paginated using offset/limit by slicing the queryset, or using the &lt;a href="https://docs.djangoproject.com/en/2.0/topics/pagination/"&gt;Django Paginator class&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Want more efficient pagination (since offset/limit tends to get slow after the first few thousand rows)? We’re ordering by &lt;code&gt;created&lt;/code&gt; already which means it’s not difficult to build efficient range-based pagination, requesting all records where the &lt;code&gt;created&lt;/code&gt; date is less than the earliest date seen on the previous page.&lt;/p&gt;
&lt;p&gt;Since everything is based on regular Django querysets, it’s possible to build all kinds of variants of the recent additions feed. So far we’ve just built one showing all changes across an entire application, but it’s not hard to apply additional filters to only show changes made by a specific user, or changes made relating to a specific foreign key relationship. If you can represent it as a collection of querysets that each expose a &lt;code&gt;created&lt;/code&gt; column you can combine them into a single feed.&lt;/p&gt;
&lt;p&gt;You don’t even need to use records that share a &lt;code&gt;created&lt;/code&gt; column: if you have objects with columns of differing names you can use an annotation to alias those columns, like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;recent = combined_recent(
    20,
    project=Project.objects.annotate(
        when=models.F('updated')
    ).prefetch_related('tags'),
    image=Image.objects.annotate(
        when=models.F('uploaded_at')
    ).select_related('uploaded_by'),
    comment=Comment.objects.annotate(
        when=models.F('commented_at')
    ).select_related('created_by'),
    datetime_field='when'
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I haven’t extensively load-tested this pattern, but I expect it will work fine for databases with tens-of-thousands of records but may start running into trouble if you have millions of records (though an index on the &lt;code&gt;created&lt;/code&gt; column should help a lot). If you need a recent additions feed on something larger scale than that you should probably look at a separate logging table or an external index in something like Elasticsearch instead.&lt;/p&gt;
&lt;p&gt;For another interesting thing you can do with &lt;code&gt;.union()&lt;/code&gt; check out my article on &lt;a href="https://simonwillison.net/2017/Oct/5/django-postgresql-faceted-search/"&gt;Implementing faceted search with Django and PostgreSQL&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="django"/><category term="orm"/></entry><entry><title>Implementing faceted search with Django and PostgreSQL</title><link href="https://simonwillison.net/2017/Oct/5/django-postgresql-faceted-search/#atom-tag" rel="alternate"/><published>2017-10-05T14:12:27+00:00</published><updated>2017-10-05T14:12:27+00:00</updated><id>https://simonwillison.net/2017/Oct/5/django-postgresql-faceted-search/#atom-tag</id><summary type="html">
    &lt;p&gt;I’ve added &lt;a href="https://simonwillison.net/search/"&gt;a faceted search engine&lt;/a&gt; to this blog, powered by PostgreSQL. It supports regular text search (proper search, not just SQL"like" queries), filter by tag, filter by date, filter by content type (entries vs blogmarks vs quotation) and any combination of the above. Some example searches:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/search/?q=postgresql"&gt;All content matching “postgresql”&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/search/?q=django&amp;amp;type=quotation"&gt;Just quotations matching “django”&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/search/?q=python+javascript&amp;amp;tag=mozilla&amp;amp;year=2007"&gt;All content matching “python” and “javascript” with the tag “mozilla” posted in 2007&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It also provides facet counts, so you can tell how many results you will get back before you apply one of these filters - and get a general feeling for the shape of the corpus as you navigate it.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2017/faceted-search.png" alt="Screenshot of my faceted search interface" style="width: 100%"/&gt;&lt;/p&gt;

&lt;p&gt;I love this kind of search interface, because the counts tell you so much more about the underlying data. Turns out I was &lt;a href="https://simonwillison.net/search/?q=javascript&amp;amp;type=quotation"&gt;most active in quoting people talking about JavaScript back in 2007&lt;/a&gt;, for example.&lt;/p&gt;
&lt;p&gt;I usually build faceted search engines using either &lt;a href="https://simonwillison.net/tags/solr/"&gt;Solr&lt;/a&gt; or &lt;a href="https://simonwillison.net/tags/solr/"&gt;Elasticsearch&lt;/a&gt; (though the first version of search on this blog was actually powered by &lt;a href="http://fallabs.com/hyperestraier/intro-en.html"&gt;Hyper Estraier&lt;/a&gt;) - but I’m hosting this blog as simply and inexpensively as possible on Heroku and I don’t want to shell out for a SaaS search solution or run an Elasticsearch instance somewhere myself. I thought I’d have to go back to using &lt;a href="https://developers.google.com/custom-search/"&gt;Google Custom Search&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then I read &lt;a href="http://rachbelaid.com/postgres-full-text-search-is-good-enough/"&gt;Postgres full-text search is Good Enough!&lt;/a&gt; by Rachid Belaid - closely followed by &lt;a href="http://blog.lotech.org/postgres-full-text-search-with-django.html"&gt;Postgres Full-Text Search With Django&lt;/a&gt; by Nathan Shafer - and I decided to have a play with the new PostgreSQL search functionality that was &lt;a href="https://docs.djangoproject.com/en/1.11/releases/1.10/#full-text-search-for-postgresql"&gt;introduced in Django 1.10&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;… and wow! Full-text search is yet another example of a feature that’s been in PostgreSQL for &lt;a href="https://www.postgresql.org/docs/8.3/static/release-8-3.html"&gt;nearly a decade now&lt;/a&gt;, incrementally improving with every release to the point where it’s now  really, &lt;em&gt;really&lt;/em&gt; good.&lt;/p&gt;
&lt;p&gt;At its most basic level a search system needs to handle four things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It needs to take user input and find matching documents.&lt;/li&gt;
&lt;li&gt;It needs to understand and ignore stopwords (common words like “the” and “and”) and apply stemming - knowing that “ridicule” and “ridiculous” should be treated as the same root, for example. Both of these features need to be language-aware.&lt;/li&gt;
&lt;li&gt;It needs to be able to apply relevance ranking, calculating which documents are the best match for a search query.&lt;/li&gt;
&lt;li&gt;It needs to be &lt;em&gt;fast&lt;/em&gt; - working against some kind of index rather than scanning every available document in full.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Modern PostgreSQL &lt;a href="https://www.postgresql.org/docs/9.5/static/textsearch.html"&gt;ticks all of those boxes&lt;/a&gt;. Let’s put it to work.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Simple_search_without_an_index_29"&gt;&lt;/a&gt;Simple search without an index&lt;/h3&gt;
&lt;p&gt;Here’s how to execute a full-text search query against a simple text column:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; blog.models &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; Entry
&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; django.contrib.postgres.search &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; SearchVector

results = Entry.objects.annotate(
    searchable=SearchVector(&lt;span class="hljs-string"&gt;'body'&lt;/span&gt;)
).filter(searchable=&lt;span class="hljs-string"&gt;'django'&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The generated SQL looks something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-sql"&gt;&lt;span class="hljs-operator"&gt;&lt;span class="hljs-keyword"&gt;SELECT&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"id"&lt;/span&gt;, ...,
to_tsvector(&lt;span class="hljs-keyword"&gt;COALESCE&lt;/span&gt;(&lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"body"&lt;/span&gt;, %s)) &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"searchable"&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;FROM&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;WHERE&lt;/span&gt; to_tsvector(&lt;span class="hljs-keyword"&gt;COALESCE&lt;/span&gt;(&lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"body"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"django"&lt;/span&gt;))
    @@ (plainto_tsquery(&lt;span class="hljs-string"&gt;"django"&lt;/span&gt;)) = &lt;span class="hljs-literal"&gt;true&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;ORDER&lt;/span&gt; &lt;span class="hljs-keyword"&gt;BY&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"created"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;DESC&lt;/span&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;SearchVector&lt;/code&gt; class constructs a stemmed, stopword-removed representation of the &lt;code&gt;body&lt;/code&gt; column ready to be searched. The resulting queryset contains entries that are a match for “django”.&lt;/p&gt;
&lt;p&gt;My blog entries are stored as HTML, but I don’t want search to include those HTML tags. One (extremely un-performant) solution is to use Django’s &lt;code&gt;Func&lt;/code&gt; helper to apply a regular expression inside PostgreSQL to strip tags before they are considered for search:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; django.db.models &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; Value, F, Func

results = Entry.objects.annotate(
    searchable=SearchVector(
        Func(
            F(&lt;span class="hljs-string"&gt;'body'&lt;/span&gt;), Value(&lt;span class="hljs-string"&gt;'&amp;lt;.*?&amp;gt;'&lt;/span&gt;), Value(&lt;span class="hljs-string"&gt;''&lt;/span&gt;), Value(&lt;span class="hljs-string"&gt;'g'&lt;/span&gt;),
            function=&lt;span class="hljs-string"&gt;'regexp_replace'&lt;/span&gt;
        )
    )
).filter(searchable=&lt;span class="hljs-string"&gt;'http'&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update 6th October 8:23pm UTC&lt;/strong&gt; - it turns out this step is entirely unnecessary. &lt;a href="https://github.com/simonw/simonwillisonblog/issues/1#issuecomment-334770577"&gt;Paolo Melchiorre points out&lt;/a&gt; that the PostgreSQL ts_vector() function already handles tag removal. Sure enough, executing &lt;samp&gt;SELECT to_tsvector('&amp;lt;div&amp;gt;Hey look what happens to &amp;lt;blockquote&amp;gt;this tag&amp;lt;/blockquote&amp;gt;&amp;lt;/div&amp;gt;')&lt;/samp&gt; &lt;a href="http://sqlfiddle.com/#!17/9eecb/4552"&gt;using SQL Fiddle&lt;/a&gt; returns &lt;samp&gt;'happen':4 'hey':1 'look':2 'tag':7&lt;/samp&gt;, with the tags already stripped.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This works, but performance isn’t great. PostgreSQL ends up having to scan every row and construct a list of search vectors for each one every time you execute a query.&lt;/p&gt;
&lt;p&gt;If you want it to go fast, you need to add a special search vector column to your table and then create the appropriate index on it. As of Django 1.11 this is trivial:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; django.contrib.postgres.search &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; SearchVectorField
&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; django.contrib.postgres.indexes &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; GinIndex

&lt;span class="hljs-class"&gt;&lt;span class="hljs-keyword"&gt;class&lt;/span&gt; &lt;span class="hljs-title"&gt;Entry&lt;/span&gt;&lt;span class="hljs-params"&gt;(models.Model)&lt;/span&gt;:&lt;/span&gt;
    &lt;span class="hljs-comment"&gt;# ...&lt;/span&gt;
    search_document = SearchVectorField(null=&lt;span class="hljs-keyword"&gt;True&lt;/span&gt;)

    &lt;span class="hljs-class"&gt;&lt;span class="hljs-keyword"&gt;class&lt;/span&gt; &lt;span class="hljs-title"&gt;Meta&lt;/span&gt;:&lt;/span&gt;
        indexes = [
            GinIndex(fields=[&lt;span class="hljs-string"&gt;'search_document'&lt;/span&gt;])
        ]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Django’s migration system will automatically add both the field and the special &lt;a href="https://www.postgresql.org/docs/9.5/static/gin-intro.html"&gt;GIN index&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;What’s trickier is populating that &lt;code&gt;search_document&lt;/code&gt; field. Django does not yet support a easy method to populate it directly in your initial INSERT call, instead recommending that you populated with a SQL UPDATE statement after the fact. Here is a one-liner that will populate the field for everything in that table (and strip tags at the same time):&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;strip_tags_func&lt;/span&gt;&lt;span class="hljs-params"&gt;(field)&lt;/span&gt;:&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; Func(
        F(field), Value(&lt;span class="hljs-string"&gt;'&amp;lt;.*?&amp;gt;'&lt;/span&gt;), Value(&lt;span class="hljs-string"&gt;''&lt;/span&gt;), Value(&lt;span class="hljs-string"&gt;'g'&lt;/span&gt;),
        function=&lt;span class="hljs-string"&gt;'regexp_replace'&lt;/span&gt;
    )
 
Entry.objects.update(
    search_document=(
        SearchVector(&lt;span class="hljs-string"&gt;'title'&lt;/span&gt;, weight=&lt;span class="hljs-string"&gt;'A'&lt;/span&gt;) +
        SearchVector(strip_tags_func(&lt;span class="hljs-string"&gt;'body'&lt;/span&gt;), weight=&lt;span class="hljs-string"&gt;'C'&lt;/span&gt;)
    )
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I’m using a neat feature of the &lt;code&gt;SearchVector&lt;/code&gt; class here: it can be concatenated together using the &lt;code&gt;+&lt;/code&gt; operator, and each component can be assigned a weight of &lt;code&gt;A&lt;/code&gt;, &lt;code&gt;B&lt;/code&gt;, &lt;code&gt;C&lt;/code&gt; or &lt;code&gt;D&lt;/code&gt;. These weights affect ranking calculations later on.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Updates_using_signals"&gt;&lt;/a&gt;Updates using signals&lt;/h3&gt;
&lt;p&gt;We could just set this up to run periodically (as I did in my &lt;a href="https://github.com/simonw/simonwillisonblog/commit/7e3a02178e3ca71c464ae68a3b68d70e5fa66692#diff-1cbcc518bc02f9495bba963e698143e0"&gt;initial implementation&lt;/a&gt;), but we can get better real-time results by ensuring this field gets updated automatically when the rest of the model is modified. Some people solve this with PostgreSQL triggers, but I’m still more comfortable handling this kind of thing in python code - so I opted to use Django’s &lt;a href="https://docs.djangoproject.com/en/1.11/topics/signals/"&gt;signals mechanism&lt;/a&gt; instead.&lt;/p&gt;
&lt;p&gt;Since I need to run search queries across three different types of blog content - Entries, Blogmarks and Quotations - I added a method to each model that returns the text fragments corresponding to each of the weight values. Here’s that method for my Quotation model:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-class"&gt;&lt;span class="hljs-keyword"&gt;class&lt;/span&gt; &lt;span class="hljs-title"&gt;Quotation&lt;/span&gt;&lt;span class="hljs-params"&gt;(models.Model)&lt;/span&gt;:&lt;/span&gt;
    quotation = models.TextField()
    source = models.CharField(max_length=&lt;span class="hljs-number"&gt;255&lt;/span&gt;)
    tags = models.ManyToManyField(Tag, blank=&lt;span class="hljs-keyword"&gt;True&lt;/span&gt;)

    &lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;index_components&lt;/span&gt;&lt;span class="hljs-params"&gt;(self)&lt;/span&gt;:&lt;/span&gt;
        &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; {
            &lt;span class="hljs-string"&gt;'A'&lt;/span&gt;: self.quotation,
            &lt;span class="hljs-string"&gt;'B'&lt;/span&gt;: &lt;span class="hljs-string"&gt;' '&lt;/span&gt;.join(self.tags.values_list(&lt;span class="hljs-string"&gt;'tag'&lt;/span&gt;, flat=&lt;span class="hljs-keyword"&gt;True&lt;/span&gt;)),
            &lt;span class="hljs-string"&gt;'C'&lt;/span&gt;: self.source,
        }
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you can see, I’m including the tags that have been assigned to the quotation in the searchable document.&lt;/p&gt;
&lt;p&gt;Here are my signals - loaded once via an import statement in my blog application’s &lt;a href="https://github.com/simonw/simonwillisonblog/blob/3f5ca05248e409a946b53593f7d11b6f9551044f/blog/apps.py"&gt;&lt;code&gt;AppConfig.ready()&lt;/code&gt; method&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-decorator"&gt;@receiver(post_save)&lt;/span&gt;
&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;on_save&lt;/span&gt;&lt;span class="hljs-params"&gt;(sender, **kwargs)&lt;/span&gt;:&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; &lt;span class="hljs-keyword"&gt;not&lt;/span&gt; issubclass(sender, BaseModel):
        &lt;span class="hljs-keyword"&gt;return&lt;/span&gt;
    transaction.on_commit(make_updater(kwargs[&lt;span class="hljs-string"&gt;'instance'&lt;/span&gt;]))

&lt;span class="hljs-decorator"&gt;@receiver(m2m_changed)&lt;/span&gt;
&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;on_m2m_changed&lt;/span&gt;&lt;span class="hljs-params"&gt;(sender, **kwargs)&lt;/span&gt;:&lt;/span&gt;
    instance = kwargs[&lt;span class="hljs-string"&gt;'instance'&lt;/span&gt;]
    model = kwargs[&lt;span class="hljs-string"&gt;'model'&lt;/span&gt;]
    &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; model &lt;span class="hljs-keyword"&gt;is&lt;/span&gt; Tag:
        transaction.on_commit(make_updater(instance))
    &lt;span class="hljs-keyword"&gt;elif&lt;/span&gt; isinstance(instance, Tag):
        &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; obj &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; model.objects.filter(pk__in=kwargs[&lt;span class="hljs-string"&gt;'pk_set'&lt;/span&gt;]):
            transaction.on_commit(make_updater(obj))

&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;make_updater&lt;/span&gt;&lt;span class="hljs-params"&gt;(instance)&lt;/span&gt;:&lt;/span&gt;
    components = instance.index_components()
    pk = instance.pk

    &lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;on_commit&lt;/span&gt;&lt;span class="hljs-params"&gt;()&lt;/span&gt;:&lt;/span&gt;
        search_vectors = []
        &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; weight, text &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; components.items():
            search_vectors.append(
                SearchVector(Value(text, output_field=models.TextField()), weight=weight)
            )
        instance.__class__.objects.filter(pk=pk).update(
            search_document=reduce(operator.add, search_vectors)
        )
    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; on_commit
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(The full code can be &lt;a href="https://github.com/simonw/simonwillisonblog/blob/3f5ca05248e409a946b53593f7d11b6f9551044f/blog/signals.py"&gt;found here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;on_save&lt;/code&gt; method is pretty straightforward - it checks if the model that was just saved has my &lt;code&gt;BaseModel&lt;/code&gt; as a base class, then it calls &lt;code&gt;make_updater&lt;/code&gt; to get a function to be executed by the &lt;code&gt;transaction.on_commit&lt;/code&gt; hook.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;on_m2m_changed&lt;/code&gt; handler is &lt;a href="https://docs.djangoproject.com/en/1.11/ref/signals/#m2m-changed"&gt;significantly more complicated&lt;/a&gt;. There are a number of scenarios in which this will be called - I’m reasonably confident that the idiom I use here will capture all of the modifications that should trigger a re-indexing operation.&lt;/p&gt;
&lt;p&gt;Running a search now looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;results = Entry.objects.filter(
    search_document=SearchQuery(&lt;span class="hljs-string"&gt;'django'&lt;/span&gt;)
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We need one more thing though: we need to sort our search results by relevance. PostgreSQL has pretty good relevance built in, and sorting by the relevance score can be done by  applying a Django ORM annotation:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;query = SearchQuery(&lt;span class="hljs-string"&gt;'ibm'&lt;/span&gt;)

results = Entry.objects.filter(
    search_document=query
).annotate(
    rank=SearchRank(F(&lt;span class="hljs-string"&gt;'search_document'&lt;/span&gt;), query)
).order_by(&lt;span class="hljs-string"&gt;'-rank'&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We now have basic full text search implemented against a single Django model, making use of a GIN index. This is lightning fast.&lt;/p&gt;
&lt;h3&gt;&lt;a id="Searching_multiple_tables_using_querysetunion_192"&gt;&lt;/a&gt;Searching multiple tables using queryset.union()&lt;/h3&gt;
&lt;p&gt;My site has three types of content, represented in three different models and hence three different underlying database tables.&lt;/p&gt;
&lt;p&gt;I’m using &lt;a href="https://github.com/simonw/simonwillisonblog/blob/3f5ca05248e409a946b53593f7d11b6f9551044f/blog/models.py#L78-L107"&gt;an abstract base model&lt;/a&gt; to define common fields shared by all three: the created date, the slug (used to construct permalink urls) and the &lt;code&gt;search_document&lt;/code&gt; field populated above.&lt;/p&gt;
&lt;p&gt;As of Django 1.11 It’s possible to combine queries across different tables &lt;a href="https://docs.djangoproject.com/en/1.11/releases/1.11/#models"&gt;using the SQL union operator&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here’s  what that looks like for running a search across three tables, all with the same &lt;code&gt;search_document&lt;/code&gt; search vector field. I need to use &lt;code&gt;.values()&lt;/code&gt; to restrict the querysets I am unioning to the same subset of fields:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;query = SearchQuery(&lt;span class="hljs-string"&gt;'django'&lt;/span&gt;)
rank_annotation = SearchRank(F(&lt;span class="hljs-string"&gt;'search_document'&lt;/span&gt;), query)
qs = Blogmark.objects.annotate(
    rank=rank_annotation,
).filter(
    search_document=query
).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;).union(
    Entry.objects.annotate(
        rank=rank_annotation,
    ).filter(
        search_document=query
    ).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;),
    Quotation.objects.annotate(
        rank=rank_annotation,
    ).filter(
        search_document=query
    ).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;),
).order_by(&lt;span class="hljs-string"&gt;'-rank'&lt;/span&gt;)[:&lt;span class="hljs-number"&gt;5&lt;/span&gt;]

&lt;span class="hljs-comment"&gt;# Output&lt;/span&gt;
&amp;lt;QuerySet [
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;186&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.875179&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;: datetime.datetime(&lt;span class="hljs-number"&gt;2008&lt;/span&gt;, &lt;span class="hljs-number"&gt;4&lt;/span&gt;, &lt;span class="hljs-number"&gt;8&lt;/span&gt;, &lt;span class="hljs-number"&gt;13&lt;/span&gt;, &lt;span class="hljs-number"&gt;48&lt;/span&gt;, &lt;span class="hljs-number"&gt;18&lt;/span&gt;, tzinfo=&amp;lt;UTC&amp;gt;)},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;134&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.842655&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;: datetime.datetime(&lt;span class="hljs-number"&gt;2007&lt;/span&gt;, &lt;span class="hljs-number"&gt;10&lt;/span&gt;, &lt;span class="hljs-number"&gt;20&lt;/span&gt;, &lt;span class="hljs-number"&gt;13&lt;/span&gt;, &lt;span class="hljs-number"&gt;46&lt;/span&gt;, &lt;span class="hljs-number"&gt;56&lt;/span&gt;, tzinfo=&amp;lt;UTC&amp;gt;)},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;1591&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.804502&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;: datetime.datetime(&lt;span class="hljs-number"&gt;2009&lt;/span&gt;, &lt;span class="hljs-number"&gt;9&lt;/span&gt;, &lt;span class="hljs-number"&gt;28&lt;/span&gt;, &lt;span class="hljs-number"&gt;23&lt;/span&gt;, &lt;span class="hljs-number"&gt;32&lt;/span&gt;, &lt;span class="hljs-number"&gt;4&lt;/span&gt;, tzinfo=&amp;lt;UTC&amp;gt;)},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;5093&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.788616&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;: datetime.datetime(&lt;span class="hljs-number"&gt;2010&lt;/span&gt;, &lt;span class="hljs-number"&gt;2&lt;/span&gt;, &lt;span class="hljs-number"&gt;26&lt;/span&gt;, &lt;span class="hljs-number"&gt;19&lt;/span&gt;, &lt;span class="hljs-number"&gt;22&lt;/span&gt;, &lt;span class="hljs-number"&gt;47&lt;/span&gt;, tzinfo=&amp;lt;UTC&amp;gt;)},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;2598&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.786928&lt;/span&gt;, &lt;span class="hljs-string"&gt;'created'&lt;/span&gt;: datetime.datetime(&lt;span class="hljs-number"&gt;2007&lt;/span&gt;, &lt;span class="hljs-number"&gt;1&lt;/span&gt;, &lt;span class="hljs-number"&gt;26&lt;/span&gt;, &lt;span class="hljs-number"&gt;12&lt;/span&gt;, &lt;span class="hljs-number"&gt;38&lt;/span&gt;, &lt;span class="hljs-number"&gt;46&lt;/span&gt;, tzinfo=&amp;lt;UTC&amp;gt;)}
]&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is not enough information though - I have the primary keys, but I don’t know which type of model they belong to. In order to retrieve the actual resulting objects from the database I need to know which type of content is represented by each of those results.&lt;/p&gt;
&lt;p&gt;I can achieve that using another annotation:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;qs = Blogmark.objects.annotate(
    rank=rank_annotation,
    type=models.Value(&lt;span class="hljs-string"&gt;'blogmark'&lt;/span&gt;, output_field=models.CharField())
).filter(
    search_document=query
).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;).union(
    Entry.objects.annotate(
        rank=rank_annotation,
        type=models.Value(&lt;span class="hljs-string"&gt;'entry'&lt;/span&gt;, output_field=models.CharField())
    ).filter(
        search_document=query
    ).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;),
    Quotation.objects.annotate(
        rank=rank_annotation,
        type=models.Value(&lt;span class="hljs-string"&gt;'quotation'&lt;/span&gt;, output_field=models.CharField())
    ).filter(
        search_document=query
    ).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;),
).order_by(&lt;span class="hljs-string"&gt;'-rank'&lt;/span&gt;)[:&lt;span class="hljs-number"&gt;5&lt;/span&gt;]

&lt;span class="hljs-comment"&gt;# Output:&lt;/span&gt;
&amp;lt;QuerySet [
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;186&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;: &lt;span class="hljs-string"&gt;u'quotation'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.875179&lt;/span&gt;},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;134&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;: &lt;span class="hljs-string"&gt;u'quotation'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.842655&lt;/span&gt;},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;1591&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;: &lt;span class="hljs-string"&gt;u'entry'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.804502&lt;/span&gt;},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;5093&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;: &lt;span class="hljs-string"&gt;u'blogmark'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.788616&lt;/span&gt;},
    {&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;: &lt;span class="hljs-number"&gt;2598&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;: &lt;span class="hljs-string"&gt;u'blogmark'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;: &lt;span class="hljs-number"&gt;0.786928&lt;/span&gt;}
]&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now I just need to write function which can take a list of types and primary keys and return the full objects from the database:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-function"&gt;&lt;span class="hljs-keyword"&gt;def&lt;/span&gt; &lt;span class="hljs-title"&gt;load_mixed_objects&lt;/span&gt;&lt;span class="hljs-params"&gt;(dicts)&lt;/span&gt;:&lt;/span&gt;
    &lt;span class="hljs-string"&gt;"""
    Takes a list of dictionaries, each of which must at least have a 'type'
    and a 'pk' key. Returns a list of ORM objects of those various types.
    Each returned ORM object has a .original_dict attribute populated.
    """&lt;/span&gt;
    to_fetch = {}
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; d &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; dicts:
        to_fetch.setdefault(d[&lt;span class="hljs-string"&gt;'type'&lt;/span&gt;], set()).add(d[&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;])
    fetched = {}
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; key, model &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; (
        (&lt;span class="hljs-string"&gt;'blogmark'&lt;/span&gt;, Blogmark),
        (&lt;span class="hljs-string"&gt;'entry'&lt;/span&gt;, Entry),
        (&lt;span class="hljs-string"&gt;'quotation'&lt;/span&gt;, Quotation),
    ):
        ids = to_fetch.get(key) &lt;span class="hljs-keyword"&gt;or&lt;/span&gt; []
        objects = model.objects.prefetch_related(&lt;span class="hljs-string"&gt;'tags'&lt;/span&gt;).filter(pk__in=ids)
        &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; obj &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; objects:
            fetched[(key, obj.pk)] = obj
    &lt;span class="hljs-comment"&gt;# Build list in same order as dicts argument&lt;/span&gt;
    to_return = []
    &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; d &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; dicts:
        item = fetched.get((d[&lt;span class="hljs-string"&gt;'type'&lt;/span&gt;], d[&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;])) &lt;span class="hljs-keyword"&gt;or&lt;/span&gt; &lt;span class="hljs-keyword"&gt;None&lt;/span&gt;
        &lt;span class="hljs-keyword"&gt;if&lt;/span&gt; item:
            item.original_dict = d
        to_return.append(item)
    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; to_return
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;One last challenge: when I add filtering by type, I’m going to want to selectively union together only a subset of these querysets. I need a queryset to start unions against, but I don’t yet know which queryset I will be using. I can abuse Django’s &lt;code&gt;queryset.none()&lt;/code&gt; method to crate an empty &lt;code&gt;ValuesQuerySet&lt;/code&gt; in the correct shape like this&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;qs = Entry.objects.annotate(
    type=models.Value(&lt;span class="hljs-string"&gt;'empty'&lt;/span&gt;, output_field=models.CharField()),
    rank=rank_annotation
).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;).none()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now I can progressively build up my union in a loop like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; klass &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; (Entry, Blogmark, Quotation):
    qs = qs.union(klass.objects.annotate(
        rank=rank_annotation,
        type=models.Value(&lt;span class="hljs-string"&gt;'quotation'&lt;/span&gt;, output_field=models.CharField())
    ).filter(
        search_document=query
    ).values(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'type'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'rank'&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The Django ORM is smart enough to compile away the empty queryset when it constructs the SQL, which ends up looking something like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-sql"&gt;(((&lt;span class="hljs-operator"&gt;&lt;span class="hljs-keyword"&gt;SELECT&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"id"&lt;/span&gt;,
            &lt;span class="hljs-string"&gt;"entry"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"type"&lt;/span&gt;,
            ts_rank(&lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"search_document"&lt;/span&gt;, plainto_tsquery(%s)) &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"rank"&lt;/span&gt;
     &lt;span class="hljs-keyword"&gt;FROM&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;
     &lt;span class="hljs-keyword"&gt;WHERE&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"search_document"&lt;/span&gt; @@ (plainto_tsquery(%s)) = &lt;span class="hljs-literal"&gt;TRUE&lt;/span&gt;
     &lt;span class="hljs-keyword"&gt;ORDER&lt;/span&gt; &lt;span class="hljs-keyword"&gt;BY&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_entry"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"created"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;DESC&lt;/span&gt;))
 &lt;span class="hljs-keyword"&gt;UNION&lt;/span&gt;
   (&lt;span class="hljs-keyword"&gt;SELECT&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_blogmark"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"id"&lt;/span&gt;,
           &lt;span class="hljs-string"&gt;"blogmark"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"type"&lt;/span&gt;,
           ts_rank(&lt;span class="hljs-string"&gt;"blog_blogmark"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"search_document"&lt;/span&gt;, plainto_tsquery(%s)) &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"rank"&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;FROM&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_blogmark"&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;WHERE&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_blogmark"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"search_document"&lt;/span&gt; @@ (plainto_tsquery(%s)) = &lt;span class="hljs-literal"&gt;TRUE&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;ORDER&lt;/span&gt; &lt;span class="hljs-keyword"&gt;BY&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_blogmark"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"created"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;DESC&lt;/span&gt;))
&lt;span class="hljs-keyword"&gt;UNION&lt;/span&gt;
  (&lt;span class="hljs-keyword"&gt;SELECT&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_quotation"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"id"&lt;/span&gt;,
          &lt;span class="hljs-string"&gt;"quotation"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"type"&lt;/span&gt;,
          ts_rank(&lt;span class="hljs-string"&gt;"blog_quotation"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"search_document"&lt;/span&gt;, plainto_tsquery(%s)) &lt;span class="hljs-keyword"&gt;AS&lt;/span&gt; &lt;span class="hljs-string"&gt;"rank"&lt;/span&gt;
   &lt;span class="hljs-keyword"&gt;FROM&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_quotation"&lt;/span&gt;
   &lt;span class="hljs-keyword"&gt;WHERE&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_quotation"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"search_document"&lt;/span&gt; @@ (plainto_tsquery(%s)) = &lt;span class="hljs-literal"&gt;TRUE&lt;/span&gt;
   &lt;span class="hljs-keyword"&gt;ORDER&lt;/span&gt; &lt;span class="hljs-keyword"&gt;BY&lt;/span&gt; &lt;span class="hljs-string"&gt;"blog_quotation"&lt;/span&gt;.&lt;span class="hljs-string"&gt;"created"&lt;/span&gt; &lt;span class="hljs-keyword"&gt;DESC&lt;/span&gt;)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;a id="Applying_filters_345"&gt;&lt;/a&gt;Applying filters&lt;/h3&gt;
&lt;p&gt;So far, our search engine can only handle user-entered query strings. If I am going to build a faceted search interface I need to be able to handle filtering as well. I want the ability to filter by year, tag and type.&lt;/p&gt;
&lt;p&gt;The key difference between filtering and querying (borrowing &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html"&gt;these definitions from Elasticsearch&lt;/a&gt;) is that querying is loose - it involves stemming and stopwords - while filtering is exact. Additionally, querying affects the calculated relevance score while filtering does not - a document either matches the filter or it doesn’t.&lt;/p&gt;
&lt;p&gt;Since PostgreSQL is a relational database, filtering can be handled by simply constructing extra SQL where clauses using the Django ORM.&lt;/p&gt;
&lt;p&gt;Each of the filters I need requires a slightly different approach. Filtering by type is easy - I just selectively include or exclude that model from my union queryset.&lt;/p&gt;
&lt;p&gt;Year and month work like this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;selected_year = request.GET.get(&lt;span class="hljs-string"&gt;'year'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;)
selected_month = request.GET.get(&lt;span class="hljs-string"&gt;'month'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;)
&lt;span class="hljs-keyword"&gt;if&lt;/span&gt; selected_year:
    qs = qs.filter(created__year=int(selected_year))
&lt;span class="hljs-keyword"&gt;if&lt;/span&gt; selected_month:
    qs = qs.filter(created__month=int(selected_month))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Tags involve a join through a many-2-many relationship against the Tags table.  We want to be able to apply more than one tag, for example this search for &lt;a href="https://simonwillison.net/search/?tag=python&amp;amp;tag=javascript"&gt;all items tagged both python and javascript&lt;/a&gt;. Django’s ORM makes this easy:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;selected_tags = request.GET.getlist(&lt;span class="hljs-string"&gt;'tag'&lt;/span&gt;)
&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; tag &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; selected_tags:
    qs = qs.filter(tags__tag=tag)
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;a id="Adding_facet_counts_374"&gt;&lt;/a&gt;Adding facet counts&lt;/h3&gt;
&lt;p&gt;There is just one more ingredient needed to complete our faceted search: facet counts!&lt;/p&gt;
&lt;p&gt;Again, the way we calculate these is different for each of our filters. For types, we need to call &lt;code&gt;.count()&lt;/code&gt; on a separate queryset for each of the types we are searching:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;queryset = make_queryset(Entry, &lt;span class="hljs-string"&gt;'entry'&lt;/span&gt;)
type_counts[&lt;span class="hljs-string"&gt;'entry'&lt;/span&gt;] = queryset.count()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(the make_queryset function is &lt;a href="https://github.com/simonw/simonwillisonblog/blob/3f5ca05248e409a946b53593f7d11b6f9551044f/blog/views.py#L408-L423"&gt;defined here&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;For years we can do this:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; django.db.models.functions &lt;span class="hljs-keyword"&gt;import&lt;/span&gt; TruncYear

&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; row &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; queryset.order_by().annotate(
    year=TruncYear(&lt;span class="hljs-string"&gt;'created'&lt;/span&gt;)
).values(&lt;span class="hljs-string"&gt;'year'&lt;/span&gt;).annotate(n=models.Count(&lt;span class="hljs-string"&gt;'pk'&lt;/span&gt;)):
    year_counts[row[&lt;span class="hljs-string"&gt;'year'&lt;/span&gt;]] = year_counts.get(
        row[&lt;span class="hljs-string"&gt;'year'&lt;/span&gt;], &lt;span class="hljs-number"&gt;0&lt;/span&gt;
    ) + row[&lt;span class="hljs-string"&gt;'n'&lt;/span&gt;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Tags are trickiest. Let’s take advantage of he fact that Django’s ORM knows how to construct sub-selects if you pass another queryset to the &lt;code&gt;__in&lt;/code&gt;  operator.&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-python"&gt;tag_counts = {}
type_name = &lt;span class="hljs-string"&gt;'entry'&lt;/span&gt;
queryset = make_queryset(Entry, &lt;span class="hljs-string"&gt;'entry'&lt;/span&gt;)
&lt;span class="hljs-keyword"&gt;for&lt;/span&gt; tag, count &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; Tag.objects.filter(**{
    &lt;span class="hljs-string"&gt;'%s__in'&lt;/span&gt; % type_name: queryset
}).annotate(
    n=models.Count(&lt;span class="hljs-string"&gt;'tag'&lt;/span&gt;)
).values_list(&lt;span class="hljs-string"&gt;'tag'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'n'&lt;/span&gt;):
    tag_counts[tag] = tag_counts.get(tag, &lt;span class="hljs-number"&gt;0&lt;/span&gt;) + count
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;a id="Rendering_it_all_in_a_template_414"&gt;&lt;/a&gt;Rendering it all in a template&lt;/h3&gt;
&lt;p&gt;Having constructed the various facets counts in the view function, the template is really simple:&lt;/p&gt;
&lt;pre&gt;&lt;code class="language-html"&gt;{% if type_counts %}
    &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;h3&lt;/span&gt;&amp;gt;&lt;/span&gt;Types&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;h3&lt;/span&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;ul&lt;/span&gt;&amp;gt;&lt;/span&gt;
        {% for t in type_counts %}
            &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;li&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;a&lt;/span&gt; &lt;span class="hljs-attribute"&gt;href&lt;/span&gt;=&lt;span class="hljs-value"&gt;"{% add_qsarg "&lt;/span&gt;&lt;span class="hljs-value"&gt;type"&lt;/span&gt; &lt;span class="hljs-attribute"&gt;t.type&lt;/span&gt; %}"&amp;gt;&lt;/span&gt;{{ t.type }}&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt; {{ t.n }}&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;li&lt;/span&gt;&amp;gt;&lt;/span&gt;
        {% endfor %}
    &lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;ul&lt;/span&gt;&amp;gt;&lt;/span&gt;
{% endif %}
{% if year_counts %}
    &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;h3&lt;/span&gt;&amp;gt;&lt;/span&gt;Years&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;h3&lt;/span&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;ul&lt;/span&gt;&amp;gt;&lt;/span&gt;
        {% for t in year_counts %}
            &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;li&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;a&lt;/span&gt; &lt;span class="hljs-attribute"&gt;href&lt;/span&gt;=&lt;span class="hljs-value"&gt;"{% add_qsarg "&lt;/span&gt;&lt;span class="hljs-value"&gt;year"&lt;/span&gt; &lt;span class="hljs-attribute"&gt;t.year&lt;/span&gt;|&lt;span class="hljs-attribute"&gt;date:&lt;/span&gt;"&lt;span class="hljs-attribute"&gt;Y&lt;/span&gt;" %}"&amp;gt;&lt;/span&gt;{{ t.year|date:"Y" }}&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt; {{ t.n }}&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;li&lt;/span&gt;&amp;gt;&lt;/span&gt;
        {% endfor %}
    &lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;ul&lt;/span&gt;&amp;gt;&lt;/span&gt;
{% endif %}
{% if tag_counts %}
    &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;h3&lt;/span&gt;&amp;gt;&lt;/span&gt;Tags&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;h3&lt;/span&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;ul&lt;/span&gt;&amp;gt;&lt;/span&gt;
        {% for t in tag_counts %}
            &lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;li&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-title"&gt;a&lt;/span&gt; &lt;span class="hljs-attribute"&gt;href&lt;/span&gt;=&lt;span class="hljs-value"&gt;"{% add_qsarg "&lt;/span&gt;&lt;span class="hljs-value"&gt;tag"&lt;/span&gt; &lt;span class="hljs-attribute"&gt;t.tag&lt;/span&gt; %}"&amp;gt;&lt;/span&gt;{{ t.tag }}&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt; {{ t.n }}&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt;&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;li&lt;/span&gt;&amp;gt;&lt;/span&gt;
        {% endfor %}
    &lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-title"&gt;ul&lt;/span&gt;&amp;gt;&lt;/span&gt;
{% endif %}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I am using &lt;a href="https://github.com/simonw/simonwillisonblog/blob/3f5ca05248e409a946b53593f7d11b6f9551044f/blog/templatetags/blog_tags.py#L54-L68"&gt;custom templates tags&lt;/a&gt; here to add arguments to the current URL. I’ve built systems like this in the past where the URLs are instead generated in the view logic, which I think I prefer. As always, perfect is the enemy of shipped.&lt;/p&gt;
&lt;p&gt;And because the results are just a Django queryset, we can use Django’s pagination helpers for the pagination links.&lt;/p&gt;
&lt;h3&gt;&lt;a id="The_final_implementation_449"&gt;&lt;/a&gt;The final implementation&lt;/h3&gt;
&lt;p&gt;The full current version of the code at time of writing &lt;a href="https://github.com/simonw/simonwillisonblog/blob/3f5ca05248e409a946b53593f7d11b6f9551044f/blog/views.py#L388-L552"&gt;can be seen here&lt;/a&gt;. You can follow my initial implementation of this feature through the following commits: &lt;a href="https://github.com/simonw/simonwillisonblog/commit/7e3a0217"&gt;7e3a0217&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/c7e7b30c"&gt;c7e7b30c&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/7f6b524c"&gt;7f6b524c&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/a16ddb5e"&gt;a16ddb5e&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/7055c7e1"&gt;7055c7e1&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/74c194d9"&gt;74c194d9&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/f3ffc100"&gt;f3ffc100&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/6c24d9fd"&gt;6c24d9fd&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/cb88c2d4"&gt;cb88c2d4&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/2c262c75"&gt;2c262c75&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/776a562a"&gt;776a562a&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/b8484c50"&gt;b8484c50&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/0b361c78"&gt;0b361c78&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/1322ada2"&gt;1322ada2&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/79b1b13d"&gt;79b1b13d&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/3955f41b"&gt;3955f41b&lt;/a&gt; &lt;a href="https://github.com/simonw/simonwillisonblog/commit/3f5ca052"&gt;3f5ca052&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;And that’s how I built faceted search on top of PostgreSQL and Django! I don’t have my blog comments up and running yet, so please post any thoughts or feedback over on &lt;a href="https://github.com/simonw/simonwillisonblog/issues/1"&gt;this GitHub issue&lt;/a&gt; or &lt;a href="https://news.ycombinator.com/item?id=15409733"&gt;over on this thread on Hacker News&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update 9th September 2021&lt;/strong&gt;: A few years after implementing this I started to notice performance issues with my blog, which turned out to be caused by search engine crawlers hitting every possible combination of facets, triggering a ton of expensive SQL queries. I excluded &lt;code&gt;/search&lt;/code&gt; from being crawled using &lt;code&gt;robots.txt&lt;/code&gt; which fixed the problem.&lt;/em&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/full-text-search"&gt;full-text-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/facetedsearch"&gt;facetedsearch&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="django"/><category term="full-text-search"/><category term="orm"/><category term="postgresql"/><category term="projects"/><category term="search"/><category term="facetedsearch"/></entry><entry><title>Easier custom Model Manager Chaining</title><link href="https://simonwillison.net/2010/Jul/20/chaining/#atom-tag" rel="alternate"/><published>2010-07-20T18:21:00+00:00</published><updated>2010-07-20T18:21:00+00:00</updated><id>https://simonwillison.net/2010/Jul/20/chaining/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://djangosnippets.org/snippets/2117/"&gt;Easier custom Model Manager Chaining&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A neat solution to the problem of wanting to write a custom QuerySet method (.published() for example) which is also available on that model’s objects manager, without having to write much boilerplate.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/queryset"&gt;queryset&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="orm"/><category term="queryset"/><category term="recovered"/></entry><entry><title>On Django And Migrations</title><link href="https://simonwillison.net/2010/Jun/2/migrations/#atom-tag" rel="alternate"/><published>2010-06-02T16:27:00+00:00</published><updated>2010-06-02T16:27:00+00:00</updated><id>https://simonwillison.net/2010/Jun/2/migrations/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.aeracode.org/2010/6/2/django-and-migrations/"&gt;On Django And Migrations&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
South author Andrew Godwin on the plans for migrations in Django. His excellent South migration library will be split in to two parts—one handling database abstraction, dependency resolution and history tracking and the other providing autodetection and the South user interface. The former will go in to Django proper, encouraging other migration libraries to share the same core abstractions.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/andrew-godwin"&gt;andrew-godwin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/migrations"&gt;migrations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/south"&gt;south&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="andrew-godwin"/><category term="django"/><category term="migrations"/><category term="orm"/><category term="south"/><category term="recovered"/></entry><entry><title>Appending the request URL to SQL statements in Django</title><link href="https://simonwillison.net/2010/Jun/2/framewalking/#atom-tag" rel="alternate"/><published>2010-06-02T09:09:00+00:00</published><updated>2010-06-02T09:09:00+00:00</updated><id>https://simonwillison.net/2010/Jun/2/framewalking/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://chris-lamb.co.uk/2010/06/01/appending-request-url-sql-statements-django/"&gt;Appending the request URL to SQL statements in Django&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A clever frame-walking monkey-patch which pulls the most recent HttpRequest object out of the Python stack and adds the current request.path to each SQL query as an SQL comment, so you can see it in debugging tools such as slow query logs and the PostgreSQL “select * from pg_stat_activity” query.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/chris-lamb"&gt;chris-lamb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/debugging"&gt;debugging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/monkeypatch"&gt;monkeypatch&lt;/a&gt;&lt;/p&gt;



</summary><category term="chris-lamb"/><category term="debugging"/><category term="django"/><category term="orm"/><category term="postgresql"/><category term="python"/><category term="sql"/><category term="recovered"/><category term="monkeypatch"/></entry><entry><title>Cache Machine: Automatic caching for your Django models</title><link href="https://simonwillison.net/2010/Mar/11/cachemachine/#atom-tag" rel="alternate"/><published>2010-03-11T19:35:32+00:00</published><updated>2010-03-11T19:35:32+00:00</updated><id>https://simonwillison.net/2010/Mar/11/cachemachine/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://jbalogh.me/2010/02/09/cache-machine/"&gt;Cache Machine: Automatic caching for your Django models&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cachemachine"&gt;cachemachine&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ormcaching"&gt;ormcaching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;&lt;/p&gt;



</summary><category term="cachemachine"/><category term="caching"/><category term="django"/><category term="memcached"/><category term="mozilla"/><category term="orm"/><category term="ormcaching"/><category term="python"/><category term="redis"/></entry><entry><title>Announcing django-cachebot</title><link href="https://simonwillison.net/2010/Mar/6/david/#atom-tag" rel="alternate"/><published>2010-03-06T12:48:39+00:00</published><updated>2010-03-06T12:48:39+00:00</updated><id>https://simonwillison.net/2010/Mar/6/david/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.davidziegler.net/post/429237463/announcing-django-cachebot"&gt;Announcing django-cachebot&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The ORM caching space around Django is heating up. django-cachebot is used in production at mingle.com and takes a more low level approach to cache invalidation than Johnny Cache, enabling you to specifically mark the querysets you wish to cache and providing some advanced options for cache invalidation. Unfortunately it currently relies on a patch to Django core to enable its own manager.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cachebot"&gt;cachebot&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mingle"&gt;mingle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ormcaching"&gt;ormcaching&lt;/a&gt;&lt;/p&gt;



</summary><category term="cachebot"/><category term="caching"/><category term="django"/><category term="mingle"/><category term="orm"/><category term="ormcaching"/></entry><entry><title>Johnny Cache</title><link href="https://simonwillison.net/2010/Feb/28/johnny/#atom-tag" rel="alternate"/><published>2010-02-28T22:55:15+00:00</published><updated>2010-02-28T22:55:15+00:00</updated><id>https://simonwillison.net/2010/Feb/28/johnny/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://packages.python.org/johnny-cache/"&gt;Johnny Cache&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Clever twist on ORM-level caching for Django. Johnny Cache (great name) monkey-patches Django’s QuerySet classes and caches the result of every single SELECT query in memcached with an infinite expiry time. The cache key includes a “generation” ID for each dependent database table, and the generation is changed every single time a table is updated. For apps with infrequent writes, this strategy should work really well—but if a popular table is being updated constantly the cache will be all but useless. Impressively, the system is transaction-aware—cache entries created during a transaction are held in local memory and only pushed to memcached should the transaction complete successfully.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ormcaching"&gt;ormcaching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="databases"/><category term="django"/><category term="memcached"/><category term="orm"/><category term="ormcaching"/><category term="performance"/><category term="python"/></entry><entry><title>django-batch-select</title><link href="https://simonwillison.net/2009/Nov/23/batchselect/#atom-tag" rel="alternate"/><published>2009-11-23T16:19:52+00:00</published><updated>2009-11-23T16:19:52+00:00</updated><id>https://simonwillison.net/2009/Nov/23/batchselect/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/lilspikey/django-batch-select"&gt;django-batch-select&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A smart attempt at solving select_related for many-to-many relationships in Django. Add a custom manager to your model and call e.g. &lt;code&gt;Entry.objects.all()[:10].batch_select("tags")&lt;/code&gt; to execute two queries - one pulling back the first ten entries and another using an "IN" query against the tags table to pull back all of the tags for those entries in one go.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://www.psychicorigami.com/2009/11/23/django-batch-select/"&gt;Psychic Origami&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/batchselect"&gt;batchselect&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-montgomery"&gt;john-montgomery&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/manytomany"&gt;manytomany&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/selectrelated"&gt;selectrelated&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;&lt;/p&gt;



</summary><category term="batchselect"/><category term="django"/><category term="john-montgomery"/><category term="manytomany"/><category term="orm"/><category term="python"/><category term="selectrelated"/><category term="sql"/></entry><entry><title>Django 1.2 planned features</title><link href="https://simonwillison.net/2009/Oct/26/django/#atom-tag" rel="alternate"/><published>2009-10-26T10:38:06+00:00</published><updated>2009-10-26T10:38:06+00:00</updated><id>https://simonwillison.net/2009/Oct/26/django/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.djangoproject.com/wiki/Version1.2Features"&gt;Django 1.2 planned features&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The votes are in and the plan for Django 1.2 has taken shape - features are split in to high, medium and low priority. There's some really exciting stuff in there - outside of the things I've already talked about, I'm particularly excited about multidb, &lt;code&gt;Model.objects.raw(SQL)&lt;/code&gt;, the smarter &lt;code&gt;{% if %}&lt;/code&gt; tag and class-based generic views.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/classbasedviews"&gt;classbasedviews&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/multidb"&gt;multidb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="classbasedviews"/><category term="django"/><category term="multidb"/><category term="orm"/><category term="python"/></entry><entry><title>Django 1.1 release notes</title><link href="https://simonwillison.net/2009/Jul/29/django/#atom-tag" rel="alternate"/><published>2009-07-29T09:34:04+00:00</published><updated>2009-07-29T09:34:04+00:00</updated><id>https://simonwillison.net/2009/Jul/29/django/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://docs.djangoproject.com/en/dev/releases/1.1/"&gt;Django 1.1 release notes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Django 1.1 is out! Congratulations everyone who worked on this, it’s a fantastic release. New features include aggregate support in the ORM, proxy models, deferred fields and some really nice admin improvements. Oh, and the testing framework is now up to 10 times thanks to smart use of transactions.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://www.djangoproject.com/weblog/2009/jul/29/1-point-1/"&gt;Django | Weblog | Django 1.1 released&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/aggregates"&gt;aggregates&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django-admin"&gt;django-admin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/releases"&gt;releases&lt;/a&gt;&lt;/p&gt;



</summary><category term="aggregates"/><category term="django"/><category term="django-admin"/><category term="open-source"/><category term="orm"/><category term="python"/><category term="releases"/></entry><entry><title>South's Design</title><link href="https://simonwillison.net/2009/May/13/south/#atom-tag" rel="alternate"/><published>2009-05-13T12:30:45+00:00</published><updated>2009-05-13T12:30:45+00:00</updated><id>https://simonwillison.net/2009/May/13/south/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.aeracode.org/2009/5/9/souths-design/"&gt;South&amp;#x27;s Design&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Andrew Godwin explains why South resorts to parsing your models.py file in order to construct information about for creating automatic migrations.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/andrew-godwin"&gt;andrew-godwin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/models"&gt;models&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/parsing"&gt;parsing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/south"&gt;south&lt;/a&gt;&lt;/p&gt;



</summary><category term="andrew-godwin"/><category term="django"/><category term="models"/><category term="orm"/><category term="parsing"/><category term="python"/><category term="south"/></entry><entry><title>Haystack</title><link href="https://simonwillison.net/2009/Apr/17/haystack/#atom-tag" rel="alternate"/><published>2009-04-17T21:53:49+00:00</published><updated>2009-04-17T21:53:49+00:00</updated><id>https://simonwillison.net/2009/Apr/17/haystack/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://haystacksearch.org/"&gt;Haystack&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A brand new modular search plugin for Django, by Daniel Lindsley. The interface is modelled after the Django ORM (complete with declarative classes for defining your search schema) and it ships with backends for both Solr and pure-python Whoosh, with more on the way. Excellent documentation.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://toastdriven.com/fresh/announcing-haystack-modular-search-django/"&gt;Toast Driven&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/daniel-lindsley"&gt;daniel-lindsley&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/haystack"&gt;haystack&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/solr"&gt;solr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/whoosh"&gt;whoosh&lt;/a&gt;&lt;/p&gt;



</summary><category term="daniel-lindsley"/><category term="django"/><category term="haystack"/><category term="orm"/><category term="python"/><category term="search"/><category term="solr"/><category term="whoosh"/></entry><entry><title>Southerly Breezes</title><link href="https://simonwillison.net/2009/Mar/15/aeracode/#atom-tag" rel="alternate"/><published>2009-03-15T13:17:20+00:00</published><updated>2009-03-15T13:17:20+00:00</updated><id>https://simonwillison.net/2009/Mar/15/aeracode/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.aeracode.org/2009/3/10/southerly-breezes/"&gt;Southerly Breezes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Andrew Godwin is slowly assimilating the best ideas from other Django migration systems in to South—the latest additions include ORM Freezing from Migratory and automatic change detection. Exciting stuff.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/andrew-godwin"&gt;andrew-godwin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/migrations"&gt;migrations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/south"&gt;south&lt;/a&gt;&lt;/p&gt;



</summary><category term="andrew-godwin"/><category term="databases"/><category term="django"/><category term="migrations"/><category term="orm"/><category term="south"/></entry><entry><title>DB2 support for Django is coming</title><link href="https://simonwillison.net/2009/Feb/18/db2/#atom-tag" rel="alternate"/><published>2009-02-18T22:58:50+00:00</published><updated>2009-02-18T22:58:50+00:00</updated><id>https://simonwillison.net/2009/Feb/18/db2/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://antoniocangiano.com/2009/02/18/db2-support-for-django-is-coming/"&gt;DB2 support for Django is coming&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
From IBM, under the Apache 2.0 License. I’m not sure if this makes it hard to bundle it with the rest of Django, which uses the BSD license.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/antonio-cangiano"&gt;antonio-cangiano&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bsd"&gt;bsd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/db2"&gt;db2&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ibm"&gt;ibm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/licensing"&gt;licensing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="antonio-cangiano"/><category term="bsd"/><category term="databases"/><category term="db2"/><category term="django"/><category term="ibm"/><category term="licensing"/><category term="open-source"/><category term="orm"/><category term="python"/></entry><entry><title>Secrets of the Django ORM</title><link href="https://simonwillison.net/2008/Nov/8/secrets/#atom-tag" rel="alternate"/><published>2008-11-08T23:49:15+00:00</published><updated>2008-11-08T23:49:15+00:00</updated><id>https://simonwillison.net/2008/Nov/8/secrets/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.eflorenzano.com/blog/post/secrets-django-orm/"&gt;Secrets of the Django ORM&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
An undocumented (and unsupported) method of poking a Django QuerySet’s internal query to add group_by and having clauses to a SQL query.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/groupby"&gt;groupby&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/having"&gt;having&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/queryset"&gt;queryset&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="groupby"/><category term="having"/><category term="orm"/><category term="python"/><category term="queryset"/><category term="sql"/></entry><entry><title>Django 1.0 alpha release notes</title><link href="https://simonwillison.net/2008/Jul/22/alpha/#atom-tag" rel="alternate"/><published>2008-07-22T06:04:29+00:00</published><updated>2008-07-22T06:04:29+00:00</updated><id>https://simonwillison.net/2008/Jul/22/alpha/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.djangoproject.com/documentation/release_notes_1.0_alpha/"&gt;Django 1.0 alpha release notes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The big features are newforms-admin, unicode everywhere, the queryset-refactor ORM improvements and auto-escaping in templates.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/alpha"&gt;alpha&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/autoescaping"&gt;autoescaping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django-admin"&gt;django-admin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/newformsadmin"&gt;newformsadmin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/querysetrefactor"&gt;querysetrefactor&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/unicode"&gt;unicode&lt;/a&gt;&lt;/p&gt;



</summary><category term="alpha"/><category term="autoescaping"/><category term="django"/><category term="django-admin"/><category term="newformsadmin"/><category term="orm"/><category term="python"/><category term="querysetrefactor"/><category term="unicode"/></entry><entry><title>jQuery style chaining with the Django ORM</title><link href="https://simonwillison.net/2008/May/1/orm/#atom-tag" rel="alternate"/><published>2008-05-01T12:31:17+00:00</published><updated>2008-05-01T12:31:17+00:00</updated><id>https://simonwillison.net/2008/May/1/orm/#atom-tag</id><summary type="html">
    &lt;p&gt;Django's ORM is, in my opinion, the unsung gem of the framework. For the subset of SQL that's used in most web applications it's very hard to beat. It's a beautiful piece of API design, and I tip my hat to the people who designed and built it.&lt;/p&gt;

&lt;h4&gt;Lazy evaluation&lt;/h4&gt;

&lt;p&gt;If you haven't spent much time with the ORM, two key features are lazy evaluation and chaining. Consider the following statement:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;entries = Entry.objects.all()&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Assuming you have created an Entry model of some sort, the above statement will create a Django QuerySet object representing all of the entries in the database. It will &lt;em&gt;not&lt;/em&gt; result in the execution of any SQL - QuerySets are lazily evaluated, and are only executed at the last possible moment. The most common situation in which SQL will be executed is when the object is used for iteration:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;for entry in entries:
    print entry.title&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This usually happens in a template:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;&amp;lt;ul&amp;gt;
{% for entry in entries %}
  &amp;lt;li&amp;gt;{{ entry.title }}&amp;lt;/li&amp;gt;
{% endfor %}
&amp;lt;/ul&amp;gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Lazy evaluation works nicely with  &lt;a href="http://www.djangoproject.com/documentation/cache/#template-fragment-caching"&gt;template fragment caching&lt;/a&gt; - even if you pass a QuerySet to a template it won't be executed if the fragment it is used in can be served from the cache.&lt;/p&gt;

&lt;p&gt;You can modify QuerySets as many times as you like before they are executed:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;entries = Entry.objects.all()
today = datetime.date.today()
entries_this_year = entries.filter(
    posted__year = today.year
)
entries_last_year = entries.filter(
    posted__year = today.year - 1
)&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Again, no SQL has been executed, but we now have two QuerySets which, when iterated, will produce the desired result.&lt;/p&gt;

&lt;h4&gt;Chaining&lt;/h4&gt;

&lt;p&gt;Chaining comes in when you want to apply multiple modifications to a QuerySet. Here are blog entries from 2006 that weren't posted in January:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;Entry.objects.filter(
    posted__year = 2006
).exclude(posted__month = 1)&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And here's entries from that year posted to the category named "Personal", ordered by title:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;Entry.objects.filter(
    posted__year = 2006
).filter(
    category__name = "Personal"
).order_by('title')&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The above can also be expressed like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;Entry.objects.filter(
    posted__year = 2006,
    category__name = "Personal"
).order_by('title')&lt;/code&gt;&lt;/pre&gt;

&lt;h4&gt;Chaining in jQuery&lt;/h4&gt;

&lt;p&gt;The parallels to &lt;a href="http://jquery.com/"&gt;jQuery&lt;/a&gt; are pretty clear. The jQuery API is built around chaining, and the jQuery &lt;a href="http://docs.jquery.com/Effects"&gt;animation library&lt;/a&gt; even uses a form of lazy evaluation to automatically queue up effects to run in sequence:&lt;/p&gt;

&lt;pre&gt;&lt;code class="javascript"&gt;jQuery('div#message').addClass(
	'borderfade'
).animate({
   'borderWidth': '+10px'
}, 1000).fadeOut();&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;One of the neatest things about jQuery is the plugin model, which takes advantage of JavaScript's prototype inheritance and makes it trivially easy to add new chainable methods. If we wanted to package the above dumb effect up as a plugin, we could do so like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class="javascript"&gt;jQuery.fn.dumbBorderFade = function() {
    return this.addClass(
        'borderfade'
    ).animate({
       'borderWidth': '+10px'
    }, 1000).fadeOut();
};&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Now we can apply it to an element like so:&lt;/p&gt;

&lt;pre&gt;&lt;code class="javascript"&gt;jQuery('div#message').dumbBorderFade();&lt;/code&gt;&lt;/pre&gt;

&lt;h4&gt;Custom QuerySet methods in Django&lt;/h4&gt;

&lt;p&gt;Django supports adding custom methods for accessing the ORM through the ability to implement a custom &lt;a href="http://www.djangoproject.com/documentation/model-api/#managers"&gt;Manager&lt;/a&gt;. In the above examples, &lt;code class="python"&gt;Entry.objects&lt;/code&gt; is the Manager. The downside of this approach is that methods added to a manager can only be used at the beginning of the chain.&lt;/p&gt;

&lt;p&gt;Luckily, Managers also provide a hook for returning a custom QuerySet. This means we can create our own QuerySet subclass and add new methods to it, in a way that's reminiscent of jQuery:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;from django.db import models
from django.db.models.query import QuerySet
import datetime

class EntryQuerySet(QuerySet):
    def on_date(self, date):
        next = date + datetime.timedelta(days = 1)
        return self.filter(
            posted__gt = date,
            posted__lt = next
        )

class EntryManager(models.Manager):
    def get_query_set(self):
        return EntryQuerySet(self.model)

class Entry(models.Model):
    ...
    objects = EntryManager()&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The above gives us a new method on the QuerySets returned by Entry.objects called on_date(), which lets us filter entries down to those posted on a specific date. Now we can run queries like the following:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;Entry.objects.filter(
    category__name = 'Personal'
).on_date(datetime.date(2008, 5, 1))&lt;/code&gt;&lt;/pre&gt;

&lt;h4&gt;Reducing the boilerplate&lt;/h4&gt;

&lt;p&gt;This method works fine, but it requires quite a bit of boilerplate code - a QuerySet subclass and a Manager subclass plus the wiring to pull them all together. Wouldn't it be neat if you could declare the extra QuerySet methods inside the model definition itself?&lt;/p&gt;

&lt;p&gt;It turns out you can, and it's surprisingly easy. Here's the syntax I came up with:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;from django.db.models.query import QuerySet

class Entry(models.Model):
   ...
   objects = QuerySetManager()
   ...
   class QuerySet(QuerySet):
       def on_date(self, date):
           return self.filter(
               ...
           )&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here I've made the custom QuerySet class an inner class of the model definition. I've also replaced the default manager with a QuerySetManager. All this class does is return the QuerySet inner class for the current model from get_query_set. The implementation looks like this:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;class QuerySetManager(models.Manager):
    def get_query_set(self):
        return self.model.QuerySet(self.model)&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I'm pretty happy with this; it makes it trivial to add custom QuerySet methods and does so without any monkeypatching or deep reliance on Django ORM internals. I think the ease with which this can be achieved is a testament to the quality of the ORM API.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/api"&gt;api&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chaining"&gt;chaining&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jquery"&gt;jquery&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/queryset"&gt;queryset&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="api"/><category term="chaining"/><category term="django"/><category term="jquery"/><category term="orm"/><category term="python"/><category term="queryset"/></entry><entry><title>Queryset-refactor branch has been merged into trunk</title><link href="https://simonwillison.net/2008/Apr/27/qsrf/#atom-tag" rel="alternate"/><published>2008-04-27T07:21:13+00:00</published><updated>2008-04-27T07:21:13+00:00</updated><id>https://simonwillison.net/2008/Apr/27/qsrf/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://groups.google.com/group/django-users/browse_thread/thread/f4cd02d8d9389669"&gt;Queryset-refactor branch has been merged into trunk&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Malcolm’s latest Django masterpiece is complete.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/branch"&gt;branch&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/malcolm-tredinnick"&gt;malcolm-tredinnick&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/qsrf"&gt;qsrf&lt;/a&gt;&lt;/p&gt;



</summary><category term="branch"/><category term="django"/><category term="malcolm-tredinnick"/><category term="orm"/><category term="python"/><category term="qsrf"/></entry><entry><title>mysql_cluster</title><link href="https://simonwillison.net/2008/Mar/21/mysqlcluster/#atom-tag" rel="alternate"/><published>2008-03-21T08:45:57+00:00</published><updated>2008-03-21T08:45:57+00:00</updated><id>https://simonwillison.net/2008/Mar/21/mysqlcluster/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://softwaremaniacs.org/soft/mysql_cluster/"&gt;mysql_cluster&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
My Russian isn’t all that good, but this looks like a neat way of getting Django to talk to a master/slave setup, written by Ivan Sagalaev. UPDATE: English docs are linked from the comments.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://antoniocangiano.com/2008/03/20/djangos-tipping-point/#comment-2765"&gt;A comment on &amp;quot;Django&amp;#x27;s tipping point&amp;quot;&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ivansagalaev"&gt;ivansagalaev&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/masterslave"&gt;masterslave&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysqlcluster"&gt;mysqlcluster&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replication"&gt;replication&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="ivansagalaev"/><category term="masterslave"/><category term="mysql"/><category term="mysqlcluster"/><category term="orm"/><category term="python"/><category term="replication"/></entry><entry><title>Queryset Implementation</title><link href="https://simonwillison.net/2008/Mar/19/defying/#atom-tag" rel="alternate"/><published>2008-03-19T09:43:08+00:00</published><updated>2008-03-19T09:43:08+00:00</updated><id>https://simonwillison.net/2008/Mar/19/defying/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.pointy-stick.com/blog/2008/03/11/queryset-implementation/"&gt;Queryset Implementation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Malcolm explains the work that has gone in to the queryset-refactor branch. Executive summary: Python’s ORM is probably a lot better at SQL than you are.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/malcolm-tredinnick"&gt;malcolm-tredinnick&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/querysetrefactor"&gt;querysetrefactor&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="malcolm-tredinnick"/><category term="orm"/><category term="python"/><category term="querysetrefactor"/><category term="sql"/></entry><entry><title>Caching Layer for Django ORM</title><link href="https://simonwillison.net/2008/Jan/23/caching/#atom-tag" rel="alternate"/><published>2008-01-23T15:18:19+00:00</published><updated>2008-01-23T15:18:19+00:00</updated><id>https://simonwillison.net/2008/Jan/23/caching/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.davidcramer.net/code/73/caching-layer-for-django-orm.html"&gt;Caching Layer for Django ORM&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Interesting extension to Django’s ORM that adds automatic caching of querysets and smart cache invalidation.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/david-cramer"&gt;david-cramer&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ormcaching"&gt;ormcaching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="david-cramer"/><category term="django"/><category term="orm"/><category term="ormcaching"/><category term="python"/></entry><entry><title>Announcing StaticGenerator for Django</title><link href="https://simonwillison.net/2008/Jan/7/announcing/#atom-tag" rel="alternate"/><published>2008-01-07T21:26:41+00:00</published><updated>2008-01-07T21:26:41+00:00</updated><id>https://simonwillison.net/2008/Jan/7/announcing/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://superjared.com/entry/announcing-staticgenerator-django/"&gt;Announcing StaticGenerator for Django&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Simple but powerful static file generator for Django applications—just tell it about your model instances and it will create an entire static site based on calling get_absolute_url() on each one. Uses signals to repopulate the cache when a model changes.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jared-kuolt"&gt;jared-kuolt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/static"&gt;static&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/static-generator"&gt;static-generator&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="django"/><category term="jared-kuolt"/><category term="orm"/><category term="performance"/><category term="static"/><category term="static-generator"/></entry><entry><title>Django Evolution</title><link href="https://simonwillison.net/2007/Nov/23/evolution/#atom-tag" rel="alternate"/><published>2007-11-23T23:49:10+00:00</published><updated>2007-11-23T23:49:10+00:00</updated><id>https://simonwillison.net/2007/Nov/23/evolution/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/django-evolution/"&gt;Django Evolution&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Really smart take on the problem of updating database tables to reflect changes to Django models. Code that automatically modifies your database tables can be pretty scary, but Evolution seems to hit the right balance.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/djangoevolution"&gt;djangoevolution&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/migration"&gt;migration&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/schema"&gt;schema&lt;/a&gt;&lt;/p&gt;



</summary><category term="databases"/><category term="django"/><category term="djangoevolution"/><category term="migration"/><category term="orm"/><category term="schema"/></entry><entry><title>Using the extra() QuerySet modifier in Django for WeGoEat</title><link href="https://simonwillison.net/2007/Oct/24/ryan/#atom-tag" rel="alternate"/><published>2007-10-24T19:28:20+00:00</published><updated>2007-10-24T19:28:20+00:00</updated><id>https://simonwillison.net/2007/Oct/24/ryan/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.localkinegrinds.com/2007/10/24/using-the-extra-queryset-modifier-in-django-for-wegoeat/"&gt;Using the extra() QuerySet modifier in Django for WeGoEat&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
You can use select() on a QuerySet to obtain extra values using subqueries.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/queryset"&gt;queryset&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ryan-kanno"&gt;ryan-kanno&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/subqueries"&gt;subqueries&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="orm"/><category term="python"/><category term="queryset"/><category term="ryan-kanno"/><category term="subqueries"/></entry><entry><title>tranquil</title><link href="https://simonwillison.net/2007/Oct/9/tranquil/#atom-tag" rel="alternate"/><published>2007-10-09T02:30:29+00:00</published><updated>2007-10-09T02:30:29+00:00</updated><id>https://simonwillison.net/2007/Oct/9/tranquil/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/tranquil/"&gt;tranquil&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Inspired take on the Django ORM to SQLAlchemy problem: lets you define your models with the Django ORM but use SQLAlchemy to run queries against them.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/djangoorm"&gt;djangoorm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/models"&gt;models&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlalchemy"&gt;sqlalchemy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tranquil"&gt;tranquil&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="djangoorm"/><category term="models"/><category term="orm"/><category term="python"/><category term="sqlalchemy"/><category term="tranquil"/></entry></feed>