<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: memcached</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/memcached.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-07-21T23:58:53+00:00</updated><author><name>Simon Willison</name></author><entry><title>tidwall/pogocache</title><link href="https://simonwillison.net/2025/Jul/21/pogocache/#atom-tag" rel="alternate"/><published>2025-07-21T23:58:53+00:00</published><updated>2025-07-21T23:58:53+00:00</updated><id>https://simonwillison.net/2025/Jul/21/pogocache/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/tidwall/pogocache"&gt;tidwall/pogocache&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New project from Josh Baker, author of the excellent &lt;code&gt;tg&lt;/code&gt; C geospatial libarry (&lt;a href="https://simonwillison.net/2023/Sep/23/tg-polygon-indexing/"&gt;covered previously&lt;/a&gt;) and various other &lt;a href="https://github.com/tidwall"&gt;interesting projects&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Pogocache is fast caching software built from scratch with a focus on low latency and cpu efficency.&lt;/p&gt;
&lt;p&gt;Faster: Pogocache is faster than Memcache, Valkey, Redis, Dragonfly, and Garnet. It has the lowest latency per request, providing the quickest response times. It's optimized to scale from one to many cores, giving you the best single-threaded and multithreaded performance.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Faster than Memcache and Redis is a big claim! The README includes a &lt;a href="https://github.com/tidwall/pogocache/blob/main/README.md#design-details"&gt;design details&lt;/a&gt; section that explains how the system achieves that performance, using a sharded hashmap inspired by Josh's &lt;a href="https://github.com/tidwall/shardmap"&gt;shardmap&lt;/a&gt; project and clever application of threads.&lt;/p&gt;
&lt;p&gt;Performance aside, the most interesting thing about Pogocache is the server interface it provides: it emulates the APIs for Redis and Memcached, provides a simple HTTP API &lt;em&gt;and&lt;/em&gt; lets you talk to it over the PostgreSQL wire protocol as well!&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;psql -h localhost -p 9401
=&amp;gt; SET first Tom;
=&amp;gt; SET last Anderson;
=&amp;gt; SET age 37;

$ curl http://localhost:9401/last
Anderson
&lt;/code&gt;&lt;/pre&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=44638076"&gt;Show HN&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="caching"/><category term="http"/><category term="memcached"/><category term="postgresql"/><category term="redis"/></entry><entry><title>What are the best books/tutorials to begin learning about memcached?</title><link href="https://simonwillison.net/2013/Apr/8/what-are-the-best/#atom-tag" rel="alternate"/><published>2013-04-08T14:51:00+00:00</published><updated>2013-04-08T14:51:00+00:00</updated><id>https://simonwillison.net/2013/Apr/8/what-are-the-best/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-are-the-best-books-tutorials-to-begin-learning-about-memcached/answer/Simon-Willison"&gt;What are the best books/tutorials to begin learning about memcached?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There isn't really enough of memcached to justify a whole book - it's a pretty straight-forward API.&lt;/p&gt;

&lt;p&gt;It's always interesting hearing about advanced usage patterns for it though. Again, these don't necessarily justify a book but they are frequently presented at conferences.&lt;/p&gt;

&lt;p&gt;Here's one video that may be relevant: &lt;span&gt;&lt;a href="http://lanyrd.com/2012/goruco/swfqp/"&gt;High Performance Caching with Rails - a session at GoRuCo 2012 by Matt Duncan&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;Here's our full collection of 36 slides and video from talks about memcached: &lt;span&gt;&lt;a href="http://lanyrd.com/topics/memcached/coverage/"&gt;Conference coverage about Memcached on Lanyrd&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/books"&gt;books&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming"&gt;programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tutorials"&gt;tutorials&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/web-development"&gt;web-development&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="books"/><category term="memcached"/><category term="programming"/><category term="tutorials"/><category term="web-development"/><category term="quora"/></entry><entry><title>What are people's experiences using Memcached?</title><link href="https://simonwillison.net/2010/Oct/29/what-are-peoples-experiences/#atom-tag" rel="alternate"/><published>2010-10-29T14:03:00+00:00</published><updated>2010-10-29T14:03:00+00:00</updated><id>https://simonwillison.net/2010/Oct/29/what-are-peoples-experiences/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-are-peoples-experiences-using-Memcached/answer/Simon-Willison"&gt;What are people&amp;#39;s experiences using Memcached?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That it's so obviously a good idea (and works so well) that you'd be crazy not to use it. As far as I'm concerned, it's part of the default stack for any web application.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/web-development"&gt;web-development&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="databases"/><category term="memcached"/><category term="web-development"/><category term="quora"/></entry><entry><title>What is the best way to list every key stored in memcached?</title><link href="https://simonwillison.net/2010/Oct/11/what-is-the-best/#atom-tag" rel="alternate"/><published>2010-10-11T12:30:00+00:00</published><updated>2010-10-11T12:30:00+00:00</updated><id>https://simonwillison.net/2010/Oct/11/what-is-the-best/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;My answer to &lt;a href="https://www.quora.com/What-is-the-best-way-to-list-every-key-stored-in-memcached/answer/Simon-Willison"&gt;What is the best way to list every key stored in memcached?&lt;/a&gt; on Quora&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Redis might be a better bet for this - it has a "KEYS *" command which can return every key in the dataset, and its GET and SET performance are comparable to memcached.&lt;/p&gt;

&lt;p&gt;You can also set expiring keys within Redis, at which point it behaves very much like memcached. I've seen quite a few people use it as a drop-in memcached replacement, though you need to use be already using an abstraction in your code (such as Django's built in caching layer) if you want to make the switch without any widespread code changes.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/quora"&gt;quora&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="memcached"/><category term="quora"/></entry><entry><title>ElasticSearch memcached module</title><link href="https://simonwillison.net/2010/May/15/elasticsearch/#atom-tag" rel="alternate"/><published>2010-05-15T10:17:00+00:00</published><updated>2010-05-15T10:17:00+00:00</updated><id>https://simonwillison.net/2010/May/15/elasticsearch/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.elasticsearch.com/docs/elasticsearch/modules/memcached/"&gt;ElasticSearch memcached module&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fascinating idea: the ElasticSearch search server provides an optional memcached protocol plugin for added performance which maps simple HTTP to memcached. GET is mapped to memcached get commands, POST is mapped to set commands. This means you can use any memcached client to communicate with the search server.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/elasticsearch"&gt;elasticsearch&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/protocol"&gt;protocol&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="elasticsearch"/><category term="http"/><category term="memcached"/><category term="protocol"/><category term="recovered"/></entry><entry><title>Introduction to nginx.conf scripting</title><link href="https://simonwillison.net/2010/Apr/21/nginx/#atom-tag" rel="alternate"/><published>2010-04-21T23:40:46+00:00</published><updated>2010-04-21T23:40:46+00:00</updated><id>https://simonwillison.net/2010/Apr/21/nginx/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://agentzh.org/misc/slides/nginx-conf-scripting/nginx-conf-scripting.html#1"&gt;Introduction to nginx.conf scripting&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Slideshow—hit left arrow to navigate through the slides. The nginx community is officially nuts. Starts out with a simple “Hello world” using the echo module, then rapidly descends down the rabbit hole in to array operations, sub-requests, memcached connection pooling and eventually non-blocking Drizzle SQL execution against a sharded cluster—all implemented in the nginx.conf configuration file.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/drizzle"&gt;drizzle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;&lt;/p&gt;



</summary><category term="drizzle"/><category term="http"/><category term="memcached"/><category term="nginx"/></entry><entry><title>Cache Machine: Automatic caching for your Django models</title><link href="https://simonwillison.net/2010/Mar/11/cachemachine/#atom-tag" rel="alternate"/><published>2010-03-11T19:35:32+00:00</published><updated>2010-03-11T19:35:32+00:00</updated><id>https://simonwillison.net/2010/Mar/11/cachemachine/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://jbalogh.me/2010/02/09/cache-machine/"&gt;Cache Machine: Automatic caching for your Django models&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cachemachine"&gt;cachemachine&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ormcaching"&gt;ormcaching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;&lt;/p&gt;



</summary><category term="cachemachine"/><category term="caching"/><category term="django"/><category term="memcached"/><category term="mozilla"/><category term="orm"/><category term="ormcaching"/><category term="python"/><category term="redis"/></entry><entry><title>Johnny Cache</title><link href="https://simonwillison.net/2010/Feb/28/johnny/#atom-tag" rel="alternate"/><published>2010-02-28T22:55:15+00:00</published><updated>2010-02-28T22:55:15+00:00</updated><id>https://simonwillison.net/2010/Feb/28/johnny/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://packages.python.org/johnny-cache/"&gt;Johnny Cache&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Clever twist on ORM-level caching for Django. Johnny Cache (great name) monkey-patches Django’s QuerySet classes and caches the result of every single SELECT query in memcached with an infinite expiry time. The cache key includes a “generation” ID for each dependent database table, and the generation is changed every single time a table is updated. For apps with infrequent writes, this strategy should work really well—but if a popular table is being updated constantly the cache will be all but useless. Impressively, the system is transaction-aware—cache entries created during a transaction are held in local memory and only pushed to memcached should the transaction complete successfully.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ormcaching"&gt;ormcaching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="databases"/><category term="django"/><category term="memcached"/><category term="orm"/><category term="ormcaching"/><category term="performance"/><category term="python"/></entry><entry><title>Distributed lock on top of memcached</title><link href="https://simonwillison.net/2010/Feb/1/distributed/#atom-tag" rel="alternate"/><published>2010-02-01T10:15:02+00:00</published><updated>2010-02-01T10:15:02+00:00</updated><id>https://simonwillison.net/2010/Feb/1/distributed/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://amix.dk/blog/post/19386"&gt;Distributed lock on top of memcached&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A simple Python context manager (taking advantage of the with statement) that implements a distributed lock using memcached to store lock state: “memcached_lock can be used to ensure that some global data is only updated by one server”. Redis would work well for this kind of thing as well.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/concurrency"&gt;concurrency&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/contextmanager"&gt;contextmanager&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/locking"&gt;locking&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plurk"&gt;plurk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/with"&gt;with&lt;/a&gt;&lt;/p&gt;



</summary><category term="concurrency"/><category term="contextmanager"/><category term="locking"/><category term="memcached"/><category term="plurk"/><category term="python"/><category term="redis"/><category term="with"/></entry><entry><title>Crowdsourced document analysis and MP expenses</title><link href="https://simonwillison.net/2009/Dec/20/crowdsourcing/#atom-tag" rel="alternate"/><published>2009-12-20T12:07:53+00:00</published><updated>2009-12-20T12:07:53+00:00</updated><id>https://simonwillison.net/2009/Dec/20/crowdsourcing/#atom-tag</id><summary type="html">
    &lt;p&gt;As &lt;a href="https://web.archive.org/web/20091204154825/https://www.guardian.co.uk/politics/mps-expenses"&gt;you may have heard&lt;/a&gt;, the UK government released a fresh batch of MP expenses documents a week ago on Thursday. I spent that week working with a small team at Guardian HQ to prepare for the release. Here's what we built:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://web.archive.org/web/20091213164102/http://mps-expenses2.guardian.co.uk/"&gt;http://mps-expenses2.guardian.co.uk/&lt;/a&gt; &lt;em&gt;Updated March 2021: all links now go to the Internet Archive&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2009/mp-expenses-2-cropped.png" alt="Screenshot of the homepage from December 2019" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;It's a crowdsourcing application that asks the public to help us dig through and categorise the enormous stack of documents - around 30,000 pages of claim forms, scanned receipts and hand-written letters, all scanned and published as PDFs.&lt;/p&gt;

&lt;p&gt;This is the second time we've tried this - the first was back in June, and can be seen at &lt;a href="https://web.archive.org/web/20090802094829/http://mps-expenses.guardian.co.uk/"&gt;mps-expenses.guardian.co.uk&lt;/a&gt;. Last week's attempt was an opportunity to apply the lessons we learnt the first time round.&lt;/p&gt;

&lt;p&gt;Writing crowdsourcing applications in a newspaper environment is a fascinating challenge. Projects have very little notice - I heard about the new document release the Thursday before giving less than a week to put everything together. In addition to the fast turnaround for the application itself, the 48 hours following the release are crucial. The news cycle moves fast, so if the application launches but we don't manage to get useful data out of it quickly the story will move on before we can impact it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://web.archive.org/web/20091124150940/http://www.scalecamp.org.uk/"&gt;ScaleCamp&lt;/a&gt; on the Friday meant that development work didn't properly kick off until Monday morning. The bulk of the work was performed by two server-side developers, one client-side developer, one designer and one QA on Monday, Tuesday and Wednesday. The Guardian operations team deftly handled our EC2 configuration and deployment, and we had some extra help on the day from other members of the technology department. After launch we also had a number of journalists helping highlight discoveries and dig through submissions.&lt;/p&gt;

&lt;p&gt;The system was written using Django, MySQL (InnoDB), Redis and memcached.&lt;/p&gt;

&lt;h4 id="asking-the-right-question"&gt;Asking the right question&lt;/h4&gt;

&lt;p&gt;The biggest mistake we made the first time round was that we asked the wrong question. We tried to get our audience to categorise documents as either "claims" or "receipts" and to rank them as "not interesting", "a bit interesting", "interesting but already known" and "someone should investigate this". We also asked users to optionally enter any numbers they saw on the page as categorised "line items", with the intention of adding these up later.&lt;/p&gt;

&lt;p&gt;The line items, with hindsight, were a mistake. 400,000 documents makes for a huge amount of data entry and for the figures to be useful we would need to confirm their accuracy. This would mean yet more rounds of crowdsourcing, and the job was so large that the chance of getting even one person to enter line items for each page rapidly diminished as the news story grew less prominent.&lt;/p&gt;

&lt;p&gt;The categorisations worked reasonably well but weren't particularly interesting - knowing if a document is a claim or receipt is useful only if you're going to collect line items. The "investigate this" button worked very well though.&lt;/p&gt;

&lt;p&gt;We completely changed our approach for the new system. We dropped the line item task and instead asked our users to categories each page by applying one or more tags, from a small set that our editors could control. This gave us a lot more flexibility - we changed the tags shortly before launch based on the characteristics of the documents - and had the potential to be a lot more fun as well. I'm particularly fond of the "hand-written" tag, which has highlighted some &lt;a href="https://web.archive.org/web/20091223091650/http://mps-expenses2.guardian.co.uk/page/1062/"&gt;lovely examples&lt;/a&gt; of correspondence between MPs and the expenses office.&lt;/p&gt;

&lt;p&gt;Sticking to an editorially assigned set of tags provided a powerful tool for directing people's investigations, and also ensured our users didn't start creating potentially libelous tags of their own.&lt;/p&gt;

&lt;h4 id="breaking-up-assignments"&gt;Breaking it up in to assignments&lt;/h4&gt;

&lt;p&gt;For the first project, everyone worked together on the same task to review all of the documents. This worked fine while the document set was small, but once we had loaded in 400,000+ pages the progress bar become quite depressing.&lt;/p&gt;

&lt;p&gt;This time round, we added a new concept of "&lt;a href="https://web.archive.org/web/20091215224727/http://mps-expenses2.guardian.co.uk/assignment/"&gt;assignments&lt;/a&gt;". Each assignment consisted of the set of pages belonging to a specified list of MPs, documents or political parties. Assignments had a threshold, so we could specify that a page must be reviewed by at least X people before it was considered reviewed. An editorial tool let us feature one "main" assignment and several alternative assignments right on the homepage.&lt;/p&gt;

&lt;p&gt;Clicking "start reviewing" on an assignment sets a cookie for that assignment, and adds the assignment's progress bar to the top of the review interface. New pages are selected at random from the set of unreviewed pages in that assignment.&lt;/p&gt;

&lt;p&gt;The assignments system proved extremely effective. We could use it to direct people to the highest value documents (our top hit list of interesting MPs, or members of the shadow cabinet) while still allowing people with specific interests to pick an alternative task.&lt;/p&gt;

&lt;h4 id="get-the-button-right"&gt;Get the button right!&lt;/h4&gt;

&lt;p&gt;Having run two crowdsourcing projects I can tell you this: the single most important piece of code you will write is the code that gives someone something new to review. Both of our projects had big "start reviewing" buttons. Both were broken in different ways.&lt;/p&gt;

&lt;p&gt;The first time round, the mistakes were around scalability. I used a SQL "ORDER BY RAND()" statement to return the next page to review. I knew this was an inefficient operation, but I assumed that it wouldn't matter since the button would only be clicked occasionally.&lt;/p&gt;

&lt;p&gt;Something like 90% of our database load turned out to be caused by that one SQL statement, and it only got worse as we loaded more pages in to the system. This caused multiple site slow downs and crashes until we threw together a cron job that pushed 1,000 unreviewed page IDs in to memcached and made the button pick one of those at random.&lt;/p&gt;

&lt;p&gt;This solved the performance problem, but meant that our user activity wasn't nearly as well targeted. For optimum efficiency you really want everyone to be looking at a different page - and a random distribution is almost certainly the easiest way to achieve that.&lt;/p&gt;

&lt;p&gt;The second time round I turned to my new favourite in-memory data structure server, &lt;a href="http://code.google.com/p/redis/"&gt;redis&lt;/a&gt;, and its &lt;a href="http://code.google.com/p/redis/wiki/SrandmemberCommand"&gt;SRANDMEMBER&lt;/a&gt; command (a feature I &lt;a href="http://twitter.com/simonw/status/5027987857"&gt;requested&lt;/a&gt; a while ago with this exact kind of project in mind). The system maintains a redis set of all IDs that needed to be reviewed for an assignment to be complete, and a separate set of IDs of all pages had been reviewed. It then uses redis set intersection (the &lt;a href="http://code.google.com/p/redis/wiki/SdiffstoreCommand"&gt;SDIFFSTORE&lt;/a&gt; command) to create a set of unreviewed pages for the current assignment and then SRANDMEMBER to pick one of those pages.&lt;/p&gt;

&lt;p&gt;This is where the bug crept in. Redis was just being used as an optimisation - the single point of truth for whether a page had been reviewed or not stayed as MySQL. I wrote a couple of Django management commands to repopulate the denormalised Redis sets should we need to manually modify the database. Unfortunately I missed some - the sets that tracked what pages were available in each document. The assignment generation code used an intersection of these sets to create the overall set of documents for that assignment. When we deleted some pages that had accidentally been imported twice I failed to update those sets.&lt;/p&gt;

&lt;p&gt;This meant the "next page" button would occasionally turn up a page that didn't exist. I had some very poorly considered fallback logic for that - if the random page didn't exist, the system would return the first page in that assignment instead. Unfortunately, this meant that when the assignment was down to the last four non-existent pages every single user was directed to the same page - which subsequently attracted well over a thousand individual reviews.&lt;/p&gt;

&lt;p&gt;Next time, I'm going to try and make the "next" button completely bullet proof! I'm also going to maintain a "denormalisation dictionary" documenting every denormalisation in the system in detail - such a thing would have saved me several hours of confused debugging.&lt;/p&gt;

&lt;h4 id="exposing-the-results"&gt;Exposing the results&lt;/h4&gt;

&lt;p&gt;The biggest mistake I made last time was not getting the data back out again fast enough for our reporters to effectively use it. It took 24 hours from the launch of the application to the moment the first reporting feature was added - mainly because we spent much of the intervening time figuring out the scaling issues.&lt;/p&gt;

&lt;p&gt;This time we handled this a lot better. We provided private pages exposing all recent activity on the site. We also provided public pages for each of the tags, as well as combination pages for party + tag, MP + tag, document + tag, assignment + tag and user + tag. Most of these pages were ordered by most-tagged, with the hope that the most interesting pages would quickly bubble to the top.&lt;/p&gt;

&lt;p&gt;This worked pretty well, but we made one key mistake. The way we were ordering pages meant that it was almost impossible to paginate through them and be sure that you had seen everything under a specific tag. If you're trying to keep track of everything going on in the site, reliable pagination is essential. The only way to get reliable pagination on a fast moving site is to order by the date something was first added to a set in ascending order. That way you can work through all of the pages, wait a bit, hit "refresh" and be able to continue paginating where you left off. Any other order results in the content of each page changing as new content comes in.&lt;/p&gt;

&lt;p&gt;We eventually added an undocumented /in-order/ URL prefix to address this issue. Next time I'll pay a lot more attention to getting the pagination options right from the start.&lt;/p&gt;

&lt;h4 id="rewarding-our-contributors"&gt;Rewarding our contributors&lt;/h4&gt;

&lt;p&gt;The reviewing experience the first time round was actually quite lonely. We deliberately avoided showing people how others had marked each page because we didn't want to bias the results. Unfortunately this meant the site felt like a bit of a ghost town, even when hundreds of other people were actively reviewing things at the same time.&lt;/p&gt;

&lt;p&gt;For the new version, we tried to provide a much better feeling of activity around the site. We added "top reviewer" tables to every assignment, MP and political party as well as a "most active reviewers in the past 48 hours" table on the homepage (this feature was added to the first project several days too late). User profile pages got a lot more attention, with more of a feel that users were collecting their favourite pages in to tag buckets within their profile.&lt;/p&gt;

&lt;p&gt;Most importantly, we added a concept of &lt;a href="https://web.archive.org/web/20091223091046/http://mps-expenses2.guardian.co.uk/discoveries/"&gt;discoveries&lt;/a&gt; - editorially highlighted pages that were shown on the homepage and credited to the user that had first highlighted them. These discoveries also added valuable editorial interest to the site, showing up on the homepage and also the index pages for &lt;a href="https://web.archive.org/web/20091215191906/http://mps-expenses2.guardian.co.uk/labour/"&gt;political parties&lt;/a&gt; and &lt;a href="https://web.archive.org/web/20091215050919/http://mps-expenses2.guardian.co.uk/conservative/gerald-howarth/"&gt;individual MPs&lt;/a&gt;.&lt;/p&gt;

&lt;h4 id="light-weight-registration"&gt;Light-weight registration&lt;/h4&gt;

&lt;p&gt;For both projects, we implemented an extremely light-weight form of registration. Users can start reviewing pages without going through any signup mechanism, and instead are assigned a cookie and an anon-454 style username the first time they review a document. They are then encouraged to assign themselves a proper username and password so they can log in later and take credit for their discoveries.&lt;/p&gt;

&lt;p&gt;It's difficult to tell how effective this approach really is. I have a strong hunch that it dramatically increases the number of people who review at least one document, but without a formal A/B test it's hard to tell how true that is. The UI for this process in the first project was quite confusing - we gave it a solid makeover the second time round, which seems to have resulted in a higher number of conversions.&lt;/p&gt;

&lt;h4 id="overall-lessons"&gt;Overall lessons&lt;/h4&gt;

&lt;p&gt;News-based crowdsourcing projects of this nature are both challenging and an enormous amount of fun. For the best chances of success, be sure to ask the right question, ensure user contributions are rewarded, expose as much data as possible and make the "next thing to review" behaviour rock solid. I'm looking forward to the next opportunity to apply these lessons, although at this point I &lt;em&gt;really&lt;/em&gt; hope it involves something other than MPs' expenses.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/crowdsourcing"&gt;crowdsourcing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/guardian"&gt;guardian&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/innodb"&gt;innodb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mpsexpenses"&gt;mpsexpenses&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nosql"&gt;nosql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/politics"&gt;politics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="crowdsourcing"/><category term="django"/><category term="guardian"/><category term="innodb"/><category term="memcached"/><category term="mpsexpenses"/><category term="mysql"/><category term="nosql"/><category term="politics"/><category term="projects"/><category term="python"/><category term="redis"/></entry><entry><title>dustin's gomemcached</title><link href="https://simonwillison.net/2009/Nov/13/gomemcached/#atom-tag" rel="alternate"/><published>2009-11-13T15:13:45+00:00</published><updated>2009-11-13T15:13:45+00:00</updated><id>https://simonwillison.net/2009/Nov/13/gomemcached/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/dustin/gomemcached"&gt;dustin&amp;#x27;s gomemcached&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A memcached server written in Go, an experiment by memcached maintainer Dustin Sallings.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://dustin.github.com/2009/11/12/gomemcached.html"&gt;Hello World in Go&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/concurrency"&gt;concurrency&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dustin-sallings"&gt;dustin-sallings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/programming"&gt;programming&lt;/a&gt;&lt;/p&gt;



</summary><category term="concurrency"/><category term="dustin-sallings"/><category term="go"/><category term="memcached"/><category term="programming"/></entry><entry><title>memcache-top</title><link href="https://simonwillison.net/2009/Oct/29/memcachetop/#atom-tag" rel="alternate"/><published>2009-10-29T08:32:18+00:00</published><updated>2009-10-29T08:32:18+00:00</updated><id>https://simonwillison.net/2009/Oct/29/memcachetop/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/memcache-top/"&gt;memcache-top&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Useful self-contained perl script for interactively monitoring a group of memcached servers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/monitoring"&gt;monitoring&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/perl"&gt;perl&lt;/a&gt;&lt;/p&gt;



</summary><category term="memcached"/><category term="monitoring"/><category term="perl"/></entry><entry><title>How We Made GitHub Fast</title><link href="https://simonwillison.net/2009/Oct/21/github/#atom-tag" rel="alternate"/><published>2009-10-21T21:14:38+00:00</published><updated>2009-10-21T21:14:38+00:00</updated><id>https://simonwillison.net/2009/Oct/21/github/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/blog/530-how-we-made-github-fast"&gt;How We Made GitHub Fast&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/drbd"&gt;drbd&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/erlang"&gt;erlang&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ernie"&gt;ernie&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git"&gt;git&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/haproxy"&gt;haproxy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/proxymachine"&gt;proxymachine&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rails"&gt;rails&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replication"&gt;replication&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ruby"&gt;ruby&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ssh"&gt;ssh&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/unicorn"&gt;unicorn&lt;/a&gt;&lt;/p&gt;



</summary><category term="drbd"/><category term="erlang"/><category term="ernie"/><category term="git"/><category term="github"/><category term="haproxy"/><category term="memcached"/><category term="nginx"/><category term="proxymachine"/><category term="rails"/><category term="redis"/><category term="replication"/><category term="ruby"/><category term="scaling"/><category term="ssh"/><category term="unicorn"/></entry><entry><title>Ravelry</title><link href="https://simonwillison.net/2009/Sep/3/ravelry/#atom-tag" rel="alternate"/><published>2009-09-03T18:50:20+00:00</published><updated>2009-09-03T18:50:20+00:00</updated><id>https://simonwillison.net/2009/Sep/3/ravelry/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.tbray.org/ongoing/When/200x/2009/09/02/Ravelry"&gt;Ravelry&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tim Bray interviews Casey Forbes, the single engineer behind Ravelry, the knitting community that serves 10 million Rails requests a day using just seven physical servers, MySQL, Sphinx, memcached, nginx, haproxy, passenger and Tokyo Cabinet.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caseyforbes"&gt;caseyforbes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/haproxy"&gt;haproxy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/passenger"&gt;passenger&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rails"&gt;rails&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ravelry"&gt;ravelry&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sphinx-search"&gt;sphinx-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tim-bray"&gt;tim-bray&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tokyocabinet"&gt;tokyocabinet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tokyotyrant"&gt;tokyotyrant&lt;/a&gt;&lt;/p&gt;



</summary><category term="caseyforbes"/><category term="haproxy"/><category term="memcached"/><category term="mysql"/><category term="nginx"/><category term="passenger"/><category term="rails"/><category term="ravelry"/><category term="scaling"/><category term="sphinx-search"/><category term="tim-bray"/><category term="tokyocabinet"/><category term="tokyotyrant"/></entry><entry><title>Memcached 1.4.0 released</title><link href="https://simonwillison.net/2009/Jul/17/memcached/#atom-tag" rel="alternate"/><published>2009-07-17T22:26:48+00:00</published><updated>2009-07-17T22:26:48+00:00</updated><id>https://simonwillison.net/2009/Jul/17/memcached/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://dustin.github.com/2009/07/16/memcached-1.4.html"&gt;Memcached 1.4.0 released&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The big new feature is the (optional) binary protocol, which enables other features such as CAS-everywhere and efficient client-side replication. Maintainer Dustin Sallings has also released some useful sounding EC2 instances which automatically assign nearly all of their RAM to memcached on launch and shouldn’t need any further configuration.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ami"&gt;ami&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/binary"&gt;binary&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cas"&gt;cas&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dustin-sallings"&gt;dustin-sallings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ec2"&gt;ec2&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;&lt;/p&gt;



</summary><category term="ami"/><category term="binary"/><category term="caching"/><category term="cas"/><category term="dustin-sallings"/><category term="ec2"/><category term="memcached"/><category term="performance"/><category term="scaling"/></entry><entry><title>cache-money</title><link href="https://simonwillison.net/2009/Jun/28/cachemoney/#atom-tag" rel="alternate"/><published>2009-06-28T15:17:30+00:00</published><updated>2009-06-28T15:17:30+00:00</updated><id>https://simonwillison.net/2009/Jun/28/cachemoney/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://github.com/nkallen/cache-money/tree/master"&gt;cache-money&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A “write-through caching library for ActiveRecord”, maintained by Nick Kallen from Twitter. Queries hit memcached first, and caches are automatically kept up-to-date when objects are created, updated and deleted. Only some queries are supported—joins and comparisons won’t hit the cache, for example.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/activerecord"&gt;activerecord&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cachemoney"&gt;cachemoney&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rails"&gt;rails&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;&lt;/p&gt;



</summary><category term="activerecord"/><category term="cachemoney"/><category term="caching"/><category term="memcached"/><category term="rails"/><category term="twitter"/></entry><entry><title>Twitter, an Evolving Architecture</title><link href="https://simonwillison.net/2009/Jun/28/twitter/#atom-tag" rel="alternate"/><published>2009-06-28T15:09:44+00:00</published><updated>2009-06-28T15:09:44+00:00</updated><id>https://simonwillison.net/2009/Jun/28/twitter/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.infoq.com/news/2009/06/Twitter-Architecture"&gt;Twitter, an Evolving Architecture&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The most detailed write-up of Twitter’s current architecture I’ve seen, explaining the four layers of cache (all memcached) used by the Twitter API.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/software-architecture"&gt;software-architecture&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="memcached"/><category term="twitter"/><category term="software-architecture"/></entry><entry><title>hash_ring 1.2</title><link href="https://simonwillison.net/2009/May/5/hashring/#atom-tag" rel="alternate"/><published>2009-05-05T13:45:08+00:00</published><updated>2009-05-05T13:45:08+00:00</updated><id>https://simonwillison.net/2009/May/5/hashring/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://pypi.python.org/pypi/hash_ring/1.2"&gt;hash_ring 1.2&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A Python library for consistent hashing with memcached, using MD5 and the same algorithm as libketama. Exposes an interface that is identical to regular memcache making this a drop-in replacement.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/amir-salihefendic"&gt;amir-salihefendic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/consistenthashing"&gt;consistenthashing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hashring"&gt;hashring&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/libketama"&gt;libketama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/md5"&gt;md5&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="amir-salihefendic"/><category term="caching"/><category term="consistenthashing"/><category term="django"/><category term="hashring"/><category term="libketama"/><category term="md5"/><category term="memcached"/><category term="python"/></entry><entry><title>peeping into memcached</title><link href="https://simonwillison.net/2009/Apr/20/peeping/#atom-tag" rel="alternate"/><published>2009-04-20T18:35:00+00:00</published><updated>2009-04-20T18:35:00+00:00</updated><id>https://simonwillison.net/2009/Apr/20/peeping/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.evanweaver.com/articles/2009/04/20/peeping-into-memcached/"&gt;peeping into memcached&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“Peep uses ptrace to freeze a running memcached server, dump the internal key metadata, and return the server to a running state”—you can then load the resulting data in to MySQL using LOAD LOCAL INFILE and analyse it using standard SQL queries.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/evanweaver"&gt;evanweaver&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/introspection"&gt;introspection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/peep"&gt;peep&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;&lt;/p&gt;



</summary><category term="evanweaver"/><category term="introspection"/><category term="memcached"/><category term="mysql"/><category term="peep"/><category term="performance"/><category term="scaling"/><category term="sql"/><category term="twitter"/></entry><entry><title>Tokyo Cabinet: Beyond Key-Value Store</title><link href="https://simonwillison.net/2009/Feb/14/tokyo/#atom-tag" rel="alternate"/><published>2009-02-14T11:17:47+00:00</published><updated>2009-02-14T11:17:47+00:00</updated><id>https://simonwillison.net/2009/Feb/14/tokyo/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-store/"&gt;Tokyo Cabinet: Beyond Key-Value Store&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Useful overview of Yet Another Scalable Key Value Store. Interesting points: multiple backends (hash table, B-Tree, in memory, on disk), a “table” engine which enables more advanced queries, a network server that supports HTTP, memcached or its own binary protocol and the ability to extend the engine with Lua scripts.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hash"&gt;hash&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/keyvaluepairs"&gt;keyvaluepairs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lua"&gt;lua&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tokyocabinet"&gt;tokyocabinet&lt;/a&gt;&lt;/p&gt;



</summary><category term="databases"/><category term="hash"/><category term="http"/><category term="keyvaluepairs"/><category term="lua"/><category term="memcached"/><category term="tokyocabinet"/></entry><entry><title>Rate limiting with memcached</title><link href="https://simonwillison.net/2009/Jan/7/ratelimitcache/#atom-tag" rel="alternate"/><published>2009-01-07T22:27:08+00:00</published><updated>2009-01-07T22:27:08+00:00</updated><id>https://simonwillison.net/2009/Jan/7/ratelimitcache/#atom-tag</id><summary type="html">
    &lt;p&gt;On Monday, several high profile "celebrity" Twitter accounts &lt;a href="http://www.techcrunch.com/2009/01/05/twitter-gets-hacked-badly/" title="Twitter Gets Hacked, Badly"&gt;started spouting nonsense&lt;/a&gt;, the victims of stolen passwords. Wired &lt;a href="http://blog.wired.com/27bstroke6/2009/01/professed-twitt.html" title="Weak Password Brings 'Happiness' to Twitter Hacker"&gt;has the full story&lt;/a&gt; - someone ran a dictionary attack against a Twitter staff member, discovered their password and used Twitter's admin tools to reset the passwords on the accounts they wanted to steal.&lt;/p&gt;

&lt;p&gt;The Twitter incident got me thinking about rate limiting again. I've been wanting a good general solution to this problem for quite a while, for API projects as well as security. Django Snippets has &lt;a href="http://www.djangosnippets.org/snippets/1083/" title="Decorator to limit request rates to individual views"&gt;an answer&lt;/a&gt;, but it works by storing access information in the database and requires you to run a periodic purge command to clean up the old records.&lt;/p&gt;

&lt;p&gt;I'm strongly averse to writing to the database for every hit. For most web applications reads scale easily, but writes don't. I also want to avoid filling my database with administrative gunk (I dislike database backed sessions for the same reason). But rate limiting relies on storing state, so there has to be some kind of persistence.&lt;/p&gt;

&lt;h4&gt;Using memcached counters&lt;/h4&gt;

&lt;p&gt;I think I've found a solution, thanks to memcached and in particular the &lt;samp&gt;incr&lt;/samp&gt; command. &lt;samp&gt;incr&lt;/samp&gt; lets you atomically increment an already existing counter, simply by specifying its key. &lt;samp&gt;add&lt;/samp&gt; can be used to create that counter - it will fail silently if the provided key already exists.&lt;/p&gt;

&lt;p&gt;Let's say we want to limit a user to 10 hits every minute. A naive implementation would be to create a memcached counter for hits from that user's IP address in a specific minute. The counter key might look like this:&lt;/p&gt;

&lt;pre&gt;&lt;samp&gt;ratelimit_72.26.203.98_2009-01-07-21:45&lt;/samp&gt;&lt;/pre&gt;

&lt;p&gt;Increment that counter for every hit, and if it exceeds 10 block the request.&lt;/p&gt;

&lt;p&gt;What if the user makes ten requests all in the last second of the minute, then another ten a second later? The rate limiter will let them off. For many cases this is probably acceptable, but we can improve things with a slightly more complex strategy. Let's say we want to allow up to 30 requests every five minutes. Instead of maintaining one counter, we can maintain five - one for each of the past five minutes (older counters than that are allowed to expire). After a few minutes we might end up with counters that look like this:&lt;/p&gt;

&lt;pre&gt;&lt;samp&gt;ratelimit_72.26.203.98_2009-01-07-21:45 = 13
ratelimit_72.26.203.98_2009-01-07-21:46 = 7
ratelimit_72.26.203.98_2009-01-07-21:47 = 11&lt;/samp&gt;&lt;/pre&gt;

&lt;p&gt;Now, on every request we work out the keys for the past five minutes and use &lt;samp&gt;get_multi&lt;/samp&gt; to retrieve them. If the sum of those counters exceeds the maximum allowed for that time period, we block the request.&lt;/p&gt;

&lt;p&gt;Are there any obvious flaws to this approach? I'm pretty happy with it - it cleans up after itself (old counters quietly expire from the cache), it shouldn't use much resources (just five active cache keys per unique IP address at any one time) and if the cache is lost the only snag is that a few clients might go slightly over their rate limit. I don't &lt;em&gt;think&lt;/em&gt; it's possible for an attacker to force the counters to expire early.&lt;/p&gt;

&lt;h4&gt;An implementation for Django&lt;/h4&gt;

&lt;p&gt;I've put together an &lt;a href="http://github.com/simonw/ratelimitcache/tree/master/ratelimitcache.py"&gt;example implementation of this algorithm&lt;/a&gt; using Django, hosted on GitHub. The &lt;a href="http://github.com/simonw/ratelimitcache/tree/master/readme.txt"&gt;readme.txt&lt;/a&gt; file shows how it works - basic usage is via a simple decorator:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;from ratelimitcache import ratelimit

@ratelimit(minutes = 3, requests = 20)
def myview(request):
    # ...
    return HttpResponse('...')&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Python decorators are typically functions, but &lt;code&gt;ratelimit&lt;/code&gt; is actually a class. This means it can be customised by subclassing it, and the class provides a number of methods designed to be over-ridden. I've provided an example of this in the module itself - ratelimit_post, a decorator which only limits on POST requests and can optionally couple the rate limiting to an individual POST field. Here's the complete implementation:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;class ratelimit_post(ratelimit):
    "Rate limit POSTs - can be used to protect a login form"
    key_field = None # If provided, this POST var will affect the rate limit
    
    def should_ratelimit(self, request):
        return request.method == 'POST'
    
    def key_extra(self, request):
        # IP address and key_field (if it is set)
        extra = super(ratelimit_post, self).key_extra(request)
        if self.key_field:
            value = sha.new(request.POST.get(self.key_field, '')).hexdigest()
            extra += '-' + value
        return extra&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And here's how you would use it to limit the number of times a specific IP address can attempt to log in as a particular user:&lt;/p&gt;

&lt;pre&gt;&lt;code class="python"&gt;@ratelimit_post(minutes = 3, requests = 10, key_field = 'username')
def login(request):
    # ...
    return HttpResponse('...')&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;should_ratelimit()&lt;/code&gt; method is called before any other rate limiting logic. The default implementation returns True, but here we only want to apply rate limits to POST requests. The &lt;code&gt;key_extra()&lt;/code&gt; method is used to compose the keys used for the counter - by default this just includes the request's IP address, but in &lt;code&gt;ratelimit_post&lt;/code&gt; we can optionally include the value of a POST field (for example the username). We could include things like the request path here to apply different rate limit counters to different URLs.&lt;/p&gt;

&lt;p&gt;Finally, the readme.txt includes &lt;code&gt;ratelimit_with_logging&lt;/code&gt;, an example that over-rides the &lt;code&gt;disallowed()&lt;/code&gt; view returned when a rate limiting condition fails and writes an audit note to a database (less overhead than writing for every request).&lt;/p&gt;

&lt;p&gt;I've been a fan of customisation via subclassing ever since I got to know the new Django admin system, and I've been using it in a bunch of projects. It's a great way to create reusable pieces of code.&lt;/p&gt;

    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/counters"&gt;counters&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ratelimitcache"&gt;ratelimitcache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rate-limiting"&gt;rate-limiting&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="counters"/><category term="django"/><category term="github"/><category term="memcached"/><category term="projects"/><category term="python"/><category term="ratelimitcache"/><category term="rate-limiting"/><category term="security"/><category term="twitter"/></entry><entry><title>Scaling memcached at Facebook</title><link href="https://simonwillison.net/2008/Dec/13/engineering/#atom-tag" rel="alternate"/><published>2008-12-13T10:08:33+00:00</published><updated>2008-12-13T10:08:33+00:00</updated><id>https://simonwillison.net/2008/Dec/13/engineering/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.facebook.com/note.php?note_id=39391378919"&gt;Scaling memcached at Facebook&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fascinating techie details on how Facebook forked memcache to use UDP and increase performance from 50,000 requests a second to 200,000. Now running on 800 servers with 28 TB of memory, and their code is on GitHub. (They may scale like crazy, but they can’t put their blog entry title in the title element?)


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/udp"&gt;udp&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="memcached"/><category term="scaling"/><category term="udp"/></entry><entry><title>Facebook engineering notes on Scaling Out</title><link href="https://simonwillison.net/2008/Aug/20/engineering/#atom-tag" rel="alternate"/><published>2008-08-20T23:51:31+00:00</published><updated>2008-08-20T23:51:31+00:00</updated><id>https://simonwillison.net/2008/Aug/20/engineering/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.facebook.com/notes.php?id=9445547199"&gt;Facebook engineering notes on Scaling Out&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Jason Sobel explains a couple of tricks Facebook use to deal with consistency between their California and Virginia data centres. The first is to hijack the MySQL replication stream to include information about memcached records to invalidate; the second is to use Layer 7 load balancers which inspect a “last modification time” cookie and send users to the masters in California if they have updated their profile in the past 20 seconds.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/facebook"&gt;facebook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jason-sobel"&gt;jason-sobel&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/replication"&gt;replication&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;&lt;/p&gt;



</summary><category term="facebook"/><category term="jason-sobel"/><category term="memcached"/><category term="mysql"/><category term="replication"/><category term="scaling"/></entry><entry><title>Velocity: A Distributed In-Memory Cache from Microsoft</title><link href="https://simonwillison.net/2008/Jun/6/dare/#atom-tag" rel="alternate"/><published>2008-06-06T21:52:04+00:00</published><updated>2008-06-06T21:52:04+00:00</updated><id>https://simonwillison.net/2008/Jun/6/dare/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.25hoursaday.com/weblog/2008/06/06/VelocityADistributedInMemoryCacheFromMicrosoft.aspx"&gt;Velocity: A Distributed In-Memory Cache from Microsoft&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I’d been wondering what Microsoft ecosystem developers were using in the absence of memcached. Is Velocity the first Windows platform implementation of this idea?


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dare-obasanjo"&gt;dare-obasanjo&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microsoft"&gt;microsoft&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/velocity"&gt;velocity&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/windows"&gt;windows&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="dare-obasanjo"/><category term="memcached"/><category term="microsoft"/><category term="velocity"/><category term="windows"/></entry><entry><title>App Engine Fan: Efficient Global Counters</title><link href="https://simonwillison.net/2008/Jun/3/app/#atom-tag" rel="alternate"/><published>2008-06-03T00:56:54+00:00</published><updated>2008-06-03T00:56:54+00:00</updated><id>https://simonwillison.net/2008/Jun/3/app/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.appenginefan.com/2008/06/efficient-global-counters.html"&gt;App Engine Fan: Efficient Global Counters&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Implementing efficient counters in Google App Engine, using shards and/or memcached.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/counters"&gt;counters&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/google-app-engine"&gt;google-app-engine&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;&lt;/p&gt;



</summary><category term="counters"/><category term="google-app-engine"/><category term="memcached"/></entry><entry><title>so-you-wanna-see-an-image</title><link href="https://simonwillison.net/2008/May/1/codeword/#atom-tag" rel="alternate"/><published>2008-05-01T10:13:09+00:00</published><updated>2008-05-01T10:13:09+00:00</updated><id>https://simonwillison.net/2008/May/1/codeword/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://blog.apokalyptik.com/2007/10/10/so-you-wanna-see-an-image/"&gt;so-you-wanna-see-an-image&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
WordPress.com use Amazon S3 to store images (presumably to save having to create a massive scalable redundant filesystem themselves) but the images are served via a load balanced memcached / varnishd caching system that they control.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://barry.wordpress.com/2007/11/01/static-hostname-hashing-in-pound/"&gt;Static hostname hashing in Pound&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/amazon-s3"&gt;amazon-s3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/s3"&gt;s3&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/varnish"&gt;varnish&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wordpresscom"&gt;wordpresscom&lt;/a&gt;&lt;/p&gt;



</summary><category term="amazon-s3"/><category term="caching"/><category term="memcached"/><category term="s3"/><category term="varnish"/><category term="wordpresscom"/></entry><entry><title>Nginx and Memcached, a 400% boost!</title><link href="https://simonwillison.net/2008/Feb/11/nginx/#atom-tag" rel="alternate"/><published>2008-02-11T22:05:11+00:00</published><updated>2008-02-11T22:05:11+00:00</updated><id>https://simonwillison.net/2008/Feb/11/nginx/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.igvita.com/2008/02/11/nginx-and-memcached-a-400-boost/"&gt;Nginx and Memcached, a 400% boost!&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Ilya Grigorik wrote up my current favourite nginx trick—you set nginx to check memcached for a cache entry matching the current URL on every hit, then invalidate your cache by pushing a new cache record straight in to memcached from your application server.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ilyagrigorik"&gt;ilyagrigorik&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/performance"&gt;performance&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="ilyagrigorik"/><category term="memcached"/><category term="nginx"/><category term="performance"/></entry><entry><title>RubyForge: Starling</title><link href="https://simonwillison.net/2008/Jan/11/rubyforge/#atom-tag" rel="alternate"/><published>2008-01-11T21:47:26+00:00</published><updated>2008-01-11T21:47:26+00:00</updated><id>https://simonwillison.net/2008/Jan/11/rubyforge/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://rubyforge.org/projects/starling/"&gt;RubyForge: Starling&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“Starling is a light-weight persistent queue server that speaks the MemCache protocol. It was built to drive Twitter’s backend, and is in production across Twitter’s cluster.”


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blaine-cook"&gt;blaine-cook&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/message-queues"&gt;message-queues&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/messaging"&gt;messaging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/queue"&gt;queue&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ruby"&gt;ruby&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rubyforge"&gt;rubyforge&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/starling"&gt;starling&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/twitter"&gt;twitter&lt;/a&gt;&lt;/p&gt;



</summary><category term="blaine-cook"/><category term="memcached"/><category term="message-queues"/><category term="messaging"/><category term="queue"/><category term="ruby"/><category term="rubyforge"/><category term="starling"/><category term="twitter"/></entry><entry><title>NginxMemcachedModule</title><link href="https://simonwillison.net/2007/Dec/15/nginxmemcachedmodule/#atom-tag" rel="alternate"/><published>2007-12-15T01:59:23+00:00</published><updated>2007-12-15T01:59:23+00:00</updated><id>https://simonwillison.net/2007/Dec/15/nginxmemcachedmodule/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://wiki.codemongers.com/NginxMemcachedModule"&gt;NginxMemcachedModule&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
nginx can be set up to directly serve a URL from memcache if the corresponding cache key is set, and fall back to a backend application server otherwise. Application servers can then write directly to memcache when content needs to be cached or goes stale.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/caching"&gt;caching&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcache"&gt;memcache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nginx"&gt;nginx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scaling"&gt;scaling&lt;/a&gt;&lt;/p&gt;



</summary><category term="caching"/><category term="memcache"/><category term="memcached"/><category term="nginx"/><category term="scaling"/></entry><entry><title>A Django Cache Status</title><link href="https://simonwillison.net/2007/Aug/25/django/#atom-tag" rel="alternate"/><published>2007-08-25T14:08:56+00:00</published><updated>2007-08-25T14:08:56+00:00</updated><id>https://simonwillison.net/2007/Aug/25/django/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://effbot.org/zone/django-memcached-view.htm"&gt;A Django Cache Status&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Django view to display stats pulled from your memcached server.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fredrik-lundh"&gt;fredrik-lundh&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/memcached"&gt;memcached&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="fredrik-lundh"/><category term="memcached"/></entry></feed>