<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: scrapy</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/scrapy.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2010-01-27T12:27:03+00:00</updated><author><name>Simon Willison</name></author><entry><title>World Government Data</title><link href="https://simonwillison.net/2010/Jan/27/world/#atom-tag" rel="alternate"/><published>2010-01-27T12:27:03+00:00</published><updated>2010-01-27T12:27:03+00:00</updated><id>https://simonwillison.net/2010/Jan/27/world/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.guardian.co.uk/world-government-data"&gt;World Government Data&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Launched last week, this is the Guardian’s meta-search engine for searching and browsing through data from four different government data sites (with more sites planned). Under the hood it’s Django, Solr, Haystack and the Scrapy crawling library. The application was built by Ben Firshman during an internship over Christmas.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ben-firshman"&gt;ben-firshman&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/data"&gt;data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datagovuk"&gt;datagovuk&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/guardian"&gt;guardian&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/haystack"&gt;haystack&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/scrapy"&gt;scrapy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/solr"&gt;solr&lt;/a&gt;&lt;/p&gt;



</summary><category term="ben-firshman"/><category term="data"/><category term="datagovuk"/><category term="django"/><category term="guardian"/><category term="haystack"/><category term="projects"/><category term="python"/><category term="scrapy"/><category term="solr"/></entry></feed>