<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: shapefiles</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/shapefiles.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2024-03-23T03:44:22+00:00</updated><author><name>Simon Willison</name></author><entry><title>mapshaper.org</title><link href="https://simonwillison.net/2024/Mar/23/mapshaperorg/#atom-tag" rel="alternate"/><published>2024-03-23T03:44:22+00:00</published><updated>2024-03-23T03:44:22+00:00</updated><id>https://simonwillison.net/2024/Mar/23/mapshaperorg/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://mapshaper.org/"&gt;mapshaper.org&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
It turns out the mapshaper CLI tool for manipulating geospatial data—including converting shapefiles to GeoJSON and back again—also has a web UI that runs the conversions entirely in your browser. If you need to convert between those (and other) formats it’s hard to imagine a more convenient option.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/adamrpearce/status/1771378836128854097"&gt;@adamrpearce&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="geospatial"/><category term="javascript"/><category term="shapefiles"/><category term="geojson"/></entry><entry><title>Claude and ChatGPT for ad-hoc sidequests</title><link href="https://simonwillison.net/2024/Mar/22/claude-and-chatgpt-case-study/#atom-tag" rel="alternate"/><published>2024-03-22T19:44:12+00:00</published><updated>2024-03-22T19:44:12+00:00</updated><id>https://simonwillison.net/2024/Mar/22/claude-and-chatgpt-case-study/#atom-tag</id><summary type="html">
    &lt;p&gt;Here is a short, illustrative example of one of the ways in which I use Claude and ChatGPT on a daily basis.&lt;/p&gt;
&lt;p&gt;I recently learned that the &lt;a href="https://en.wikipedia.org/wiki/Adirondack_Park"&gt;Adirondack Park&lt;/a&gt; is the single largest park in the contiguous United States, taking up a fifth of the state of New York.&lt;/p&gt;
&lt;p&gt;Naturally, my first thought was that it would be neat to have a GeoJSON file representing the boundary of the park.&lt;/p&gt;
&lt;p&gt;A quick search landed me on the &lt;a href="https://apa.ny.gov/gis/ApaData.html"&gt;Adirondack Park Agency GIS data page&lt;/a&gt;, which offered me a shapefile of the "Outer boundary of the New York State Adirondack Park as described in Section 9-0101 of the New York Environmental Conservation Law". Sounds good!&lt;/p&gt;
&lt;p&gt;I knew there were tools for converting shapefiles to GeoJSON, but I couldn't remember what they were. Since I had a terminal window open already, I typed the following:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre style="white-space: pre-wrap"&gt;llm -m opus -c &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;give me options on macOS for CLI tools to turn a shapefile into GeoJSON&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here I am using my &lt;a href="https://llm.datasette.io/"&gt;LLM tool&lt;/a&gt; (and &lt;a href="https://github.com/simonw/llm-claude-3"&gt;llm-claude-3&lt;/a&gt; plugin) to run a prompt through the new &lt;a href="https://www.anthropic.com/news/claude-3-family"&gt;Claude 3 Opus&lt;/a&gt;, my current favorite language model.&lt;/p&gt;
&lt;p&gt;It &lt;a href="https://gist.github.com/simonw/331918e46f33e27e997afb4e7c62fc74"&gt;replied with a couple of options&lt;/a&gt;, but the first was this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;ogr2ogr -f GeoJSON output.geojson input.shp&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So I ran that against the shapefile, and then pasted &lt;a href="https://gist.github.com/simonw/c941f3454cdec7e10f500dc5a752b614"&gt;the resulting GeoJSON&lt;/a&gt; into &lt;a href="https://geojson.io/"&gt;geojson.io&lt;/a&gt; to check if it worked... and nothing displayed. Then I looked at the GeoJSON and spotted this:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;"coordinates": [ [ -8358911.527799999341369, 5379193.197800002992153 ] ...&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;That didn't look right. Those co-ordinates aren't the correct scale for latitude and longitude values.&lt;/p&gt;
&lt;p&gt;So I sent a follow-up prompt to the model (the &lt;code&gt;-c&lt;/code&gt; option means "continue previous conversation"):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre style="white-space: pre-wrap"&gt;llm -c &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;i tried using ogr2ogr but it gave me back GeoJSON with a weird coordinate system that was not lat/lon that i am used to&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It suggested this new command:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;ogr2ogr -f GeoJSON -t_srs EPSG:4326 output.geojson input.shp&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This time &lt;a href="https://gist.github.com/simonw/6c4cf102a8ea532dc365c2773f0eb6ea"&gt;it worked&lt;/a&gt;! The shapefile has now been converted to GeoJSON.&lt;/p&gt;
&lt;p&gt;Time elapsed so far: 2.5 minutes (I can tell from &lt;a href="https://llm.datasette.io/en/stable/logging.html"&gt;my LLM logs&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;I pasted it into &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; (with &lt;a href="https://github.com/datasette/datasette-paste"&gt;datasette-paste&lt;/a&gt; and &lt;a href="https://datasette.io/plugins/datasette-leaflet-geojson"&gt;datasette-leaflet-geojson&lt;/a&gt;) to take a look at it more closely, and got this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/datasette-adirondack-boundaries.jpg" alt="A Datasette table with 106 rows. The first two are shown - both have properties and a geometry, and the geometry is a single line on a map. The first one has a ECL_Text of thence southerly along the westerly line of lots 223, 241, 259, 276, 293, 309, 325 and 340 to the southwesterly corner of lot number 340 in the Brantingham Tract and the second has thence westerly along the northern line of lots 204 and 203 to the midpoint of the northern line of lot 203" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;That's not a single polygon! That's 106 line segments... and they are fascinating. Look at those descriptions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;thence westerly along the northern line of lots 204 and 203 to the midpoint of the northern line of lot 203&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is utterly delightful. The shapefile description did say "as described in Section 9-0101 of the New York Environmental Conservation Law", so I guess this is how you write geographically boundaries into law!&lt;/p&gt;
&lt;p&gt;But it's not what I wanted. I want a single polygon of the whole park, not 106 separate lines.&lt;/p&gt;
&lt;p&gt;I decided to switch models. ChatGPT has access to Code Interpreter, and I happen to know that Code Interpreter is quite effective at processing GeoJSON.&lt;/p&gt;
&lt;p&gt;I opened a new ChatGPT (with GPT-4) browser tab, uploaded my GeoJSON file and prompted it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This GeoJSON file is full of line segments. Use them to create me a single shape that is a Polygon&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-1.jpg" alt="ChatGPT screenshot - it shows some Python code with a result of &amp;lt;shapely.geometry.polygon.Polygon at 0x7eba83f9fca0 /&amp;gt;, then says: I've created a polygon from the line segments in the GeoJSON file. You can now use this polygon for further analysis or visualization. If you have specific requirements for the polygon or need it in a particular format, please let me know! ​​" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;OK, so it wrote some Python code and ran it. But did it work?&lt;/p&gt;
&lt;p&gt;I happen to know that Code Interpreter can save files to disk and provide links to download them, so I told it to do that:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Save it to a GeoJSON file for me to download&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-2.jpg" alt="ChatGPT screenshot - this time it writes more Python code to define a GeoJSON polygon, then saves that to a file called /mnt/data/polygon.geojson and gives me a link to download it.​​" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I pasted &lt;a href="https://gist.github.com/simonw/c1002dbf5249de7addd0b65cb774d3e9"&gt;that&lt;/a&gt; into &lt;a href="https://geojson.io/"&gt;geojson.io&lt;/a&gt;, and it was clearly wrong:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-bad-map.jpg" alt="geojson.io screenshot - a triangle shape sits on top of an area of upstate New York, clearly not in the shape of the park" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;So I told it to try again. I didn't think very hard about this prompt, I basically went with a version of "do better":&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;that doesn't look right to me, check that it has all of the lines in it&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-3.jpg" alt="ChatGPT screenshot - it writes more Python code and outputs a link to complete_polygon.geojson​​" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;It gave me a new file, optimistically named &lt;code&gt;complete_polygon.geojson&lt;/code&gt;. Here's what that one looked like:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-almost.jpg" alt="ChatGPT screenshot - it writes more Python code and outputs a link to complete_polygon.geojson​​" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This is getting a lot closer! Note how the right hand boundary of the park looks correct, but the rest of the image is scrambled.&lt;/p&gt;
&lt;p&gt;I had a hunch about the fix. I pasted in a screenshot of where we were so far and added my hunch about the solution:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;That almost works but you need to sort the line segments first, it looked like this:&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Honestly, pasting in the screenshot probably wasn't necessary here, but it amused me.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-4.jpg" alt="That almost works but you need to sort the line segments first, it looked like this: an a screenshot of a map" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;... and ChatGPT churned away again ...&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-chatgpt-5.jpg" alt="More Python code - link to the full transcript is below" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://gist.github.com/simonw/b9e4325b76e4a3813ff5482aa278c342"&gt;sorted_polygon.geojson&lt;/a&gt; is spot on! Here's what it looks like:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/shapely-good-map.jpg" alt="A shaded polygon showing the exact shape of the boundary of Adirondack Park, overlayed on a map of the area" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Total time spent in ChatGPT: 3 minutes and 35 seconds. Plus 2.5 minutes with Claude 3 earlier, so an overall total of just over 6 minutes.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/0343cdd3568bbe28cad15d1097b1b1c7"&gt;the full Claude transcript&lt;/a&gt; and the &lt;a href="https://gist.github.com/simonw/3eb845823c5ad4c48d2b4eb7586f1533"&gt;full transcript from ChatGPT&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="not-notable"&gt;This isn't notable&lt;/h4&gt;
&lt;p&gt;The most notable thing about this example is how completely &lt;em&gt;not&lt;/em&gt; notable it is.&lt;/p&gt;
&lt;p&gt;I get results like this from these tools several times a day. I'm not at all surprised that this worked, in fact, I would've been mildly surprised if it had not.&lt;/p&gt;
&lt;p&gt;Could I have done this without LLM assistance? Yes, but not nearly as quickly. And this was not a task on my critical path for the day - it was a sidequest at best and honestly more of a distraction.&lt;/p&gt;
&lt;p&gt;So, without LLM tools, I would likely have given this one up at the first hurdle.&lt;/p&gt;
&lt;p&gt;A year ago I wrote about how &lt;a href="https://simonwillison.net/2023/Mar/27/ai-enhanced-development/"&gt;AI-enhanced development makes me more ambitious with my projects&lt;/a&gt;. They are now so firmly baked into my daily work that they influence not just side projects but tiny sidequests like this one as well.&lt;/p&gt;
&lt;h4 id="not-simple"&gt;This certainly wasn't simple&lt;/h4&gt;
&lt;p&gt;Something else I like about this example is that it illustrates quite how much depth there is to getting great results out of these systems.&lt;/p&gt;
&lt;p&gt;In those few minutes I used two different interfaces to call two different models. I sent multiple follow-up prompts. I triggered Code Interpreter, took advantage of GPT-4 Vision and mixed in external tools like &lt;a href="https://geojson.io/"&gt;geojson.io&lt;/a&gt; and Datasette as well.&lt;/p&gt;
&lt;p&gt;I leaned a lot on my existing knowledge and experience:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I knew that tools existed for commandline processing of shapefiles and GeoJSON&lt;/li&gt;
&lt;li&gt;I instinctively knew that Claude 3 Opus was likely to correctly answer my initial prompt&lt;/li&gt;
&lt;li&gt;I knew the capabilities of Code Interpreter, including that it has libraries that can process geometries, what to say to get it to kick into action and how to get it to give me files to download&lt;/li&gt;
&lt;li&gt;My limited GIS knowledge was strong enough to spot a likely coordinate system problem, and I guessed the fix for the jumbled lines&lt;/li&gt;
&lt;li&gt;My prompting intuition is developed to the point that I didn't have to think very hard about what to say to get the best results&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you have the right combination of domain knowledge and hard-won experience driving LLMs, you can &lt;em&gt;fly&lt;/em&gt; with these things.&lt;/p&gt;
&lt;h4 id="a-bit-trivial"&gt;Isn't this a bit trivial?&lt;/h4&gt;
&lt;p&gt;Yes it is, and that's the point. This was a five minute sidequest. Writing about it here took ten times longer than the exercise itself.&lt;/p&gt;
&lt;p&gt;I take on LLM-assisted sidequests like this one dozens of times a week. Many of them are substantially larger and more useful. They are having a very material impact on my work: I can get more done and solve much more interesting problems, because I'm not wasting valuable cycles figuring out &lt;code&gt;ogr2ogr&lt;/code&gt; invocations or mucking around with polygon libraries.&lt;/p&gt;
&lt;p&gt;Not to mention that I find working this way &lt;em&gt;fun&lt;/em&gt;! It feels like science fiction every time I do it. Our AI-assisted future is here right now and I'm still finding it weird, fascinating and deeply entertaining.&lt;/p&gt;
&lt;h4 id="llms-are-useful"&gt;LLMs are useful&lt;/h4&gt;
&lt;p&gt;There are many legitimate criticisms of LLMs. The copyright issues involved in their training, their enormous power consumption and the risks of people trusting them when they shouldn't (considering both accuracy and bias) are three that I think about a lot.&lt;/p&gt;
&lt;p&gt;The one criticism I wont accept is that they aren't &lt;em&gt;useful&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;One of the greatest misconceptions concerning LLMs is the idea that they are easy to use. They really aren't: getting great results out of them requires a great deal of experience and hard-fought intuition, combined with deep domain knowledge of the problem you are applying them to.&lt;/p&gt;
&lt;p&gt;I use these things every day. They help me take on much more interesting and ambitious problems than I could otherwise. I would miss them terribly if they were no longer available to me.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/code-interpreter"&gt;code-interpreter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="geospatial"/><category term="shapefiles"/><category term="geojson"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="claude"/><category term="code-interpreter"/><category term="coding-agents"/><category term="leaflet"/></entry><entry><title>geoBoundaries</title><link href="https://simonwillison.net/2022/Mar/24/geoboundaries/#atom-tag" rel="alternate"/><published>2022-03-24T14:03:56+00:00</published><updated>2022-03-24T14:03:56+00:00</updated><id>https://simonwillison.net/2022/Mar/24/geoboundaries/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.geoboundaries.org/"&gt;geoBoundaries&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This looks useful: “The world’s largest open, free and research-ready database of political administrative boundaries.” Founded by the geoLab at William &amp;amp; Mary university, and released under a Creative Commons Attribution license that includes a requirement for a citation. File formats offered include shapefiles, GeoJSON and TopoJSON.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;&lt;/p&gt;



</summary><category term="geospatial"/><category term="shapefiles"/><category term="geojson"/></entry><entry><title>Weeknotes: datasette-export-notebook, PyInstaller packaged Datasette, CBSAs</title><link href="https://simonwillison.net/2021/Jan/10/weeknotes/#atom-tag" rel="alternate"/><published>2021-01-10T00:26:13+00:00</published><updated>2021-01-10T00:26:13+00:00</updated><id>https://simonwillison.net/2021/Jan/10/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;What a &lt;a href="https://en.wikipedia.org/wiki/2021_storming_of_the_United_States_Capitol"&gt;terrible week&lt;/a&gt;. I've found it hard to concentrate on anything substantial. In a mostly futile attempt to distract myself from doomscrolling I've mainly been building some experimental output plugins, fiddling with PyInstaller and messing around with shapefiles.&lt;/p&gt;
&lt;h4&gt;Packaged Datasette with PyInstaller&lt;/h4&gt;
&lt;p&gt;A long running goal for Datasette has been to make it as easy to install as possible - something that's not particularly straight-forward for applications written in Python, at least in comparison to toolchains like Rust, Go or &lt;a href="https://deno.land/manual@v1.6.3/tools/compiler"&gt;Deno&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Back in November 2017 Raffaele Messuti &lt;a href="https://github.com/simonw/datasette/issues/93"&gt;suggested using PyInstaller&lt;/a&gt; for this. I revisited that issue while looking through open Datasette issues ordered by &lt;a href="https://github.com/simonw/datasette/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-asc"&gt;least recently updated&lt;/a&gt; and decided to try it out again - and it worked! Here's &lt;a href="https://til.simonwillison.net/python/packaging-pyinstaller"&gt;the resulting TIL&lt;/a&gt;, and I've attached a bundled &lt;code&gt;datasette&lt;/code&gt; macOS binary file to the &lt;a href="https://github.com/simonw/datasette/releases/tag/0.53"&gt;0.53 release&lt;/a&gt; on GitHub.&lt;/p&gt;
&lt;p&gt;There's one catch: the binary isn't signed, which means it shows security warnings that have to be worked around if you try to run it on macOS. I've started looking into the signing process - I'm going to need an Apple Developer account and to jump through a bunch of different hoops, but it looks like I should be able to get that working. Here's my issue for that: &lt;a href="https://github.com/simonw/datasette/issues/1171"&gt;GitHub Actions workflow to build and sign macOS binary executables&lt;/a&gt; - it looks like &lt;a href="https://github.com/mitchellh/gon"&gt;gon&lt;/a&gt; is the missing automation piece I need.&lt;/p&gt;
&lt;p&gt;One thing that &lt;em&gt;really&lt;/em&gt; impressed me about &lt;a href="https://www.pyinstaller.org/"&gt;PyInstaller&lt;/a&gt; is the size of the resulting file. On both macOS and Linux it was able to create a roughly 8MB file containing Datasette, all of its dependencies AND a working Python environment. It's pretty magic!&lt;/p&gt;
&lt;h4&gt;datasette-css-properties&lt;/h4&gt;
&lt;p&gt;I wrote this up in detail &lt;a href="https://simonwillison.net/2021/Jan/7/css-apis-no-javascript/"&gt;a couple of days ago&lt;/a&gt;: &lt;a href="https://datasette.io/plugins/datasette-css-properties"&gt;datasette-css-properties&lt;/a&gt; is an amusingly weird output plugin that turns the results of a SQL query into CSS custom property definitions which can then be used to style or insert content into the current page.&lt;/p&gt;
&lt;h4&gt;sqlite-utils 3.2&lt;/h4&gt;
&lt;p&gt;The big new feature in this release is &lt;a href="https://sqlite-utils.datasette.io/en/stable/python-api.html#cached-table-counts-using-triggers"&gt;cached table counts using triggers&lt;/a&gt;, which I &lt;a href="https://simonwillison.net/2021/Jan/3/weeknotes/#cached-table-counts"&gt;described last week&lt;/a&gt;. Full &lt;a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-2"&gt;release notes here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've opened &lt;a href="https://github.com/simonw/datasette/issues/1183"&gt;an issue&lt;/a&gt; to take advantage of this optimization in Datasette itself.&lt;/p&gt;
&lt;h4&gt;datasette-export-notebook&lt;/h4&gt;
&lt;p&gt;This is an idea I've been bouncing around for a while, and during a bout of attempted-coup-induced insomnia I decided to sketch out an initial version.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://datasette.io/plugins/datasette-export-notebook"&gt;datasette-export-notebook&lt;/a&gt; is a plugin that adds an export-to-notebook option to any table or query.&lt;/p&gt;
&lt;p&gt;This provides a page of documentation with copy-and-paste examples for loading data from that table or query into a &lt;a href="https://jupyter.org/"&gt;Jupyter&lt;/a&gt; or &lt;a href="https://observablehq.com/"&gt;Observable&lt;/a&gt; notebook.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://latest-with-plugins.datasette.io/github/commits.Notebook?repo=107914493"&gt;a live demo&lt;/a&gt;. The interface currently looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of the interface, showing copy-and-paste examples for Jupyter and Observable imports" src="https://static.simonwillison.net/static/2021/Export_to_a_notebook.png" style="max-width:100%;" /&gt;&lt;/p&gt;
&lt;p&gt;As often happens when building even simple plugins like this I identified some &lt;a href="https://github.com/simonw/datasette/issues/1179"&gt;small&lt;/a&gt; &lt;a href="https://github.com/simonw/datasette/issues/1177"&gt;improvements&lt;/a&gt; I can make to Datasette.&lt;/p&gt;
&lt;h4 id="cbsa-datasette"&gt;cbsa-datasette&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Core-based_statistical_area"&gt;Core-based statistical areas&lt;/a&gt; are a US government concept used for various statistical purposes. They are essentially metropolitan areas, based on central cities and the commuting area that they sit inside.&lt;/p&gt;
&lt;p&gt;I built &lt;a href="https://cbsa.datasettes.com/"&gt;cbsa.datasettes.com&lt;/a&gt; this week to provide an API for looking up a CBSA based on a latitude and longitude point. Here's a location within San Francisco for example:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://cbsa.datasettes.com/core/by_lat_lon?longitude=-122.51&amp;amp;latitude=37.78"&gt;https://cbsa.datasettes.com/core/by_lat_lon?longitude=-122.51&amp;amp;latitude=37.78&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This returns &lt;code&gt;San Francisco-Oakland-Berkeley, CA&lt;/code&gt;. Add &lt;code&gt;.json&lt;/code&gt; and &lt;code&gt;&amp;amp;_shape=array&lt;/code&gt; to the above URL to get &lt;a href="https://cbsa.datasettes.com/core/by_lat_lon.json?longitude=-122.51&amp;amp;latitude=37.78&amp;amp;_shape=array"&gt;a JSON API version&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The data comes from a shapefile &lt;a href="https://data-usdot.opendata.arcgis.com/datasets/b0d0e777e2ad4b53803dbc0527c73d88_0"&gt;published by the Bureau of Transportation Stastics&lt;/a&gt;. I'm using &lt;a href="https://datasette.io/tools/shapefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/a&gt; to import it into a SpatiaLite database, then publishing it to Cloud Run using &lt;a href="https://github.com/simonw/cbsa-datasette/blob/main/.github/workflows/deploy.yml"&gt;this GitHub Actions workfow&lt;/a&gt;. Full details in &lt;a href="https://github.com/simonw/cbsa-datasette"&gt;the README&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I built this mainly to act as a simple updated example of how to use Datasette and SpatiaLite to provide an API against data from a shapefile. I published a tutorial about &lt;a href="https://simonwillison.net/2017/Dec/12/location-time-zone-api/"&gt;doing this for timezones&lt;/a&gt; three years ago, but &lt;code&gt;shapefile-to-sqlite&lt;/code&gt; makes it much easier.&lt;/p&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-export-notebook"&gt;datasette-export-notebook&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-export-notebook/releases/tag/0.1.1"&gt;0.1.1&lt;/a&gt; - 2021-01-09
&lt;br /&gt;Datasette plugin providing instructions for exporting data to Jupyter or Observable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-css-properties"&gt;datasette-css-properties&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/datasette-css-properties/releases/tag/0.2"&gt;0.2&lt;/a&gt; - 2021-01-07
&lt;br /&gt;Experimental Datasette output plugin using CSS properties&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/strong&gt;: &lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.2"&gt;3.2&lt;/a&gt; - 2021-01-03
&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://til.simonwillison.net/til/til/python_packaging-pyinstaller.md"&gt;Packaging a Python app as a standalone binary with PyInstaller&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="geospatial"/><category term="projects"/><category term="python"/><category term="shapefiles"/><category term="datasette"/><category term="weeknotes"/><category term="sqlite-utils"/></entry><entry><title>California Protected Areas Database in Datasette</title><link href="https://simonwillison.net/2020/Aug/21/california-protected-areas-database-datasette/#atom-tag" rel="alternate"/><published>2020-08-21T23:15:58+00:00</published><updated>2020-08-21T23:15:58+00:00</updated><id>https://simonwillison.net/2020/Aug/21/california-protected-areas-database-datasette/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://calands.datasettes.com/"&gt;California Protected Areas Database in Datasette&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I built this yesterday: it’s a Datasette interface on top of the CPAD 2020 GIS database of protected areas in California maintained by GreenInfo Network. This was a useful excuse to build a GitHub Actions flow that builds a SpatiaLite database using my shapefile-to-sqlite tool, and I fixed a few bugs in my datasette-leaflet-geojson plugin as well.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://github.com/simonw/calands-datasette"&gt;calands-datasette on GitHub&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/spatialite"&gt;spatialite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/california"&gt;california&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;



</summary><category term="geospatial"/><category term="projects"/><category term="shapefiles"/><category term="spatialite"/><category term="california"/><category term="datasette"/><category term="github-actions"/><category term="leaflet"/></entry><entry><title>Things I learned about shapefiles building shapefile-to-sqlite</title><link href="https://simonwillison.net/2020/Feb/19/shapefile-to-sqlite/#atom-tag" rel="alternate"/><published>2020-02-19T05:25:58+00:00</published><updated>2020-02-19T05:25:58+00:00</updated><id>https://simonwillison.net/2020/Feb/19/shapefile-to-sqlite/#atom-tag</id><summary type="html">
    &lt;p&gt;The latest in my series of &lt;a href="https://datasette.readthedocs.io/en/latest/ecosystem.html#tools-for-creating-sqlite-databases"&gt;x-to-sqlite tools&lt;/a&gt; is &lt;a href="https://github.com/simonw/shapefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/a&gt;. I learned a whole bunch of things about the ESRI shapefile format while building it.&lt;/p&gt;
&lt;p&gt;Governments really love ESRI shapefiles. There is a huge amount of interesting geospatial data made available in the format - &lt;a href="https://catalog.data.gov/dataset?res_format=SHP"&gt;4,614 on Data.gov&lt;/a&gt;!&lt;/p&gt;
&lt;h3 id="shapefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/h3&gt;
&lt;p&gt;&lt;code&gt;shapefile-to-sqlite&lt;/code&gt; loads the data from these files into a SQLite database, turning geometry properties into database columns and the geometry itself into a blob of GeoJSON. Let&amp;#39;s try it out on a shapefile containing the &lt;a href="https://catalog.data.gov/dataset/national-parks"&gt;boundaries of US national parks&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ wget http:&lt;span class="hljs-comment"&gt;//nrdata.nps.gov/programs/lands/nps_boundary.zip&lt;/span&gt;
...
Saving to: ‘nps_boundary.zip’
nps_boundary.zip                           &lt;span class="hljs-number"&gt;100&lt;/span&gt;%[=====================================================================================&amp;gt;]  &lt;span class="hljs-number"&gt;12.61&lt;/span&gt;M   &lt;span class="hljs-number"&gt;705&lt;/span&gt;KB/s    &lt;span class="hljs-keyword"&gt;in&lt;/span&gt; &lt;span class="hljs-number"&gt;22&lt;/span&gt;s     
&lt;span class="hljs-number"&gt;2020&lt;/span&gt;&lt;span class="hljs-number"&gt;-02&lt;/span&gt;&lt;span class="hljs-number"&gt;-18&lt;/span&gt; &lt;span class="hljs-number"&gt;19&lt;/span&gt;:&lt;span class="hljs-number"&gt;59&lt;/span&gt;:&lt;span class="hljs-number"&gt;22&lt;/span&gt; (&lt;span class="hljs-number"&gt;597&lt;/span&gt; KB/s) - ‘nps_boundary.zip’ saved [&lt;span class="hljs-number"&gt;13227561&lt;/span&gt;/&lt;span class="hljs-number"&gt;13227561&lt;/span&gt;]

$ unzip nps_boundary.zip 
Archive:  nps_boundary.zip
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.xml  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.dbf  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.prj  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp  
inflating: temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shx

$ shapefile-to-sqlite nps.db temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp
temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp
[####################################]  &lt;span class="hljs-number"&gt;100&lt;/span&gt;%

$ datasette nps.db
Serve! files=(&lt;span class="hljs-string"&gt;'nps.db'&lt;/span&gt;,) (immutables=()) on port &lt;span class="hljs-number"&gt;8003&lt;/span&gt;
INFO:     Started server process [&lt;span class="hljs-number"&gt;33534&lt;/span&gt;]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http:&lt;span class="hljs-comment"&gt;//127.0.0.1:8001 (Press CTRL+C to quit)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I recommend installing the &lt;a href="https://github.com/simonw/datasette-leaflet-geojson"&gt;datasette-leaflet-geojson&lt;/a&gt; plugin, which will turn any column containing GeoJSON into a Leaflet map.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/nps-boundaries.jpg" alt="Screenshot of National Parks in Datasette" style="max-width: 100%" /&gt;&lt;/p&gt;
&lt;p&gt;If you&amp;#39;ve installed SpatiaLite (&lt;a href="https://datasette.readthedocs.io/en/latest/spatialite.html#installation"&gt;installation instructions here&lt;/a&gt;) you can use the &lt;code&gt;--spatialite&lt;/code&gt; option to instead store the geometry in a SpatiaLite column, unlocking &lt;a href="http://www.gaia-gis.it/gaia-sins/spatialite-sql-latest.html"&gt;a bewildering array&lt;/a&gt; of SQL geometry functions.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ shapefile-to-sqlite nps.db temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp --spatialite --table=nps-spatialite
temp/Current_Shapes/Data_Store/&lt;span class="hljs-number"&gt;06&lt;/span&gt;&lt;span class="hljs-number"&gt;-06&lt;/span&gt;&lt;span class="hljs-number"&gt;-12&lt;/span&gt;_Posting/nps_boundary.shp
[##################################--]   &lt;span class="hljs-number"&gt;94&lt;/span&gt;%  &lt;span class="hljs-number"&gt;00&lt;/span&gt;:&lt;span class="hljs-number"&gt;00&lt;/span&gt;:&lt;span class="hljs-number"&gt;00&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I deployed a copy of the resulting database using Cloud Run:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ datasette publish cloudrun nps.db \
    -&lt;span class="ruby"&gt;-service national-parks \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-title &lt;span class="hljs-string"&gt;"National Parks"&lt;/span&gt; \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-source_url=&lt;span class="hljs-string"&gt;"https://catalog.data.gov/dataset/national-parks"&lt;/span&gt; \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-source=&lt;span class="hljs-string"&gt;"data.gov"&lt;/span&gt; \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-spatialite \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-install=datasette-leaflet-geojson \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-install=datasette-render-binary \
&lt;/span&gt;    -&lt;span class="ruby"&gt;-extra-options=&lt;span class="hljs-string"&gt;"--config max_returned_rows:5"&lt;/span&gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;I used &lt;code&gt;max_returned_rows:5&lt;/code&gt; there because these geometrries are pretty big - without it a page with 100 rows on it can return over 90MB of HTML!&lt;/p&gt;
&lt;p&gt;You can browse the GeoJSON version of the table &lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps/nps_boundary"&gt;here&lt;/a&gt; and the SpatiaLite version &lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps/nps-spatialite"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The SpatiaLite version defaults to rendering each geometry as an ugly binary blob. You can convert them to GeoJSON for compatibility with &lt;code&gt;datasette-leaflet-geojson&lt;/code&gt; using the SpatiaLite &lt;code&gt;AsGeoJSON()&lt;/code&gt; function:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;select&lt;/span&gt; &lt;span class="hljs-keyword"&gt;id&lt;/span&gt;, UNIT_NAME, AsGeoJSON(geometry)
&lt;span class="hljs-keyword"&gt;from&lt;/span&gt; [nps-spatialite]
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here&amp;#39;s &lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps?sql=select+id%2C+UNIT_NAME%2C+AsGeoJSON%28geometry%29+from+%5Bnps-spatialite%5D"&gt;the result&lt;/a&gt; of that query running against the demo.&lt;/p&gt;
&lt;h3 id="understanding-shapefiles"&gt;Understanding shapefiles&lt;/h3&gt;
&lt;p&gt;The most confusing thing about shapefiles is that they aren&amp;#39;t a single file. A shapefile comes as a minimum of three files: &lt;code&gt;foo.shp&lt;/code&gt; containing geometries, &lt;code&gt;foo.shx&lt;/code&gt; containing an index into those geometries (really more of an implementation detail) and &lt;code&gt;foo.dbf&lt;/code&gt; contains key/value properties for each geometry.&lt;/p&gt;
&lt;p&gt;They often come bundled with other files too. &lt;code&gt;foo.prj&lt;/code&gt; is a WKT projection for the data for example. Wikipedia lists &lt;a href="https://en.wikipedia.org/wiki/Shapefile#Overview"&gt;a whole bunch&lt;/a&gt; of other possibilities.&lt;/p&gt;
&lt;p&gt;As a result, shapefiles are usually distributed as a zip file. Some shapefile libraries can even read directly from a zip.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://tools.ietf.org/html/rfc7946"&gt;GeoJSON format&lt;/a&gt; was designed as a modern alternative to shapefiles, so understanding GeoJSON really helps in understanding shapefiles. In particular the GeoJSON geometry types: Point, LineString, MultiLineString, Polygon and MultiPolygon match how shapefile geometries work.&lt;/p&gt;
&lt;p&gt;An important detail in shapefiles is that data in the &lt;code&gt;.shp&lt;/code&gt; and &lt;code&gt;.dbf&lt;/code&gt; files is matched by array index - so the first geometry can be considered as having ID=0, the second ID=1 and so on.&lt;/p&gt;
&lt;p&gt;You can read the properties from the &lt;code&gt;.dbf&lt;/code&gt; file using the &lt;a href="https://dbfread.readthedocs.io/en/latest/"&gt;dbfread&lt;/a&gt; Python module like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ ipython
&lt;span class="hljs-keyword"&gt;In&lt;/span&gt; [&lt;span class="hljs-number"&gt;1&lt;/span&gt;]: import dbfread
&lt;span class="hljs-keyword"&gt;In&lt;/span&gt; [&lt;span class="hljs-number"&gt;2&lt;/span&gt;]: db = dbfread.DBF(&lt;span class="hljs-string"&gt;"temp/Current_Shapes/Data_Store/06-06-12_Posting/nps_boundary.dbf"&lt;/span&gt;)
&lt;span class="hljs-keyword"&gt;In&lt;/span&gt; [&lt;span class="hljs-number"&gt;3&lt;/span&gt;]: next(iter(db))
&lt;span class="hljs-keyword"&gt;Out&lt;/span&gt;[&lt;span class="hljs-number"&gt;3&lt;/span&gt;]: 
OrderedDict([(&lt;span class="hljs-string"&gt;'UNIT_TYPE'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'Park'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'STATE'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'REGION'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'NC'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'UNIT_CODE'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'NACC'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'UNIT_NAME'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'West Potomac Park'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'DATE_EDIT'&lt;/span&gt;, &lt;span class="hljs-keyword"&gt;None&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'GIS_NOTES'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'CREATED_BY'&lt;/span&gt;, &lt;span class="hljs-string"&gt;'Legacy'&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'METADATA'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;),
            (&lt;span class="hljs-string"&gt;'PARKNAME'&lt;/span&gt;, &lt;span class="hljs-string"&gt;''&lt;/span&gt;)])
&lt;/code&gt;&lt;/pre&gt;&lt;h3 id="reading-shapefiles-in-python"&gt;Reading shapefiles in Python&lt;/h3&gt;
&lt;p&gt;I&amp;#39;m a big fan of the &lt;a href="https://shapely.readthedocs.io/"&gt;Shapely&lt;/a&gt; Python library, so I was delighted to see that Sean Gillies, creator of Shapely, also created a library for reading and writing shapefiles: &lt;a href="https://fiona.readthedocs.io/"&gt;Fiona&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://macwright.org/2012/10/31/gis-with-python-shapely-fiona.html"&gt;GIS with Python, Shapely, and Fiona&lt;/a&gt; by Tom MacWright was particularly useful for figuring this out. I like how he wrote that post in 2012 but added a note in 2017 that it&amp;#39;s still his recommended way of getting started with GIS in Python.&lt;/p&gt;
&lt;h3 id="projections"&gt;Projections&lt;/h3&gt;
&lt;p&gt;The trickiest part of working with any GIS data is always figuring out how to deal with &lt;a href="https://xkcd.com/977/"&gt;projections&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;GeoJSON &lt;a href="https://tools.ietf.org/html/rfc7946#section-4"&gt;attempts to standardize&lt;/a&gt; on WGS 84, otherwise known as the latitude/longitude model used by GPS. But... shapefiles frequently use something else. The &lt;a href="https://www.sccgov.org/sites/parks/Parks-Maps/Maps-Data/Pages/home.aspx"&gt;Santa Clara county parks&lt;/a&gt; shapefiles for example use &lt;a href="https://epsg.io/2227"&gt;EPSG:2227&lt;/a&gt;, also known as California zone 3.&lt;/p&gt;
&lt;p&gt;(Fun fact: ESPG stands for European Petroleum Survey Group, a now defunct oil industry group that today lives on only as a database of projected coordinate systems.)&lt;/p&gt;
&lt;p&gt;I spent &lt;a href="https://github.com/simonw/shapefile-to-sqlite/issues/6"&gt;quite a while&lt;/a&gt; thinking about how to best handle projections. In the end I decided that I&amp;#39;d follow GeoJSON&amp;#39;s lead and attempt to convert everything to WGS 84, but allow users to skip that behaviour using &lt;code&gt;--crs=keep&lt;/code&gt; or to specify an alternative projection to convert to with &lt;code&gt;--crs=epsg:2227&lt;/code&gt; or similar.&lt;/p&gt;
&lt;p&gt;SpatiaLite creates its geometry columns with a baked in SRID (a code which usually maps to the EPSG identifier). You can see which SRID was used for a specific geometry using the &lt;code&gt;srid()&lt;/code&gt; function:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps?sql=select+srid%28geometry%29+from+%22nps-spatialite%22+limit+1"&gt;select srid(geometry) from &amp;quot;nps-spatialite&amp;quot; limit 1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;SpatiaLite can also convert to another projection using the &lt;code&gt;Transform()&lt;/code&gt; function:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://national-parks-j7hipcg4aq-uc.a.run.app/nps?sql=select+%27%3A%27+%7C%7C+AsGeoJSON%28Transform%28geometry%2C+2227%29%29+from+%22nps-spatialite%22+limit+1"&gt;select &amp;#39;:&amp;#39; || AsGeoJSON(Transform(geometry, 2227)) from &amp;quot;nps-spatialite&amp;quot; limit 1&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;(I&amp;#39;m using &lt;code&gt;&amp;#39;:&amp;#39; || AsGeoJSON(...)&lt;/code&gt; here to disable the &lt;code&gt;datasette-leaflet-geojson&lt;/code&gt; plugin, since it can&amp;#39;t correctly render data that has been transformed to a non-WGS-84 proection.)&lt;/p&gt;
&lt;h3 id="pulling-it-all-together"&gt;Pulling it all together&lt;/h3&gt;
&lt;p&gt;I now have two tools for imorting geospatial data into SQLite (or SpatiaLite) databases: &lt;a href="hhttps://github.com/simonw/shahpefile-to-sqlite"&gt;shapefile-to-sqlite&lt;/a&gt; and &lt;a href="https://github.com/simonw/geojson-to-sqlite"&gt;geojson-to-sqlite&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I&amp;#39;m excited about Datasette&amp;#39;s potential as a tool for GIS. I started exploring this back in 2017 when I used it to &lt;a href="https://simonwillison.net/2017/Dec/12/location-time-zone-api/"&gt;build a location to timezone API&lt;/a&gt; - but adding easy shapefile imports to the toolchain should unlock all kinds of interesting new geospatial projects.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/spatialite"&gt;spatialite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tom-macwright"&gt;tom-macwright&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="geospatial"/><category term="projects"/><category term="shapefiles"/><category term="spatialite"/><category term="sqlite"/><category term="geojson"/><category term="weeknotes"/><category term="tom-macwright"/><category term="leaflet"/></entry><entry><title>Flickr Shapefiles Public Dataset 1.0</title><link href="https://simonwillison.net/2009/May/22/shapefiles/#atom-tag" rel="alternate"/><published>2009-05-22T18:12:10+00:00</published><updated>2009-05-22T18:12:10+00:00</updated><id>https://simonwillison.net/2009/May/22/shapefiles/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.flickr.com/blog/2009/05/21/flickr-shapefiles-public-dataset-10/"&gt;Flickr Shapefiles Public Dataset 1.0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Another awesome Geo dataset from the Yahoo! stable—this time it’s Flickr releasing shapefiles (geometrical shapes) for hundreds of thousands of places around the world, under the CC0 license which makes them essentially public domain. The shapes themselves have been crowdsourced from geocoded photos uploaded to Flickr, where users can “correct” the textual location assigned to each photo. Combine this with the GeoPlanet WOE data and you get a huge, free dataset describing the human geography of the world.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/creativecommons"&gt;creativecommons&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/crowdsourcing"&gt;crowdsourcing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/flickr"&gt;flickr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geoplanet"&gt;geoplanet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/maps"&gt;maps&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shapefiles"&gt;shapefiles&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/yahoo"&gt;yahoo&lt;/a&gt;&lt;/p&gt;



</summary><category term="creativecommons"/><category term="crowdsourcing"/><category term="flickr"/><category term="geoplanet"/><category term="geospatial"/><category term="maps"/><category term="shapefiles"/><category term="yahoo"/></entry></feed>