<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: alex-garcia</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/alex-garcia.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-01-22T02:09:54+00:00</updated><author><name>Simon Willison</name></author><entry><title>Six short video demos of LLM and Datasette projects</title><link href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#atom-tag" rel="alternate"/><published>2025-01-22T02:09:54+00:00</published><updated>2025-01-22T02:09:54+00:00</updated><id>https://simonwillison.net/2025/Jan/22/office-hours-demos/#atom-tag</id><summary type="html">
    &lt;p&gt;Last Friday Alex Garcia and I hosted a new kind of Datasette Public Office Hours session, inviting members of the Datasette community to share short demos of projects that they had built. The session lasted just over an hour and featured demos from six different people.&lt;/p&gt;
&lt;p&gt;We broadcast live on YouTube, but I've now edited the session into separate videos. These are listed below, along with project summaries and show notes for each presentation.&lt;/p&gt;
&lt;p&gt;You can also watch all six videos in &lt;a href="https://www.youtube.com/playlist?list=PLSocEbMlNGotyeonEbgFP1_uf9gk1z7zm"&gt;this YouTube playlist&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#llm-logs-feedback-by-matthias-l-bken"&gt;llm-logs-feedback by Matthias Lübken&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#llm-model-gateway-and-llm-consortium-by-thomas-hughes"&gt;llm-model-gateway and llm-consortium by Thomas Hughes&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#congressional-travel-explorer-with-derek-willis"&gt;Congressional Travel Explorer with Derek Willis&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#llm-questioncache-with-nat-knight"&gt;llm-questioncache with Nat Knight&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#improvements-to-datasette-enrichments-with-simon-willison"&gt;Improvements to Datasette Enrichments with Simon Willison&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2025/Jan/22/office-hours-demos/#datasette-comments-pins-and-write-ui-with-alex-garcia"&gt;Datasette comments, pins and write UI with Alex Garcia&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="llm-logs-feedback-by-matthias-l-bken"&gt;llm-logs-feedback by Matthias Lübken&lt;/h4&gt;
&lt;p&gt;&lt;lite-youtube videoid="9pEP6auZmvg"
  title="llm-logs-feedback by Matthias Lübken"
  playlabel="Play: llm-logs-feedback by Matthias Lübken"
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/luebken/llm-logs-feedback"&gt;llm-logs-feedback&lt;/a&gt; is a plugin by Matthias Lübken for &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; which adds the ability to store feedback on prompt responses, using new &lt;code&gt;llm feedback+1&lt;/code&gt; and &lt;code&gt;llm feedback-1&lt;/code&gt; commands. These also accept an optional comment, and the feedback is stored in a &lt;code&gt;feedback&lt;/code&gt; table in SQLite.&lt;/p&gt;
&lt;p&gt;You can install the plugin from PyPI like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-logs-feedback&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The full plugin implementation is in the &lt;a href="https://github.com/luebken/llm-logs-feedback/blob/main/llm_logs_feedback.py"&gt;llm_logs_feedback.py file&lt;/a&gt; in Matthias' GitHub repository.&lt;/p&gt;
&lt;h4 id="llm-model-gateway-and-llm-consortium-by-thomas-hughes"&gt;llm-model-gateway and llm-consortium by Thomas Hughes&lt;/h4&gt;
&lt;p&gt;&lt;lite-youtube videoid="Th5WOyjuRdk"
  title="llm-model-gateway and llm-consortium by Thomas Hughes"
  playlabel="Play: llm-model-gateway and llm-consortium by Thomas Hughes"
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;Tommy Hughes has been developing a whole array of LLM plugins, including his &lt;a href="https://github.com/irthomasthomas/llm-plugin-generator"&gt;llm-plugin-generator&lt;/a&gt; which is a plugin that can help write new plugins!&lt;/p&gt;
&lt;p&gt;He started by demonstrating &lt;a href="https://github.com/irthomasthomas/llm-model-gateway"&gt;llm-model-gateway&lt;/a&gt;, a plugin that adds a &lt;code&gt;llm serve&lt;/code&gt; command which starts a localhost server running an imitation of the OpenAI API against LLM models:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-model-gateway
llm serve&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Tommy's main demo was of &lt;a href="https://github.com/irthomasthomas/llm-consortium"&gt;llm-consortium&lt;/a&gt;, a plugin which can use a consortium of collaborating LLM models working together to solve problems.&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm install llm-consortium
llm consortium &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Best way to prepare avocado toast&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  --models claude-3.5-sonnet \
  --models gpt-4o \
  --arbiter gegemini-1.5-pro-latest \
  --confidence-threshold 0.8 \
  --max-iterations 3 \
  --output results.json&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I ran this and got back:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The best way to prepare avocado toast involves selecting quality ingredients and careful preparation. Start with a thick slice of good quality bread, such as sourdough or whole grain. Toast it to your preferred level of crispness.  While the bread is toasting, prepare a ripe avocado. Cut it in half, remove the pit, and scoop the flesh into a bowl. Mash the avocado with a fork, leaving some chunks for texture. Season with salt, black pepper, and a squeeze of fresh lemon juice to prevent browning.  Optional additions include a touch of red pepper flakes.&lt;/p&gt;
&lt;p&gt;Once the toast is ready, let it cool slightly before spreading the seasoned avocado evenly over it. Consider lightly rubbing a cut garlic clove on the warm toast for an extra layer of flavor (optional).&lt;/p&gt;
&lt;p&gt;Enhance your avocado toast with your favorite toppings. Popular choices include: everything bagel seasoning, sliced tomatoes, radishes, a poached or fried egg (for added protein), microgreens, smoked salmon (for a more savory option), feta cheese crumbles, or a drizzle of hot sauce.  For a finishing touch, drizzle with high-quality olive oil and sprinkle with sesame or chia seeds for added texture.&lt;/p&gt;
&lt;p&gt;Consider dietary needs when choosing toppings. For example, those following a low-carb diet might skip the tomatoes and opt for more protein and healthy fats.&lt;/p&gt;
&lt;p&gt;Finally, pay attention to presentation. Arrange the toppings neatly for a visually appealing toast. Serve immediately to enjoy the fresh flavors and crispy toast.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;But the really interesting thing is the full log of the prompts and responses sent to Claude 3.5 Sonnet and GPT-4o, followed by a combined prompt to Gemini 1.5 Pro to have it arbitrate between the two responses. You can see &lt;a href="https://gist.github.com/simonw/425f42f8ec1a963ae13c5b57ba580f56"&gt;the full logged prompts and responses here&lt;/a&gt;. Here's that &lt;a href="https://gist.github.com/simonw/e82370f0e5986a15823c82200c1b77f8"&gt;results.json&lt;/a&gt; output file.&lt;/p&gt;
&lt;h4 id="congressional-travel-explorer-with-derek-willis"&gt;Congressional Travel Explorer with Derek Willis&lt;/h4&gt;
&lt;p&gt;&lt;lite-youtube videoid="CDilLbFP1DY"
  title="Congressional Travel Explorer with Derek Willis"
  playlabel="Play: Congressional Travel Explorer with Derek Willis"
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;Derek Willis teaches data journalism at the Philip Merrill College of Journalism at the University of Maryland. For a recent project his students built a &lt;a href="https://cnsmaryland.org/interactives/fall-2024/congressional_travel_explorer/index.html"&gt;Congressional Travel Explorer&lt;/a&gt; interactive using Datasette, AWS Extract and Claude 3.5 Sonnet to analyze travel disclosures from members of Congress.&lt;/p&gt;
&lt;p&gt;One of the outcomes from the project was this story in Politico: &lt;a href="https://www.politico.com/news/2024/10/30/israel-aipac-funded-congress-travel-00185167"&gt;Members of Congress have taken hundreds of AIPAC-funded trips to Israel in the past decade&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="llm-questioncache-with-nat-knight"&gt;llm-questioncache with Nat Knight&lt;/h4&gt;
&lt;p&gt;&lt;lite-youtube videoid="lXwfEYXjsak"
  title="llm-questioncache with Nat Knight"
  playlabel="Play: llm-questioncache with Nat Knight"
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/nathanielknight/llm-questioncache"&gt;llm-questioncache&lt;/a&gt; builds on top of &lt;a href="https://llm.datasette.io/"&gt;https://llm.datasette.io/&lt;/a&gt; to cache answers to questions, using embeddings to return similar answers if they have already been stored.&lt;/p&gt;
&lt;p&gt;Using embeddings for de-duplication of similar questions is an interesting way to apply LLM's &lt;a href="https://llm.datasette.io/en/stable/embeddings/python-api.html"&gt;embeddings feature&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="improvements-to-datasette-enrichments-with-simon-willison"&gt;Improvements to Datasette Enrichments with Simon Willison&lt;/h4&gt;
&lt;p&gt;&lt;lite-youtube videoid="GumAgaYpda0"
  title="Improvements to Datasette Enrichments with Simon Willison"
  playlabel="Play: Improvements to Datasette Enrichments with Simon Willison"
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;I've demonstrated improvements I've been making to Datasette's &lt;a href="https://enrichments.datasette.io/"&gt;Enrichments&lt;/a&gt; system over the past few weeks.&lt;/p&gt;
&lt;p&gt;Enrichments allow you to apply an operation - such as geocoding, a QuickJS JavaScript transformation or an LLM prompt - against selected rows within a table.&lt;/p&gt;
&lt;p&gt;The latest release of &lt;a href="https://github.com/datasette/datasette-enrichments/releases/tag/0.5"&gt;datasette-enrichments&lt;/a&gt; adds visible progress bars and the ability to pause, resume and cancel an enrichment job that is running against a table.&lt;/p&gt;
&lt;h4 id="datasette-comments-pins-and-write-ui-with-alex-garcia"&gt;Datasette comments, pins and write UI with Alex Garcia&lt;/h4&gt;
&lt;p&gt;&lt;lite-youtube videoid="i0u4N6g15Zg"
  title="Datasette comments, pins and write UI with Alex Garcia"
  playlabel="Play: Datasette comments, pins and write UI with Alex Garcia"
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;
&lt;p&gt;We finished with three plugin demos from Alex, showcasing collaborative features we have been developing for &lt;a href="https://www.datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/datasette/datasette-write-ui"&gt;datasette-write-ui&lt;/a&gt; provides tools for editing and adding data to Datasette tables. A new feature here is the ability to shift-click a row to open the editing interface for that row.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/datasette/datasette-pins"&gt;datasette-pins&lt;/a&gt; allows users to pin tables and databases to their Datasette home page, making them easier to find.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/datasette/datasette-comments"&gt;datasette-comments&lt;/a&gt; adds a commenting interface to Datasette, allowing users to leave comments on individual rows in a table.&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/community"&gt;community&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/demos"&gt;demos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/derek-willis"&gt;derek-willis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/enrichments"&gt;enrichments&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-public-office-hours"&gt;datasette-public-office-hours&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="community"/><category term="data-journalism"/><category term="demos"/><category term="derek-willis"/><category term="youtube"/><category term="ai"/><category term="datasette"/><category term="alex-garcia"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="enrichments"/><category term="datasette-public-office-hours"/></entry><entry><title>Visualizing local election results with Datasette, Observable and MapLibre GL</title><link href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#atom-tag" rel="alternate"/><published>2024-11-09T23:32:06+00:00</published><updated>2024-11-09T23:32:06+00:00</updated><id>https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#atom-tag</id><summary type="html">
    &lt;p&gt;Alex Garcia and myself hosted the first &lt;a href="https://simonwillison.net/2024/Nov/7/datasette-public-office-hours/"&gt;Datasette Open Office Hours&lt;/a&gt; on Friday - a live-streamed video session where we hacked on a project together and took questions and tips from community members on Discord.&lt;/p&gt;
&lt;p&gt;We didn't record this one (surprisingly not a feature that Discord offers) but we hope to do more of these and record them in the future.&lt;/p&gt;
&lt;p&gt;This post is a detailed write-up of what we built during the session.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#san-mateo-county-election-results"&gt;San Mateo County election results&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#importing-csv-data-into-datasette"&gt;Importing CSV data into Datasette&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#modifying-the-schema"&gt;Modifying the schema&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#faceting-and-filtering-the-table"&gt;Faceting and filtering the table&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#importing-geospatial-precinct-shapes"&gt;Importing geospatial precinct shapes&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#enriching-that-data-to-extract-the-precinct-ids"&gt;Enriching that data to extract the precinct IDs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#running-a-join"&gt;Running a join&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#creating-an-api-token-to-access-the-data"&gt;Creating an API token to access the data&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#getting-cors-working"&gt;Getting CORS working&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#working-with-datasette-in-observable"&gt;Working with Datasette in Observable&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#visualizing-those-with-maplibre-gl"&gt;Visualizing those with MapLibre GL&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#observable-plot"&gt;Observable Plot&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#bringing-it-all-together"&gt;Bringing it all together&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2024/Nov/9/visualizing-local-election-results/#we-ll-be-doing-this-again"&gt;We'll be doing this again&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h4 id="san-mateo-county-election-results"&gt;San Mateo County election results&lt;/h4&gt;
&lt;p&gt;I live in El Granada, a tiny town just north of Half Moon Bay in San Mateo County, California.&lt;/p&gt;
&lt;p&gt;Every county appears to handle counting and publishing election results differently. For San Mateo County the results are published &lt;a href="https://smcacre.gov/elections/november-5-2024-election-results"&gt;on this page&lt;/a&gt;, and detailed per-precinct and per-candidate breakdowns are made available as a CSV file.&lt;/p&gt;
&lt;p&gt;(I optimistically set up a &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git scraper&lt;/a&gt; for these results in &lt;a href="https://github.com/simonw/scrape-san-mateo-county-election-results-2024"&gt;simonw/scrape-san-mateo-county-election-results-2024&lt;/a&gt; only to learn that the CSV is updated just once a day, not continually as the ballots are counted.)&lt;/p&gt;
&lt;p&gt;I'm particularly invested in the results of the &lt;a href="http://granada.ca.gov/"&gt;Granada Community Services District&lt;/a&gt; board member elections. Our little town of El Granada is in "unincorporated San Mateo County" which means we don't have a mayor or any local officials, so the closest we get to hyper-local government is the officials that run our local sewage and parks organization! My partner Natalie ran &lt;a href="https://til.simonwillison.net/youtube/livestreaming"&gt;the candidate forum event&lt;/a&gt; (effectively the debate) featuring three of the four candidates running for the two open places on the board.&lt;/p&gt;
&lt;p&gt;Let's explore the data for that race using Datasette.&lt;/p&gt;
&lt;h4 id="importing-csv-data-into-datasette"&gt;Importing CSV data into Datasette&lt;/h4&gt;
&lt;p&gt;I ran my part of the demo using &lt;a href="https://www.datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt;, the beta of my new hosted Datasette service.&lt;/p&gt;
&lt;p&gt;I started by using the pre-configured &lt;a href="https://github.com/datasette/datasette-import"&gt;datasette-import&lt;/a&gt; plugin to import the data from the CSV file into a fresh table:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/datasette-import-loop.gif" alt="Paste data to create a table - I drag and drop on a CSV file, which produces a preview of the first 100 of 15,589 rows. I click to Upload and a progress bar runs before redirecting me to the resulting table." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="modifying-the-schema"&gt;Modifying the schema&lt;/h4&gt;
&lt;p&gt;The table imported cleanly, but all of the columns from the CSV were still being treated as text. I used the &lt;a href=""&gt;datasette-edit-schema&lt;/a&gt; plugin to switch the relevant columns to integers so that we could run sums and sorts against them.&lt;/p&gt;
&lt;p&gt;(I also noted that I really should add a "detect column types" feature to that plugin!)&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/datasette-edit-schema.jpg" alt="Edit table data/san_mateo_election_results - an option to rename table and then one to change existing columns, where each column is listed in turn and some have their type select box set to integer instead of the default of text" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The resulting 15,589 rows represent counts from individual precincts around the county for each of the races and measures on the ballot, with a row per precinct per candidate/choice per race.&lt;/p&gt;
&lt;h4 id="faceting-and-filtering-the-table"&gt;Faceting and filtering the table&lt;/h4&gt;
&lt;p&gt;Since I'm interested in the Granada Community Services District election, I applied a facet on "Contest_title" and then used that to select that specific race.&lt;/p&gt;
&lt;p&gt;I applied additional facets on "candidate_name" and "Precinct name".&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/facet-candidates.jpg" alt="28 rows where Contest_title = Granada Community Services District Members, Board of Directors. Facets are precinct name (7 choices), candidate name (IRIS GRANT, JANET BRAYER, NANCY MARSH, WANDA BOWLES) and Contest_title" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This looks right to me: we have 7 precincts and 4 candidates for 28 rows in total.&lt;/p&gt;
&lt;h4 id="importing-geospatial-precinct-shapes"&gt;Importing geospatial precinct shapes&lt;/h4&gt;
&lt;p&gt;Those precinct names are pretty non-descriptive! What does 33001 mean?&lt;/p&gt;
&lt;p&gt;To answer that question, I added a new table.&lt;/p&gt;
&lt;p&gt;San Mateo County offers &lt;a href="https://smcacre.gov/elections/precinct-maps-pdf"&gt;precinct maps&lt;/a&gt; in the form of 23 PDF files. Our precincts are in the "Unincorporated Coastside" file:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/precinct-map-from-pdf.jpg" alt="Screenshot from a PDF - label is Unincorporated Coastside, it shows the area north of Half Moon Bay with a bunch of polygons with numeric identifiers." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Thankfully the county &lt;em&gt;also&lt;/em&gt; makes that data available as &lt;a href="https://data.smcgov.org/Government/Election-Precincts/g5sj-6zp8/about_data"&gt;geospatial data&lt;/a&gt;, hosted using Socrata with an option to export as GeoJSON.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/precincts-socrata.jpg" alt="Socrata interface, Election Precincts updated March 7 2022 - 533 views, 72 downloads, and export dataset modal shows a GeoJSON option to export 783 rows." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;datasette-import&lt;/code&gt; plugin can handle JSON files... and if a JSON file contains a top-level object with a key that is an array of objects, it will import those objects as a table.&lt;/p&gt;
&lt;p&gt;Dragging that file into Datasette is enough to import it as a table with a &lt;code&gt;properties&lt;/code&gt; JSON column containing properties and a &lt;code&gt;geometry&lt;/code&gt; JSON columnn with the GeoJSON geometry.&lt;/p&gt;
&lt;p&gt;Here's where another plugin kicks in: &lt;a href="https://datasette.io/plugins/datasette-leaflet-geojson"&gt;datasette-leaflet-geojson&lt;/a&gt; looks for columns that contain valid GeoJSON geometries and... draws them on a map!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/precincts-in-datasette.jpg" alt="Datasette precincts table with 783 rows. The properties column contains JSON keys lastupdate, creationda, prencitid, notes and active - the geometry column renders maps with polygons showing the shape of the precinct." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;So now we can see the shape of the individual geometries.&lt;/p&gt;
&lt;h4 id="enriching-that-data-to-extract-the-precinct-ids"&gt;Enriching that data to extract the precinct IDs&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;precinctid&lt;/code&gt; is present in the data, but it's tucked away in a JSON object in that &lt;code&gt;properties&lt;/code&gt; JSON blob. It would be more convenient if it was a top-level column.&lt;/p&gt;
&lt;p&gt;Datasette's &lt;a href="https://simonwillison.net/2023/Dec/1/datasette-enrichments/"&gt;enrichments feature&lt;/a&gt; provides tools for running operations against every row in a table and adding new columns based on the results.&lt;/p&gt;
&lt;p&gt;My Datasette Cloud instance was missing the &lt;a href="https://github.com/datasette/datasette-enrichments-quickjs"&gt;datasette-enrichments-quickjs plugin&lt;/a&gt; that would let me run JavaScript code against the data. I used my privileged access on Datasette Cloud to add that plugin to my requirements and restarted the instance to install it.&lt;/p&gt;
&lt;p&gt;I used that to run this JavaScript code against every row in the table and saved the output in a new &lt;code&gt;precinct_id&lt;/code&gt; column:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;enrich&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;row&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;parse&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;row&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;properties&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;precinctid&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/enrich-precincts.jpg" alt="Enrich data in precincts. 783 rows selected. JavaScript. Enrich data with a custom JavaScript function. JavaScript function: function enrich(row) { return JSON.stringify(row) + &amp;quot; enriched&amp;quot;; } - Define an enrich(row) JavaScript function taking an object and returning a value. Row keys: properties, geometry. Output mode: store the function result in a single column. Output clumn name: precinct_id. The column to store the output in - will be created if it does not exist. Output column type: text." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This took less than a second to run, adding and populating a new &lt;code&gt;precinct_id&lt;/code&gt; column for the table.&lt;/p&gt;
&lt;h4 id="running-a-join"&gt;Running a join&lt;/h4&gt;
&lt;p&gt;I demonstrated how to run a join between the election results and the precincts table using the Datasette SQL query editor.&lt;/p&gt;
&lt;p&gt;I tried a few different things, but the most interesting query was this one:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;select&lt;/span&gt;
  Precinct_name,
  &lt;span class="pl-c1"&gt;precincts&lt;/span&gt;.&lt;span class="pl-c1"&gt;geometry&lt;/span&gt;,
  total_ballots,
  json_group_object(
    candidate_name,
    total_votes
  ) &lt;span class="pl-k"&gt;as&lt;/span&gt; votes_by_candidate
&lt;span class="pl-k"&gt;from&lt;/span&gt;
  election_results 
  &lt;span class="pl-k"&gt;join&lt;/span&gt; precincts &lt;span class="pl-k"&gt;on&lt;/span&gt; &lt;span class="pl-c1"&gt;election_results&lt;/span&gt;.&lt;span class="pl-c1"&gt;Precinct_name&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;precincts&lt;/span&gt;.&lt;span class="pl-c1"&gt;precinct_id&lt;/span&gt;
&lt;span class="pl-k"&gt;where&lt;/span&gt; 
  Contest_title &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Granada Community Services District Members, Board of Directors&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;group by&lt;/span&gt; 
  Precinct_name,
  &lt;span class="pl-c1"&gt;precincts&lt;/span&gt;.&lt;span class="pl-c1"&gt;geometry&lt;/span&gt;,
  total_ballots;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/joined-precincts.jpg" alt="The SQL query returned four columns: Precinct_name, geometry with a map of the precinct, total_ballots with a number and votes_by_candidate with a JSON object mapping each candidate name to their number of votes." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="creating-an-api-token-to-access-the-data"&gt;Creating an API token to access the data&lt;/h4&gt;
&lt;p&gt;I was nearly ready to hand over to Alex for the second half of our demo, where he would use Observable Notebooks to build some custom visualizations on top of the data.&lt;/p&gt;
&lt;p&gt;A great pattern for this is to host the data in Datasette and then fetch it into Observable via the Datasette JSON API.&lt;/p&gt;
&lt;p&gt;Since Datasette Cloud instances are private by default we would need to create an API token that could do this.&lt;/p&gt;
&lt;p&gt;I used this interface (from the &lt;a href="https://github.com/simonw/datasette-auth-tokens"&gt;datasette-auth-tokens plugin&lt;/a&gt;) to create a new token with read-only access to all databases and tables in the instance:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/create-api-token.jpg" alt="Create an API token interface. This token will allow API access with the same abilities as your current user, swillison .Token will be restricted to: all databases and tables: view-database, all databases and tables: view-table, all databases and tables: execute-sql - token is set to read-only and never expires, a list of possible permissions with checkboxes is listed below." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Since we're running a dedicated instance just for Datasette Public Office Hours there's no reason not to distribute that read-only token in publically accessible code.&lt;/p&gt;
&lt;h4 id="getting-cors-working"&gt;Getting CORS working&lt;/h4&gt;
&lt;p&gt;Embarrassingly, I had forgotten that we would need CORS headers in order to access the data from an Observable notebook. Thankfully we have another plugin for that: &lt;a href="https://datasette.io/plugins/datasette-cors"&gt;datasette-cors&lt;/a&gt;. I installed that quickly and we confirmed that it granted access to the API from Observable as intended.&lt;/p&gt;
&lt;p&gt;I handed over to Alex for the next section of the demo.&lt;/p&gt;
&lt;h4 id="working-with-datasette-in-observable"&gt;Working with Datasette in Observable&lt;/h4&gt;
&lt;p&gt;Alex started by running a SQL query from client-side JavaScript to pull in the joined data for our specific El Granada race:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;sql&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;`&lt;/span&gt;
&lt;span class="pl-s"&gt;select&lt;/span&gt;
&lt;span class="pl-s"&gt;  Precinct_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  precincts.geometry,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Split_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Reporting_flag,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Update_count,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Pct_Id,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Pct_seq_nbr,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Reg_voters,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Turn_Out,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Contest_Id,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Contest_seq_nbr,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Contest_title,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Contest_party_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Selectable_Options,&lt;/span&gt;
&lt;span class="pl-s"&gt;  candidate_id,&lt;/span&gt;
&lt;span class="pl-s"&gt;  candidate_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Candidate_Type,&lt;/span&gt;
&lt;span class="pl-s"&gt;  cand_seq_nbr,&lt;/span&gt;
&lt;span class="pl-s"&gt;  Party_Code,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_ballots,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_votes,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_under_votes,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_over_votes,&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote Centers_ballots],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote Centers_votes],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote Centers_under_votes],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote Centers_over_votes],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote by Mail_ballots],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote by Mail_votes],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote by Mail_under_votes],&lt;/span&gt;
&lt;span class="pl-s"&gt;  [Vote by Mail_over_votes]&lt;/span&gt;
&lt;span class="pl-s"&gt;from&lt;/span&gt;
&lt;span class="pl-s"&gt;  election_results join precincts on election_results.Precinct_name = precincts.precinct_id&lt;/span&gt;
&lt;span class="pl-s"&gt;where "Contest_title" = "Granada Community Services District Members, Board of Directors"&lt;/span&gt;
&lt;span class="pl-s"&gt;limit 101;`&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And in the next cell:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;raw_data&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-s"&gt;`https://datasette-public-office-hours.datasette.cloud/data/-/query.json?_shape=array&amp;amp;sql=&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-en"&gt;encodeURIComponent&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-s1"&gt;    &lt;span class="pl-s1"&gt;sql&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-s1"&gt;  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;secret&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note the &lt;code&gt;?_shape=array&lt;/code&gt; parameter there, which causes Datasette to output the results directly as a JSON array of objects.&lt;/p&gt;
&lt;p&gt;That's all it takes to get the data into Observable. Adding another cell like this confirms that the data is now available:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-v"&gt;Inputs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;table&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;raw_data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/inputs-table-raw-data.jpg" alt="An Observable cell running Inputs.table(raw_data) and displaying a table of Precinct_name and geometry columns, with GeoJSON" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="visualizing-those-with-maplibre-gl"&gt;Visualizing those with MapLibre GL&lt;/h4&gt;
&lt;p&gt;There are plenty of good options for visualizing GeoJSON data using JavaScript in an Observable notebook.&lt;/p&gt;
&lt;p&gt;Alex started with &lt;a href="https://maplibre.org/maplibre-gl-js/docs/"&gt;MapLibre GL&lt;/a&gt;, using the excellent &lt;a href="https://simonwillison.net/2024/Sep/28/openfreemap/"&gt;OpenFreeMap 3D tiles&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-s1"&gt;viewof&lt;/span&gt; &lt;span class="pl-s1"&gt;map&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;const&lt;/span&gt; container &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;html&lt;/span&gt;&lt;span class="pl-s"&gt;`&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;div&lt;/span&gt; &lt;span class="pl-c1"&gt;style&lt;/span&gt;="&lt;span class="pl-s"&gt;height:800px;&lt;/span&gt;"&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;yield&lt;/span&gt; &lt;span class="pl-s1"&gt;container&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;map&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;container&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;value&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-s1"&gt;maplibregl&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;Map&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    container&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;zoom&lt;/span&gt;: &lt;span class="pl-c1"&gt;2&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c"&gt;//style: "https://basemaps.cartocdn.com/gl/voyager-gl-style/style.json",&lt;/span&gt;
    &lt;span class="pl-c1"&gt;style&lt;/span&gt;: &lt;span class="pl-s"&gt;"https://tiles.openfreemap.org/styles/liberty"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;scrollZoom&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-k"&gt;yield&lt;/span&gt; &lt;span class="pl-s1"&gt;container&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

  &lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;on&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"load"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;fitBounds&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d3&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;geoBounds&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;duration&lt;/span&gt;: &lt;span class="pl-c1"&gt;0&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addSource&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"precincts"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"geojson"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;data&lt;/span&gt;: &lt;span class="pl-s1"&gt;data&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addLayer&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;id&lt;/span&gt;: &lt;span class="pl-s"&gt;"precincts"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"fill"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;source&lt;/span&gt;: &lt;span class="pl-s"&gt;"precincts"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;paint&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-s"&gt;"fill-opacity"&lt;/span&gt;: &lt;span class="pl-c1"&gt;0.4&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-s"&gt;"fill-color"&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
          &lt;span class="pl-s"&gt;"case"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
          &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s"&gt;"=="&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s"&gt;"get"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"ratio"&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;null&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"#000000"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
          &lt;span class="pl-kos"&gt;[&lt;/span&gt;
            &lt;span class="pl-s"&gt;"interpolate"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
            &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s"&gt;"linear"&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
            &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s"&gt;"get"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"ratio"&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
            &lt;span class="pl-c1"&gt;0.0&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"#0000ff"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
            &lt;span class="pl-c1"&gt;0.5&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"#d3d3d3"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
            &lt;span class="pl-c1"&gt;1.0&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"#ff0000"&lt;/span&gt;
          &lt;span class="pl-kos"&gt;]&lt;/span&gt;
        &lt;span class="pl-kos"&gt;]&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;on&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"click"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s"&gt;"precincts"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; precinct&lt;span class="pl-kos"&gt;,&lt;/span&gt; ratio &lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;features&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;properties&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;description&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

      &lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-s1"&gt;maplibregl&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;Popup&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
        &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;setLngLat&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;e&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;lngLat&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
        &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;setHTML&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;description&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
        &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addTo&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;invalidation&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;remove&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/maplibre-gl.jpg" alt="An Observable cell showing a map of El Granada - a black shape shows the outlines of the precincts." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;(This is just one of several iterations, I didn't capture detailed notes of every change Alex made to the code.)&lt;/p&gt;
&lt;h4 id="observable-plot"&gt;Observable Plot&lt;/h4&gt;
&lt;p&gt;Observable notebooks come pre-loaded with the excellent Observable Plot charting library - Mike Bostock's high-level charting tool built on top of D3.&lt;/p&gt;
&lt;p&gt;Alex used that to first render the shapes of the precincts directly, without even needing a tiled basemap:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-v"&gt;Plot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;plot&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
  width&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;height&lt;/span&gt;: &lt;span class="pl-c1"&gt;600&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;legend&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;projection&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"conic-conformal"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;parallels&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;37&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;4&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-c1"&gt;60&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;38&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;26&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-c1"&gt;60&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;rotate&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;120&lt;/span&gt; &lt;span class="pl-c1"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;30&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-c1"&gt;60&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;domain&lt;/span&gt;: &lt;span class="pl-s1"&gt;data&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;marks&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
    &lt;span class="pl-v"&gt;Plot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;geo&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;strokeOpacity&lt;/span&gt;: &lt;span class="pl-c1"&gt;0.1&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;fill&lt;/span&gt;: &lt;span class="pl-s"&gt;"total_votes"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-en"&gt;title&lt;/span&gt;: &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;properties&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;tip&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;]&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;parallels&lt;/code&gt; and &lt;code&gt;rotate&lt;/code&gt; options there come from the handy &lt;a href="https://github.com/veltman/d3-stateplane?tab=readme-ov-file#nad83--california-zone-3-epsg26943"&gt;veltman/d3-stateplane&lt;/a&gt; repo, which lists recommended settings for the &lt;a href="https://en.wikipedia.org/wiki/State_Plane_Coordinate_System"&gt;State Plane Coordinate System&lt;/a&gt; used with projections in D3. Those values are for &lt;a href="https://www.conservation.ca.gov/cgs/rgm/state-plane-coordinate-system"&gt;California Zone 3&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/color-precincts.jpg" alt="An Observable cell shows six five distinct colored polygons, each for a different precinct. The shape of El Granada is clearly visible despite no other map tiles or labels." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="bringing-it-all-together"&gt;Bringing it all together&lt;/h4&gt;
&lt;p&gt;For the grand finale, Alex combined everything learned so far to build an interactive map allowing a user to select any of the 110 races on the ballot and see a heatmap of results for any selected candidate and option:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/select-map.gif" alt="Animated demo. Choose a contest select - picking different contests updates the map at the bottom. For each contest the candidates or options are shown as radio buttons, and selecting those updates the map to show a heatmap of votes for that candidate in different precincts." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;You can try this out in &lt;a href="https://observablehq.com/d/2ed2ad2443d7bbb5"&gt;Alex's notebook&lt;/a&gt;. Here's the relevant code (Observable cells are divided by &lt;code&gt;// ---&lt;/code&gt; comments). Note that Observable notebooks are reactive and allow variables to be referenced out of order.&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;// Select the contest&lt;/span&gt;
&lt;span class="pl-s1"&gt;viewof&lt;/span&gt; &lt;span class="pl-s1"&gt;contest&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Inputs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;select&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;contests&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;label&lt;/span&gt;: &lt;span class="pl-s"&gt;"Choose a contest"&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-c"&gt;// And the candidate&lt;/span&gt;
&lt;span class="pl-s1"&gt;viewof&lt;/span&gt;&lt;span class="pl-kos"&gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;candidate&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Inputs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;radio&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-s1"&gt;candidates&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;label&lt;/span&gt;: &lt;span class="pl-s"&gt;"Choose a candidate"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;value&lt;/span&gt;: &lt;span class="pl-s1"&gt;candidates&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-c"&gt;// Show the map itself&lt;/span&gt;
&lt;span class="pl-v"&gt;Plot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;plot&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
  width&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;height&lt;/span&gt;: &lt;span class="pl-c1"&gt;600&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;legend&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;color&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt; &lt;span class="pl-c1"&gt;scheme&lt;/span&gt;: &lt;span class="pl-s"&gt;"blues"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;legend&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;projection&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"mercator"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;domain&lt;/span&gt;: &lt;span class="pl-s1"&gt;data2&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;marks&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
    &lt;span class="pl-v"&gt;Plot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;geo&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data2&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;strokeOpacity&lt;/span&gt;: &lt;span class="pl-c1"&gt;0.1&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;fill&lt;/span&gt;: &lt;span class="pl-s"&gt;"ratio"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;tip&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;]&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
# &lt;span class="pl-c1"&gt;--&lt;/span&gt;&lt;span class="pl-c1"&gt;-&lt;/span&gt;
&lt;span class="pl-s1"&gt;data2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"FeatureCollection"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;features&lt;/span&gt;: &lt;span class="pl-s1"&gt;raw_data2&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"Feature"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;properties&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;precinct&lt;/span&gt;: &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;Precinct_name&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;total_ballots&lt;/span&gt;: &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;total_ballots&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;ratio&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;parse&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;votes_by_candidate&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-s1"&gt;candidate&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt; &lt;span class="pl-c1"&gt;/&lt;/span&gt; &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;total_ballots&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;geometry&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;parse&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;geometry&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-s1"&gt;raw_data2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;query&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-s"&gt;`select&lt;/span&gt;
&lt;span class="pl-s"&gt;  Precinct_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  precincts.geometry,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_ballots,&lt;/span&gt;
&lt;span class="pl-s"&gt;  json_grop_object(&lt;/span&gt;
&lt;span class="pl-s"&gt;    candidate_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;    total_votes&lt;/span&gt;
&lt;span class="pl-s"&gt;  ) as votes_by_candidate&lt;/span&gt;
&lt;span class="pl-s"&gt;from&lt;/span&gt;
&lt;span class="pl-s"&gt;  election_results &lt;/span&gt;
&lt;span class="pl-s"&gt;  join precincts on election_results.Precinct_name = precincts.precinct_id&lt;/span&gt;
&lt;span class="pl-s"&gt;where Contest_title = :contest&lt;/span&gt;
&lt;span class="pl-s"&gt;group by &lt;/span&gt;
&lt;span class="pl-s"&gt;  Precinct_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  precincts.geometry,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_ballots;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;{&lt;/span&gt; contest &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-s1"&gt;raw_data2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;query&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-s"&gt;`select&lt;/span&gt;
&lt;span class="pl-s"&gt;  Precinct_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  precincts.geometry,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_ballots,&lt;/span&gt;
&lt;span class="pl-s"&gt;  json_group_object(&lt;/span&gt;
&lt;span class="pl-s"&gt;    candidate_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;    total_votes&lt;/span&gt;
&lt;span class="pl-s"&gt;  ) as votes_by_candidate&lt;/span&gt;
&lt;span class="pl-s"&gt;from&lt;/span&gt;
&lt;span class="pl-s"&gt;  election_results &lt;/span&gt;
&lt;span class="pl-s"&gt;  join precincts on election_results.Precinct_name = precincts.precinct_id&lt;/span&gt;
&lt;span class="pl-s"&gt;where Contest_title = :contest&lt;/span&gt;
&lt;span class="pl-s"&gt;group by &lt;/span&gt;
&lt;span class="pl-s"&gt;  Precinct_name,&lt;/span&gt;
&lt;span class="pl-s"&gt;  precincts.geometry,&lt;/span&gt;
&lt;span class="pl-s"&gt;  total_ballots;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;{&lt;/span&gt; contest &lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-c"&gt;// Fetch the available contests&lt;/span&gt;
&lt;span class="pl-s1"&gt;contests&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;query&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"select distinct Contest_title from election_results"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;map&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;d&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;Contest_title&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-c"&gt;// Extract available candidates for selected contest&lt;/span&gt;

&lt;span class="pl-s1"&gt;candidates&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Object&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;keys&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
  &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;parse&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;raw_data2&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;votes_by_candidate&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
&lt;span class="pl-kos"&gt;)&lt;/span&gt;

&lt;span class="pl-c"&gt;// ---&lt;/span&gt;

&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-en"&gt;query&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;sql&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;params&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-s"&gt;`https://datasette-public-office-hours.datasette.cloud/data/-/query.json?&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;URLSearchParams&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-s1"&gt;      &lt;span class="pl-kos"&gt;{&lt;/span&gt; sql&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;_shape&lt;/span&gt;: &lt;span class="pl-s"&gt;"array"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; ...&lt;span class="pl-s1"&gt;params&lt;/span&gt; &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-s1"&gt;    &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;toString&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;secret&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;h4 id="we-ll-be-doing-this-again"&gt;We'll be doing this again&lt;/h4&gt;
&lt;p&gt;This was our first time trying something like this and I think it worked &lt;em&gt;really&lt;/em&gt; well. We're already thinking about ways to improve it next time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;I want to record these sessions and make them available on YouTube for people who couldn't be there live&lt;/li&gt;
&lt;li&gt;It would be fun to mix up the format. I'm particularly keen on getting more people involved giving demos - maybe having 5-10 minute lightning demo slots so we can see what other people are working on&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Keep an eye on this blog or on the &lt;a href="https://datasette.io/discord"&gt;Datasette Discord&lt;/a&gt; for news about future sessions.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mapping"&gt;mapping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/politics"&gt;politics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-public-office-hours"&gt;datasette-public-office-hours&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="geospatial"/><category term="mapping"/><category term="politics"/><category term="projects"/><category term="datasette"/><category term="datasette-cloud"/><category term="alex-garcia"/><category term="datasette-public-office-hours"/><category term="leaflet"/></entry><entry><title>Datasette Public Office Hours, Friday Nov 8th at 2pm PT</title><link href="https://simonwillison.net/2024/Nov/7/datasette-public-office-hours/#atom-tag" rel="alternate"/><published>2024-11-07T19:10:10+00:00</published><updated>2024-11-07T19:10:10+00:00</updated><id>https://simonwillison.net/2024/Nov/7/datasette-public-office-hours/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://discord.gg/udUyEnv3?event=1304134449453072435"&gt;Datasette Public Office Hours, Friday Nov 8th at 2pm PT&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Tomorrow afternoon (Friday 8th November) at 2pm PT we'll be hosting the first &lt;strong&gt;Datasette Public Office Hours&lt;/strong&gt; - a livestream video session on Discord where Alex Garcia and myself will live code on some &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; projects and hang out to chat about the project.&lt;/p&gt;
&lt;p&gt;This is our first time trying this format. If it works out well I plan to turn it into a series.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Discord event card promoting Datasette Public Office Hours" src="https://static.simonwillison.net/static/2024/datasette-public-office-hours.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/discord"&gt;discord&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-public-office-hours"&gt;datasette-public-office-hours&lt;/a&gt;&lt;/p&gt;



</summary><category term="open-source"/><category term="datasette"/><category term="discord"/><category term="alex-garcia"/><category term="datasette-public-office-hours"/></entry><entry><title>Hybrid full-text search and vector search with SQLite</title><link href="https://simonwillison.net/2024/Oct/4/hybrid-full-text-search-and-vector-search-with-sqlite/#atom-tag" rel="alternate"/><published>2024-10-04T16:22:09+00:00</published><updated>2024-10-04T16:22:09+00:00</updated><id>https://simonwillison.net/2024/Oct/4/hybrid-full-text-search-and-vector-search-with-sqlite/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://alexgarcia.xyz/blog/2024/sqlite-vec-hybrid-search/index.html"&gt;Hybrid full-text search and vector search with SQLite&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
As part of Alex’s work on his &lt;a href="https://github.com/asg017/sqlite-vec"&gt;sqlite-vec&lt;/a&gt; SQLite extension - adding fast vector lookups to SQLite - he’s been investigating hybrid search, where search results from both vector similarity and traditional full-text search are combined together.&lt;/p&gt;
&lt;p&gt;The most promising approach looks to be &lt;a href="https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking"&gt;Reciprocal Rank Fusion&lt;/a&gt;, which combines the top ranked items from both approaches. Here’s Alex’s SQL query:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;--&lt;/span&gt; the sqlite-vec KNN vector search results&lt;/span&gt;
with vec_matches &lt;span class="pl-k"&gt;as&lt;/span&gt; (
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    article_id,
    row_number() over (&lt;span class="pl-k"&gt;order by&lt;/span&gt; distance) &lt;span class="pl-k"&gt;as&lt;/span&gt; rank_number,
    distance
  &lt;span class="pl-k"&gt;from&lt;/span&gt; vec_articles
  &lt;span class="pl-k"&gt;where&lt;/span&gt;
    headline_embedding match lembed(:query)
    &lt;span class="pl-k"&gt;and&lt;/span&gt; k &lt;span class="pl-k"&gt;=&lt;/span&gt; :k
),
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;--&lt;/span&gt; the FTS5 search results&lt;/span&gt;
fts_matches &lt;span class="pl-k"&gt;as&lt;/span&gt; (
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    rowid,
    row_number() over (&lt;span class="pl-k"&gt;order by&lt;/span&gt; rank) &lt;span class="pl-k"&gt;as&lt;/span&gt; rank_number,
    rank &lt;span class="pl-k"&gt;as&lt;/span&gt; score
  &lt;span class="pl-k"&gt;from&lt;/span&gt; fts_articles
  &lt;span class="pl-k"&gt;where&lt;/span&gt; headline match :query
  &lt;span class="pl-k"&gt;limit&lt;/span&gt; :k
),
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;--&lt;/span&gt; combine FTS5 + vector search results with RRF&lt;/span&gt;
final &lt;span class="pl-k"&gt;as&lt;/span&gt; (
  &lt;span class="pl-k"&gt;select&lt;/span&gt;
    &lt;span class="pl-c1"&gt;articles&lt;/span&gt;.&lt;span class="pl-c1"&gt;id&lt;/span&gt;,
    &lt;span class="pl-c1"&gt;articles&lt;/span&gt;.&lt;span class="pl-c1"&gt;headline&lt;/span&gt;,
    &lt;span class="pl-c1"&gt;vec_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;rank_number&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; vec_rank,
    &lt;span class="pl-c1"&gt;fts_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;rank_number&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; fts_rank,
    &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;--&lt;/span&gt; RRF algorithm&lt;/span&gt;
    (
      coalesce(&lt;span class="pl-c1"&gt;1&lt;/span&gt;.&lt;span class="pl-c1"&gt;0&lt;/span&gt; &lt;span class="pl-k"&gt;/&lt;/span&gt; (:rrf_k &lt;span class="pl-k"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;fts_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;rank_number&lt;/span&gt;), &lt;span class="pl-c1"&gt;0&lt;/span&gt;.&lt;span class="pl-c1"&gt;0&lt;/span&gt;) &lt;span class="pl-k"&gt;*&lt;/span&gt; :weight_fts &lt;span class="pl-k"&gt;+&lt;/span&gt;
      coalesce(&lt;span class="pl-c1"&gt;1&lt;/span&gt;.&lt;span class="pl-c1"&gt;0&lt;/span&gt; &lt;span class="pl-k"&gt;/&lt;/span&gt; (:rrf_k &lt;span class="pl-k"&gt;+&lt;/span&gt; &lt;span class="pl-c1"&gt;vec_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;rank_number&lt;/span&gt;), &lt;span class="pl-c1"&gt;0&lt;/span&gt;.&lt;span class="pl-c1"&gt;0&lt;/span&gt;) &lt;span class="pl-k"&gt;*&lt;/span&gt; :weight_vec
    ) &lt;span class="pl-k"&gt;as&lt;/span&gt; combined_rank,
    &lt;span class="pl-c1"&gt;vec_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;distance&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; vec_distance,
    &lt;span class="pl-c1"&gt;fts_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;score&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; fts_score
  &lt;span class="pl-k"&gt;from&lt;/span&gt; fts_matches
  full outer &lt;span class="pl-k"&gt;join&lt;/span&gt; vec_matches &lt;span class="pl-k"&gt;on&lt;/span&gt; &lt;span class="pl-c1"&gt;vec_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;article_id&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; &lt;span class="pl-c1"&gt;fts_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;rowid&lt;/span&gt;
  &lt;span class="pl-k"&gt;join&lt;/span&gt; articles &lt;span class="pl-k"&gt;on&lt;/span&gt; &lt;span class="pl-c1"&gt;articles&lt;/span&gt;.&lt;span class="pl-c1"&gt;rowid&lt;/span&gt; &lt;span class="pl-k"&gt;=&lt;/span&gt; coalesce(&lt;span class="pl-c1"&gt;fts_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;rowid&lt;/span&gt;, &lt;span class="pl-c1"&gt;vec_matches&lt;/span&gt;.&lt;span class="pl-c1"&gt;article_id&lt;/span&gt;)
  &lt;span class="pl-k"&gt;order by&lt;/span&gt; combined_rank &lt;span class="pl-k"&gt;desc&lt;/span&gt;
)
&lt;span class="pl-k"&gt;select&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-k"&gt;from&lt;/span&gt; final;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I’ve been puzzled in the past over how to best do that because the distance scores from vector similarity and the relevance scores from FTS are meaningless in comparison to each other. RRF doesn’t even attempt to compare them - it uses them purely for &lt;code&gt;row_number()&lt;/code&gt; ranking within each set and combines the results based on that.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/full-text-search"&gt;full-text-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vector-search"&gt;vector-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rag"&gt;rag&lt;/a&gt;&lt;/p&gt;



</summary><category term="full-text-search"/><category term="search"/><category term="sql"/><category term="sqlite"/><category term="alex-garcia"/><category term="vector-search"/><category term="embeddings"/><category term="rag"/></entry><entry><title>Using sqlite-vec with embeddings in sqlite-utils and Datasette</title><link href="https://simonwillison.net/2024/Aug/11/sqlite-vec/#atom-tag" rel="alternate"/><published>2024-08-11T23:37:42+00:00</published><updated>2024-08-11T23:37:42+00:00</updated><id>https://simonwillison.net/2024/Aug/11/sqlite-vec/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/sqlite/sqlite-vec"&gt;Using sqlite-vec with embeddings in sqlite-utils and Datasette&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
My notes on trying out Alex Garcia's newly released &lt;a href="https://github.com/asg017/sqlite-vec"&gt;sqlite-vec&lt;/a&gt; SQLite extension, including how to use it with OpenAI embeddings in both &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; and &lt;a href="https://sqlite-utils.datasette.io/"&gt;sqlite-utils&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="datasette"/><category term="sqlite-utils"/><category term="openai"/><category term="alex-garcia"/><category term="embeddings"/></entry><entry><title>Introducing sqlite-lembed: A SQLite extension for generating text embeddings locally</title><link href="https://simonwillison.net/2024/Jul/25/sqlite-lembed-rembed/#atom-tag" rel="alternate"/><published>2024-07-25T20:30:01+00:00</published><updated>2024-07-25T20:30:01+00:00</updated><id>https://simonwillison.net/2024/Jul/25/sqlite-lembed-rembed/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://alexgarcia.xyz/blog/2024/sqlite-lembed-init/index.html"&gt;Introducing sqlite-lembed: A SQLite extension for generating text embeddings locally&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia's latest SQLite extension is a C wrapper around the &lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt; that exposes just its embedding support, allowing you to register a GGUF file containing an embedding model:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;INSERT INTO temp.lembed_models(name, model)
  select 'all-MiniLM-L6-v2',
  lembed_model_from_file('all-MiniLM-L6-v2.e4ce9877.q8_0.gguf');
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then use it to calculate embeddings as part of a SQL query:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;select lembed(
  'all-MiniLM-L6-v2',
  'The United States Postal Service is an independent agency...'
); -- X'A402...09C3' (1536 bytes)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;all-MiniLM-L6-v2.e4ce9877.q8_0.gguf&lt;/code&gt; here is a 24MB file, so this should run quite happily even on machines without much available RAM.&lt;/p&gt;
&lt;p&gt;What if you don't want to run the models locally at all? Alex has another new extension for that, described in &lt;strong&gt;&lt;a href="https://alexgarcia.xyz/blog/2024/sqlite-rembed-init/index.html"&gt;Introducing sqlite-rembed: A SQLite extension for generating text embeddings from remote APIs&lt;/a&gt;&lt;/strong&gt;. The &lt;code&gt;rembed&lt;/code&gt; is for remote embeddings, and this extension uses Rust to call multiple remotely-hosted embeddings APIs, registered like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;INSERT INTO temp.rembed_clients(name, options)
  VALUES ('text-embedding-3-small', 'openai');
select rembed(
  'text-embedding-3-small',
  'The United States Postal Service is an independent agency...'
); -- X'A452...01FC', Blob&amp;lt;6144 bytes&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/asg017/sqlite-rembed/blob/v0.0.1-alpha.9/src/clients.rs"&gt;the Rust code&lt;/a&gt; that implements Rust wrapper functions for HTTP JSON APIs from OpenAI, Nomic, Cohere, Jina, Mixedbread and localhost servers provided by Ollama and Llamafile.&lt;/p&gt;
&lt;p&gt;Both of these extensions are designed to complement Alex's &lt;a href="https://github.com/asg017/sqlite-vec"&gt;sqlite-vec&lt;/a&gt; extension, which is nearing a first stable release.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://data-folks.masto.host/@alexgarciaxyz/112848900983450306"&gt;@alexgarciaxyz&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="sqlite"/><category term="rust"/><category term="alex-garcia"/><category term="embeddings"/><category term="llama-cpp"/></entry><entry><title>sqlite-jiff</title><link href="https://simonwillison.net/2024/Jul/23/sqlite-jiff/#atom-tag" rel="alternate"/><published>2024-07-23T03:53:52+00:00</published><updated>2024-07-23T03:53:52+00:00</updated><id>https://simonwillison.net/2024/Jul/23/sqlite-jiff/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/asg017/sqlite-jiff"&gt;sqlite-jiff&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I linked to the brand new Jiff datetime library &lt;a href="https://simonwillison.net/2024/Jul/22/jiff/"&gt;yesterday&lt;/a&gt;. Alex Garcia has already used it for an experimental SQLite extension providing a timezone-aware &lt;code&gt;jiff_duration()&lt;/code&gt; function - a useful new capability since SQLite's built in date functions don't handle timezones at all.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;select jiff_duration(
  '2024-11-02T01:59:59[America/Los_Angeles]',
  '2024-11-02T02:00:01[America/New_York]',
  'minutes'
) as result; -- returns 179.966
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The implementation is &lt;a href="https://github.com/asg017/sqlite-jiff/blob/e02d625757105a68f5a64954262bd1ef8683212e/src/lib.rs"&gt;65 lines of Rust&lt;/a&gt;.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/agarcia_me/status/1815517168366485619"&gt;@agarcia_me&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/timezones"&gt;timezones&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="timezones"/><category term="rust"/><category term="alex-garcia"/></entry><entry><title>datasette-pins — a new Datasette plugin for pinning tables and queries</title><link href="https://simonwillison.net/2024/May/9/datasette-pins/#atom-tag" rel="alternate"/><published>2024-05-09T18:29:03+00:00</published><updated>2024-05-09T18:29:03+00:00</updated><id>https://simonwillison.net/2024/May/9/datasette-pins/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datasette.cloud/blog/2024/datasette-pins/"&gt;datasette-pins — a new Datasette plugin for pinning tables and queries&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia built this plugin for Datasette Cloud, and as with almost every Datasette Cloud features we're releasing it as &lt;a href="https://github.com/datasette/datasette-pins"&gt;an open source package&lt;/a&gt; as well.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;datasette-pins&lt;/code&gt; allows users with the right permission to "pin" tables, databases and queries to their homepage. It's a lightweight way to customize that homepage, especially useful as your Datasette instance grows to host dozens or even hundreds of tables.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="datasette"/><category term="datasette-cloud"/><category term="alex-garcia"/></entry><entry><title>I'm writing a new vector search SQLite Extension</title><link href="https://simonwillison.net/2024/May/3/sqlite-vec/#atom-tag" rel="alternate"/><published>2024-05-03T03:16:39+00:00</published><updated>2024-05-03T03:16:39+00:00</updated><id>https://simonwillison.net/2024/May/3/sqlite-vec/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://alexgarcia.xyz/blog/2024/building-new-vector-search-sqlite/index.html"&gt;I&amp;#x27;m writing a new vector search SQLite Extension&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia is working on &lt;code&gt;sqlite-vec&lt;/code&gt;, a spiritual successor to his &lt;code&gt;sqlite-vss&lt;/code&gt; project. The new SQLite C extension will have zero other dependencies (&lt;code&gt;sqlite-vss&lt;/code&gt; used some tricky C++ libraries) and will work using virtual tables, storing chunks of vectors in shadow tables to avoid needing to load everything into memory at once.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vectors"&gt;vectors&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vector-search"&gt;vector-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="sqlite"/><category term="vectors"/><category term="alex-garcia"/><category term="vector-search"/><category term="embeddings"/></entry><entry><title>datasette-plot - a new Datasette Plugin for building data visualizations</title><link href="https://simonwillison.net/2023/Dec/31/datasette-plot/#atom-tag" rel="alternate"/><published>2023-12-31T05:04:19+00:00</published><updated>2023-12-31T05:04:19+00:00</updated><id>https://simonwillison.net/2023/Dec/31/datasette-plot/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datasette.cloud/blog/2023/datasette-plot/"&gt;datasette-plot - a new Datasette Plugin for building data visualizations&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I forgot to link to this here last week: Alex Garcia released the first version of datasette-plot, a brand new Datasette visualization plugin built on top of the Observable Plot charting library. We plan to use this as the new, updated alternative to my older datasette-vega plugin.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/visualization"&gt;visualization&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable-plot"&gt;observable-plot&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="visualization"/><category term="datasette"/><category term="observable"/><category term="alex-garcia"/><category term="observable-plot"/></entry><entry><title>Annotate and explore your data with datasette-comments</title><link href="https://simonwillison.net/2023/Nov/30/datasette-comments/#atom-tag" rel="alternate"/><published>2023-11-30T21:59:54+00:00</published><updated>2023-11-30T21:59:54+00:00</updated><id>https://simonwillison.net/2023/Nov/30/datasette-comments/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datasette.cloud/blog/2023/datasette-comments/"&gt;Annotate and explore your data with datasette-comments&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New plugin for Datasette and Datasette Cloud: datasette-comments, providing tools for collaborating on data exploration with a team through posting comments on individual rows of data.&lt;/p&gt;

&lt;p&gt;Alex Garcia built this for Datasette Cloud but as with almost all of our work there it’s also available as an open source Python package.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/collaboration"&gt;collaboration&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="collaboration"/><category term="projects"/><category term="datasette"/><category term="datasette-cloud"/><category term="alex-garcia"/></entry><entry><title>Weeknotes: the Datasette Cloud API, a podcast appearance and more</title><link href="https://simonwillison.net/2023/Oct/1/datasette-cloud-api/#atom-tag" rel="alternate"/><published>2023-10-01T00:03:53+00:00</published><updated>2023-10-01T00:03:53+00:00</updated><id>https://simonwillison.net/2023/Oct/1/datasette-cloud-api/#atom-tag</id><summary type="html">
    &lt;p&gt;Datasette Cloud now has a documented API, plus a podcast appearance, some LLM plugins work and some geospatial excitement.&lt;/p&gt;
&lt;h4 id="the-datasette-cloud-api"&gt;The Datasette Cloud API&lt;/h4&gt;
&lt;p&gt;My biggest achievement this week is that I documented and announced the API for &lt;a href="https://www.datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I wrote about this at length in &lt;a href="https://www.datasette.cloud/blog/2023/datasette-cloud-api/"&gt;Getting started with the Datasette Cloud API&lt;/a&gt; on the Datasette Cloud blog. I also used this as an opportunity to start a documentation site for the service, now available at &lt;a href="https://www.datasette.cloud/docs/"&gt;datasette.cloud/docs&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The API is effectively the Datasette 1.0 alpha write API, &lt;a href="https://simonwillison.net/2022/Dec/2/datasette-write-api/"&gt;described here previously&lt;/a&gt;. You can use the API to both read and write data to a Datasette Cloud space, with finely-grained permissions (powered by the &lt;a href="https://datasette.io/plugins/datasette-auth-tokens"&gt;datasette-auth-tokens&lt;/a&gt; plugin) so you can create tokens that are restricted to actions just against specified tables.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://www.datasette.cloud/blog/2023/datasette-cloud-api/"&gt;blog entry&lt;/a&gt; about it doubles as a tutorial, describing how I wrote code to import the latest documents from the US Government &lt;a href="https://www.federalregister.gov/"&gt;Federal Register&lt;/a&gt; into a Datasette Cloud space, using a dependency-free Python script and GitHub Actions.&lt;/p&gt;
&lt;p&gt;You can see that code in the new &lt;a href="https://github.com/simonw/federal-register-to-datasette"&gt;federal-register-to-datasette&lt;/a&gt; GitHub repository. It's pretty small - just 70 lines of Python and 22 of YAML.&lt;/p&gt;
&lt;p&gt;The more time I spend writing code against the Datasette API the more confident I get that it's shaped in the right way. I'm happy to consider it stable for the 1.0 release now.&lt;/p&gt;
&lt;h4 id="talking-large-language-models-with-rooftop-ruby"&gt;Talking Large Language Models with Rooftop Ruby&lt;/h4&gt;
&lt;p&gt;I recorded a podcast episode this week for &lt;a href="https://www.rooftopruby.com/2108545/13676934-26-large-language-models-with-simon-willison"&gt;Rooftop Ruby&lt;/a&gt; with Collin Donnell and Joel Drapper. It was a &lt;em&gt;really&lt;/em&gt; high quality conversation - we went for about an hour and 20 minutes and covered a huge amount of ground.&lt;/p&gt;
&lt;p&gt;After the podcast came out I took the MP3, ran it through &lt;a href="https://goodsnooze.gumroad.com/l/macwhisper"&gt;MacWhisper&lt;/a&gt; and then spent several hours marking up speakers and editing the resulting text. I also added headings corresponding to the different topics we covered, along with inline links to other relevant material.&lt;/p&gt;
&lt;p&gt;I'm really pleased with the resulting document, which you can find at &lt;a href="https://simonwillison.net/2023/Sep/29/llms-podcast/"&gt;Talking Large Language Models with Rooftop Ruby&lt;/a&gt;. It was quite a bit of work but I think it was worthwhile - I've since been able to answer some questions about LLMs &lt;a href="https://fedi.simonwillison.net/@simon/111154892998909354"&gt;on Mastodon&lt;/a&gt; and Twitter by linking directly to the point within the transcript that discussed those points.&lt;/p&gt;
&lt;p&gt;I also dropped in my own audio player, &lt;a href="https://chat.openai.com/share/4ea13846-6292-4412-97e5-57400279c6c7"&gt;developed with GPT-4 assistance&lt;/a&gt;, and provided links from the different transcript sessions that would jump the audio to that point in the conversation.&lt;/p&gt;
&lt;p&gt;Also this week: while closing a bunch of VS Code tabs I stumbled across a partially written blog entry about &lt;a href="https://simonwillison.net/2023/Sep/30/cli-tools-python/"&gt;Things I've learned about building CLI tools in Python&lt;/a&gt;, so I finished that up and published it.&lt;/p&gt;
&lt;p&gt;I'm trying to leave less unfinished projects lying around on my computer, so if something is 90% finished I'll try to wrap it up and put it out there to get it off my ever-expanding plate.&lt;/p&gt;
&lt;h4 id="llm-llama-cpp"&gt;llm-llama-cpp&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; has started to collect a small but healthy community on Discord, which is really exciting.&lt;/p&gt;
&lt;p&gt;My absolute favourite community project so far is Drew Breunig's Facet Finder, which he described in &lt;a href="https://www.dbreunig.com/2023/09/26/faucet-finder.html"&gt;Finding Bathroom Faucets with Embeddings&lt;/a&gt;. He used &lt;a href="https://github.com/simonw/llm-clip"&gt;llm-clip&lt;/a&gt; to calculate embeddings for 20,000 pictures of faucets, then ran both similarity and text search against them to help renovate his bathroom. It's really fun!&lt;/p&gt;
&lt;p&gt;I shipped a new version of the &lt;a href="https://github.com/simonw/llm-llama-cpp"&gt;llm-llama-cpp&lt;/a&gt; plugin this week which was mostly written by other people: &lt;a href="https://github.com/simonw/llm-llama-cpp/releases/tag/0.2b1"&gt;llm-llama-cpp 0.2b1&lt;/a&gt;. Alexis Métaireau and LoopControl submitted fixes to extend the default max token limit (fixing a frustrating issue with truncated responses) and to allow for increasing the number of GPU layers used to run the models.&lt;/p&gt;
&lt;p&gt;I also shipped &lt;a href="https://github.com/simonw/llm/releases/tag/0.11"&gt;LLM 0.11&lt;/a&gt;, the main feature of which was support for the new OpenAI &lt;code&gt;gpt-3.5-turbo-instruct&lt;/code&gt; model. I really need to split the OpenAI support out into a separate plugin so I can ship fixes to that without having to release the core LLM package.&lt;/p&gt;
&lt;p&gt;And I put together an &lt;a href="https://github.com/simonw/llm-plugin"&gt;llm-plugin&lt;/a&gt; cookiecutter template, which I plan to use for all of my plugins going forward.&lt;/p&gt;
&lt;h4 id="getting-excited-about-tg-and-sqlite-tg"&gt;Getting excited about TG and sqlite-tg&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://github.com/tidwall/tg"&gt;TG&lt;/a&gt; is a brand new C library from &lt;a href="https://github.com/tidwall/tile38"&gt;Tile38&lt;/a&gt; creator Josh Baker. It's &lt;em&gt;really&lt;/em&gt; exciting: it provides a set of fast geospatial operations - the exact subset I usually find myself needing, based around polygon intersections, GeoJSON, WKT, WKB and geospatial indexes - implemented with zero external dependencies. It's shipped as a single C file, reminiscent of the SQLite amalgamation.&lt;/p&gt;
&lt;p&gt;I noted in a few places that it could make a great SQLite extension... and Alex Garcia fell victim to my blatant &lt;a href="https://xkcd.com/356/"&gt;nerd-sniping&lt;/a&gt; and built the first version of &lt;a href="https://github.com/asg017/sqlite-tg"&gt;sqlite-tg&lt;/a&gt; within 24 hours!&lt;/p&gt;
&lt;p&gt;I wrote about my own explorations of Alex's work in &lt;a href="https://til.simonwillison.net/sqlite/sqlite-tg"&gt;Geospatial SQL queries in SQLite using TG, sqlite-tg and datasette-sqlite-tg&lt;/a&gt;. I'm thrilled at the idea of having a tiny, lightweight alternative to SpatiaLite as an addition to the Datasette ecosystem, and the SQLite world in general.&lt;/p&gt;
&lt;h4 id="two-tiny-datasette-releases"&gt;Two tiny Datasette releases&lt;/h4&gt;
&lt;p&gt;I released dot-releases for Datasette:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.datasette.io/en/1.0a7/changelog.html#a7-2023-09-21"&gt;datasette 1.0a7&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.datasette.io/en/stable/changelog.html#v0-64-4"&gt;datasette 0.64.4&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Both of these feature the same fix, described in &lt;a href="https://github.com/simonw/datasette/issues/2189"&gt;Issue 2189: Server hang on parallel execution of queries to named in-memory databases&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Short version: it turns out the experimental work I did a while ago to try running SQL queries in parallel was causing threading deadlock issues against in-memory named SQLite databases. No-one had noticed because those are only available within Datasette plugins, but I'd started to experience them as I started writing my own plugins that used that feature.&lt;/p&gt;
&lt;h4 id="chatgpt-in-the-newsroom"&gt;ChatGPT in the newsroom&lt;/h4&gt;
&lt;p&gt;I signed up for a MOOC (Massive Open Online Courses) about journalism and ChatGPT!&lt;/p&gt;
&lt;p&gt;&lt;a href="https://journalismcourses.org/course/how-to-use-chatgpt-and-other-generative-ai-tools-in-your-newsrooms/"&gt;How to use ChatGPT and other generative AI tools in your newsrooms
&lt;/a&gt; is being taught by Aimee Rinehart and Sil Hamilton for the Knight Center.&lt;/p&gt;
&lt;p&gt;I actually found out about it because people were being snarky about it on Twitter. That's not a big surprise - there are many obvious problems with applying generative AI to journalism.&lt;/p&gt;
&lt;p&gt;As you would hope, this course is not a hype-filled pitch for writing AI-generated news stories. It's a conversation between literally thousands of journalists around the world about the ethical and practical implications of this technology.&lt;/p&gt;
&lt;p&gt;I'm really enjoying it. I'm learning a huge amount about how people experience AI tools, the kinds of questions they have about them and the kinds of journalism problems that make sense for them to solve.&lt;/p&gt;
&lt;h4 id="releases-this-week"&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-remote-actors/releases/tag/0.1a2"&gt;datasette-remote-actors 0.1a2&lt;/a&gt;&lt;/strong&gt; - 2023-09-28&lt;br /&gt;Datasette plugin for fetching details of actors from a remote endpoint&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-llama-cpp/releases/tag/0.2b1"&gt;llm-llama-cpp 0.2b1&lt;/a&gt;&lt;/strong&gt; - 2023-09-28&lt;br /&gt;LLM plugin for running models using llama.cpp&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-tokens/releases/tag/0.4a4"&gt;datasette-auth-tokens 0.4a4&lt;/a&gt;&lt;/strong&gt; - 2023-09-26&lt;br /&gt;Datasette plugin for authenticating access using API tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette/releases/tag/1.0a7"&gt;datasette 1.0a7&lt;/a&gt;&lt;/strong&gt; - 2023-09-21&lt;br /&gt;An open source multi-tool for exploring and publishing data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-upload-dbs/releases/tag/0.3.1"&gt;datasette-upload-dbs 0.3.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-20&lt;br /&gt;Upload SQLite database files to Datasette&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-mask-columns/releases/tag/0.2.2"&gt;datasette-mask-columns 0.2.2&lt;/a&gt;&lt;/strong&gt; - 2023-09-20&lt;br /&gt;Datasette plugin that masks specified database columns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm/releases/tag/0.11"&gt;llm 0.11&lt;/a&gt;&lt;/strong&gt; - 2023-09-19&lt;br /&gt;Access large language models from the command-line&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="til-this-week"&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/css/resizing-textarea"&gt;Understanding the CSS auto-resizing textarea trick&lt;/a&gt; - 2023-09-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/pytest/syrupy"&gt;Snapshot testing with Syrupy&lt;/a&gt; - 2023-09-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sqlite/sqlite-tg"&gt;Geospatial SQL queries in SQLite using TG, sqlite-tg and datasette-sqlite-tg&lt;/a&gt; - 2023-09-25&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/machinelearning/musicgen"&gt;Trying out the facebook/musicgen-small sound generation model&lt;/a&gt; - 2023-09-23&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/journalism"&gt;journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="journalism"/><category term="projects"/><category term="sqlite"/><category term="ai"/><category term="datasette"/><category term="weeknotes"/><category term="datasette-cloud"/><category term="alex-garcia"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>Geospatial SQL queries in SQLite using TG, sqlite-tg and datasette-sqlite-tg</title><link href="https://simonwillison.net/2023/Sep/25/tg-sqlite/#atom-tag" rel="alternate"/><published>2023-09-25T19:45:03+00:00</published><updated>2023-09-25T19:45:03+00:00</updated><id>https://simonwillison.net/2023/Sep/25/tg-sqlite/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/sqlite/sqlite-tg"&gt;Geospatial SQL queries in SQLite using TG, sqlite-tg and datasette-sqlite-tg&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia built sqlite-tg—a SQLite extension that uses the brand new TG geospatial library to provide a whole suite of custom SQL functions for working with geospatial data.&lt;/p&gt;

&lt;p&gt;Here are my notes on trying out his initial alpha releases. The extension already provides tools for converting between GeoJSON, WKT and WKB, plus the all important tg_intersects() function for testing if a polygon or point overlap each other.&lt;/p&gt;

&lt;p&gt;It’s pretty useful already. Without any geospatial indexing at all I was still able to get 700ms replies to a brute-force point-in-polygon query against 150MB of GeoJSON timezone boundaries stored as JSON text in a table.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geojson"&gt;geojson&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tg"&gt;tg&lt;/a&gt;&lt;/p&gt;



</summary><category term="geospatial"/><category term="sqlite"/><category term="geojson"/><category term="datasette"/><category term="alex-garcia"/><category term="tg"/></entry><entry><title>Weeknotes: Embeddings, more embeddings and Datasette Cloud</title><link href="https://simonwillison.net/2023/Sep/17/weeknotes-embeddings/#atom-tag" rel="alternate"/><published>2023-09-17T05:10:13+00:00</published><updated>2023-09-17T05:10:13+00:00</updated><id>https://simonwillison.net/2023/Sep/17/weeknotes-embeddings/#atom-tag</id><summary type="html">
    &lt;p&gt;Since my &lt;a href="https://simonwillison.net/2023/Aug/30/datasette-plus-weeknotes/"&gt;last weeknotes&lt;/a&gt;, a flurry of activity. LLM has embeddings support now, and Datasette Cloud has driven some major improvements to the wider Datasette ecosystem.&lt;/p&gt;
&lt;h4 id="embeddings-in-llm"&gt;Embeddings in LLM&lt;/h4&gt;
&lt;p&gt;LLM gained embedding support in version 0.9, and then got binary embedding support (for CLIP) in version 0.10. I wrote about those releases in detail in:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2023/Sep/4/llm-embeddings/"&gt;LLM now provides tools for working with embeddings&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonwillison.net/2023/Sep/12/llm-clip-and-chat/"&gt;Build an image search engine with llm-clip, chat with models with llm chat&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Embeddings are a fascinating tool. If you haven't got your head around them yet the &lt;a href="https://simonwillison.net/2023/Sep/4/llm-embeddings/"&gt;first of my blog entries&lt;/a&gt; tries to explain why they are so interesting.&lt;/p&gt;
&lt;p&gt;There's a lot more I want to built on top of embeddings - most notably, LLM (or Datasette, or likely a combination of the two) will be growing support for Retrieval Augmented Generation on top of the LLM embedding mechanism.&lt;/p&gt;
&lt;h4 id="annotated-releases"&gt;Annotated releases&lt;/h4&gt;
&lt;p&gt;I always include a list of new releases in my weeknotes. This time I'm going to use those to illustrate the themes I've been working on.&lt;/p&gt;
&lt;p&gt;The first group of release relates to LLM and its embedding support. LLM 0.10 extended that support:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm/releases/tag/0.10"&gt;llm 0.10&lt;/a&gt;&lt;/strong&gt; - 2023-09-12&lt;br /&gt;Access large language models from the command-line&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Embedding models can now be &lt;a href="https://llm.datasette.io/en/stable/embeddings/writing-plugins.html"&gt;built as LLM plugins&lt;/a&gt;. I've released two of those so far:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-sentence-transformers/releases/tag/0.1.2"&gt;llm-sentence-transformers 0.1.2&lt;/a&gt;&lt;/strong&gt; - 2023-09-13&lt;br /&gt;LLM plugin for embeddings using sentence-transformers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-clip/releases/tag/0.1"&gt;llm-clip 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-12&lt;br /&gt;Generate embeddings for images and text using CLIP with LLM&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The CLIP one is particularly fun, because it genuinely allows you to build a sophisticated image search engine that runs entirely on your own computer!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/symbex/releases/tag/1.4"&gt;symbex 1.4&lt;/a&gt;&lt;/strong&gt; - 2023-09-05&lt;br /&gt;Find the Python code for specified symbols&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Symbex is my tool for extracting symbols - functions, methods and classes - from Python code. I introduced that in &lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;Symbex: search Python code for functions and classes, then pipe them into a LLM&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Symbex 1.4 adds a tiny but impactful feature: it can now output a list of symbols as JSON, CSV or TSV. These output formats are designed to be compatible with the new &lt;a href="https://llm.datasette.io/en/stable/embeddings/cli.html#embedding-data-from-a-csv-tsv-or-json-file"&gt;llm embed-multi&lt;/a&gt; command, which means you can easily create embeddings for all of your functions:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;symbex &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;*&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;*:*&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --nl &lt;span class="pl-k"&gt;|&lt;/span&gt; \
  llm embed-multi symbols - \
  --format nl --database embeddings.db --store&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I haven't fully explored what this enables yet, but it should mean that both related functions and semantic function search ("Find my a function that downloads a CSV") are now easy to build.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-cluster/releases/tag/0.2"&gt;llm-cluster 0.2&lt;/a&gt;&lt;/strong&gt; - 2023-09-04&lt;br /&gt;LLM plugin for clustering embeddings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Yet another thing you can do with embeddings is use them to find clusters of related items.&lt;/p&gt;
&lt;p&gt;The neatest feature of &lt;code&gt;llm-cluster&lt;/code&gt; is that you can ask it to generate names for these clusters by sending the names of the items in each cluster through another language model, something like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm cluster issues 10 \
  -d issues.db \
  --summary \
  --prompt &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Short, concise title for this cluster of related documents&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;One last embedding related project: &lt;code&gt;datasette-llm-embed&lt;/code&gt; is a tiny plugin that adds a &lt;code&gt;select llm_embed('sentence-transformers/all-mpnet-base-v2', 'This is some text')&lt;/code&gt; SQL function. I built it to support quickly prototyping embedding-related ideas in Datasette.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-llm-embed/releases/tag/0.1a0"&gt;datasette-llm-embed 0.1a0&lt;/a&gt;&lt;/strong&gt; - 2023-09-08&lt;br /&gt;Datasette plugin adding a llm_embed(model_id, text) SQL function&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Spending time with embedding models has lead me to spend more time with Hugging Face. I realized last week that the Hugging Face &lt;a href="https://huggingface.co/models?sort=downloads"&gt;all models sorted by downloads&lt;/a&gt; page doubles as a list of the models that are most likely to be easy to use.&lt;/p&gt;
&lt;p&gt;One of the models I tried out was &lt;a href="https://huggingface.co/Salesforce/blip-image-captioning-base"&gt;Salesforce BLIP&lt;/a&gt;, an astonishing model that can genuinely produce usable captions for images.&lt;/p&gt;
&lt;p&gt;It's really easy to work with. I ended up building this tiny little CLI tool that wraps the model:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/blip-caption/releases/tag/0.1"&gt;blip-caption 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-10&lt;br /&gt;Generate captions for images with Salesforce BLIP&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="releases-datasette-cloud"&gt;Releases driven by Datasette Cloud&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://www.datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt; continues to drive improvements to the wider Datasette ecosystem as a whole.&lt;/p&gt;
&lt;p&gt;It runs on the latest Datasette 1.0 alpha series, taking advantage of &lt;a href="https://simonwillison.net/2022/Dec/2/datasette-write-api/"&gt;the JSON write API&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This also means that it's been highlighting breaking changes in 1.0 that have caused old plugins to break, either subtly or completely.&lt;/p&gt;
&lt;p&gt;This has driven a bunch of new plugin releases. Some of these are compatible with both 0.x and 1.x - the ones that only work with the 1.x alphas are themselves marked as alpha releases.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-export-notebook/releases/tag/1.0.1"&gt;datasette-export-notebook 1.0.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-15&lt;br /&gt;Datasette plugin providing instructions for exporting data to Jupyter or Observable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-cluster-map/releases/tag/0.18a0"&gt;datasette-cluster-map 0.18a0&lt;/a&gt;&lt;/strong&gt; - 2023-09-11&lt;br /&gt;Datasette plugin that shows a map for any data with latitude/longitude columns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-graphql/releases/tag/3.0a0"&gt;datasette-graphql 3.0a0&lt;/a&gt;&lt;/strong&gt; - 2023-09-07&lt;br /&gt;Datasette plugin providing an automatic GraphQL API for your SQLite databases&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Datasette Cloud's API works using database-backed access tokens, to ensure users can revoke tokens if they need to (something that's not easily done with purely signed tokens) and that each token can record when it was most recently used.&lt;/p&gt;
&lt;p&gt;I've been building that into the existing &lt;code&gt;datasette-auth-tokens&lt;/code&gt; plugin:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-tokens/releases/tag/0.4a3"&gt;datasette-auth-tokens 0.4a3&lt;/a&gt;&lt;/strong&gt; - 2023-08-31&lt;br /&gt;Datasette plugin for authenticating access using API tokens&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://alexgarcia.xyz/"&gt;Alex Garcia&lt;/a&gt; has been working with me building out features for Datasette Cloud, generously sponsored by &lt;a href="https://fly.io/"&gt;Fly.io&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We're beginning to build out social features for Datasette Cloud - feature that will help teams privately collaborate on data investigations together.&lt;/p&gt;
&lt;p&gt;Alex has been building &lt;a href="https://github.com/datasette/datasette-short-links"&gt;datasette-short-links&lt;/a&gt; as an experimental link shortener. In building that, we realized that we needed a mechanism for resolving actor IDs displayed in a list (e.g. this link created by X) to their actual names.&lt;/p&gt;
&lt;p&gt;Datasette doesn't dictate the shape of &lt;a href="https://docs.datasette.io/en/stable/authentication.html#actors"&gt;actor&lt;/a&gt; representations, and there's no guarantee that actors would be represented in a predictable table.&lt;/p&gt;
&lt;p&gt;So... we needed a new plugin hook. I released Datasette 1.06a with a new hook, &lt;a href="https://docs.datasette.io/en/1.0a6/plugin_hooks.html#actors-from-ids-datasette-actor-ids"&gt;actors_from_ids(actor_ids)&lt;/a&gt;, which can be used to answer the question "who are the actors represented by these IDs".&lt;/p&gt;
&lt;p&gt;Alex is using this in &lt;code&gt;datasette-short-links&lt;/code&gt;, and I built two plugins to work with the new hook as well:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette/releases/tag/1.0a6"&gt;datasette 1.0a6&lt;/a&gt;&lt;/strong&gt; - 2023-09-08&lt;br /&gt;An open source multi-tool for exploring and publishing data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-debug-actors-from-ids/releases/tag/0.1a1"&gt;datasette-debug-actors-from-ids 0.1a1&lt;/a&gt;&lt;/strong&gt; - 2023-09-08&lt;br /&gt;Datasette plugin for trying out the actors_from_ids hook&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/datasette/datasette-remote-actors/releases/tag/0.1a1"&gt;datasette-remote-actors 0.1a1&lt;/a&gt;&lt;/strong&gt; - 2023-09-08&lt;br /&gt;Datasette plugin for fetching details of actors from a remote endpoint&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Datasette Cloud lets users insert, edit and delete rows from their tables, using the plugin Alex built called &lt;a href="https://github.com/datasette/datasette-write-ui"&gt;datasette-write-ui&lt;/a&gt; which he &lt;a href="https://www.datasette.cloud/blog/2023/datasette-write-ui/"&gt;introduced on the Datasette Cloud blog&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This inspired me to finally put out a fresh release of &lt;a href="https://github.com/simonw/datasette-edit-schema"&gt;datasette-edit-schema&lt;/a&gt; - the plugin which provides the ability to edit table schemas - adding and removing columns, changing column types, even altering the order columns are stored in the table.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/datasette-edit-schema/releases/tag/0.6"&gt;datasette-edit-schema 0.6&lt;/a&gt; is a major release, with three significant new features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You can now create a brand new table from scratch!&lt;/li&gt;
&lt;li&gt;You can edit the table's primary key&lt;/li&gt;
&lt;li&gt;You can modify the foreign key constraints on the table&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Those last two became important when I realized that Datasette's API is much more interesting if there are foreign key relationships to follow.&lt;/p&gt;
&lt;p&gt;Combine that with &lt;code&gt;datasette-write-ui&lt;/code&gt; and Datasette Cloud now has a full set of features for building, populating and editing tables - backed by a comprehensive JSON API.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-migrate/releases/tag/0.1a2"&gt;sqlite-migrate 0.1a2&lt;/a&gt;&lt;/strong&gt; - 2023-09-03&lt;br /&gt;A simple database migration system for SQLite, based on sqlite-utils&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/sqlite-migrate"&gt;sqlite-migrate&lt;/a&gt; is still marked as an alpha, but won't be for much longer: it's my attempt at a migration system for SQLite, inspired by &lt;a href="https://docs.djangoproject.com/en/4.2/topics/migrations/"&gt;Django migrations&lt;/a&gt; but with a less sophisticated set of features.&lt;/p&gt;
&lt;p&gt;I'm using it in LLM now to manage the schema used to store embeddings, and it's beginning to show up in some Datasette plugins as well. I'll be promoting this to non-alpha status pretty soon.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.35.1"&gt;sqlite-utils 3.35.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-09&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A tiny fix in this, which with hindsight was less impactful than I thought.&lt;/p&gt;
&lt;p&gt;I spotted a bug on Datasette Cloud when I configured full-text search on a column, then edited the schema and found that searches no longer returned the correct results.&lt;/p&gt;
&lt;p&gt;It turned out the &lt;code&gt;rowid&lt;/code&gt; column in SQLite was being rewritten by calls to the &lt;code&gt;sqlite-utils&lt;/code&gt; &lt;a href="https://sqlite-utils.datasette.io/en/stable/python-api.html#transforming-a-table"&gt;table.transform()&lt;/a&gt; method. FTS records are related to their underlying row by &lt;code&gt;rowid&lt;/code&gt;, so this was breaking search!&lt;/p&gt;
&lt;p&gt;I pushed out &lt;a href="https://github.com/simonw/sqlite-utils/issues/592"&gt;a fix for this&lt;/a&gt; in 3.35.1. But then... I learned that &lt;code&gt;rowid&lt;/code&gt; in SQLite has always been unstable - they are rewritten any time someone VACUUMs a table!&lt;/p&gt;
&lt;p&gt;I've been designing future features for Datasette that assume that &lt;code&gt;rowid&lt;/code&gt; is a useful stable identifier for a row. This clearly isn't going to work! I'm still thinking through the consequences of it, but I think there may be Datasette features (like the ability to comment on a row) that will only work for tables with a proper foreign key.&lt;/p&gt;
&lt;h4 id="sqlite-chronicle"&gt;sqlite-chronicle&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-chronicle/releases/tag/0.1"&gt;sqlite-chronicle 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-11&lt;br /&gt;Use triggers to track when rows in a SQLite table were updated or deleted&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is very early, but I'm excited about the direction it's going in.&lt;/p&gt;
&lt;p&gt;I keep on finding problems where I want to be able to synchronize various processes with the data in a table.&lt;/p&gt;
&lt;p&gt;I built &lt;a href="https://simonwillison.net/2023/Apr/15/sqlite-history/"&gt;sqlite-history&lt;/a&gt; a few months ago, which uses SQLite triggers to create a full copy of the updated data every time a row in a table is edited.&lt;/p&gt;
&lt;p&gt;That's a pretty heavy-weight solution. What if there was something lighter that could achieve a lot of the same goals?&lt;/p&gt;
&lt;p&gt;&lt;code&gt;sqlite-chronicle&lt;/code&gt; uses triggers to instead create what I'm calling a "chronicle table". This is a shadow table that records, for every row in the main table, four integer values:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;added_ms&lt;/code&gt; - the timestamp in milliseconds when the row was added&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;updated_ms&lt;/code&gt; - the timestamp in milliseconds when the row was last updated&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;version&lt;/code&gt; - a constantly incrementing version number, global across the entire table&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;deleted&lt;/code&gt; - set to &lt;code&gt;1&lt;/code&gt; if the row has been deleted&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Just storing four integers (plus copies of the primary key) makes this a pretty tiny table, and hopefully one that's cheap to update via triggers.&lt;/p&gt;
&lt;p&gt;But... having this table enables some pretty interesting things - because external processes can track the last version number that they saw and use it to see just which rows have been inserted and updated since that point.&lt;/p&gt;
&lt;p&gt;I gave a talk at DjangoCon a few years ago called &lt;a href="https://2017.djangocon.us/talks/the-denormalized-query-engine-design-pattern/"&gt;the denormalized query engine pattern&lt;/a&gt;, describing the challenge of syncing an external search index like Elasticsearch with data held in a relational database.&lt;/p&gt;
&lt;p&gt;These chronicle tables can solve that problem, and can be applied to a whole host of other problems too. So far I'm thinking about the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Publishing SQLite databases up to Datasette, sending only the rows that have changed since the last sync. I &lt;a href="https://github.com/simonw/sqlite-chronicle/issues/2#issuecomment-1721557623"&gt;wrote a prototype that does this&lt;/a&gt; and it seems to work very well.&lt;/li&gt;
&lt;li&gt;Copying a table from Datasette Cloud to other places - a desktop copy, or another instance, or even into an alternative database such as PostgreSQL or MySQL, in a way that only copies and deletes rows that have changed.&lt;/li&gt;
&lt;li&gt;Saved search alerts: run a SQL query against just rows that were modified since the last time that query ran, then send alerts if any rows are matched.&lt;/li&gt;
&lt;li&gt;Showing users a note that "34 rows in this table have changed since your last visit", then displaying those rows.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I'm sure there are many more applications for this. I'm looking forward to finding out what they are!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils-move-tables/releases/tag/0.1"&gt;sqlite-utils-move-tables 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-09-01&lt;br /&gt;sqlite-utils plugin adding a move-tables command&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I needed to fix a bug in Datasette Cloud by moving a table from one database to another... so I built a little plugin for &lt;code&gt;sqlite-utils&lt;/code&gt; that adds a &lt;code&gt;sqlite-utils move-tables origin.db destination.db tablename&lt;/code&gt; command. I love being able to build single-use features &lt;a href="https://simonwillison.net/2023/Jul/24/sqlite-utils-plugins/"&gt;as plugins like this&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="and-some-tils"&gt;And some TILs&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/llms/embed-paragraphs"&gt;Embedding paragraphs from my blog with E5-large-v2&lt;/a&gt; - 2023-09-08&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This was a fun TIL exercising the new embeddings feature in LLM. I used &lt;a href="https://django-sql-dashboard.datasette.io/"&gt;Django SQL Dashboard&lt;/a&gt;to break up my blog entries into paragraphs and exported those as CSV which could then be piped into &lt;code&gt;llm embed-multi&lt;/code&gt;, then used that to build a CLI-driven semantic search engine for my blog.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/llms/llama-cpp-python-grammars"&gt;Using llama-cpp-python grammars to generate JSON&lt;/a&gt; - 2023-09-13&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;llama-cpp&lt;/code&gt; has grammars now, which enable you to control the exact output format of the LLM. I'm optimistic that these could be used to implement an equivalent to &lt;a href="https://openai.com/blog/function-calling-and-other-api-updates"&gt;OpenAI Functions&lt;/a&gt; on top of Llama 2 and similar models. So far I've just got them to output arrays of JSON objects.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/llms/claude-hacker-news-themes"&gt;Summarizing Hacker News discussion themes with Claude and LLM&lt;/a&gt; - 2023-09-09&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I'm using this trick a lot at the moment. I have API access to &lt;a href="https://claude.ai/"&gt;Claude&lt;/a&gt; now, which has a 100,000 token context limit (GPT-4 is just 8,000 by default). That's enough to summarize 100+ comment threads from Hacker News, for which I'm now using this prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Summarize the themes of the opinions expressed here, including quotes (with author attribution) where appropriate.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The quotes part has been working really well - it turns out summaries of themes with illustrative quotes are much more interesting, and so far my spot checks haven't found any that were hallucinated.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sqlite/cr-sqlite-macos"&gt;Trying out cr-sqlite on macOS&lt;/a&gt; - 2023-09-13&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="https://github.com/vlcn-io/cr-sqlite"&gt;cr-sqlite&lt;/a&gt; adds full CRDTs to SQLite, which should enable multiple databases to accept writes independently and then seamlessly merge them together. It's a very exciting capability!&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/datasette/hugging-face-spaces"&gt;Running Datasette on Hugging Face Spaces&lt;/a&gt; - 2023-09-08&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It turns out Hugging Faces offer free scale-to-zero hosting for demos that run in Docker containers on machines with a full 16GB of RAM! I'm used to optimizing Datasette for tiny 256MB containers, so having this much memory available is a real treat.&lt;/p&gt;
&lt;p&gt;And the rest:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/google/json-api-programmable-search-engine"&gt;Limited JSON API for Google searches using Programmable Search Engine&lt;/a&gt; - 2023-09-17&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/github-actions/running-tests-against-multiple-verisons-of-dependencies"&gt;Running tests against multiple versions of a Python dependency in GitHub Actions&lt;/a&gt; - 2023-09-15&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/datasette/remember-to-commit"&gt;Remember to commit when using datasette.execute_write_fn()&lt;/a&gt; - 2023-08-31&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="datasette-cloud"/><category term="sqlite-utils"/><category term="alex-garcia"/><category term="embeddings"/><category term="llm"/></entry><entry><title>Introducing datasette-litestream: easy replication for SQLite databases in Datasette</title><link href="https://simonwillison.net/2023/Sep/13/datasette-litestream/#atom-tag" rel="alternate"/><published>2023-09-13T19:28:37+00:00</published><updated>2023-09-13T19:28:37+00:00</updated><id>https://simonwillison.net/2023/Sep/13/datasette-litestream/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datasette.cloud/blog/2023/datasette-litestream/"&gt;Introducing datasette-litestream: easy replication for SQLite databases in Datasette&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
We use Litestream on Datasette Cloud for streaming backups of user data to S3. Alex Garcia extracted out our implementation into a standalone Datasette plugin, which bundles the Litestream Go binary (for the relevant platform) in the package you get when you run “datasette install datasette-litestream”—so now Datasette has a very robust answer to questions about SQLite disaster recovery beyond just the Datasette Cloud platform.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/litestream"&gt;litestream&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="sqlite"/><category term="datasette"/><category term="datasette-cloud"/><category term="litestream"/><category term="alex-garcia"/></entry><entry><title>Datasette 1.0a4 and 1.0a5, plus weeknotes</title><link href="https://simonwillison.net/2023/Aug/30/datasette-plus-weeknotes/#atom-tag" rel="alternate"/><published>2023-08-30T14:33:35+00:00</published><updated>2023-08-30T14:33:35+00:00</updated><id>https://simonwillison.net/2023/Aug/30/datasette-plus-weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;Two new alpha releases of Datasette, plus a keynote at WordCamp, a new LLM release, two new LLM plugins and a flurry of TILs.&lt;/p&gt;
&lt;h4&gt;Datasette 1.0a5&lt;/h4&gt;
&lt;p&gt;Released this morning, &lt;a href="https://docs.datasette.io/en/1.0a5/changelog.html"&gt;Datasette 1.0a5&lt;/a&gt; has some exciting new changes driven by Datasette Cloud and the ongoing march towards Datasette 1.0.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://alexgarcia.xyz/"&gt;Alex Garcia&lt;/a&gt; is working with me on Datasette Cloud and Datasette generally, generously sponsored by &lt;a href="https://fly.io/"&gt;Fly&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Two of the changes in 1.0a5 were driven by Alex:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;New &lt;code&gt;datasette.yaml&lt;/code&gt; (or &lt;code&gt;.json&lt;/code&gt;) configuration file, which can be specified using &lt;code&gt;datasette -c path-to-file&lt;/code&gt;. The goal here to consolidate settings, plugin configuration, permissions, canned queries, and other Datasette configuration into a single single file, separate from &lt;code&gt;metadata.yaml&lt;/code&gt;. The legacy &lt;code&gt;settings.json&lt;/code&gt; config file used for &lt;a href="https://docs.datasette.io/en/1.0a5/settings.html#config-dir"&gt;Configuration directory mode&lt;/a&gt; has been removed, and &lt;code&gt;datasette.yaml&lt;/code&gt; has a &lt;code&gt;"settings"&lt;/code&gt; section where the same settings key/value pairs can be included. In the next future alpha release, more configuration such as plugins/permissions/canned queries will be moved to the &lt;code&gt;datasette.yaml&lt;/code&gt; file. See &lt;a href="https://github.com/simonw/datasette/issues/2093"&gt;#2093&lt;/a&gt; for more details.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Right from the very start of the project, Datasette has supported specifying metadata about databases - sources, licenses, etc, as a &lt;code&gt;metadata.json&lt;/code&gt; file that can be passed to Datasette like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;datasette data.db -m metadata.json&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Over time, the purpose and uses of that file has expanded in all kinds of different directions. It can be used &lt;a href="https://docs.datasette.io/en/1.0a5/plugins.html#plugin-configuration"&gt;for plugin settings&lt;/a&gt;, and to set preferences for a table default page size, &lt;a href="https://docs.datasette.io/en/1.0a5/facets.html#facets-in-metadata"&gt;default facets&lt;/a&gt; etc), and even to &lt;a href="https://docs.datasette.io/en/1.0a5/authentication.html#access-permissions-in-metadata"&gt;configure access permissions&lt;/a&gt; for who can view what.&lt;/p&gt;
&lt;p&gt;The name &lt;code&gt;metadata.json&lt;/code&gt; is entirely inappropriate for what the file actually does. It's a mess.&lt;/p&gt;
&lt;p&gt;I've always had a desire to fix this before Datasette 1.0, but it never quite got high up enough the priority list for me to spend time on it.&lt;/p&gt;
&lt;p&gt;Alex &lt;a href="https://github.com/simonw/datasette/issues/2093"&gt;expressed interest in fixing it&lt;/a&gt;, and has started to put a plan into motion for cleaning it up.&lt;/p&gt;
&lt;p&gt;More details &lt;a href="https://github.com/simonw/datasette/issues/2093"&gt;in the issue&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Datasette &lt;code&gt;_internal&lt;/code&gt; database has had some changes. It no longer shows up in the &lt;code&gt;datasette.databases&lt;/code&gt; list by default, and is now instead available to plugins using the &lt;code&gt;datasette.get_internal_database()&lt;/code&gt;. Plugins are invited to use this as a private database to store configuration and settings and secrets that should not be made visible through the default Datasette interface. Users can pass the new &lt;code&gt;--internal internal.db&lt;/code&gt; option to persist that internal database to disk. (&lt;a href="https://github.com/simonw/datasette/issues/2157"&gt;#2157&lt;/a&gt;).&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This was the other initiative driven by Alex. In working on Datasette Cloud we realized that it's actually quite common for plugins to need somewhere to store data that shouldn't necessarily be visible to regular users of a Datasette instance - things like tokens created by &lt;a href="https://datasette.io/plugins/datasette-auth-tokens"&gt;datasette-auth-tokens&lt;/a&gt;, or the progress bar mechanism used by &lt;a href="https://datasette.io/plugins/datasette-upload-csvs"&gt;datasette-upload-csvs&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Alex pointed out that the existing &lt;code&gt;_internal&lt;/code&gt; database for Datasette could be expanded to cover these use-cases as well. &lt;a href="https://github.com/simonw/datasette/issues/2157"&gt;#2157&lt;/a&gt; has more details on how we agreed this should work.&lt;/p&gt;
&lt;p&gt;The other changes in 1.0a5 were driven by me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When restrictions are applied to &lt;a href="https://docs.datasette.io/en/1.0a5/authentication.html#createtokenview"&gt;API tokens&lt;/a&gt;, those restrictions now behave slightly differently: applying the &lt;code&gt;view-table&lt;/code&gt; restriction will imply the ability to &lt;code&gt;view-database&lt;/code&gt; for the database containing that table, and both &lt;code&gt;view-table&lt;/code&gt; and &lt;code&gt;view-database&lt;/code&gt; will imply &lt;code&gt;view-instance&lt;/code&gt;. Previously you needed to create a token with restrictions that explicitly listed &lt;code&gt;view-instance&lt;/code&gt; and &lt;code&gt;view-database&lt;/code&gt; and &lt;code&gt;view-table&lt;/code&gt; in order to view a table without getting a permission denied error. (&lt;a href="https://github.com/simonw/datasette/issues/2102"&gt;#2102&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I &lt;a href="https://simonwillison.net/2022/Dec/15/datasette-1a2/#finely-grained-permissions"&gt;described finely-grained permissions&lt;/a&gt; for access tokens in my annotated release notes for 1.0a2.&lt;/p&gt;
&lt;p&gt;They provide a mechanism for creating an API token that's only allowed to perform a subset of actions on behalf of the user.&lt;/p&gt;
&lt;p&gt;In trying these out for Datasette Cloud I came across a nasty usability flaw. You could create a token that was restricted to &lt;code&gt;view-table&lt;/code&gt; access for a specific table... and it wouldn't work. Because the access code for that view would check for &lt;code&gt;view-instance&lt;/code&gt; and &lt;code&gt;view-database&lt;/code&gt; permission first.&lt;/p&gt;
&lt;p&gt;1.0a5 fixes that, by adding logic that says that if a token can &lt;code&gt;view-table&lt;/code&gt; that implies it can &lt;code&gt;view-database&lt;/code&gt; for the database containing that table, and &lt;code&gt;view-instance&lt;/code&gt; for the overall instance.&lt;/p&gt;
&lt;p&gt;This change took quite some time to develop, because any time I write code involving permissions I like to also include extremely comprehensive automated tests.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The &lt;code&gt;-s/--setting&lt;/code&gt; option can now take dotted paths to nested settings. These will then be used to set or over-ride the same options as are present in the new configuration file. (&lt;a href="https://github.com/simonw/datasette/issues/2156"&gt;#2156&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a fun little detail inspired by Alex's configuration work.&lt;/p&gt;
&lt;p&gt;I run a lot of different Datasette instances, often on an ad-hoc basis.&lt;/p&gt;
&lt;p&gt;I sometimes find it frustrating that to use certain features I need to create a &lt;code&gt;metadata.json&lt;/code&gt; (soon to be &lt;code&gt;datasette.yml&lt;/code&gt;) configuration file, just to get something to work.&lt;/p&gt;
&lt;p&gt;Wouldn't it be neat if every possible setting for Datasette could be provided both in a configuration file or as command-line options?&lt;/p&gt;
&lt;p&gt;That's what the new &lt;code&gt;--setting&lt;/code&gt; option aims to solve. Anything that can be represented as a JSON or YAML configuration can now also be represented as key/value pairs on the command-line.&lt;/p&gt;
&lt;p&gt;Here's an example &lt;a href="https://github.com/simonw/datasette/issues/2143#issuecomment-1690792514"&gt;from my initial issue comment&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;datasette \
  -s settings.sql_time_limit_ms 1000 \
  -s plugins.datasette-auth-tokens.manage_tokens &lt;span class="pl-c1"&gt;true&lt;/span&gt; \
  -s plugins.datasette-auth-tokens.manage_tokens_database tokens \
  -s plugins.datasette-ripgrep.path &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;/home/simon/code-to-search&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -s databases.mydatabase.tables.example_table.sort created \
  mydatabase.db tokens.db&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Once this feature is complete, the above will behave the same as a &lt;code&gt;datasette.yml&lt;/code&gt; file containing this:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;&lt;span class="pl-ent"&gt;plugins&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;datasette-auth-tokens&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;manage_tokens&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;
    &lt;span class="pl-ent"&gt;manage_tokens_database&lt;/span&gt;: &lt;span class="pl-s"&gt;tokens&lt;/span&gt;
  &lt;span class="pl-ent"&gt;datasette-ripgrep&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;path&lt;/span&gt;: &lt;span class="pl-s"&gt;/home/simon/code-to-search&lt;/span&gt;
&lt;span class="pl-ent"&gt;databases&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;mydatabase&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;tables&lt;/span&gt;:
      &lt;span class="pl-ent"&gt;example_table&lt;/span&gt;:
        &lt;span class="pl-ent"&gt;sort&lt;/span&gt;: &lt;span class="pl-s"&gt;created&lt;/span&gt;
&lt;span class="pl-ent"&gt;settings&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;sql_time_limit_ms&lt;/span&gt;: &lt;span class="pl-c1"&gt;1000&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I've experimented with ways of turning key/value pairs into nested JSON objects before, with my &lt;a href="https://github.com/simonw/json-flatten"&gt;json-flatten&lt;/a&gt; library.&lt;/p&gt;
&lt;p&gt;This time I took a slightly different approach. In particular, if you need to pass a nested JSON object (such as an array) which isn't easily represented using &lt;code&gt;key.nested&lt;/code&gt; notation, you can pass it like this instead:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;datasette data.db \
  -s plugins.datasette-complex-plugin.configs \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{"foo": [1,2,3], "bar": "baz"}&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which would convert to the following equivalent YAML:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;&lt;span class="pl-ent"&gt;plugins&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;datasette-complex-plugin&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;configs&lt;/span&gt;:
      &lt;span class="pl-ent"&gt;foo&lt;/span&gt;:
        - &lt;span class="pl-c1"&gt;1&lt;/span&gt;
        - &lt;span class="pl-c1"&gt;2&lt;/span&gt;
        - &lt;span class="pl-c1"&gt;3&lt;/span&gt;
      &lt;span class="pl-ent"&gt;bar&lt;/span&gt;: &lt;span class="pl-s"&gt;baz&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;These examples don't quite work yet, because the plugin configuration hasn't migrated to &lt;code&gt;datasette.yml&lt;/code&gt; - but it should work for the next alpha.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;New &lt;code&gt;--actor '{"id": "json-goes-here"}'&lt;/code&gt; option for use with &lt;code&gt;datasette --get&lt;/code&gt; to treat the simulated request as being made by a specific actor, see &lt;a href="https://docs.datasette.io/en/1.0a5/cli-reference.html#cli-datasette-get"&gt;datasette --get&lt;/a&gt;. (&lt;a href="https://github.com/simonw/datasette/issues/2153"&gt;#2153&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a fun little debug helper I built while working on restricted tokens.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;datasette --get /...&lt;/code&gt; option is a neat trick that can be used to simulate an HTTP request through the Datasette instance, without even starting a server running on a port.&lt;/p&gt;
&lt;p&gt;I use it for things like &lt;a href="https://til.simonwillison.net/shot-scraper/social-media-cards"&gt;generating social media card images&lt;/a&gt; for my TILs website.&lt;/p&gt;
&lt;p&gt;The new &lt;code&gt;--actor&lt;/code&gt; option lets you add a simulated &lt;a href="https://docs.datasette.io/en/latest/authentication.html#actors"&gt;actor&lt;/a&gt; to the request, which is useful for testing out things like configured authentication and permissions.&lt;/p&gt;
&lt;h4&gt;A security fix in Datasette 1.0a4&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://docs.datasette.io/en/latest/changelog.html#a4-2023-08-21"&gt;Datasette 1.0a4&lt;/a&gt; has a security fix: I realized that the API explorer I added in the 1.0 alpha series was exposing the names of databases and tables (though not their actual content) to unauthenticated users, even for Datasette instances that were protected by authentication.&lt;/p&gt;
&lt;p&gt;I issued a GitHub security advisory for this: &lt;a href="https://github.com/simonw/datasette/security/advisories/GHSA-7ch3-7pp7-7cpq"&gt;Datasette 1.0 alpha series leaks names of databases and tables to unauthenticated users&lt;/a&gt;, which has since been issued a CVE, &lt;a href="https://nvd.nist.gov/vuln/detail/CVE-2023-40570"&gt;CVE-2023-40570&lt;/a&gt; - GitHub is &lt;a href="https://docs.github.com/en/code-security/security-advisories/repository-security-advisories/about-repository-security-advisories#cve-identification-numbers"&gt;a CVE Numbering Authority&lt;/a&gt; which means their security team are trusted to review such advisories and issue CVEs where necessary.&lt;/p&gt;
&lt;p&gt;I expect the impact of this vulnerability to be very small: outside of &lt;a href="https://www.datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt; very few people are running the Datasette 1.0 alphas on the public internet, and it's possible that the set of those users who are also authenticating their instances to provide authenticated access to private data - especially where just the database and table names of that data is considered sensitive - is an empty set.&lt;/p&gt;
&lt;p&gt;Datasette Cloud itself has detailed access logs primarily to help evaluate this kind of threat. I'm pleased to report that those logs showed no instances of an unauthenticated user accessing the pages in question prior to the bug being fixed.&lt;/p&gt;
&lt;h4&gt;A keynote at WordCamp US&lt;/h4&gt;
&lt;p&gt;Last Friday I gave a keynote at &lt;a href="https://us.wordcamp.org/2023/"&gt;WordCamp US&lt;/a&gt; on the subject of Large Language Models.&lt;/p&gt;
&lt;p&gt;I used &lt;a href="https://goodsnooze.gumroad.com/l/macwhisper"&gt;MacWhisper&lt;/a&gt; and my &lt;a href="https://simonwillison.net/2023/Aug/6/annotated-presentations/"&gt;annotated presentation tool&lt;/a&gt; to turn that into a detailed transcript, complete with additional links and context: &lt;a href="https://simonwillison.net/2023/Aug/27/wordcamp-llms/"&gt;Making Large Language Models work for you&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;llm-openrouter and llm-anyscale-endpoints&lt;/h4&gt;
&lt;p&gt;I released two new plugins for &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt;, which lets you run large language models either locally or via APIs, as both a CLI tool and a Python library.&lt;/p&gt;
&lt;p&gt;Both plugins provide access to API-hosted models:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-openrouter"&gt;llm-openrouter&lt;/a&gt;&lt;/strong&gt; provides access to &lt;a href="https://openrouter.ai/docs#models"&gt;models&lt;/a&gt; hosted by &lt;a href="https://openrouter.ai/"&gt;OpenRouter&lt;/a&gt;. Of particular interest here is Claude - I'm still on the waiting list for the official Claude API, but in the meantime I can pay for access to it via OpenRouter and it works just fine. Claude has a 100,000 token context, making it a really great option for working with larger documents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-anyscale-endpoints"&gt;llm-anyscale-endpoints&lt;/a&gt;&lt;/strong&gt; is a similar plugin that instead works with &lt;a href="https://app.endpoints.anyscale.com/"&gt;Anyscale Endpoints&lt;/a&gt;. Anyscale provide Llama 2 and Code Llama at extremely low prices - between $0.25 and $1 per million tokens, depending on the model.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These plugins were very quick to develop.&lt;/p&gt;
&lt;p&gt;Both OpenRouter and Anyscale Endpoints provide API endpoints that emulate the official OpenAI APIs, including the way the handle streaming tokens.&lt;/p&gt;
&lt;p&gt;LLM already has code for talking to those endpoints via the &lt;a href="https://github.com/openai/openai-python"&gt;openai&lt;/a&gt; Python library, which can be re-pointed to another backend using the officially supported &lt;code&gt;api_base&lt;/code&gt; parameter.&lt;/p&gt;
&lt;p&gt;So the core code for the plugins ended up being less than 30 lines each: &lt;a href="https://github.com/simonw/llm-openrouter/blob/main/llm_openrouter.py"&gt;llm_openrouter.py&lt;/a&gt; and &lt;a href="https://github.com/simonw/llm-anyscale-endpoints/blob/main/llm_anyscale_endpoints.py"&gt;llm_anyscale_endpoints.py&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;llm 0.8&lt;/h4&gt;
&lt;p&gt;I shipped &lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-8"&gt;LLM 0.8&lt;/a&gt; a week and a half ago, with a bunch of small changes.&lt;/p&gt;
&lt;p&gt;The most significant of these was a change to the default &lt;code&gt;llm logs&lt;/code&gt; output, which shows the logs (recorded in SQLite) of the previous prompts and responses you have sent through the tool.&lt;/p&gt;
&lt;p&gt;This output used to be JSON. It's &lt;a href="https://github.com/simonw/llm/issues/160#issuecomment-1682991314"&gt;now Markdown&lt;/a&gt;, which is both easier to read and can be pasted into GitHub Issue comments or Gists or similar to share the results with other people.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://llm.datasette.io/en/stable/changelog.html#v0-8"&gt;The release notes for 0.8&lt;/a&gt; describe all of the other improvements.&lt;/p&gt;
&lt;h4&gt;sqlite-utils 3.35&lt;/h4&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.35"&gt;3.35 release of sqlite-utils&lt;/a&gt; was driven by LLM.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;sqlite-utils&lt;/code&gt; has a mechanism for adding foreign keys to an existing table - something that's not supported by SQLite out of the box.&lt;/p&gt;
&lt;p&gt;That implementation used to work using a deeply gnarly hack: it would switch the &lt;code&gt;sqlite_master&lt;/code&gt; table over to being writable (using &lt;code&gt;PRAGMA writable_schema = 1&lt;/code&gt;), update that schema in place to reflect the new foreign keys and then toggle &lt;code&gt;writable_schema = 0&lt;/code&gt; back again.&lt;/p&gt;
&lt;p&gt;It turns out there are Python installations out there - most notably the system Python on macOS - which completely disable the ability to write to that table, no matter what the status of the various pragmas.&lt;/p&gt;
&lt;p&gt;I was getting bug reports from LLM users who were running into this. I realized that I had a solution for this mostly implemented already: the &lt;a href="https://sqlite-utils.datasette.io/en/stable/python-api.html#transforming-a-table"&gt;sqlite-utils transform() method&lt;/a&gt;, which can apply all sorts of complex schema changes by creating a brand new table, copying across the old data and then renaming it to replace the old one.&lt;/p&gt;
&lt;p&gt;So I dropped the old &lt;code&gt;writable_schema&lt;/code&gt; mechanism entirely in favour of &lt;code&gt;.transform()&lt;/code&gt; - it's slower, because it requires copying the entire table, but it doesn't have weird edge-cases where it doesn't work.&lt;/p&gt;
&lt;p&gt;Since &lt;a href="https://simonwillison.net/2023/Jul/24/sqlite-utils-plugins/"&gt;sqlite-utils supports plugins now&lt;/a&gt;, I realized I could set a healthy precedent by making the removed feature available in a new plugin: &lt;a href="https://github.com/simonw/sqlite-utils-fast-fks"&gt;sqlite-utils-fast-fks&lt;/a&gt;, which provides the following command for adding foreign keys the fast, old way (provided your installation supports it):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-fast-fks
sqlite-utils fast-fks my_database.db places country_id country id&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I've always admired how &lt;a href="https://jquery.com/"&gt;jQuery&lt;/a&gt; uses plugins to keep old features working on an opt-in basis after major version upgrades. I'm excited to be able to apply the same pattern for &lt;code&gt;sqlite-utils&lt;/code&gt;.&lt;/p&gt;
&lt;h4&gt;paginate-json 1.0&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/paginate-json"&gt;paginate-json&lt;/a&gt; is a tiny tool I first released a few years ago to solve a very specific problem.&lt;/p&gt;
&lt;p&gt;There's a neat pattern in some JSON APIs where the &lt;a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link"&gt;HTTP link header&lt;/a&gt; is used to indicate subsequent pages of results.&lt;/p&gt;
&lt;p&gt;The best example I know of this is the GitHub API. Run this to see what it looks like here I'm using the &lt;a href="https://docs.github.com/en/rest/activity/events?apiVersion=2022-11-28#list-public-events-for-a-user"&gt;events API&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl -i \
  https://api.github.com/users/simonw/events&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's a truncated example of the output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;HTTP/2 200 
server: GitHub.com
content-type: application/json; charset=utf-8
link: &amp;lt;https://api.github.com/user/9599/events?page=2&amp;gt;; rel="next", &amp;lt;https://api.github.com/user/9599/events?page=9&amp;gt;; rel="last"

[
  {
    "id": "31467177730",
    "type": "PushEvent",
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;link&lt;/code&gt; header there specifies a &lt;code&gt;next&lt;/code&gt; and &lt;code&gt;last&lt;/code&gt; URL that can be used for pagination.&lt;/p&gt;
&lt;p&gt;To fetch all available items, you can follow the &lt;code&gt;next&lt;/code&gt; link repeatedly until it runs out.&lt;/p&gt;
&lt;p&gt;My &lt;code&gt;paginate-json&lt;/code&gt; tool can follow these links for you. If you run it like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;paginate-json \
  https://api.github.com/users/simonw/events&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It will output a single JSON array consisting of the results from every available page.&lt;/p&gt;
&lt;p&gt;The 1.0 release adds &lt;a href="https://github.com/simonw/paginate-json/releases/tag/1.0"&gt;a bunch of small features&lt;/a&gt;, but also marks my confidence in the stability of the design of the tool.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://docs.datasette.io/en/latest/json_api.html"&gt;Datasette JSON API&lt;/a&gt; has supported &lt;a href="https://docs.datasette.io/en/latest/json_api.html#pagination"&gt;link pagination&lt;/a&gt; for a while - you can use &lt;code&gt;paginate-json&lt;/code&gt; with Datasette like this, taking advantage of the new &lt;code&gt;--key&lt;/code&gt; option to paginate over the array of objects returned in the &lt;code&gt;"rows"&lt;/code&gt; key:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;paginate-json \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://datasette.io/content/pypi_releases.json?_labels=on&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  --key rows \
  --nl&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;--nl&lt;/code&gt; option here causes &lt;code&gt;paginate-json&lt;/code&gt; to output the results as newline-delimited JSON, instead of bundling them together into a JSON array.&lt;/p&gt;
&lt;p&gt;Here's how to use &lt;a href="https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-newline-delimited-json"&gt;sqlite-utils insert&lt;/a&gt; to insert that data directly into a fresh SQLite database:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;paginate-json \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://datasette.io/content/pypi_releases.json?_labels=on&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  --key rows \
  --nl &lt;span class="pl-k"&gt;|&lt;/span&gt; \
    sqlite-utils insert data.db releases - \
      --nl --flatten&lt;/pre&gt;&lt;/div&gt;
&lt;h4&gt;Releases this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/paginate-json/releases/tag/1.0"&gt;paginate-json 1.0&lt;/a&gt;&lt;/strong&gt; - 2023-08-30&lt;br /&gt;Command-line tool for fetching JSON from paginated APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-auth-tokens/releases/tag/0.4a2"&gt;datasette-auth-tokens 0.4a2&lt;/a&gt;&lt;/strong&gt; - 2023-08-29&lt;br /&gt;Datasette plugin for authenticating access using API tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette/releases/tag/1.0a5"&gt;datasette 1.0a5&lt;/a&gt;&lt;/strong&gt; - 2023-08-29&lt;br /&gt;An open source multi-tool for exploring and publishing data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-anyscale-endpoints/releases/tag/0.2"&gt;llm-anyscale-endpoints 0.2&lt;/a&gt;&lt;/strong&gt; - 2023-08-25&lt;br /&gt;LLM plugin for models hosted by Anyscale Endpoints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-jellyfish/releases/tag/2.0"&gt;datasette-jellyfish 2.0&lt;/a&gt;&lt;/strong&gt; - 2023-08-24&lt;br /&gt;Datasette plugin adding SQL functions for fuzzy text matching powered by Jellyfish&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-configure-fts/releases/tag/1.1.2"&gt;datasette-configure-fts 1.1.2&lt;/a&gt;&lt;/strong&gt; - 2023-08-23&lt;br /&gt;Datasette plugin for enabling full-text search against selected table columns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-ripgrep/releases/tag/0.8.1"&gt;datasette-ripgrep 0.8.1&lt;/a&gt;&lt;/strong&gt; - 2023-08-21&lt;br /&gt;Web interface for searching your code using ripgrep, built as a Datasette plugin&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-publish-fly/releases/tag/1.3.1"&gt;datasette-publish-fly 1.3.1&lt;/a&gt;&lt;/strong&gt; - 2023-08-21&lt;br /&gt;Datasette plugin for publishing data using Fly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-openrouter/releases/tag/0.1"&gt;llm-openrouter 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-08-21&lt;br /&gt;LLM plugin for models hosted by OpenRouter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/llm/releases/tag/0.8"&gt;llm 0.8&lt;/a&gt;&lt;/strong&gt; - 2023-08-21&lt;br /&gt;Access large language models from the command-line&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils-fast-fks/releases/tag/0.1"&gt;sqlite-utils-fast-fks 0.1&lt;/a&gt;&lt;/strong&gt; - 2023-08-18&lt;br /&gt;Fast foreign key addition for sqlite-utils&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-edit-schema/releases/tag/0.5.3"&gt;datasette-edit-schema 0.5.3&lt;/a&gt;&lt;/strong&gt; - 2023-08-18&lt;br /&gt;Datasette plugin for modifying table schemas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://github.com/simonw/sqlite-utils/releases/tag/3.35"&gt;sqlite-utils 3.35&lt;/a&gt;&lt;/strong&gt; - 2023-08-18&lt;br /&gt;Python CLI utility and library for manipulating SQLite databases&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;TIL this week&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/json/streaming-indented-json-array"&gt;Streaming output of an indented JSON array&lt;/a&gt; - 2023-08-30&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/macos/downloading-partial-youtube-videos"&gt;Downloading partial YouTube videos with ffmpeg&lt;/a&gt; - 2023-08-26&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sqlite/sqlite-version-macos-python"&gt;Compile and run a new SQLite version with the existing sqlite3 Python library on macOS&lt;/a&gt; - 2023-08-22&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/fly/django-sql-dashboard"&gt;Configuring Django SQL Dashboard for Fly PostgreSQL&lt;/a&gt; - 2023-08-22&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sqlite/database-file-size"&gt;Calculating the size of a SQLite database file using SQL&lt;/a&gt; - 2023-08-21&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/readthedocs/stable-docs"&gt;Updating stable docs in ReadTheDocs without pushing a release&lt;/a&gt; - 2023-08-21&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/bash/go-script"&gt;A shell script for running Go one-liners&lt;/a&gt; - 2023-08-20&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/sqlite/python-sqlite-environment"&gt;A one-liner to output details of the current Python's SQLite&lt;/a&gt; - 2023-08-19&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/python/inlining-binary-data"&gt;A simple pattern for inlining binary content in a Python script&lt;/a&gt; - 2023-08-19&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://til.simonwillison.net/bash/multiple-servers"&gt;Running multiple servers in a single Bash script&lt;/a&gt; - 2023-08-17&lt;/li&gt;
&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="datasette"/><category term="weeknotes"/><category term="sqlite-utils"/><category term="annotated-release-notes"/><category term="alex-garcia"/><category term="llm"/></entry><entry><title>Introducing datasette-write-ui: a Datasette plugin for editing, inserting, and deleting rows</title><link href="https://simonwillison.net/2023/Aug/16/datasette-write-ui/#atom-tag" rel="alternate"/><published>2023-08-16T01:48:07+00:00</published><updated>2023-08-16T01:48:07+00:00</updated><id>https://simonwillison.net/2023/Aug/16/datasette-write-ui/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.datasette.cloud/blog/2023/datasette-write-ui/"&gt;Introducing datasette-write-ui: a Datasette plugin for editing, inserting, and deleting rows&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex García is working with me on Datasette Cloud for the next few months, graciously sponsored by Fly. We will be working in public, releasing open source code and documenting how to build a multi-tenant SaaS product using Fly Machines.&lt;/p&gt;

&lt;p&gt;Alex’s first project is datasette-write-ui, a plugin that finally lets you directly edit data stored inside Datasette. Alex wrote about the plugin on our new Datasette Cloud blog.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-cloud"&gt;datasette-cloud&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/fly"&gt;fly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="datasette"/><category term="datasette-cloud"/><category term="fly"/><category term="alex-garcia"/></entry><entry><title>sqlite-utils now supports plugins</title><link href="https://simonwillison.net/2023/Jul/24/sqlite-utils-plugins/#atom-tag" rel="alternate"/><published>2023-07-24T17:06:23+00:00</published><updated>2023-07-24T17:06:23+00:00</updated><id>https://simonwillison.net/2023/Jul/24/sqlite-utils-plugins/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;a href="https://sqlite-utils.datasette.io/en/stable/changelog.html#v3-34"&gt;sqlite-utils 3.34&lt;/a&gt; is out with a major new feature: support for &lt;a href="https://sqlite-utils.datasette.io/en/stable/plugins.html"&gt;plugins&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;sqlite-utils&lt;/code&gt; is my combination Python library and command-line tool for manipulating SQLite databases. It recently celebrated its fifth birthday, and has had over 100 releases since it first launched back in 2018.&lt;/p&gt;
&lt;p&gt;The new plugin system is inspired by similar mechanisms &lt;a href="https://docs.datasette.io/en/stable/plugins.html"&gt;in Datasette&lt;/a&gt; and &lt;a href="https://llm.datasette.io/en/stable/plugins/index.html"&gt;LLM&lt;/a&gt;. It lets developers add new features to &lt;code&gt;sqlite-utils&lt;/code&gt; without needing to get their changes accepted by the core project.&lt;/p&gt;
&lt;p&gt;I love plugin systems. As an open source maintainer they are by far the best way to encourage people to contribute to my projects - I can genuinely wake up in the morning and my software has new features, and I didn't even need to review a pull request.&lt;/p&gt;
&lt;p&gt;Plugins also offer a fantastic medium for exploration and experimentation. I can try out new ideas without committing to supporting them in core, and without needing to tie improvements to them to the core release cycle.&lt;/p&gt;
&lt;p&gt;Version 3.34 adds &lt;a href="https://sqlite-utils.datasette.io/en/stable/plugins.html#plugin-hooks"&gt;two initial plugin hooks&lt;/a&gt;: &lt;code&gt;register_commands()&lt;/code&gt; and &lt;code&gt;prepare_connection()&lt;/code&gt;. These are both based on the equivalent hooks in Datasette.&lt;/p&gt;
&lt;p&gt;I planned to just ship &lt;code&gt;register_commands()&lt;/code&gt;, but Alex Garcia spotted my activity on the repo and submitted &lt;a href="https://github.com/simonw/sqlite-utils/pull/573"&gt;a PR&lt;/a&gt; adding &lt;code&gt;prepare_connection()&lt;/code&gt; literally minutes before I had intended to ship the release!&lt;/p&gt;
&lt;h4&gt;register_commands()&lt;/h4&gt;
&lt;p&gt;The &lt;code&gt;register_commands()&lt;/code&gt; hook lets you add new commands to the &lt;code&gt;sqlite-utils&lt;/code&gt; command-line tool - so users can run &lt;code&gt;sqlite-utils your-new-command&lt;/code&gt; to access your feature.&lt;/p&gt;
&lt;p&gt;I've learned from past experience that you should never ship a plugin hook without also releasing at least one plugin that uses it. I've built two so far for &lt;code&gt;register_commands()&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/sqlite-utils-shell"&gt;sqlite-utils-shell&lt;/a&gt; adds a simply interactive shell, accessed using &lt;code&gt;sqlite-utils shell&lt;/code&gt; for an in-memory database or &lt;code&gt;sqlite-utils shell data.db&lt;/code&gt; to run it against a specific database file.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/sqlite-migrate"&gt;sqlite-migrate&lt;/a&gt; is my first draft of a database migrations system for SQLite, loosely inspired by Django migrations and previewed by the migration mechanism I &lt;a href="https://github.com/simonw/llm/blob/0.6.1/llm/migrations.py"&gt;added to LLM&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Try out the shell plugin like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-shell
sqlite-utils shell&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The interface looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;In-memory database, content will be lost on exit
Type 'exit' to exit.
sqlite-utils&amp;gt; select 3 + 5;
  3 + 5
-------
      8
sqlite-utils&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;prepare_connection()&lt;/h4&gt;
&lt;p&gt;This hook, contributed by Alex, lets you modify the connection object before it is used to execute any SQL. Most importantly, this lets you register custom SQLite functions.&lt;/p&gt;
&lt;p&gt;I expect this to be the most common category of plugin. I've built one so far: &lt;a href="https://github.com/simonw/sqlite-utils-dateutil"&gt;sqlite-utils-dateutil&lt;/a&gt;, which adds functions for parsing dates and times using the &lt;a href="https://dateutil.readthedocs.io/"&gt;dateutil&lt;/a&gt; library.&lt;/p&gt;
&lt;p&gt;It lets you do things like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-dateutil
sqlite-utils memory &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;select dateutil_parse('3rd october')&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; -t&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;dateutil_parse('3rd october')
-------------------------------
2023-10-03T00:00:00
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This works inside &lt;code&gt;sqlite-shell&lt;/code&gt; too.&lt;/p&gt;
&lt;p&gt;Plugins that you install also become available in the Python API interface to &lt;code&gt;sqlite-utils&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-text-python-console"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; sqlite_utils
&amp;gt;&amp;gt;&amp;gt; db &lt;span class="pl-k"&gt;=&lt;/span&gt; sqlite_utils.Database(&lt;span class="pl-v"&gt;memory&lt;/span&gt;&lt;span class="pl-k"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-c1"&gt;list&lt;/span&gt;(db.query(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;select dateutil_parse('3rd october')&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;))
[{"dateutil_parse('3rd october')": '2023-10-03T00:00:00'}]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You can opt out of executing installed plugins by passing &lt;code&gt;execute_plugins=False&lt;/code&gt; to the &lt;code&gt;Database()&lt;/code&gt; constructor:&lt;/p&gt;
&lt;div class="highlight highlight-text-python-console"&gt;&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; db &lt;span class="pl-k"&gt;=&lt;/span&gt; sqlite_utils.Database(&lt;span class="pl-v"&gt;memory&lt;/span&gt;&lt;span class="pl-k"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;, &lt;span class="pl-v"&gt;execute_plugins&lt;/span&gt;&lt;span class="pl-k"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;False&lt;/span&gt;)
&amp;gt;&amp;gt;&amp;gt; &lt;span class="pl-c1"&gt;list&lt;/span&gt;(db.query(&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;select dateutil_parse('3rd october')&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;))
Traceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
  File ".../site-packages/sqlite_utils/db.py", line 494, in query
    cursor = self.execute(sql, params or tuple())
  File ".../site-packages/sqlite_utils/db.py", line 512, in execute
    return self.conn.execute(sql, parameters)
sqlite3.OperationalError: no such function: dateutil_parse&lt;/pre&gt;&lt;/div&gt;
&lt;h4&gt;sqlite-ml by Romain Clement&lt;/h4&gt;
&lt;p&gt;I quietly released &lt;code&gt;sqlite-utils 3.34&lt;/code&gt; on Saturday. The community has already released several plugins for it!&lt;/p&gt;
&lt;p&gt;Romain Clement built &lt;a href="https://github.com/rclement/sqlite-utils-ml"&gt;sqlite-utils-ml&lt;/a&gt;, a plugin wrapper for his &lt;a href="https://github.com/rclement/sqlite-ml"&gt;sqlite-ml&lt;/a&gt; project.&lt;/p&gt;
&lt;p&gt;This adds custom SQL functions for training machine learning models and running predictions, entirely within SQLite, using algorithms from &lt;a href="https://scikit-learn.org"&gt;scikit-learn&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here's what that looks like running inside &lt;code&gt;sqlite-utils shell&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-shell sqlite-utils-ml
sqlite-utils shell ml.db&lt;/pre&gt;&lt;/div&gt;
&lt;pre&gt;&lt;code&gt;Attached to ml.db
Type 'exit' to exit.
sqlite-utils&amp;gt; select sqml_load_dataset('iris') as dataset;
dataset
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{"table": "dataset_iris", "feature_names": ["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"], "target_names": ["setosa", "versicolor", "virginica"], "size": 150}
sqlite-utils&amp;gt; select sqml_train(
         ...&amp;gt;   'Iris prediction',
         ...&amp;gt;   'classification',
         ...&amp;gt;   'logistic_regression',
         ...&amp;gt;   'dataset_iris',
         ...&amp;gt;   'target'
         ...&amp;gt; ) as training;
training
--------------------------------------------------------------------------------------------------------------------------------------------------------------
{"experiment_name": "Iris prediction", "prediction_type": "classification", "algorithm": "logistic_regression", "deployed": true, "score": 0.9736842105263158}
sqlite-utils&amp;gt; select
         ...&amp;gt;   dataset_iris.*,
         ...&amp;gt;   sqml_predict(
         ...&amp;gt;     'Iris prediction',
         ...&amp;gt;     json_object(
         ...&amp;gt;       'sepal length (cm)', [sepal length (cm)],
         ...&amp;gt;       'sepal width (cm)', [sepal width (cm)],
         ...&amp;gt;       'petal length (cm)', [petal length (cm)],
         ...&amp;gt;       'petal width (cm)', [petal width (cm)]
         ...&amp;gt;     )
         ...&amp;gt;   ) as prediction
         ...&amp;gt; from dataset_iris
         ...&amp;gt; limit 1;
  sepal length (cm)    sepal width (cm)    petal length (cm)    petal width (cm)    target    prediction
-------------------  ------------------  -------------------  ------------------  --------  ------------
                5.1                 3.5                  1.4                 0.2         0             0
&lt;/code&gt;&lt;/pre&gt;
&lt;h4&gt;SQLite extensions by Alex Garcia&lt;/h4&gt;
&lt;p&gt;Alex Garcia has &lt;a href="https://github.com/asg017/sqlite-ecosystem"&gt;a growing collection&lt;/a&gt; of SQLite extensions, many of which are written in Rust but are packaged as wheels for ease of installation using Python.&lt;/p&gt;
&lt;p&gt;Alex released five plugins for SQLite corresponding to five of his existing extensions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;sqlite-utils-sqlite-regex&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sqlite-utils-sqlite-path&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sqlite-utils-sqlite-url&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sqlite-utils-sqlite-ulid&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;sqlite-utils-sqlite-lines&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's an example of &lt;code&gt;sqlite-utils-sqlite-ulid&lt;/code&gt; in action:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils install sqlite-utils-sqlite-ulid
sqlite-utils memory &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;select ulid() u1, ulid() u2, ulid() u3&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; jq&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[
  {
    &lt;span class="pl-ent"&gt;"u1"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h64d1ysg1rx63z1gwy7nah4n&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"u2"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h64d1ysgd7vx04sc9pncqh10&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"u3"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;01h64d1ysgz1sy7njkqt86dkq9&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  }
]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I've started a &lt;a href="https://github.com/simonw/sqlite-utils-plugins"&gt;sqlite-utils plugin directory&lt;/a&gt; with a list of all of the plugins so far.&lt;/p&gt;
&lt;h4&gt;Building your own plugin&lt;/h4&gt;
&lt;p&gt;If you want to try building your own plugin, the documentation includes a &lt;a href="https://sqlite-utils.datasette.io/en/stable/plugins.html#building-a-plugin"&gt;simple step-by-step guide&lt;/a&gt;. A plugin can be built with as little as two files: a Python module implementing the hooks, and a &lt;code&gt;pyproject.toml&lt;/code&gt; module with metadata about how it should be installed.&lt;/p&gt;
&lt;p&gt;I've also released a new &lt;a href="https://pypi.org/project/cookiecutter/"&gt;cookiecutter&lt;/a&gt; template: &lt;a href="https://github.com/simonw/sqlite-utils-plugin"&gt;simonw/sqlite-utils-plugin&lt;/a&gt;. Here's how to use that to get started building a plugin:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;cookiecutter gh:simonw/sqlite-utils-plugin&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Answer the form fields like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;plugin_name []: rot13
description []: select rot13('text') as a sqlite-utils plugin
hyphenated [rot13]: 
underscored [rot13]: 
github_username []: your-username
author_name []: your-name
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Change directory into the new folder and use &lt;code&gt;sqlite-utils install -e&lt;/code&gt; to install an editable version of your plugin, so changes you make will be reflected when you run the tool:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;cd&lt;/span&gt; sqlite-utils-rot13
sqlite-utils install -e &lt;span class="pl-c1"&gt;.&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Run this command to confirm the plugin has been installed:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils plugins&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You should see this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[
  {
    &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;sqlite-utils-rot13&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"hooks"&lt;/span&gt;: [
      &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;prepare_connection&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
    ],
    &lt;span class="pl-ent"&gt;"version"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0.1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  }
]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now drop this code into the &lt;code&gt;sqlite_utils_rot13.py&lt;/code&gt; file:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;sqlite_utils&lt;/span&gt;


&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;rot13&lt;/span&gt;(&lt;span class="pl-s1"&gt;s&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;chars&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; []
    &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;v&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;s&lt;/span&gt;:
        &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s1"&gt;v&lt;/span&gt;)
        &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s"&gt;"a"&lt;/span&gt;) &lt;span class="pl-c1"&gt;and&lt;/span&gt; &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s"&gt;"z"&lt;/span&gt;):
            &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;gt;&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s"&gt;"m"&lt;/span&gt;):
                &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;-=&lt;/span&gt; &lt;span class="pl-c1"&gt;13&lt;/span&gt;
            &lt;span class="pl-k"&gt;else&lt;/span&gt;:
                &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;+=&lt;/span&gt; &lt;span class="pl-c1"&gt;13&lt;/span&gt;
        &lt;span class="pl-k"&gt;elif&lt;/span&gt; &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s"&gt;"A"&lt;/span&gt;) &lt;span class="pl-c1"&gt;and&lt;/span&gt; &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s"&gt;"Z"&lt;/span&gt;):
            &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;&amp;gt;&lt;/span&gt; &lt;span class="pl-en"&gt;ord&lt;/span&gt;(&lt;span class="pl-s"&gt;"M"&lt;/span&gt;):
                &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;-=&lt;/span&gt; &lt;span class="pl-c1"&gt;13&lt;/span&gt;
            &lt;span class="pl-k"&gt;else&lt;/span&gt;:
                &lt;span class="pl-s1"&gt;c&lt;/span&gt; &lt;span class="pl-c1"&gt;+=&lt;/span&gt; &lt;span class="pl-c1"&gt;13&lt;/span&gt;
        &lt;span class="pl-s1"&gt;chars&lt;/span&gt;.&lt;span class="pl-en"&gt;append&lt;/span&gt;(&lt;span class="pl-en"&gt;chr&lt;/span&gt;(&lt;span class="pl-s1"&gt;c&lt;/span&gt;))

    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s"&gt;""&lt;/span&gt;.&lt;span class="pl-en"&gt;join&lt;/span&gt;(&lt;span class="pl-s1"&gt;chars&lt;/span&gt;)


&lt;span class="pl-en"&gt;@&lt;span class="pl-s1"&gt;sqlite_utils&lt;/span&gt;.&lt;span class="pl-s1"&gt;hookimpl&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;prepare_connection&lt;/span&gt;(&lt;span class="pl-s1"&gt;conn&lt;/span&gt;):
    &lt;span class="pl-s1"&gt;conn&lt;/span&gt;.&lt;span class="pl-en"&gt;create_function&lt;/span&gt;(&lt;span class="pl-s"&gt;"rot13"&lt;/span&gt;, &lt;span class="pl-c1"&gt;1&lt;/span&gt;, &lt;span class="pl-s1"&gt;rot13&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;And try it out like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;sqlite-utils memory &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;select rot13('hello world')&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[{&lt;span class="pl-ent"&gt;"rot13('hello world')"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;uryyb jbeyq&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;}]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And to reverse that:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sqlite-utils memory "select rot13('uryyb jbeyq')"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Output:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;[{&lt;span class="pl-ent"&gt;"rot13('uryyb jbeyq')"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;hello world&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;}]&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As you can see, building plugins can be done with very little code. I'm excited to see what else people build with this new capability!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite-utils"&gt;sqlite-utils&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="plugins"/><category term="projects"/><category term="sqlite"/><category term="sqlite-utils"/><category term="alex-garcia"/></entry><entry><title>sqlite-vss v0.1.1 Annotated Release Notes</title><link href="https://simonwillison.net/2023/Jul/20/sqlite-vss-v011-annotated-release-notes/#atom-tag" rel="alternate"/><published>2023-07-20T17:48:50+00:00</published><updated>2023-07-20T17:48:50+00:00</updated><id>https://simonwillison.net/2023/Jul/20/sqlite-vss-v011-annotated-release-notes/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/sqlite-vss-v0-1-1-annotated-release-notes"&gt;sqlite-vss v0.1.1 Annotated Release Notes&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia’s sqlite-vss adds vector search directly to SQLite through a custom extension. It’s now easily installed for Python, Node.js, Deno, Elixir, Go, Rust and Ruby (“gem install sqlite-vss”), and is being used actively by enough people that Alex is getting actionable feedback, including fixes for memory leaks spotted in production.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/agarcia_me/status/1682055955549417473"&gt;@agarcia_me&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vector-search"&gt;vector-search&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="annotated-release-notes"/><category term="alex-garcia"/><category term="vector-search"/></entry><entry><title>Making SQLite extensions npm install'able for Node.js, and on deno.land/x for Deno</title><link href="https://simonwillison.net/2023/Mar/29/sqlite-extensions-npm-installable/#atom-tag" rel="alternate"/><published>2023-03-29T22:13:21+00:00</published><updated>2023-03-29T22:13:21+00:00</updated><id>https://simonwillison.net/2023/Mar/29/sqlite-extensions-npm-installable/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/making-sqlite-extensions-npm-installable-and-deno"&gt;Making SQLite extensions npm install&amp;#x27;able for Node.js, and on deno.land/x for Deno&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia figured out how to get his “pip install X” trick for distributing compiled SQLite extensions to work for Node too! Now you can “npm install” 10 of his extensions, including sqlite-regex and sqlite-xsv and sqlite-http and sqlite-html and more, and attach them to a node-sqlite3 or better-sqlite3 connection. He’s bundled them for Deno too!

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://datasette.io/discord"&gt;Datasette Discord&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/npm"&gt;npm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/deno"&gt;deno&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="nodejs"/><category term="sqlite"/><category term="npm"/><category term="deno"/><category term="alex-garcia"/></entry><entry><title>Introducing sqlite-vss: A SQLite Extension for Vector Search</title><link href="https://simonwillison.net/2023/Feb/10/sqlite-vss/#atom-tag" rel="alternate"/><published>2023-02-10T22:53:14+00:00</published><updated>2023-02-10T22:53:14+00:00</updated><id>https://simonwillison.net/2023/Feb/10/sqlite-vss/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-vss"&gt;Introducing sqlite-vss: A SQLite Extension for Vector Search&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This latest SQLite extension from Alex Garcia is possibly his best yet: it adds FAISS-powered vector similarity search directly to SQLite, enabling fast KNN similarity lookups against a virtual table that feels a lot like SQLite’s own built-in full text search feature. This write-up includes interactive demos using Datasette called from an Observable notebook, running similarity searches against an index of 200,000 news headlines and summaries in less than 50ms.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://fedi.simonwillison.net/@simon/109842505896608627"&gt;@simon on Mastodon&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vector-search"&gt;vector-search&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="datasette"/><category term="observable"/><category term="alex-garcia"/><category term="vector-search"/></entry><entry><title>Making SQLite extensions pip install-able</title><link href="https://simonwillison.net/2023/Feb/6/making-sqlite-extensions-pip-install-able/#atom-tag" rel="alternate"/><published>2023-02-06T19:44:10+00:00</published><updated>2023-02-06T19:44:10+00:00</updated><id>https://simonwillison.net/2023/Feb/6/making-sqlite-extensions-pip-install-able/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/making-sqlite-extensions-pip-install-able"&gt;Making SQLite extensions pip install-able&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia figured out how to bundle a compiled SQLite extension in a Python wheel (building different wheels for different platforms) and publish them to PyPI. This is a huge leap forward in terms of the usability of SQLite extensions, which have previously been pretty difficult to actually install and run. Alex also created Datasette plugins that depend on his packages, so you can now “datasette install datasette-sqlite-regex” (or datasette-sqlite-ulid, datasette-sqlite-fastrand, datasette-sqlite-jsonschema) to gain access to his custom SQLite extensions in your Datasette instance. It even works with “datasette publish --install” to deploy to Vercel, Fly.io and Cloud Run.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://discordapp.com/channels/823971286308356157/823971286941302908/1072220706366029864"&gt;Datasette Discord&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/pip"&gt;pip&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="pip"/><category term="plugins"/><category term="python"/><category term="sqlite"/><category term="datasette"/><category term="alex-garcia"/></entry><entry><title>sqlite-jsonschema</title><link href="https://simonwillison.net/2023/Jan/28/sqlite-jsonschema/#atom-tag" rel="alternate"/><published>2023-01-28T03:50:46+00:00</published><updated>2023-01-28T03:50:46+00:00</updated><id>https://simonwillison.net/2023/Jan/28/sqlite-jsonschema/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/asg017/sqlite-jsonschema"&gt;sqlite-jsonschema&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“A SQLite extension for validating JSON objects with JSON Schema”, building on the jsonschema Rust crate. SQLite and JSON are already a great combination—Alex suggests using this extension to implement check constraints to validate JSON columns before inserting into a table, or just to run queries finding existing data that doesn’t match a given schema.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonschema"&gt;jsonschema&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="jsonschema"/><category term="sqlite"/><category term="rust"/><category term="alex-garcia"/></entry><entry><title>sqlite-ulid</title><link href="https://simonwillison.net/2023/Jan/28/sqlite-ulid/#atom-tag" rel="alternate"/><published>2023-01-28T03:45:36+00:00</published><updated>2023-01-28T03:45:36+00:00</updated><id>https://simonwillison.net/2023/Jan/28/sqlite-ulid/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/asg017/sqlite-ulid"&gt;sqlite-ulid&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia’s sqlite-ulid adds lightning-fast SQL functions for generating ULIDs—Universally Unique Lexicographically Sortable Identifiers. These work like UUIDs but are smaller and faster to generate, and can be canonically encoded as a URL-safe 26 character string (UUIDs are 36 characters). Again, this builds on a Rust crate—ulid-rs—and can generate 1 million byte-represented ULIDs with the ulid_bytes() function in just 88.4ms.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uuid"&gt;uuid&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="uuid"/><category term="rust"/><category term="alex-garcia"/></entry><entry><title>sqlite-fastrand</title><link href="https://simonwillison.net/2023/Jan/28/sqlite-fastrand/#atom-tag" rel="alternate"/><published>2023-01-28T03:41:50+00:00</published><updated>2023-01-28T03:41:50+00:00</updated><id>https://simonwillison.net/2023/Jan/28/sqlite-fastrand/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/asg017/sqlite-fastrand"&gt;sqlite-fastrand&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia just dropped three new SQLite extensions, and I’m going to link to all of them. The first is sqlite-fastrand, which adds new functions for generating random numbers (and alphanumeric characters too). Impressively, these out-perform the default SQLite random() and randomblob() functions by about 1.6-2.6x, thanks to being built on the Rust fastrand crate which builds on wyhash, an extremely fast (though not cryptographically secure) hashing function.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="rust"/><category term="alex-garcia"/></entry><entry><title>Introducing sqlite-xsv: The Fastest CSV Parser for SQLite</title><link href="https://simonwillison.net/2023/Jan/14/sqlite-xsv/#atom-tag" rel="alternate"/><published>2023-01-14T21:54:05+00:00</published><updated>2023-01-14T21:54:05+00:00</updated><id>https://simonwillison.net/2023/Jan/14/sqlite-xsv/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-xsv"&gt;Introducing sqlite-xsv: The Fastest CSV Parser for SQLite&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia continues to push the boundaries of SQLite extensions. This new extension in Rust wraps the lightning fast Rust csv crate and provides a new csv_reader() virtual table that can handle regular, gzipped and zstd compressed files.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/csv"&gt;csv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zstd"&gt;zstd&lt;/a&gt;&lt;/p&gt;



</summary><category term="csv"/><category term="sqlite"/><category term="rust"/><category term="alex-garcia"/><category term="zstd"/></entry><entry><title>Introducing sqlite-loadable-rs: A framework for building SQLite Extensions in Rust</title><link href="https://simonwillison.net/2022/Dec/7/sqlite-loadable-rs/#atom-tag" rel="alternate"/><published>2022-12-07T23:08:49+00:00</published><updated>2022-12-07T23:08:49+00:00</updated><id>https://simonwillison.net/2022/Dec/7/sqlite-loadable-rs/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-loadable-rs"&gt;Introducing sqlite-loadable-rs: A framework for building SQLite Extensions in Rust&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia has built a new Rust library for creating SQLite extensions—initially supporting custom scalar functions, virtual tables and table functions and with more types of extension coming soon. This looks very easy to use, partly because the documentation and examples are already delightfully thorough, especially for an initial release.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="sqlite"/><category term="rust"/><category term="alex-garcia"/></entry><entry><title>Introducing sqlite-http: A SQLite extension for making HTTP requests</title><link href="https://simonwillison.net/2022/Aug/10/sqlite-http/#atom-tag" rel="alternate"/><published>2022-08-10T22:22:42+00:00</published><updated>2022-08-10T22:22:42+00:00</updated><id>https://simonwillison.net/2022/Aug/10/sqlite-http/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-http"&gt;Introducing sqlite-http: A SQLite extension for making HTTP requests&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Characteristically thoughtful SQLite extension from Alex, following his sqlite-html extension from a few days ago. sqlite-http lets you make HTTP requests from SQLite—both as a SQL function that returns a string, and as a table-valued SQL function that lets you independently access the body, headers and even the timing data for the request.&lt;/p&gt;

&lt;p&gt;This write-up is excellent: it provides interactive demos but also shows how additional SQLite extensions such as the new-to-me “define” extension can be combined with sqlite-http to create custom functions for parsing and processing HTML.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/agarcia_me/status/1557437368818249728"&gt;@agarcia_me&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="http"/><category term="sqlite"/><category term="alex-garcia"/></entry><entry><title>Introducing sqlite-html: query, parse, and generate HTML in SQLite</title><link href="https://simonwillison.net/2022/Aug/3/sqlite-html/#atom-tag" rel="alternate"/><published>2022-08-03T17:31:16+00:00</published><updated>2022-08-03T17:31:16+00:00</updated><id>https://simonwillison.net/2022/Aug/3/sqlite-html/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-html"&gt;Introducing sqlite-html: query, parse, and generate HTML in SQLite&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Another brilliant SQLite extension module from Alex Garcia, this time written in Go. sqlite-html adds a whole family of functions to SQLite for parsing and constructing HTML strings, built on the Go goquery and cascadia libraries. Once again, Alex uses an Observable notebook to describe the new features, with embedded interactive examples that are backed by a Datasette instance running in Fly.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://til.simonwillison.net/sqlite/trying-macos-extensions"&gt;My TIL on Trying out SQLite extensions on macOS&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/html"&gt;html&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="go"/><category term="html"/><category term="sqlite"/><category term="datasette"/><category term="alex-garcia"/></entry><entry><title>Introducing sqlite-lines - a SQLite extension for reading files line-by-line</title><link href="https://simonwillison.net/2022/Jul/30/sqlite-lines/#atom-tag" rel="alternate"/><published>2022-07-30T19:18:53+00:00</published><updated>2022-07-30T19:18:53+00:00</updated><id>https://simonwillison.net/2022/Jul/30/sqlite-lines/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@asg017/introducing-sqlite-lines"&gt;Introducing sqlite-lines - a SQLite extension for reading files line-by-line&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Alex Garcia wrote a brilliant C module for SQLIte which adds functions (and a table-valued function) for efficiently reading newline-delimited text into SQLite. When combined with SQLite’s built-in JSON features this means you can read a huge newline-delimited JSON file into SQLite in a streaming fashion so it doesn’t exhaust memory for a large file. Alex also compiled the extension to WebAssembly, and his post here is an Observable notebook post that lets you exercise the code directly.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/agarcia_me/status/1553266886929920000"&gt;@agarcia_me&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="sqlite"/><category term="observable"/><category term="webassembly"/><category term="alex-garcia"/></entry></feed>