<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: json</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/json.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-04-05T00:32:11+00:00</updated><author><name>Simon Willison</name></author><entry><title>research-llm-apis 2026-04-04</title><link href="https://simonwillison.net/2026/Apr/5/research-llm-apis/#atom-tag" rel="alternate"/><published>2026-04-05T00:32:11+00:00</published><updated>2026-04-05T00:32:11+00:00</updated><id>https://simonwillison.net/2026/Apr/5/research-llm-apis/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research-llm-apis/releases/tag/2026-04-04"&gt;research-llm-apis 2026-04-04&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I'm working on a &lt;a href="https://github.com/simonw/llm/issues/1314"&gt;major change&lt;/a&gt; to my LLM Python library and CLI tool. LLM provides an abstraction layer over hundreds of different LLMs from dozens of different vendors thanks to its plugin system, and some of those vendors have grown new features over the past year which LLM's abstraction layer can't handle, such as server-side tool execution.&lt;/p&gt;
&lt;p&gt;To help design that new abstraction layer I had Claude Code read through the Python client libraries for Anthropic, OpenAI, Gemini and Mistral and use those to help craft &lt;code&gt;curl&lt;/code&gt; commands to access the raw JSON for both streaming and non-streaming modes across a range of different scenarios. Both the scripts and the captured outputs now live in this new repo.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="json"/><category term="llms"/><category term="llm"/></entry><entry><title>We Rewrote JSONata with AI in a Day, Saved $500K/Year</title><link href="https://simonwillison.net/2026/Mar/27/vine-porting-jsonata/#atom-tag" rel="alternate"/><published>2026-03-27T00:35:01+00:00</published><updated>2026-03-27T00:35:01+00:00</updated><id>https://simonwillison.net/2026/Mar/27/vine-porting-jsonata/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.reco.ai/blog/we-rewrote-jsonata-with-ai"&gt;We Rewrote JSONata with AI in a Day, Saved $500K/Year&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Bit of a hyperbolic framing but this looks like another case study of &lt;strong&gt;vibe porting&lt;/strong&gt;, this time spinning up a new custom Go implementation of the &lt;a href="https://jsonata.org"&gt;JSONata&lt;/a&gt; JSON expression language - similar in focus to jq, and heavily associated with the &lt;a href="https://nodered.org"&gt;Node-RED&lt;/a&gt; platform.&lt;/p&gt;
&lt;p&gt;As with other vibe-porting projects the key enabling factor was JSONata's existing test suite, which helped build the first working Go version in 7 hours and $400 of token spend.&lt;/p&gt;
&lt;p&gt;The Reco team then used a shadow deployment for a week to run the new and old versions in parallel to confirm the new implementation exactly matched the behavior of the old one.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/go"&gt;go&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-porting"&gt;vibe-porting&lt;/a&gt;&lt;/p&gt;



</summary><category term="go"/><category term="json"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="agentic-engineering"/><category term="vibe-porting"/></entry><entry><title>SQLite Tags Benchmark: Comparing 5 Tagging Strategies</title><link href="https://simonwillison.net/2026/Mar/20/sqlite-tags-benchmark/#atom-tag" rel="alternate"/><published>2026-03-20T02:57:00+00:00</published><updated>2026-03-20T02:57:00+00:00</updated><id>https://simonwillison.net/2026/Mar/20/sqlite-tags-benchmark/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Research:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme"&gt;SQLite Tags Benchmark: Comparing 5 Tagging Strategies&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I had Claude Code run a micro-benchmark comparing different approaches to implementing tagging in SQLite. Traditional many-to-many tables won, but FTS5 came a close second. Full table scans with LIKE queries performed better than I expected, but full table scans with JSON arrays and &lt;code&gt;json_each()&lt;/code&gt; were much slower.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="json"/><category term="sqlite"/></entry><entry><title>Open Responses</title><link href="https://simonwillison.net/2026/Jan/15/open-responses/#atom-tag" rel="alternate"/><published>2026-01-15T23:56:56+00:00</published><updated>2026-01-15T23:56:56+00:00</updated><id>https://simonwillison.net/2026/Jan/15/open-responses/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.openresponses.org/"&gt;Open Responses&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is the standardization effort I've most wanted in the world of LLMs: a vendor-neutral specification for the JSON API that clients can use to talk to hosted LLMs.&lt;/p&gt;
&lt;p&gt;Open Responses aims to provide exactly that as a documented standard, derived from OpenAI's Responses API.&lt;/p&gt;
&lt;p&gt;I was hoping for one based on their older Chat Completions API since so many other products have cloned the already, but basing it on Responses does make sense since that API was designed with the feature of more recent models - such as reasoning traces - baked into the design.&lt;/p&gt;
&lt;p&gt;What's certainly notable is the list of launch partners. OpenRouter alone means we can expect to be able to use this protocol with almost every existing model, and Hugging Face, LM Studio, vLLM, Ollama and Vercel cover a huge portion of the common tools used to serve models.&lt;/p&gt;
&lt;p&gt;For protocols like this I really want to see a comprehensive, language-independent conformance test site. Open Responses has a subset of that - the official repository includes &lt;a href="https://github.com/openresponses/openresponses/blob/d0f23437b27845d5c3d0abaf5cb5c4a702f26b05/src/lib/compliance-tests.ts"&gt;src/lib/compliance-tests.ts&lt;/a&gt; which can be used to exercise a server implementation, and is available as a React app &lt;a href="https://www.openresponses.org/compliance"&gt;on the official site&lt;/a&gt; that can be pointed at any implementation served via CORS.&lt;/p&gt;
&lt;p&gt;What's missing is the equivalent for clients. I plan to spin up my own client library for this in Python and I'd really like to be able to run that against a conformance suite designed to check that my client correctly handles all of the details.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/reach_vb/status/2011863516852965565"&gt;VB&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/standards"&gt;standards&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openrouter"&gt;openrouter&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/conformance-suites"&gt;conformance-suites&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="standards"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="openrouter"/><category term="conformance-suites"/></entry><entry><title>Progressive JSON</title><link href="https://simonwillison.net/2025/Jun/1/progressive-json/#atom-tag" rel="alternate"/><published>2025-06-01T04:45:32+00:00</published><updated>2025-06-01T04:45:32+00:00</updated><id>https://simonwillison.net/2025/Jun/1/progressive-json/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://overreacted.io/progressive-json/"&gt;Progressive JSON&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This post by Dan Abramov is a trap! It proposes a fascinating way of streaming JSON objects to a client in a way that provides the shape of the JSON before the stream has completed, then fills in the gaps as more data arrives... and then turns out to be a sneaky tutorial in how React Server Components work.&lt;/p&gt;
&lt;p&gt;Ignoring the sneakiness, the imaginary streaming JSON format it describes is a fascinating thought exercise:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  header: "$1",
  post: "$2",
  footer: "$3"
}
/* $1 */
"Welcome to my blog"
/* $3 */
"Hope you like it"
/* $2 */
{
  content: "$4",
  comments: "$5"
}
/* $4 */
"This is my article"
/* $5 */
["$6", "$7", "$8"]
/* $6 */
"This is the first comment"
/* $7 */
"This is the second comment"
/* $8 */
"This is the third comment"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After each block the full JSON document so far can be constructed, and Dan suggests interleaving &lt;code&gt;Promise()&lt;/code&gt; objects along the way for placeholders that have not yet been fully resolved - so after receipt of block &lt;code&gt;$3&lt;/code&gt; above (note that the blocks can be served out of order) the document would look like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  header: "Welcome to my blog",
  post: new Promise(/* ... not yet resolved ... */),
  footer: "Hope you like it"
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I'm tucking this idea away in case I ever get a chance to try it out in the future.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/react"&gt;react&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dan-abramov"&gt;dan-abramov&lt;/a&gt;&lt;/p&gt;



</summary><category term="javascript"/><category term="json"/><category term="react"/><category term="dan-abramov"/></entry><entry><title>Incomplete JSON Pretty Printer</title><link href="https://simonwillison.net/2025/Mar/28/incomplete-json-pretty-printer/#atom-tag" rel="alternate"/><published>2025-03-28T00:18:43+00:00</published><updated>2025-03-28T00:18:43+00:00</updated><id>https://simonwillison.net/2025/Mar/28/incomplete-json-pretty-printer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://tools.simonwillison.net/incomplete-json-printer"&gt;Incomplete JSON Pretty Printer&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Every now and then a log file or a tool I'm using will spit out a bunch of JSON that terminates unexpectedly, meaning I can't copy it into a text editor and pretty-print it to see what's going on.&lt;/p&gt;
&lt;p&gt;The other day I got frustrated with this and had the then-new GPT-4.5 build me a pretty-printer that didn't mind incomplete JSON, using an OpenAI Canvas. Here's &lt;a href="https://chatgpt.com/share/67dd9d55-7f70-8006-b55d-72730f60ddbe"&gt;the chat&lt;/a&gt; and here's &lt;a href="https://chatgpt.com/canvas/shared/67e5e9b3f7bc8191b2306a123c9d328f"&gt;the resulting interactive&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I spotted a bug with the way it indented code today so I pasted it into Claude 3.7 Sonnet Thinking mode and had it make a bunch of improvements - &lt;a href="https://claude.ai/share/22dc4b58-e8c4-44a4-9650-a37d21513b8d"&gt;full transcript here&lt;/a&gt;. Here's the &lt;a href="https://github.com/simonw/tools/blob/main/incomplete-json-printer.html"&gt;finished code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animated GIF demo - as I type JSON it is pretty printed below, at the end I click the Load Pelican Example button." src="https://static.simonwillison.net/static/2025/pretty-print-json.gif" /&gt;&lt;/p&gt;
&lt;p&gt;In many ways this is a perfect example of &lt;a href="https://simonwillison.net/2025/Mar/19/vibe-coding/"&gt;vibe coding &lt;/a&gt; in action. At no point did I look at a &lt;em&gt;single line&lt;/em&gt; of code that either of the LLMs had written for me. I honestly don't care how this thing works: it could not be lower stakes for me, the worst a bug could do is show me poorly formatted incomplete JSON.&lt;/p&gt;
&lt;p&gt;I was vaguely aware that some kind of state machine style parser would be needed, because you can't parse incomplete JSON with a regular JSON parser. Building simple parsers is the kind of thing LLMs are surprisingly good at, and also the kind of thing I don't want to take on for a trivial project.&lt;/p&gt;
&lt;p&gt;At one point I told Claude "Try using your code execution tool to check your logic", because I happen to know Claude can write and then execute JavaScript independently of using it for artifacts. That helped it out a bunch.&lt;/p&gt;
&lt;p&gt;I later dropped in the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;modify the tool to work better on mobile screens and generally look a bit nicer - and remove the pretty print JSON button, it should update any time the input text is changed. Also add a "copy to clipboard" button next to the results. And add a button that says "example" which adds a longer incomplete example to demonstrate the tool, make that example pelican themed.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It's fun being able to say "generally look a bit nicer" and get a perfectly acceptable result!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="tools"/><category term="ai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="claude"/><category term="vibe-coding"/></entry><entry><title>PostgreSQL 17: SQL/JSON is here!</title><link href="https://simonwillison.net/2024/Oct/13/postgresql-sqljson/#atom-tag" rel="alternate"/><published>2024-10-13T19:01:02+00:00</published><updated>2024-10-13T19:01:02+00:00</updated><id>https://simonwillison.net/2024/Oct/13/postgresql-sqljson/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.depesz.com/2024/10/11/sql-json-is-here-kinda-waiting-for-pg-17/"&gt;PostgreSQL 17: SQL/JSON is here!&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Hubert Lubaczewski dives into the new JSON features added in PostgreSQL 17, released a few weeks ago on the &lt;a href="https://www.postgresql.org/about/news/postgresql-17-released-2936/"&gt;26th of September&lt;/a&gt;. This is the latest in his &lt;a href="https://www.depesz.com/tag/waiting/"&gt;long series&lt;/a&gt; of similar posts about new PostgreSQL features.&lt;/p&gt;
&lt;p&gt;The features are based on the new &lt;a href="https://en.wikipedia.org/wiki/SQL:2023"&gt;SQL:2023&lt;/a&gt; standard from June 2023. If you want to actually &lt;em&gt;read&lt;/em&gt; the specification for SQL:2023 it looks like you have to &lt;a href="https://www.iso.org/standard/76583.html"&gt;buy a PDF from ISO&lt;/a&gt; for 194 Swiss Francs (currently $226). Here's a handy summary by Peter Eisentraut: &lt;a href="http://peter.eisentraut.org/blog/2023/04/04/sql-2023-is-finished-here-is-whats-new"&gt;SQL:2023 is finished: Here is what's new&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There's a lot of neat stuff in here. I'm particularly interested in the &lt;code&gt;json_table()&lt;/code&gt; table-valued function, which can convert a JSON string into a table with quite a lot of flexibility. You can even specify a full table schema as part of the function call:&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;SELECT&lt;/span&gt; &lt;span class="pl-k"&gt;*&lt;/span&gt; &lt;span class="pl-k"&gt;FROM&lt;/span&gt; json_table(
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;[{"a":10,"b":20},{"a":30,"b":40}]&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;::jsonb,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;$[*]&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
    COLUMNS (
        id FOR ORDINALITY,
        column_a int4 &lt;span class="pl-k"&gt;path&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;$.a&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;,
        column_b int4 &lt;span class="pl-k"&gt;path&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;$.b&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;,
        a int4,
        b int4,
        c &lt;span class="pl-k"&gt;text&lt;/span&gt;
    )
);&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;SQLite has &lt;a href="https://www.sqlite.org/json1.html"&gt;solid JSON support already&lt;/a&gt; and often imitates PostgreSQL features, so I wonder if we'll see an update to SQLite that reflects some aspects of this new syntax.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/spw1je/sql_json_is_here_kinda_waiting_for_pg_17"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postgresql"&gt;postgresql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="postgresql"/><category term="sql"/><category term="sqlite"/></entry><entry><title>Wikidata is a Giant Crosswalk File</title><link href="https://simonwillison.net/2024/Oct/5/wikidata-is-a-giant-crosswalk-file/#atom-tag" rel="alternate"/><published>2024-10-05T15:45:36+00:00</published><updated>2024-10-05T15:45:36+00:00</updated><id>https://simonwillison.net/2024/Oct/5/wikidata-is-a-giant-crosswalk-file/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.dbreunig.com/2024/10/04/wikidata-is-a-giant-crosswalk-file.html"&gt;Wikidata is a Giant Crosswalk File&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Drew Breunig shows how to take the 140GB Wikidata JSON export, use &lt;code&gt;sed 's/,$//'&lt;/code&gt; to convert it to newline-delimited JSON, then use DuckDB to run queries and extract external identifiers, including a query that pulls out 500MB of latitude and longitude points.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wikipedia"&gt;wikipedia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/duckdb"&gt;duckdb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/drew-breunig"&gt;drew-breunig&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="wikipedia"/><category term="duckdb"/><category term="drew-breunig"/></entry><entry><title>Jiter</title><link href="https://simonwillison.net/2024/Sep/22/jiter/#atom-tag" rel="alternate"/><published>2024-09-22T20:03:07+00:00</published><updated>2024-09-22T20:03:07+00:00</updated><id>https://simonwillison.net/2024/Sep/22/jiter/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/pydantic/jiter/tree/main/crates/jiter-python"&gt;Jiter&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
One of the challenges in dealing with LLM streaming APIs is the need to parse partial JSON - until the stream has ended you won't have a complete valid JSON object, but you may want to display components of that JSON as they become available.&lt;/p&gt;
&lt;p&gt;I've solved this previously using the &lt;a href="https://pypi.org/project/ijson/"&gt;ijson&lt;/a&gt; streaming JSON library, see &lt;a href="https://til.simonwillison.net/json/ijson-stream"&gt;my previous TIL&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Today I found out about Jiter, a new option from the team behind Pydantic. It's written in Rust and extracted from &lt;a href="https://github.com/pydantic/pydantic-core"&gt;pydantic-core&lt;/a&gt;, so the Python wrapper for it can be installed using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install jiter
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can feed it an incomplete JSON bytes object and use &lt;code&gt;partial_mode="on"&lt;/code&gt; to parse the valid subset:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;jiter&lt;/span&gt;
&lt;span class="pl-s1"&gt;partial_json&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;b'{"name": "John", "age": 30, "city": "New Yor'&lt;/span&gt;
&lt;span class="pl-s1"&gt;jiter&lt;/span&gt;.&lt;span class="pl-en"&gt;from_json&lt;/span&gt;(&lt;span class="pl-s1"&gt;partial_json&lt;/span&gt;, &lt;span class="pl-s1"&gt;partial_mode&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"on"&lt;/span&gt;)
&lt;span class="pl-c"&gt;# {'name': 'John', 'age': 30}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Or use &lt;code&gt;partial_mode="trailing-strings"&lt;/code&gt; to include incomplete string fields too:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;jiter&lt;/span&gt;.&lt;span class="pl-en"&gt;from_json&lt;/span&gt;(&lt;span class="pl-s1"&gt;partial_json&lt;/span&gt;, &lt;span class="pl-s1"&gt;partial_mode&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"trailing-strings"&lt;/span&gt;)
&lt;span class="pl-c"&gt;# {'name': 'John', 'age': 30, 'city': 'New Yor'}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;a href="https://github.com/pydantic/jiter/blob/ae5fc7d8548c90ad8762dfdf2ea6461776c2feb6/crates/jiter-python/README.md"&gt;current README&lt;/a&gt; was a little thin, so I submiitted &lt;a href="https://github.com/pydantic/jiter/pull/143"&gt;a PR&lt;/a&gt; with some extra examples. I &lt;a href="https://gist.github.com/simonw/264d487db1a18f8585c2ca0c68e50d1e"&gt;got some help&lt;/a&gt; from &lt;code&gt;files-to-prompt&lt;/code&gt; and Claude 3.5 Sonnet):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;cd crates/jiter-python/ &amp;amp;&amp;amp; files-to-prompt -c README.md tests | llm -m claude-3.5-sonnet --system 'write a new README with comprehensive documentation'&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=41615404#41618393"&gt;jackmpcollins on Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pydantic"&gt;pydantic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="python"/><category term="rust"/><category term="ai-assisted-programming"/><category term="pydantic"/><category term="files-to-prompt"/></entry><entry><title>How streaming LLM APIs work</title><link href="https://simonwillison.net/2024/Sep/22/how-streaming-llm-apis-work/#atom-tag" rel="alternate"/><published>2024-09-22T03:48:12+00:00</published><updated>2024-09-22T03:48:12+00:00</updated><id>https://simonwillison.net/2024/Sep/22/how-streaming-llm-apis-work/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis"&gt;How streaming LLM APIs work&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New TIL. I used &lt;code&gt;curl&lt;/code&gt; to explore the streaming APIs provided by OpenAI, Anthropic and Google Gemini and wrote up detailed notes on what I learned.&lt;/p&gt;
&lt;p&gt;Also includes example code for &lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis#user-content-bonus-accessing-these-streams-using-httpx"&gt;receiving streaming events in Python with HTTPX&lt;/a&gt; and &lt;a href="https://til.simonwillison.net/llms/streaming-llm-apis#user-content-bonus--2-processing-streaming-events-in-javascript-with-fetch"&gt;receiving streaming events in client-side JavaScript using fetch()&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="apis"/><category term="http"/><category term="json"/><category term="llms"/></entry><entry><title>json-flatten, now with format documentation</title><link href="https://simonwillison.net/2024/Sep/7/json-flatten/#atom-tag" rel="alternate"/><published>2024-09-07T05:43:01+00:00</published><updated>2024-09-07T05:43:01+00:00</updated><id>https://simonwillison.net/2024/Sep/7/json-flatten/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/json-flatten?tab=readme-ov-file#json-flattening-format"&gt;json-flatten, now with format documentation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;code&gt;json-flatten&lt;/code&gt; is a fun little Python library I put together a few years ago for converting JSON data into a flat key-value format, suitable for inclusion in an HTML form or query string. It lets you take a structure like this one:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{"foo": {"bar": [1, True, None]}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And convert it into key-value pairs like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;foo.bar.[0]$int=1
foo.bar.[1]$bool=True
foo.bar.[2]$none=None
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;flatten(dictionary)&lt;/code&gt; function function converts to that format, and &lt;code&gt;unflatten(dictionary)&lt;/code&gt; converts back again.&lt;/p&gt;
&lt;p&gt;I was considering the library for a project today and realized that &lt;a href="https://github.com/simonw/json-flatten/blob/0.3/README.md"&gt;the 0.3 README&lt;/a&gt; was a little thin - it showed how to use the library but didn't provide full details of the format it used.&lt;/p&gt;
&lt;p&gt;On a hunch, I decided to see if &lt;a href="https://simonwillison.net/2024/Apr/8/files-to-prompt/"&gt;files-to-prompt&lt;/a&gt; plus &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; plus Claude 3.5 Sonnet could write that documentation for me. I ran this command:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;files-to-prompt *.py | llm -m claude-3.5-sonnet --system 'write detailed documentation in markdown describing the format used to represent JSON and nested JSON as key/value pairs, include a table as well'&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That &lt;code&gt;*.py&lt;/code&gt; picked up both &lt;code&gt;json_flatten.py&lt;/code&gt; and &lt;code&gt;test_json_flatten.py&lt;/code&gt; - I figured the test file had enough examples in that it should act as a good source of information for the documentation.&lt;/p&gt;
&lt;p&gt;This worked really well! You can see the &lt;a href="https://gist.github.com/simonw/f5caf4ca24662f0078ec3cffcb040ce4#response"&gt;first draft it produced here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It included before and after examples in the documentation. I didn't fully trust these to be accurate, so I gave it this follow-up prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;llm -c "Rewrite that document to use the Python cog library to generate the examples"&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm a big fan of &lt;a href="https://nedbatchelder.com/code/cog/"&gt;Cog&lt;/a&gt; for maintaining examples in READMEs that are generated by code. Cog has been around for a couple of decades now so it was a safe bet that Claude would know about it.&lt;/p&gt;
&lt;p&gt;This &lt;a href="https://gist.github.com/simonw/f5caf4ca24662f0078ec3cffcb040ce4#response-1"&gt;almost worked&lt;/a&gt; - it produced valid Cog syntax like the following:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;[[[cog
example = {
"fruits": ["apple", "banana", "cherry"]
}

cog.out("```json\n")
cog.out(str(example))
cog.out("\n```\n")
cog.out("Flattened:\n```\n")
for key, value in flatten(example).items():
    cog.out(f"{key}: {value}\n")
cog.out("```\n")
]]]
[[[end]]]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But that wasn't entirely right, because it forgot to include the Markdown comments that would hide the Cog syntax, which should have looked like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;!-- [[[cog --&amp;gt;
...
&amp;lt;!-- ]]] --&amp;gt;
...
&amp;lt;!-- [[[end]]] --&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I could have prompted it to correct itself, but at this point I decided to take over and edit the rest of the documentation by hand.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://github.com/simonw/json-flatten/blob/78c2835bf3b7b7cf068fca04a6cf341347dfa2bc/README.md"&gt;end result&lt;/a&gt; was documentation that I'm really happy with, and that I probably wouldn't have bothered to write if Claude hadn't got me started.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-3-5-sonnet"&gt;claude-3-5-sonnet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/files-to-prompt"&gt;files-to-prompt&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="llm"/><category term="anthropic"/><category term="claude"/><category term="claude-3-5-sonnet"/><category term="files-to-prompt"/></entry><entry><title>json-flatten 0.3.1</title><link href="https://simonwillison.net/2024/Sep/7/json-flatten-2/#atom-tag" rel="alternate"/><published>2024-09-07T04:14:40+00:00</published><updated>2024-09-07T04:14:40+00:00</updated><id>https://simonwillison.net/2024/Sep/7/json-flatten-2/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/json-flatten/releases/tag/0.3.1"&gt;json-flatten 0.3.1&lt;/a&gt;&lt;/p&gt;
        
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="json"/><category term="python"/></entry><entry><title>New improved commit messages for scrape-hacker-news-by-domain</title><link href="https://simonwillison.net/2024/Sep/6/improved-commit-messages-csv-diff/#atom-tag" rel="alternate"/><published>2024-09-06T05:40:01+00:00</published><updated>2024-09-06T05:40:01+00:00</updated><id>https://simonwillison.net/2024/Sep/6/improved-commit-messages-csv-diff/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/issues/6"&gt;New improved commit messages for scrape-hacker-news-by-domain&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
My &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain"&gt;simonw/scrape-hacker-news-by-domain&lt;/a&gt; repo has a very specific purpose. Once an hour it scrapes the Hacker News &lt;a href="https://news.ycombinator.com/from?site=simonwillison.net"&gt;/from?site=simonwillison.net&lt;/a&gt; page (and the equivalent &lt;a href="https://news.ycombinator.com/from?site=datasette.io"&gt;for datasette.io&lt;/a&gt;) using my &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; tool and stashes the parsed links, scores and comment counts in JSON files in that repo.&lt;/p&gt;
&lt;p&gt;It does this mainly so I can subscribe to GitHub's Atom feed of the commit log - visit &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/commits/main"&gt;simonw/scrape-hacker-news-by-domain/commits/main&lt;/a&gt; and add &lt;code&gt;.atom&lt;/code&gt; to the URL to get that.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://netnewswire.com/"&gt;NetNewsWire&lt;/a&gt; will inform me within about an hour if any of my content has made it to Hacker News, and the repo will track the score and comment count for me over time. I wrote more about how this works in &lt;a href="https://simonwillison.net/2022/Mar/14/scraping-web-pages-shot-scraper/#scrape-a-web-page"&gt;Scraping web pages from the command line with shot-scraper&lt;/a&gt; back in March 2022.&lt;/p&gt;
&lt;p&gt;Prior to the latest improvement, the commit messages themselves were pretty uninformative. The message had the date, and to actually see which Hacker News post it was referring to, I had to click through to the commit and look at the diff.&lt;/p&gt;
&lt;p&gt;I built my &lt;a href="https://github.com/simonw/csv-diff"&gt;csv-diff&lt;/a&gt; tool a while back to help address this problem: it can produce a slightly more human-readable version of a diff between two CSV or JSON files, ideally suited for including in a commit message attached to a &lt;a href="https://simonwillison.net/tags/git-scraping/"&gt;git scraping&lt;/a&gt; repo like this one.&lt;/p&gt;
&lt;p&gt;I &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/commit/35aa3c6c03507d89dd2eb7afa54839b2575b0e33"&gt;got that working&lt;/a&gt;, but there was still room for improvement. I recently learned that any Hacker News thread has an undocumented URL at &lt;code&gt;/latest?id=x&lt;/code&gt; which displays the most recently added comments at the top.&lt;/p&gt;
&lt;p&gt;I wanted that in my commit messages, so I could quickly click a link to see the most recent comments on a thread.&lt;/p&gt;
&lt;p&gt;So... I added one more feature to &lt;code&gt;csv-diff&lt;/code&gt;: a new &lt;a href="https://github.com/simonw/csv-diff/issues/38"&gt;--extra option&lt;/a&gt; lets you specify a Python format string to be used to add extra fields to the displayed difference.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/blob/main/.github/workflows/scrape.yml"&gt;GitHub Actions workflow&lt;/a&gt; now runs this command:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;csv-diff simonwillison-net.json simonwillison-net-new.json \
  --key id --format json \
  --extra latest 'https://news.ycombinator.com/latest?id={id}' \
  &amp;gt;&amp;gt; /tmp/commit.txt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This generates the diff between the two versions, using the &lt;code&gt;id&lt;/code&gt; property in the JSON to tie records together. It adds a &lt;code&gt;latest&lt;/code&gt; field linking to that URL.&lt;/p&gt;
&lt;p&gt;The commits now &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/commit/bda23fc358d978392d38933083ba1c49f50c107a"&gt;look like this&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Fri Sep 6 05:22:32 UTC 2024. 1 row changed. id: 41459472 points: &amp;quot;25&amp;quot; =&amp;gt; &amp;quot;27&amp;quot; numComments: &amp;quot;7&amp;quot; =&amp;gt; &amp;quot;8&amp;quot; extras: latest: https://news.ycombinator.com/latest?id=41459472" src="https://static.simonwillison.net/static/2024/hacker-news-commit.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/hacker-news"&gt;hacker-news&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-actions"&gt;github-actions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/git-scraping"&gt;git-scraping&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/shot-scraper"&gt;shot-scraper&lt;/a&gt;&lt;/p&gt;



</summary><category term="hacker-news"/><category term="json"/><category term="projects"/><category term="github-actions"/><category term="git-scraping"/><category term="shot-scraper"/></entry><entry><title>LLMs are bad at returning code in JSON</title><link href="https://simonwillison.net/2024/Aug/16/llms-are-bad-at-returning-code-in-json/#atom-tag" rel="alternate"/><published>2024-08-16T17:04:39+00:00</published><updated>2024-08-16T17:04:39+00:00</updated><id>https://simonwillison.net/2024/Aug/16/llms-are-bad-at-returning-code-in-json/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://aider.chat/2024/08/14/code-in-json.html"&gt;LLMs are bad at returning code in JSON&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Paul Gauthier's &lt;a href="https://aider.chat/"&gt;Aider&lt;/a&gt; is a terminal-based coding assistant which works against multiple different models. As part of developing the project Paul runs extensive benchmarks, and his latest shows an interesting result: LLMs are slightly less reliable at producing working code if you request that code be returned as part of a JSON response.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Coding skill by model and code wrapping strategy - four models, each showing their pass rate % average of five runs. Claude 3.5 Sonnet gets 60.5% with Markdown, 54.1% with JSON. DeepSeek-Coder V2 0724 gets 60.6% with Markdown, 51.1% with JSON. GPT-4o-2024-05-13 gets 60.0% with Markdown, 59.6% with JSON. GPT-4o-2024-08-06 gets 60.8% with Markdown, 57.6% with JSON, and 56.9% with JSON (strict). Markdown consistently performs better than JSON across all models." src="https://static.simonwillison.net/static/2024/llm-code-json.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;The May release of GPT-4o is the closest to a perfect score - the  August appears to have regressed slightly, and the new structured output mode doesn't help and could even make things worse (though that difference may not be statistically significant).&lt;/p&gt;
&lt;p&gt;Paul recommends using Markdown delimiters here instead, which are less likely to introduce confusing nested quoting issues.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/paulgauthier/status/1824442504290374061"&gt;@paulgauthier&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/aider"&gt;aider&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/evals"&gt;evals&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/paul-gauthier"&gt;paul-gauthier&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="aider"/><category term="evals"/><category term="paul-gauthier"/></entry><entry><title>Share Claude conversations by converting their JSON to Markdown</title><link href="https://simonwillison.net/2024/Aug/8/convert-claude-json-to-markdown/#atom-tag" rel="alternate"/><published>2024-08-08T20:40:20+00:00</published><updated>2024-08-08T20:40:20+00:00</updated><id>https://simonwillison.net/2024/Aug/8/convert-claude-json-to-markdown/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@simonw/convert-claude-json-to-markdown"&gt;Share Claude conversations by converting their JSON to Markdown&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Anthropic's &lt;a href="https://claude.ai/"&gt;Claude&lt;/a&gt; is missing one key feature that I really appreciate in ChatGPT: the ability to create a public link to a full conversation transcript. You can publish individual artifacts from Claude, but I often find myself wanting to publish the whole conversation.&lt;/p&gt;
&lt;p&gt;Before ChatGPT added that feature I solved it myself with &lt;a href="https://observablehq.com/@simonw/chatgpt-json-transcript-to-markdown"&gt;this ChatGPT JSON transcript to Markdown Observable notebook&lt;/a&gt;. Today I built the same thing for Claude.&lt;/p&gt;
&lt;p&gt;Here's how to use it:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animated demo - starting on the Claude homepage, opening a conversation with the DevTools network panel open, searching for chat_ and then using Copy -&amp;gt; Response to get the JSON, then switching tabs to the Observable notebook and pasting that JSON in to get Markdown." src="https://static.simonwillison.net/static/2024/claude-json-markdown.gif" /&gt;&lt;/p&gt;
&lt;p&gt;The key is to load a Claude conversation on their website with your browser DevTools network panel open and then filter URLs for &lt;code&gt;chat_&lt;/code&gt;.  You can use the Copy -&amp;gt; Response right click menu option to get the JSON for that conversation, then paste it into that &lt;a href="https://observablehq.com/@simonw/convert-claude-json-to-markdown"&gt;new Observable notebook&lt;/a&gt; to get a Markdown transcript.&lt;/p&gt;
&lt;p&gt;I like sharing these by pasting them into a "secret" &lt;a href="https://gist.github.com/"&gt;Gist&lt;/a&gt; - that way they won't be indexed by search engines (adding more AI generated slop to the world) but can still be shared with people who have the link.&lt;/p&gt;
&lt;p&gt;Here's an &lt;a href="https://gist.github.com/simonw/95abdfa3cdf755dbe6feb5ec4e3029f4"&gt;example transcript&lt;/a&gt; from this morning. I started by asking Claude:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want to breed spiders in my house to get rid of all of the flies. What spider would you recommend?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;When it suggested that this was a bad idea because it might attract pests, I asked:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What are the pests might they attract? I really like possums&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It told me that possums are attracted by food waste, but "deliberately attracting them to your home isn't recommended" - so I said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Thank you for the tips on attracting possums to my house. I will get right on that! [...] Once I have attracted all of those possums, what other animals might be attracted as a result? Do you think I might get a mountain lion?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It emphasized how bad an idea that would be and said "This would be extremely dangerous and is a serious public safety risk.", so I said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;OK. I took your advice and everything has gone wrong: I am now hiding inside my house from the several mountain lions stalking my backyard, which is full of possums&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Claude has quite a preachy tone when you ask it for advice on things that are clearly a bad idea, which makes winding it up with increasingly ludicrous questions a lot of fun.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="projects"/><category term="tools"/><category term="markdown"/><category term="ai"/><category term="observable"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/></entry><entry><title>OpenAI: Introducing Structured Outputs in the API</title><link href="https://simonwillison.net/2024/Aug/6/openai-structured-outputs/#atom-tag" rel="alternate"/><published>2024-08-06T18:32:25+00:00</published><updated>2024-08-06T18:32:25+00:00</updated><id>https://simonwillison.net/2024/Aug/6/openai-structured-outputs/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://openai.com/index/introducing-structured-outputs-in-the-api/"&gt;OpenAI: Introducing Structured Outputs in the API&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
OpenAI have offered structured outputs for a while now: you could specify &lt;code&gt;"response_format": {"type": "json_object"}}&lt;/code&gt; to request a valid JSON object, or you could use the &lt;a href="https://platform.openai.com/docs/guides/function-calling"&gt;function calling&lt;/a&gt; mechanism to request responses that match a specific schema.&lt;/p&gt;
&lt;p&gt;Neither of these modes were guaranteed to return valid JSON! In my experience they usually did, but there was always a chance that something could go wrong and the returned code could not match the schema, or even not be valid JSON at all.&lt;/p&gt;
&lt;p&gt;Outside of OpenAI techniques like &lt;a href="https://github.com/1rgs/jsonformer"&gt;jsonformer&lt;/a&gt; and &lt;a href="https://til.simonwillison.net/llms/llama-cpp-python-grammars"&gt;llama.cpp grammars&lt;/a&gt; could provide those guarantees against open weights models, by interacting directly with the next-token logic to ensure that only tokens that matched the required schema were selected.&lt;/p&gt;
&lt;p&gt;OpenAI credit that work in this announcement, so they're presumably using the same trick. They've provided two new ways to guarantee valid outputs. The first a new &lt;code&gt;"strict": true&lt;/code&gt; option for function definitions. The second is a new feature: a &lt;code&gt;"type": "json_schema"&lt;/code&gt; option for the &lt;code&gt;"response_format"&lt;/code&gt; field which lets you then pass a JSON schema (and another &lt;code&gt;"strict": true&lt;/code&gt; flag) to specify your required output.&lt;/p&gt;
&lt;p&gt;I've been using the existing &lt;code&gt;"tools"&lt;/code&gt; mechanism for exactly this already in my &lt;a href="https://github.com/datasette/datasette-extract"&gt;datasette-extract&lt;/a&gt; plugin - defining a function that I have no intention of executing just to get structured data out of the API in the shape that I want.&lt;/p&gt;
&lt;p&gt;Why isn't &lt;code&gt;"strict": true&lt;/code&gt; by default? Here's OpenAI's &lt;a href="https://news.ycombinator.com/item?id=41173223#41174306"&gt;Ted Sanders&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We didn't cover this in the announcement post, but there are a few reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The first request with each JSON schema will be slow, as we need to preprocess the JSON schema into a context-free grammar. If you don't want that latency hit (e.g., you're prototyping, or have a use case that uses variable one-off schemas), then you might prefer "strict": false&lt;/li&gt;
&lt;li&gt;You might have a schema that isn't covered by our subset of JSON schema. (To keep performance fast, we don't support some more complex/long-tail features.)&lt;/li&gt;
&lt;li&gt;In JSON mode and Structured Outputs, failures are rarer but more catastrophic. If the model gets too confused, it can get stuck in loops where it just prints technically valid output forever without ever closing the object. In these cases, you can end up waiting a minute for the request to hit the max_token limit, and you also have to pay for all those useless tokens. So if you have a really tricky schema, and you'd rather get frequent failures back quickly instead of infrequent failures back slowly, you might also want &lt;code&gt;"strict": false&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;But in 99% of cases, you'll want &lt;code&gt;"strict": true&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;More &lt;a href="https://news.ycombinator.com/item?id=41173223#41174213"&gt;from Ted&lt;/a&gt; on how the new mode differs from function calling:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Under the hood, it's quite similar to function calling. A few differences:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Structured Outputs is a bit more straightforward. e.g., you don't have to pretend you're writing a function where the second arg could be a two-page report to the user, and then pretend the "function" was called successfully by returning &lt;code&gt;{"success": true}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Having two interfaces lets us teach the model different default behaviors and styles, depending on which you use&lt;/li&gt;
&lt;li&gt;Another difference is that our current implementation of function calling can return both a text reply plus a function call (e.g., "Let me look up that flight for you"), whereas Structured Outputs will only return the JSON&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;The official &lt;code&gt;openai-python&lt;/code&gt; library also &lt;a href="https://github.com/openai/openai-python/commit/bf1ca86cf392eb0ffed1e146937c5d73d8a568f0"&gt;added structured output support&lt;/a&gt; this morning, based on Pydantic and looking very similar to the &lt;a href="https://python.useinstructor.com/"&gt;Instructor library&lt;/a&gt; (also credited as providing inspiration in their announcement).&lt;/p&gt;
&lt;p&gt;There are some key limitations on the new structured output mode, &lt;a href="https://platform.openai.com/docs/guides/structured-outputs/supported-schemas"&gt;described in the documentation&lt;/a&gt;. Only a subset of JSON schema is supported, and most notably the &lt;code&gt;"additionalProperties": false&lt;/code&gt; property must be set on all objects and all object keys must be listed in &lt;code&gt;"required"&lt;/code&gt; - no optional keys are allowed.&lt;/p&gt;
&lt;p&gt;Another interesting new feature: if the model denies a request on safety grounds a new &lt;a href="https://platform.openai.com/docs/guides/structured-outputs/refusals"&gt;refusal message&lt;/a&gt; will be returned:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{
  "message": {
    "role": "assistant",
    "refusal": "I'm sorry, I cannot assist with that request."
  }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Finally, tucked away at the bottom of this announcement is a significant new model release with a major price cut:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;By switching to the new &lt;code&gt;gpt-4o-2024-08-06&lt;/code&gt;, developers save 50% on inputs ($2.50/1M input tokens) and 33% on outputs ($10.00/1M output tokens) compared to &lt;code&gt;gpt-4o-2024-05-13&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This new model &lt;a href="https://platform.openai.com/docs/models/gpt-4o"&gt;also supports&lt;/a&gt; 16,384 output tokens, up from 4,096.&lt;/p&gt;
&lt;p&gt;The price change is particularly notable because &lt;a href="https://simonwillison.net/2024/Jul/18/gpt-4o-mini/"&gt;GPT-4o-mini&lt;/a&gt;, the much cheaper alternative to GPT-4o, prices image inputs at the &lt;em&gt;same price&lt;/em&gt; as GPT-4o. This new model cuts that by half (&lt;a href="https://news.ycombinator.com/item?id=41173223#41174929"&gt;confirmed here&lt;/a&gt;), making &lt;code&gt;gpt-4o-2024-08-06&lt;/code&gt; the new cheapest model from OpenAI for handling image inputs.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/structured-extraction"&gt;structured-extraction&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pydantic"&gt;pydantic&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="structured-extraction"/><category term="pydantic"/></entry><entry><title>Ham radio general exam question pool as JSON</title><link href="https://simonwillison.net/2024/May/11/ham-radio-general/#atom-tag" rel="alternate"/><published>2024-05-11T19:16:49+00:00</published><updated>2024-05-11T19:16:49+00:00</updated><id>https://simonwillison.net/2024/May/11/ham-radio-general/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/ham-general-question-pool"&gt;Ham radio general exam question pool as JSON&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I scraped a pass of my Ham radio general exam this morning. One of the tools I used to help me pass was a Datasette instance with all 429 questions from the official question pool. I've published that raw data as JSON on GitHub, which I converted from the official question pool document using &lt;a href="https://observablehq.com/@simonw/ham-general-2024"&gt;an Observable notebook&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Relevant TIL: &lt;a href="https://til.simonwillison.net/ham-radio/general"&gt;How I studied for my Ham radio general exam&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/radio"&gt;radio&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ham-radio"&gt;ham-radio&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="projects"/><category term="radio"/><category term="datasette"/><category term="observable"/><category term="ham-radio"/></entry><entry><title>Tips on Adding JSON Output to Your CLI App</title><link href="https://simonwillison.net/2024/Apr/20/json-output-cli/#atom-tag" rel="alternate"/><published>2024-04-20T21:43:58+00:00</published><updated>2024-04-20T21:43:58+00:00</updated><id>https://simonwillison.net/2024/Apr/20/json-output-cli/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.kellybrazil.com/2021/12/03/tips-on-adding-json-output-to-your-cli-app/"&gt;Tips on Adding JSON Output to Your CLI App&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Kelly Brazil - also the author of &lt;code&gt;jc&lt;/code&gt;, the neat CLI tool that converts the output of common Unix utilities such as dig into JSON - provides some useful do's and don'ts for adding JSON output as an option to a command-line tool.&lt;/p&gt;
&lt;p&gt;Kelly recommends defaulting to arrays of flat objects - or newline-delimited objects - and suggests including an "unbuffer" option for streaming tools that discourages the OS from buffering output that is being sent through a pipe.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=40098606"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/cli"&gt;cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;&lt;/p&gt;



</summary><category term="cli"/><category term="json"/></entry><entry><title>Caddy: Config Adapters</title><link href="https://simonwillison.net/2024/Feb/13/caddy-config-adapters/#atom-tag" rel="alternate"/><published>2024-02-13T04:22:08+00:00</published><updated>2024-02-13T04:22:08+00:00</updated><id>https://simonwillison.net/2024/Feb/13/caddy-config-adapters/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://caddyserver.com/docs/config-adapters"&gt;Caddy: Config Adapters&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The Caddy web application server is configured using JSON, but their “config adapters” plugin mechanism allows you to write configuration files in YAML, TOML, JSON5 (JSON with comments), and even nginx format which then gets automatically converted to JSON for you.&lt;/p&gt;

&lt;p&gt;Caddy author Matt Holt: “We put an end to the config format wars in Caddy by letting you use any format you want!”

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/mholt6/status/1757251648148373779"&gt;@mholt6&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/matt-holt"&gt;matt-holt&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="matt-holt"/></entry><entry><title>SQLite 3.45</title><link href="https://simonwillison.net/2024/Jan/15/sqlite-345/#atom-tag" rel="alternate"/><published>2024-01-15T20:15:42+00:00</published><updated>2024-01-15T20:15:42+00:00</updated><id>https://simonwillison.net/2024/Jan/15/sqlite-345/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.sqlite.org/changes.html#version_3_45_0"&gt;SQLite 3.45&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Released today. The big new feature is JSONB support, a new, specific-to-SQLite binary internal representation of JSON which can provide up to a 3x performance improvement for JSON-heavy operations, plus a 5-10% saving it terms of bytes stored on disk.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/databases"&gt;databases&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;&lt;/p&gt;



</summary><category term="databases"/><category term="json"/><category term="sqlite"/></entry><entry><title>jo</title><link href="https://simonwillison.net/2023/Oct/8/jo/#atom-tag" rel="alternate"/><published>2023-10-08T05:20:09+00:00</published><updated>2023-10-08T05:20:09+00:00</updated><id>https://simonwillison.net/2023/Oct/8/jo/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/jpmens/jo"&gt;jo&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat little C utility (available via brew/apt-get install etc) for conveniently outputting JSON from a shell: “jo -p name=jo n=17 parser=false” will output a JSON object with string, integer and boolean values, and you can nest it to create nested objects. Looks very handy.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/4wiwig/shell_tip_print_json_with_printf#c_p9pihd"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="json"/></entry><entry><title>jq 1.7</title><link href="https://simonwillison.net/2023/Oct/2/jq-17/#atom-tag" rel="alternate"/><published>2023-10-02T04:58:54+00:00</published><updated>2023-10-02T04:58:54+00:00</updated><id>https://simonwillison.net/2023/Oct/2/jq-17/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/jqlang/jq/releases/tag/jq-1.7"&gt;jq 1.7&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
First new release of jq in five years! The project has moved from a solo maintainer to a new team with a dedicated GitHub organization. A ton of new features in this release—I’m most excited about the new pick(.key1, .key2.nested) builtin for emitting a selected subset of the incoming objects, and the --raw-output0 option which outputs zero byte delimited lists, designed to be piped to “xargs -0”.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jq"&gt;jq&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="jq"/></entry><entry><title>Lark parsing library JSON tutorial</title><link href="https://simonwillison.net/2023/Aug/13/lark-parsing-library-json-tutorial/#atom-tag" rel="alternate"/><published>2023-08-13T21:50:16+00:00</published><updated>2023-08-13T21:50:16+00:00</updated><id>https://simonwillison.net/2023/Aug/13/lark-parsing-library-json-tutorial/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://lark-parser.readthedocs.io/en/stable/json_tutorial.html"&gt;Lark parsing library JSON tutorial&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A very convincing tutorial for a new-to-me parsing library for Python called Lark.&lt;/p&gt;

&lt;p&gt;The tutorial covers building a full JSON parser from scratch, which ends up being just 19 lines of grammar definition code and 15 lines for the transformer to turn that tree into the final JSON.&lt;/p&gt;

&lt;p&gt;It then gets into the details of optimization—the default Earley algorithm is quite slow, but swapping that out for a LALR parser (a one-line change) provides a 5x speedup for this particular example.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://github.com/spandanb/learndb-py"&gt;spandanb/learndb-py&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/compilers"&gt;compilers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/parsing"&gt;parsing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;&lt;/p&gt;



</summary><category term="compilers"/><category term="json"/><category term="parsing"/><category term="python"/></entry><entry><title>Datasette 1.0a3</title><link href="https://simonwillison.net/2023/Aug/9/datasette-10a3/#atom-tag" rel="alternate"/><published>2023-08-09T20:49:35+00:00</published><updated>2023-08-09T20:49:35+00:00</updated><id>https://simonwillison.net/2023/Aug/9/datasette-10a3/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.datasette.io/en/latest/changelog.html#a3-2023-08-09"&gt;Datasette 1.0a3&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A new Datasette alpha release. This one previews the new default JSON API design that’s coming in 1.0—the single most significant change in the 1.0 milestone, since I plan to keep that API stable for many years to come.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="projects"/><category term="datasette"/></entry><entry><title>SQLite 3.42.0</title><link href="https://simonwillison.net/2023/May/18/sqlite/#atom-tag" rel="alternate"/><published>2023-05-18T21:14:07+00:00</published><updated>2023-05-18T21:14:07+00:00</updated><id>https://simonwillison.net/2023/May/18/sqlite/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://sqlite.org/releaselog/3_42_0.html"&gt;SQLite 3.42.0&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The latest SQLite has a tiny feature I requested on the SQLite Forum - &lt;code&gt;SELECT unixepoch('subsec')&lt;/code&gt; now returns the current time in milliseconds since the Unix epoch, a big improvement on the previous recipe of &lt;code&gt;select cast((julianday('now') - 2440587.5) * 86400 * 1000 as integer)&lt;/code&gt;!&lt;/p&gt;
&lt;p&gt;Also in the release: JSON5 support (JSON with multi-line strings and comments), a bunch of improvements to the query planner and CLI tool, plus various interesting internal changes.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="sqlite"/></entry><entry><title>Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models</title><link href="https://simonwillison.net/2023/May/8/jsonformer/#atom-tag" rel="alternate"/><published>2023-05-08T23:02:01+00:00</published><updated>2023-05-08T23:02:01+00:00</updated><id>https://simonwillison.net/2023/May/8/jsonformer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/1rgs/jsonformer"&gt;Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
This is such an interesting trick. A common challenge with LLMs is getting them to output a specific JSON shape of data reliably, without occasionally messing up and generating invalid JSON or outputting other text.&lt;/p&gt;

&lt;p&gt;Jsonformer addresses this in a truly ingenious way: it implements code that interacts with the logic that decides which token to output next, influenced by a JSON schema. If that code knows that the next token after a double quote should be a comma it can force the issue for that specific token.&lt;/p&gt;

&lt;p&gt;This means you can get reliable, robust JSON output even for much smaller, less capable language models.&lt;/p&gt;

&lt;p&gt;It’s built against Hugging Face transformers, but there’s no reason the same idea couldn’t be applied in other contexts as well.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hugging-face"&gt;hugging-face&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="hugging-face"/></entry><entry><title>Datasette: Gather feedback on new ?_extra= design</title><link href="https://simonwillison.net/2023/Mar/22/datasette-json-feedback/#atom-tag" rel="alternate"/><published>2023-03-22T23:14:19+00:00</published><updated>2023-03-22T23:14:19+00:00</updated><id>https://simonwillison.net/2023/Mar/22/datasette-json-feedback/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette/issues/2042"&gt;Datasette: Gather feedback on new ?_extra= design&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I just landed the single biggest backwards-incompatible change to Datasette ever, in preparation for the 1.0 release. It’s a change to the default JSON format from the Datasette API—the new format is much slimmer, and can be expanded using a new ?_extra= query string parameter. I’m desperately keen on getting feedback on this change! This issues has more details and a call for feedback.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="datasette"/></entry><entry><title>sqlite-jsonschema</title><link href="https://simonwillison.net/2023/Jan/28/sqlite-jsonschema/#atom-tag" rel="alternate"/><published>2023-01-28T03:50:46+00:00</published><updated>2023-01-28T03:50:46+00:00</updated><id>https://simonwillison.net/2023/Jan/28/sqlite-jsonschema/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/asg017/sqlite-jsonschema"&gt;sqlite-jsonschema&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“A SQLite extension for validating JSON objects with JSON Schema”, building on the jsonschema Rust crate. SQLite and JSON are already a great combination—Alex suggests using this extension to implement check constraints to validate JSON columns before inserting into a table, or just to run queries finding existing data that doesn’t match a given schema.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jsonschema"&gt;jsonschema&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/alex-garcia"&gt;alex-garcia&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="jsonschema"/><category term="sqlite"/><category term="rust"/><category term="alex-garcia"/></entry><entry><title>Datasette's new JSON write API: The first alpha of Datasette 1.0</title><link href="https://simonwillison.net/2022/Dec/2/datasette-write-api/#atom-tag" rel="alternate"/><published>2022-12-02T23:15:07+00:00</published><updated>2022-12-02T23:15:07+00:00</updated><id>https://simonwillison.net/2022/Dec/2/datasette-write-api/#atom-tag</id><summary type="html">
    &lt;p&gt;This week I published &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a0-2022-11-29"&gt;the first alpha release of Datasette 1.0&lt;/a&gt;, with a significant new feature: Datasette core now includes &lt;a href="https://docs.datasette.io/en/latest/json_api.html#the-json-write-api"&gt;a JSON API&lt;/a&gt; for creating and dropping tables and inserting, updating and deleting data.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/datasette.svg" alt="The Datasette logo" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Combined with Datasette's existing APIs for reading and filtering table data and executing SELECT queries this effectively turns Datasette into a SQLite-backed JSON data layer for any application.&lt;/p&gt;
&lt;p&gt;If you squint at it the right way, you could even describe it as offering a NoSQL interface to a SQL database!&lt;/p&gt;
&lt;p&gt;My initial motivation for this work was to provide an API for loading data into my &lt;a href="https://datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt; SaaS product - but now that I've got it working I'm realizing that it can be applied to a whole host of interesting things.&lt;/p&gt;
&lt;p&gt;I shipped &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a0-2022-11-29"&gt;the 1.0a0 alpha&lt;/a&gt; on Wednesday, then spent the last two days ironing out some bugs (released in &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a1-2022-12-01"&gt;1.0a1&lt;/a&gt;) and building some illustrative demos.&lt;/p&gt;
&lt;h4&gt;Scraping Hacker News to build an atom feed&lt;/h4&gt;
&lt;p&gt;My first demo reuses my &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain"&gt;scrape-hacker-news-by-domain&lt;/a&gt; project from earlier this year.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://news.ycombinator.com/from?site=simonwillison.net"&gt;https://news.ycombinator.com/from?site=simonwillison.net&lt;/a&gt; is the page on Hacker News that shows submissions from my blog. I like to keep an eye on that page to see if anyone has linked to my work.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/hacker-news-from.jpg" alt="The page lists posts from my blog - the top one has 222 points and 39 comments, but most of the others have 2 or 3 points and no discussion at all." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Data from that page is not currently available through the &lt;a href="https://github.com/HackerNews/API"&gt;official Hacker News API&lt;/a&gt;... but it's in an HTML format that's pretty easy to scrape.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://shot-scraper.datasette.io/"&gt;shot-scraper&lt;/a&gt; command-line browser automation tool has the ability to execute JavaScript against a web page and return scraped data as JSON.&lt;/p&gt;
&lt;p&gt;I wrote about that in &lt;a href="https://simonwillison.net/2022/Mar/14/scraping-web-pages-shot-scraper/"&gt;Scraping web pages from the command line with shot-scraper&lt;/a&gt;, including a recipe for scraping that Hacker News page that looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;shot-scraper javascript \
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://news.ycombinator.com/from?site=simonwillison.net&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -i scrape.js -o simonwillison-net.json&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's that &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain/blob/main/scrape.js"&gt;scrape.js&lt;/a&gt; script.&lt;/p&gt;
&lt;p&gt;I've been running a &lt;a href="https://simonwillison.net/2020/Oct/9/git-scraping/"&gt;Git scraper&lt;/a&gt; that executes that scraping script using GitHub Actions for several months now, out of my &lt;a href="https://github.com/simonw/scrape-hacker-news-by-domain"&gt;simonw/scrape-hacker-news-by-domain&lt;/a&gt; repository.&lt;/p&gt;
&lt;p&gt;Today I modified that script to also publish the data it has scraped to my personal Datasette Cloud account using the new  API - and then used the &lt;a href="https://datasette.io/plugins/datasette-atom"&gt;datasette-atom&lt;/a&gt; plugin to generate an Atom feed from that data.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://simon.datasette.cloud/data/hacker_news_posts?_sort_desc=dt"&gt;the new table&lt;/a&gt; in Datasette Cloud.&lt;/p&gt;
&lt;p&gt;This is the &lt;code&gt;bash&lt;/code&gt; script that runs in GitHub Actions and pushes the data to Datasette:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; SIMONWILLISON_ROWS=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;  jq -n --argjson rows &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;cat simonwillison-net.json&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \&lt;/span&gt;
&lt;span class="pl-s"&gt;  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{ "rows": $rows, "replace": true }&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
curl -X POST \
  https://simon.datasette.cloud/data/hacker_news_posts/-/insert \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Content-Type: application/json&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Authorization: Bearer &lt;span class="pl-smi"&gt;$DS_TOKEN&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -d &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$SIMONWILLISON_ROWS&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;$DS_TOKEN&lt;/code&gt; is an environment variable containing a signed API token, see the &lt;a href="https://docs.datasette.io/en/latest/authentication.html#api-tokens"&gt;API token documentation&lt;/a&gt; for details.&lt;/p&gt;
&lt;p&gt;I'm using &lt;code&gt;jq&lt;/code&gt; here (with a recipe &lt;a href="https://til.simonwillison.net/gpt3/jq"&gt;generated using GPT-3&lt;/a&gt;) to convert the scraped data into the JSON format needeed by the Datasette API. The result looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"rows"&lt;/span&gt;: [
    {
      &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Coping strategies for the serial project hoarder&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"url"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://simonwillison.net/2022/Nov/26/productivity/&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"dt"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2022-11-27T12:12:56&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: &lt;span class="pl-c1"&gt;222&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"submitter"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;usrme&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"commentsUrl"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://news.ycombinator.com/item?id=33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"numComments"&lt;/span&gt;: &lt;span class="pl-c1"&gt;38&lt;/span&gt;
    }
  ],
  &lt;span class="pl-ent"&gt;"replace"&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This is then POSTed up to the &lt;code&gt;https://simon.datasette.cloud/data/hacker_news_posts/-/insert&lt;/code&gt; API endpoint.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;"rows"&lt;/code&gt; key is a list of rows to be inserted.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;"replace": true&lt;/code&gt; tells Datasette to replace any existing rows with the same primary key. Without that, the API would return an error if any rows already existed.&lt;/p&gt;
&lt;p&gt;The API also accepts &lt;code&gt;"ignore": true&lt;/code&gt; which will cause it to ignore any rows that already exist.&lt;/p&gt;
&lt;p&gt;Full insert API documentation &lt;a href="https://docs.datasette.io/en/latest/json_api.html#inserting-rows"&gt;is here&lt;/a&gt;.&lt;/p&gt;
&lt;h4&gt;Initially creating the table&lt;/h4&gt;
&lt;p&gt;Before I could insert any rows I needed to create the table.&lt;/p&gt;
&lt;p&gt;I did that from the command-line too, using this recipe:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; ROWS=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;  jq -n --argjson rows &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;cat simonwillison-net.json&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \&lt;/span&gt;
&lt;span class="pl-s"&gt;  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{ "table": "hacker_news_posts", "rows": $rows, "pk": "id" }&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Use curl to POST some JSON to a URL&lt;/span&gt;
curl -X POST \
  https://simon.datasette.cloud/data/-/create \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Content-Type: application/json&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Authorization: Bearer &lt;span class="pl-smi"&gt;$DS_TOKEN&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -d &lt;span class="pl-smi"&gt;$ROWS&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This uses the same trick as above, but hits a different API endpoint: &lt;code&gt;/data/-/create&lt;/code&gt; which is the endpoint for &lt;a href="https://docs.datasette.io/en/latest/json_api.html#creating-a-table"&gt;creating a table&lt;/a&gt; in the &lt;code&gt;data.db&lt;/code&gt; database.&lt;/p&gt;
&lt;p&gt;The JSON submitted to that endpoint looks like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"table"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;hacker_news_posts&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"pk"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;id&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"rows"&lt;/span&gt;: [
    {
      &lt;span class="pl-ent"&gt;"id"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"title"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Coping strategies for the serial project hoarder&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"url"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://simonwillison.net/2022/Nov/26/productivity/&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"dt"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;2022-11-27T12:12:56&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"points"&lt;/span&gt;: &lt;span class="pl-c1"&gt;222&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"submitter"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;usrme&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"commentsUrl"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://news.ycombinator.com/item?id=33762438&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
      &lt;span class="pl-ent"&gt;"numComments"&lt;/span&gt;: &lt;span class="pl-c1"&gt;38&lt;/span&gt;
    }
  ]
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It's almost the same shape as the &lt;code&gt;/-/insert&lt;/code&gt; call above. That's because it's using a feature of the Datasette API inherited from &lt;a href="https://sqlite-utils.datasette.io/"&gt;sqlite-utils&lt;/a&gt; - it can create a table from a list of rows, automatically determining the correct schema.&lt;/p&gt;
&lt;p&gt;If you already know your schema you can pass a &lt;code&gt;"columns": [...]&lt;/code&gt; key instead, but I've found that this kind of automatic schema generation works really well in practice.&lt;/p&gt;
&lt;p&gt;Datasette will let you call the create API like that multiple times, and if the table already exists it will insert new rows directly into the existing tables. I expect this to be a really convenient way to write automation scripts where you don't want to bother checking if the table exists already.&lt;/p&gt;
&lt;h4&gt;Building an Atom feed&lt;/h4&gt;
&lt;p&gt;My end goal with this demo was to build an Atom feed I could subscribe to in my NetNewsWire feed reader.&lt;/p&gt;
&lt;p&gt;I have a plugin for that already: &lt;a href="https://datasette.io/plugins/datasette-atom"&gt;datasette-atom&lt;/a&gt;, which lets you generate an Atom feed for any data in Datasette, defined using a SQL query.&lt;/p&gt;
&lt;p&gt;I created a SQL view for this (using the &lt;a href="https://datasette.io/plugins/datasette-write"&gt;datasette-write&lt;/a&gt; plugin, which is installed on Datasette Cloud):&lt;/p&gt;
&lt;div class="highlight highlight-source-sql"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;CREATE&lt;/span&gt; &lt;span class="pl-k"&gt;VIEW&lt;/span&gt; &lt;span class="pl-en"&gt;hacker_news_posts_atom&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; &lt;span class="pl-k"&gt;select&lt;/span&gt;
  id &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_id,
  title &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_title,
  url,
  commentsUrl &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_link,
  dt &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Z&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_updated,
  &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Submitter: &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; submitter &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; - &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; points &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; points, &lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;||&lt;/span&gt; numComments &lt;span class="pl-k"&gt;||&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt; comments&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;as&lt;/span&gt; atom_content
&lt;span class="pl-k"&gt;from&lt;/span&gt;
  hacker_news_posts
&lt;span class="pl-k"&gt;order by&lt;/span&gt;
  dt &lt;span class="pl-k"&gt;desc&lt;/span&gt;
&lt;span class="pl-k"&gt;limit&lt;/span&gt;
  &lt;span class="pl-c1"&gt;100&lt;/span&gt;;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;code&gt;datasette-atom&lt;/code&gt; requires a table, view or SQL query that returns &lt;code&gt;atom_id&lt;/code&gt;, &lt;code&gt;atom_title&lt;/code&gt; and &lt;code&gt;atom_updated&lt;/code&gt; columns - and will make use of &lt;code&gt;atom_link&lt;/code&gt; and &lt;code&gt;atom_content&lt;/code&gt; as well if they are present.&lt;/p&gt;
&lt;p&gt;Datasette Cloud defaults to keeping all tables and views private - but a while ago I created the &lt;a href="https://datasette.io/plugins/datasette-public"&gt;datasette-public&lt;/a&gt; plugin to provide a UI for making a table public.&lt;/p&gt;
&lt;p&gt;It turned out this didn't work for SQL views yet, so &lt;a href="https://github.com/simonw/datasette-public/issues/5"&gt;I fixed that&lt;/a&gt; - then used that option to make my view public. You can visit it at:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simon.datasette.cloud/data/hacker_news_posts_atom"&gt;https://simon.datasette.cloud/data/hacker_news_posts_atom&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And to get an Atom feed, just add &lt;code&gt;.atom&lt;/code&gt; to the end of the URL:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://simon.datasette.cloud/data/hacker_news_posts_atom.atom"&gt;https://simon.datasette.cloud/data/hacker_news_posts_atom.atom&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here's what it looks like in NetNewsWire:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/netnewswire-hacker-news.jpg" alt="A screenshot of a feed reading interface, showing posts from Hacker News with the submitter, number of points and number of comments" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I'm pretty excited about being able to combine these tools in this way: it makes getting from scraped data to a Datasette table to an Atom feed a very repeatable process.&lt;/p&gt;
&lt;h4&gt;Building a TODO list application&lt;/h4&gt;
&lt;p&gt;My second demo explores what it looks like to develop custom applications against the new API.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://todomvc.com"&gt;TodoMVC&lt;/a&gt; is a project that provides the same TODO list interface built using dozens of different JavaScript frameworks, as a comparison tool.&lt;/p&gt;
&lt;p&gt;I decided to use it to build my own TODO list application, using Datasette as the backend.&lt;/p&gt;
&lt;p&gt;You can try it out at &lt;a href="https://todomvc.datasette.io/"&gt;https://todomvc.datasette.io/&lt;/a&gt; - but be warned that the demo resets every 15 minutes so don't use it for real task tracking!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/todomvc.gif" alt="Animated GIF showing a TODO list interface - I add two items to it, then check one of them off as done, then remove the other one" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The source code for this demo lives in &lt;a href="https://github.com/simonw/todomvc-datasette"&gt;simonw/todomvc-datasette&lt;/a&gt; - which also serves the demo itself using GitHub Pages.&lt;/p&gt;
&lt;p&gt;The code is based on the TodoMVC &lt;a href="https://github.com/tastejs/todomvc/tree/gh-pages/examples/vanillajs"&gt;Vanilla JavaScript example&lt;/a&gt;. I used that unmodified, except for one file - &lt;a href="https://github.com/simonw/todomvc-datasette/blob/main/js/store.js"&gt;store.js&lt;/a&gt;, which I modified to use the Datasette API instead of &lt;code&gt;localStorage&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The demo currently uses a hard-coded authentication token, which is signed to allow actions to be performed against the &lt;a href="https://latest.datasette.io/"&gt;https://latest.datasette.io/&lt;/a&gt; demo instance as a user called &lt;code&gt;todomvc&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;That user is granted permissions &lt;a href="https://github.com/simonw/datasette/blob/cab5b60e09e94aca820dbec5308446a88c99ea3d/tests/plugins/my_plugin.py#L223-L230"&gt;in a custom plugin&lt;/a&gt; at the moment, but I plan to provide a more user-friendly way to do this in the future.&lt;/p&gt;
&lt;p&gt;A couple of illustrative snippets of code. First, on page load this constructor uses the Datasette API to create the table used by the application:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-v"&gt;Store&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

  &lt;span class="pl-c"&gt;// Ensure a table exists with this name&lt;/span&gt;
  &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;self&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;_dbName&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;`todo_&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"https://latest.datasette.io/ephemeral/-/create"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;mode&lt;/span&gt;: &lt;span class="pl-s"&gt;"cors"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;TOKEN&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-s"&gt;"Content-Type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;table&lt;/span&gt;: &lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;_dbName&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;columns&lt;/span&gt;: &lt;span class="pl-kos"&gt;[&lt;/span&gt;
        &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;"id"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"integer"&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;"title"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"text"&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;"completed"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;type&lt;/span&gt;: &lt;span class="pl-s"&gt;"integer"&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;pk&lt;/span&gt;: &lt;span class="pl-s"&gt;"id"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;call&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Most applications would run against a table that has already been created, but this felt like a good opportunity to show what table creation looks like.&lt;/p&gt;
&lt;p&gt;Note that the table is being created using &lt;code&gt;/ephemeral/-/create&lt;/code&gt; - this endpoint that lets you create tables in the ephemeral database, which is a temporary database that drops every table after 15 minutes. I built the &lt;a href="https://datasette.io/plugins/datasette-ephemeral-tables"&gt;datasette-ephemeral-tables&lt;/a&gt; plugin to make this possible.&lt;/p&gt;
&lt;p&gt;Here's the code which is called when a new TODO list item is created or updated:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-v"&gt;Store&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;prototype&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;save&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;updateData&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
&lt;span class="pl-c"&gt;// {title, completed}&lt;/span&gt;
&lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;callback&lt;/span&gt; &lt;span class="pl-c1"&gt;||&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;var&lt;/span&gt; &lt;span class="pl-s1"&gt;table&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;this&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;_dbName&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;

&lt;span class="pl-c"&gt;// If an ID was actually given, find the item and update each property&lt;/span&gt;
&lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;
    &lt;span class="pl-s"&gt;`https://latest.datasette.io/ephemeral/&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;table&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;/&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;id&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;/-/update`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;mode&lt;/span&gt;: &lt;span class="pl-s"&gt;"cors"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
        &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;TOKEN&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
        &lt;span class="pl-s"&gt;"Content-Type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;update&lt;/span&gt;: &lt;span class="pl-s1"&gt;updateData&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;
  &lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;call&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt; &lt;span class="pl-k"&gt;else&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// Save it and store ID&lt;/span&gt;
  &lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;`https://latest.datasette.io/ephemeral/&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-s1"&gt;table&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;/-/insert`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;mode&lt;/span&gt;: &lt;span class="pl-s"&gt;"cors"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;headers&lt;/span&gt;: &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;Authorization&lt;/span&gt;: &lt;span class="pl-s"&gt;`Bearer &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;${&lt;/span&gt;&lt;span class="pl-c1"&gt;TOKEN&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;`&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
      &lt;span class="pl-s"&gt;"Content-Type"&lt;/span&gt;: &lt;span class="pl-s"&gt;"application/json"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-c1"&gt;row&lt;/span&gt;: &lt;span class="pl-s1"&gt;updateData&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-s1"&gt;r&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;json&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;
    &lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;then&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-c1"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
      &lt;span class="pl-k"&gt;let&lt;/span&gt; &lt;span class="pl-s1"&gt;row&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;data&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;rows&lt;/span&gt;&lt;span class="pl-kos"&gt;[&lt;/span&gt;&lt;span class="pl-c1"&gt;0&lt;/span&gt;&lt;span class="pl-kos"&gt;]&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
      &lt;span class="pl-s1"&gt;callback&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;call&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;self&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-s1"&gt;row&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
    &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;TodoMVC passes an &lt;code&gt;id&lt;/code&gt; if a record is being updated - which this code uses as a sign that the &lt;code&gt;...table/row-id/-/update&lt;/code&gt; API should be called (see &lt;a href="https://docs.datasette.io/en/latest/json_api.html#updating-a-row"&gt;update API documentation&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;If the row doen't have an ID it is inserted using &lt;code&gt;table/-/insert&lt;/code&gt;, this time using the &lt;code&gt;"row":&lt;/code&gt; key because we are only inserting a single row.&lt;/p&gt;
&lt;p&gt;The hardest part of getting this to work was ensuring Datasette's &lt;a href="https://docs.datasette.io/en/latest/json_api.html#json-api"&gt;CORS mode&lt;/a&gt; worked correctly for writes. I had to add a new &lt;code&gt;Access-Control-Allow-Methods&lt;/code&gt; header, which I shipped in &lt;a href="https://docs.datasette.io/en/latest/changelog.html#a1-2022-12-01"&gt;Datasette 1.0a1&lt;/a&gt; (see &lt;a href="https://github.com/simonw/datasette/issues/1922"&gt;issue #1922&lt;/a&gt;).&lt;/p&gt;
&lt;h4&gt;Try the ephemeral hosted API&lt;/h4&gt;
&lt;p&gt;I built the &lt;a href="https://datasette.io/plugins/datasette-ephemeral-tables"&gt;datasette-ephemeral-tables&lt;/a&gt; plugin because I wanted to provide a demo instance of the write API that anyone could try out without needing to install Datasette themselves - but that wouldn't leave me responsible for taking care of their data or cleaning up any of their mess.&lt;/p&gt;
&lt;p&gt;You're welcome to experiment with the API using the &lt;a href="https://latest.datasette.io/"&gt;https://latest.datasette.io/&lt;/a&gt; demo instance.&lt;/p&gt;
&lt;p&gt;First, you'll need to sign in as a root user. You can do that (no password required) using the button &lt;a href="https://latest.datasette.io/login-as-root"&gt;on this page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once signed in you can view the ephemeral database (which isn't visible to anonymous users) here:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest.datasette.io/ephemeral"&gt;https://latest.datasette.io/ephemeral&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;You can use the API explorer to try out the different write APIs against it here:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest.datasette.io/-/api"&gt;https://latest.datasette.io/-/api&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And you can create your own signed token for accessing the API on this page:&lt;/p&gt;
&lt;p&gt;&lt;a href="https://latest.datasette.io/-/create-token"&gt;https://latest.datasette.io/-/create-token&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/create-token.jpg" alt="The Create an API token page lets you create a token that expires after a set number of hours - you can then copy that token to your clipboard" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The TodoMVC application described above also uses the &lt;code&gt;ephemeral&lt;/code&gt; database, so you may see a &lt;code&gt;todo_todos-vanillajs&lt;/code&gt; table appear there if anyone is playing with that demo.&lt;/p&gt;
&lt;h4 id="your-machine"&gt;Or run this on your own machine&lt;/h4&gt;
&lt;p&gt;You can install the latest Datasette alpha like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;pip install datasette==1.0a1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then create a database and sign in as the &lt;code&gt;root&lt;/code&gt; user in order to gain access to the API:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;datasette demo.db --create --root
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Click on the link it outputs to sign in as the root user, then visit the API explorer to start trying out the API:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://127.0.0.1:8001/-/api"&gt;http://127.0.0.1:8001/-/api&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2022/api-explorer.jpg" alt="The API explorer interface has tools for sending GET and POST requests, plus a list of API endpoints" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The API explorer works without a token at all, using your existing browser cookies.&lt;/p&gt;
&lt;p&gt;If you want to try the API using &lt;code&gt;curl&lt;/code&gt; or similar you can use this page to create a new signed API token for the &lt;code&gt;root&lt;/code&gt; user:&lt;/p&gt;
&lt;p&gt;&lt;a href="http://127.0.0.1:8001/-/create-token"&gt;http://127.0.0.1:8001/-/create-token&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This token will become invalid if you restart the server, unless you fix the &lt;code&gt;DATASETTE_SECRET&lt;/code&gt; environment variable to a stable string before you start the server:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;export DATASETTE_SECRET=$(
  python3 -c 'print(__import__("secrets").token_hex(16))'
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Check the &lt;a href="https://docs.datasette.io/en/latest/json_api.html#the-json-write-api"&gt;Write API documentation&lt;/a&gt; for more details.&lt;/p&gt;
&lt;h4&gt;What's next?&lt;/h4&gt;
&lt;p&gt;If you have feedback on these APIs, &lt;em&gt;now is the time&lt;/em&gt; to share it! I'm hoping to ship Datasette 1.0 at the start of 2023, after which these APIs will be considered stable for hopefully a long time to come.&lt;/p&gt;
&lt;p&gt;If you have thoughts or feedback (or questions) join us on the &lt;a href="https://datasette.io/discord"&gt;Datasette Discord&lt;/a&gt;. You can also file issue comments against &lt;a href="https://github.com/simonw/issues"&gt;Datasette&lt;/a&gt; itself.&lt;/p&gt;
&lt;p&gt;My priority for the next 1.0 alpha is to bake in a small number of backwards incompatible changes to other aspects of Datasette's JSON API that I've been hoping to include in 1.0 for a while.&lt;/p&gt;
&lt;p&gt;I'm also going to be rolling out API support to my &lt;a href="https://datasette.cloud/"&gt;Datasette Cloud&lt;/a&gt; preview users. If you're interested in trying that out you can &lt;a href="https://www.datasette.cloud/preview/"&gt;request access here&lt;/a&gt;.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apis"&gt;apis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apis"/><category term="json"/><category term="projects"/><category term="datasette"/></entry><entry><title>Building a BFT JSON CRDT</title><link href="https://simonwillison.net/2022/Nov/21/building-a-bft-json-crdt/#atom-tag" rel="alternate"/><published>2022-11-21T19:56:50+00:00</published><updated>2022-11-21T19:56:50+00:00</updated><id>https://simonwillison.net/2022/Nov/21/building-a-bft-json-crdt/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://jzhao.xyz/posts/bft-json-crdt/"&gt;Building a BFT JSON CRDT&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Jacky Zhao describes their project to build a CRDT library for JSON data in Rust, and includes a thorough explanation of what CRDTs are and how they work. “I write this blog post mostly as a note to my past self, distilling a lot of what I’ve learned since into a blog post I wish I had read before going in”—the best kind of blog post!

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=33694568"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/json"&gt;json&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/crdt"&gt;crdt&lt;/a&gt;&lt;/p&gt;



</summary><category term="json"/><category term="rust"/><category term="crdt"/></entry></feed>