<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: claude-code</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/claude-code.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2026-06-11T23:35:17+00:00</updated><author><name>Simon Willison</name></author><entry><title>Claude Fable is relentlessly proactive</title><link href="https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-tag" rel="alternate"/><published>2026-06-11T23:35:17+00:00</published><updated>2026-06-11T23:35:17+00:00</updated><id>https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/#atom-tag</id><summary type="html">
    &lt;p&gt;After two days of experience with &lt;a href="https://simonwillison.net/2026/Jun/9/claude-fable-5/"&gt;Claude Fable 5&lt;/a&gt; I think the best way to describe it is &lt;strong&gt;relentlessly proactive&lt;/strong&gt;. It knows a whole lot of tricks and it will deploy pretty much any of them to get to its goal.&lt;/p&gt;
&lt;p&gt;I'll illustrate this with an example. I was hacking on &lt;a href="https://agent.datasette.io/"&gt;Datasette Agent&lt;/a&gt; today when I noticed a glitch: a horizontal scrollbar that shouldn't be there in the jump menu chat prompt. I snapped this screenshot:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/jump-to-bug.jpg" alt="Screenshot of a modal dialog demonstrating a scrollbar bug. At the top is a focused search input with blue outline and placeholder &amp;quot;Jump to...&amp;quot;, with an X close button to its right. Below, a heading reads &amp;quot;Start a new agent chat&amp;quot; above a textarea with the placeholder &amp;quot;Ask a question about your data...&amp;quot; — the bug: a thick gray horizontal scrollbar is incorrectly displayed along the bottom edge of the empty textarea, spanning nearly its full width, next to the resize handle. Below the textarea: &amp;quot;Press Enter to start. Shift+Enter adds a new line.&amp;quot; followed by a blue &amp;quot;Start chat&amp;quot; button." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Then I started a fresh &lt;code&gt;claude&lt;/code&gt; session in my &lt;code&gt;datasette-agent&lt;/code&gt; checkout, dragged in the screenshot and told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Look at dependencies to help figure out why there is a horizontal scrollbar here&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I had a hunch the cause was in a dependency of Datasette Agent (likely Datasette itself) and I knew Fable was good at digging into dependency code, either by inspecting installed files in its own virtual environment &lt;code&gt;site-packages&lt;/code&gt; or by referencing a local checkout on disk. Telling it to start with dependencies felt like a good bet.&lt;/p&gt;
&lt;p&gt;I got distracted by a domestic task and wandered away from my computer.&lt;/p&gt;
&lt;p&gt;When I came back a few minutes later I saw my machine &lt;em&gt;open a browser window&lt;/em&gt; in my regular Firefox and then &lt;em&gt;navigate to the dialog in question&lt;/em&gt;. I had not told Claude Code to use any browser automation, and I was pretty sure it wasn't possible for it to trigger mouse movements or keyboard shortcuts within a window, so how was it doing that?&lt;/p&gt;
&lt;p&gt;I watched in fascination as it continued with its explorations, then saw it open a Safari window instead of Firefox. I also grabbed this snapshot from the Claude terminal:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/fable-bash-pyobjc.jpg" alt="Screenshot of two Bash tool calls in a dark terminal interface. First: Bash(open -a Safari /tmp/textarea-scrollbar-test.html &amp;amp;&amp;amp; sleep 4 &amp;amp;&amp;amp; uv run --with pyobjc-framework-Quartz python - &amp;lt;&amp;lt;'EOF' import Quartz wins = Quartz.CGWindowListCopyWindowInfo(Quartz.kCGWindowListOptionOnScreenOnly, Quartz.kCGNullWindowID) for w in wins: if (w.get('kCGWindowOwnerName') or '') == 'Safari' and 'textarea' in (w.get('kCGWindowName') or '').lower(): print(w.get('kCGWindowNumber')) EOF) with output 153551. Second: Bash(screencapture -x -o -l 153551 /tmp/safari-cases.png &amp;amp;&amp;amp; echo ok) with output ok." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;What was it doing there with &lt;code&gt;uv run --with pyobjc-framework-Quartz&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;It turns out Fable had hacked up its own pattern for taking screenshots of browser windows. It was using Python to iterate through all available windows on my machine, then filtering for Safari windows with expected strings such as &lt;code&gt;"textarea"&lt;/code&gt; in the window name. It used that to find their window number - an integer like 153551 - which it could then use with the &lt;code&gt;screencapture&lt;/code&gt; CLI tool to grab a PNG.&lt;/p&gt;
&lt;p&gt;OK fine, that's a neat way of taking screenshots. But what was it taking screenshots of?&lt;/p&gt;
&lt;p&gt;Turns out it had been writing its own scratch HTML pages to try and recreate the bug, then opening Safari and grabbing screenshots.&lt;/p&gt;
&lt;p&gt;Here's that &lt;a href="https://static.simonwillison.net/static/2026/textarea-scrollbar-test.html"&gt;/tmp/textarea-scrollbar-test.html&lt;/a&gt; page it created, and the screenshot it took with &lt;code&gt;screencapture -x -o -l 153551 /tmp/safari-cases.png&lt;/code&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/safari-cases.jpg" alt="Screenshot of a Safari browser window showing a textarea scrollbar test page at file:///private/tmp/textarea-scrollbar-test.html. Page text reads: scrollbar thickness: 17px | UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/26.4 Safari/605.1.15 | devicePixelRatio: 2. Four numbered test cases follow, each with a textarea containing the placeholder &amp;quot;Ask a question about your data...&amp;quot;: 1. Exact plugin CSS (resize: vertical, default overflow), 2. Plugin CSS + overflow-x: hidden, 3. Plugin CSS + resize: none, and 4. Bare default textarea, which is a much smaller box with the placeholder wrapping onto two lines." style="max-width: 100%;" /&gt;
(I have way too many open tabs!)&lt;/p&gt;
&lt;p&gt;OK, so I can see how it's opening test pages and taking screenshots, but how on earth was it triggering the modal dialog that was meant to be under test? That's only available via a click or a keyboard shortcut, and I couldn't see a mechanism for it to run those in Safari.&lt;/p&gt;
&lt;p&gt;I eventually figured out what it had done.&lt;/p&gt;
&lt;p&gt;Claude was running in a folder that contained the source code for the application. It knows enough about &lt;a href="https://datasette.io/"&gt;Datasette&lt;/a&gt; to be able to run a local development server. It turns out it was editing Datasette's own templates to add JavaScript that would trigger the correct keyboard shortcut as soon as the window opened, adding code like this:&lt;/p&gt;
&lt;div class="highlight highlight-text-html-basic"&gt;&lt;pre&gt;&lt;span class="pl-kos"&gt;&amp;lt;&lt;/span&gt;&lt;span class="pl-ent"&gt;script&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="pl-smi"&gt;window&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;addEventListener&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"load"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-en"&gt;setTimeout&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;dispatchEvent&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;new&lt;/span&gt; &lt;span class="pl-v"&gt;KeyboardEvent&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"keydown"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-c1"&gt;key&lt;/span&gt;: &lt;span class="pl-s"&gt;"/"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;bubbles&lt;/span&gt;: &lt;span class="pl-c1"&gt;true&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;1200&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="pl-ent"&gt;script&lt;/span&gt;&lt;span class="pl-kos"&gt;&amp;gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;1.2 seconds after the window opens, this code triggers a simulated &lt;code&gt;/&lt;/code&gt; key, which is the keyboard shortcut for opening the modal dialog.&lt;/p&gt;
&lt;p&gt;There was one challenge left. In order to understand what was going on, Claude needed to run JavaScript on the page to take measurements for itself.&lt;/p&gt;
&lt;p&gt;It wrote its own custom web application to capture information via CORS, then ran that as a local server and opened a page with JavaScript that would POST directly to it!&lt;/p&gt;
&lt;p&gt;Here's the Python web app it wrote, using the standard library &lt;a href="https://docs.python.org/3/library/http.server.html"&gt;http.server&lt;/a&gt; package:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;http&lt;/span&gt;.&lt;span class="pl-s1"&gt;server&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;HTTPServer&lt;/span&gt;, &lt;span class="pl-v"&gt;BaseHTTPRequestHandler&lt;/span&gt;

&lt;span class="pl-k"&gt;class&lt;/span&gt; &lt;span class="pl-c1"&gt;H&lt;/span&gt;(&lt;span class="pl-v"&gt;BaseHTTPRequestHandler&lt;/span&gt;):
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;do_POST&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;):
        &lt;span class="pl-s1"&gt;n&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;int&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;headers&lt;/span&gt;.&lt;span class="pl-c1"&gt;get&lt;/span&gt;(&lt;span class="pl-s"&gt;"Content-Length"&lt;/span&gt;, &lt;span class="pl-c1"&gt;0&lt;/span&gt;))
        &lt;span class="pl-en"&gt;open&lt;/span&gt;(&lt;span class="pl-s"&gt;"/tmp/diag.json"&lt;/span&gt;, &lt;span class="pl-s"&gt;"w"&lt;/span&gt;).&lt;span class="pl-c1"&gt;write&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;rfile&lt;/span&gt;.&lt;span class="pl-c1"&gt;read&lt;/span&gt;(&lt;span class="pl-s1"&gt;n&lt;/span&gt;).&lt;span class="pl-c1"&gt;decode&lt;/span&gt;())
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;send_response&lt;/span&gt;(&lt;span class="pl-c1"&gt;200&lt;/span&gt;)
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;send_header&lt;/span&gt;(&lt;span class="pl-s"&gt;"Access-Control-Allow-Origin"&lt;/span&gt;, &lt;span class="pl-s"&gt;"*"&lt;/span&gt;)
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;end_headers&lt;/span&gt;()
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;do_OPTIONS&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;):
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;send_response&lt;/span&gt;(&lt;span class="pl-c1"&gt;200&lt;/span&gt;)
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;send_header&lt;/span&gt;(&lt;span class="pl-s"&gt;"Access-Control-Allow-Origin"&lt;/span&gt;, &lt;span class="pl-s"&gt;"*"&lt;/span&gt;)
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;send_header&lt;/span&gt;(&lt;span class="pl-s"&gt;"Access-Control-Allow-Headers"&lt;/span&gt;, &lt;span class="pl-s"&gt;"*"&lt;/span&gt;)
        &lt;span class="pl-s1"&gt;self&lt;/span&gt;.&lt;span class="pl-c1"&gt;end_headers&lt;/span&gt;()
    &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;log_message&lt;/span&gt;(&lt;span class="pl-s1"&gt;self&lt;/span&gt;, &lt;span class="pl-c1"&gt;*&lt;/span&gt;&lt;span class="pl-s1"&gt;a&lt;/span&gt;):  &lt;span class="pl-c"&gt;# quiet&lt;/span&gt;
        &lt;span class="pl-k"&gt;pass&lt;/span&gt;

&lt;span class="pl-en"&gt;HTTPServer&lt;/span&gt;((&lt;span class="pl-s"&gt;"127.0.0.1"&lt;/span&gt;, &lt;span class="pl-c1"&gt;9999&lt;/span&gt;), &lt;span class="pl-c1"&gt;H&lt;/span&gt;).&lt;span class="pl-c1"&gt;serve_forever&lt;/span&gt;()&lt;/pre&gt;
&lt;p&gt;All this does is accept a POST request full of JSON and write that to the &lt;code&gt;/tmp/diag.json&lt;/code&gt; file. It sends &lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt; headers (including from &lt;code&gt;OPTIONS&lt;/code&gt; requests) so that code running on another domain can still communicate back to it.&lt;/p&gt;
&lt;p&gt;Then Claude injected this code into the template that it was loading in a browser:&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;host&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-smi"&gt;document&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;querySelector&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"navigation-search"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;ta&lt;/span&gt;   &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;host&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;shadowRoot&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;querySelector&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"textarea"&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-s1"&gt;cs&lt;/span&gt;   &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;getComputedStyle&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s1"&gt;ta&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-en"&gt;fetch&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-s"&gt;"http://127.0.0.1:9999/diag"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c1"&gt;method&lt;/span&gt;: &lt;span class="pl-s"&gt;"POST"&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-c1"&gt;body&lt;/span&gt;: &lt;span class="pl-c1"&gt;JSON&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-en"&gt;stringify&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;
    &lt;span class="pl-c1"&gt;dpr&lt;/span&gt;: &lt;span class="pl-smi"&gt;window&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;devicePixelRatio&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;scrollWidth&lt;/span&gt;: &lt;span class="pl-s1"&gt;ta&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;scrollWidth&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;clientWidth&lt;/span&gt;: &lt;span class="pl-s1"&gt;ta&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;clientWidth&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
    &lt;span class="pl-c1"&gt;whiteSpace&lt;/span&gt;: &lt;span class="pl-s1"&gt;cs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;whiteSpace&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt; &lt;span class="pl-c1"&gt;width&lt;/span&gt;: &lt;span class="pl-s1"&gt;cs&lt;/span&gt;&lt;span class="pl-kos"&gt;.&lt;/span&gt;&lt;span class="pl-c1"&gt;width&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
  &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;,&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This took measurements of the &lt;code&gt;&amp;lt;textarea&amp;gt;&lt;/code&gt; inside the &lt;code&gt;&amp;lt;navigation-search&amp;gt;&lt;/code&gt; Web Component and sent them to the server, which wrote them to a file on disk, which Claude could then read.&lt;/p&gt;
&lt;p&gt;Having figured out all of these tricks Fable... hit some invisible guardrail and downgraded itself to Opus. Thankfully Opus had access to the full transcript and could continue using the tricks pioneered by Fable, and shortly afterwards found, tested and verified &lt;a href="https://github.com/datasette/datasette-agent/commit/a75a8b727b42c30ced1fc41dc8add7eb9f04fefe"&gt;the fix&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I prompted Opus to:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Write a report in /tmp/automation-report.md where you note down all of the tricks you have used in this session to test against real browsers on my computer, include runnable code examples&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Which produced &lt;a href="https://gist.github.com/simonw/aef7f7db9ac992643110a74e43d6d42f"&gt;this report&lt;/a&gt;, which was invaluable for piecing together the details of what had happened for this post.&lt;/p&gt;
&lt;p&gt;I've shared &lt;a href="https://gisthost.github.io/?cc14774f6d37eb67bf089f3ac3925f8f"&gt;the full terminal transcript&lt;/a&gt; of the Claude Code session as well.&lt;/p&gt;
&lt;h4 id="a-review-of-everything-it-did"&gt;A review of everything it did&lt;/h4&gt;
&lt;p&gt;Based on a screenshot and a one-line prompt, Claude Fable 5 + Claude Code:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Figured out the recipe to run the local development server (with fake environment variables needed to get it running)&lt;/li&gt;
&lt;li&gt;Fired up a Playwright Chrome session&lt;/li&gt;
&lt;li&gt;Turned on the visible scrollbars setting for Chrome &lt;code&gt;defaults write com.google.chrome.for.testing AppleShowScrollBars Always&lt;/code&gt; (it turned that off again later)&lt;/li&gt;
&lt;li&gt;Cycled through Firefox and WebKit in Playwright too, failing to recreate the bug&lt;/li&gt;
&lt;li&gt;Worked out my default browser was Safari&lt;/li&gt;
&lt;li&gt;Built a &lt;code&gt;textarea-scrollbar-test.html&lt;/code&gt; HTML document&lt;/li&gt;
&lt;li&gt;Opened that in real (not Playwright) Firefox&lt;/li&gt;
&lt;li&gt;Found that &lt;code&gt;osascript -e 'tell application "System Events" to tell process "firefox" to id of window 1'&lt;/code&gt; was blocked because "osascript is not allowed assistive access"&lt;/li&gt;
&lt;li&gt;Figured out that &lt;code&gt;uv run --with pyobjc-framework-Quartz python&lt;/code&gt; workaround, described above&lt;/li&gt;
&lt;li&gt;Added JavaScript to the site templates in order to trigger the &lt;code&gt;/&lt;/code&gt; key&lt;/li&gt;
&lt;li&gt;Built its own little Python CORS web server to capture JSON data&lt;/li&gt;
&lt;li&gt;Rewrote the template to capture that data and send it to the server&lt;/li&gt;
&lt;li&gt;Scripted its way through the Web Component shadow DOM to the information it needed&lt;/li&gt;
&lt;li&gt;Opened Safari to confirm the source of the bug&lt;/li&gt;
&lt;li&gt;Modified its custom template to hack in a potential fix&lt;/li&gt;
&lt;li&gt;Confirmed the hacked fix worked&lt;/li&gt;
&lt;li&gt;Reported back on how to fix the problem&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Like I said, relentlessly proactive!&lt;/p&gt;
&lt;h4 id="an-estimate-of-the-cost"&gt;An estimate of the cost&lt;/h4&gt;
&lt;p&gt;I'm currently on the $100/month Claude Max plan, which includes a generous allowance for Fable up until June 22nd after which Anthropic say they'll start charging full API prices for it.&lt;/p&gt;
&lt;p&gt;I'm using &lt;a href="https://www.agentsview.io"&gt;AgentsView&lt;/a&gt; to track my spending (see &lt;a href="https://til.simonwillison.net/llms/agentsview-custom-model-price"&gt;this TIL&lt;/a&gt;). Here's what AgentsView says this session would have cost me if I was paying full price for it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;~ % uvx agentsview session usage be8850a7-6119-46a0-b5d6-79c7fff5ae2b
Session:       be8850a7-6119-46a0-b5d6-79c7fff5ae2b
Agent:         claude
Output:        68606
Peak ctx:      113178
Cost:          ~$12.11 (claude-fable-5, claude-opus-4-8)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you don't keep a close eye on it, Fable will quite happily burn $12 in tokens inventing new ways to debug your CSS.&lt;/p&gt;
&lt;h4 id="i-really-need-to-lock-this-thing-down"&gt;I really need to lock this thing down&lt;/h4&gt;
&lt;p&gt;On the one hand, watching Fable go to extreme lengths to get the information that it needed to debug what was, in the end, a two-line CSS fix, was &lt;em&gt;fascinating&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;But on the other hand... this is a robust reminder that coding agents can do anything &lt;em&gt;you&lt;/em&gt; can do by typing commands into a terminal - and frontier models know every trick in the book, and evidently a few that nobody has ever written down before.&lt;/p&gt;
&lt;p&gt;If Fable had been acting on malicious instructions - a prompt injection attack hidden in code or an issue thread, or something I'd carelessly pasted into my terminal - it's alarming to think quite how far it could go to exfiltrate data or cause other forms of mischief.&lt;/p&gt;
&lt;p&gt;Running coding agents outside of a sandbox has always been a bad idea - it's my top contender for &lt;a href="https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/#1-year-a-challenger-disaster-for-coding-agent-security"&gt;a Challenger disaster&lt;/a&gt; incident, as described by Johann Rehberger in &lt;a href="https://embracethered.com/blog/posts/2025/the-normalization-of-deviance-in-ai/"&gt;The Normalization of Deviance in AI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Fable is arguably smarter and hence more suspicious of potentially malicious instructions. But that smartness is very much a two-edged sword: if it &lt;em&gt;does&lt;/em&gt; get subverted by instructions, the amount of damage it can do given its relentless proactivity is terrifying.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-mythos"&gt;claude-mythos&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="coding-agents"/><category term="claude-code"/><category term="claude-mythos"/></entry><entry><title>How we contain Claude across products</title><link href="https://simonwillison.net/2026/May/30/how-we-contain-claude/#atom-tag" rel="alternate"/><published>2026-05-30T21:36:24+00:00</published><updated>2026-05-30T21:36:24+00:00</updated><id>https://simonwillison.net/2026/May/30/how-we-contain-claude/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.anthropic.com/engineering/how-we-contain-claude"&gt;How we contain Claude across products&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A complaint I often have about sandboxing products is that they are rarely thoroughly &lt;em&gt;documented&lt;/em&gt;, and in the absence of detailed documentation it's hard to know how much I can trust them.&lt;/p&gt;
&lt;p&gt;Anthropic just published a fantastic overview of how their various sandbox techniques work across &lt;a href="https://claude.ai/"&gt;Claude.ai&lt;/a&gt;, Claude Code, and Cowork.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We constrain where and how an agent can act with process sandboxes, VMs, filesystem boundaries, and egress controls. The goal is to set a hard boundary on what an agent can reach. For example, if credentials never enter the sandbox, they can't be exfiltrated, regardless of whether the cause is a user, a model finding a “creative” path, or an attacker.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Claude.ai uses gVisor. Claude Code, run locally, uses Seatbelt on macOS and Bubblewrap on Linux. Claude Cowork runs a full VM (Apple's Virtualization framework on macOS, HCS on Windows).&lt;/p&gt;
&lt;p&gt;There's a lot in here, including some interesting stories of risks they missed such as the &lt;code&gt;api.anthropic.com/v1/files&lt;/code&gt; exfiltration vector &lt;a href="https://simonwillison.net/2026/Jan/14/claude-cowork-exfiltrates-files/"&gt;covered here previously&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This reminded me it's time I took another look at Anthropic's open source &lt;a href="https://github.com/anthropic-experimental/sandbox-runtime"&gt;srt (Anthropic Sandbox Runtime)&lt;/a&gt; tool - it's mature enough now that I'm ready to give it a proper go.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="sandboxing"/><category term="security"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="claude-code"/></entry><entry><title>Running Python ASGI apps in the browser via Pyodide + a service worker</title><link href="https://simonwillison.net/2026/May/30/pyodide-asgi-browser/#atom-tag" rel="alternate"/><published>2026-05-30T15:34:00+00:00</published><updated>2026-05-30T15:34:00+00:00</updated><id>https://simonwillison.net/2026/May/30/pyodide-asgi-browser/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Research:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research/tree/main/pyodide-asgi-browser#readme"&gt;Running Python ASGI apps in the browser via Pyodide + a service worker&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;&lt;a href="https://lite.datasette.io/"&gt;Datasette Lite&lt;/a&gt; is my version of Datasette that runs entirely in the browser using Pyodide in WebAssembly.&lt;/p&gt;
&lt;p&gt;When I first built it &lt;a href="https://simonwillison.net/2022/May/4/datasette-lite/"&gt;four years ago&lt;/a&gt; I used Web Workers and code that intercepts navigation operations and fetches the generated HTML by running the Python app.&lt;/p&gt;
&lt;p&gt;This worked, but had the disadvantage that any JavaScript in &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags would not be executed - breaking some Datasette functionality and a whole lot of Datasette plugins.&lt;/p&gt;
&lt;p&gt;This morning I &lt;a href="https://github.com/simonw/research/pull/112"&gt;set Claude Opus 4.8 the task&lt;/a&gt; (in Claude Code for web) of figuring out how to run Python ASGI apps in Pyodide using Service Workers instead, and it seems to work! Here's a &lt;a href="https://simonw.github.io/research/pyodide-asgi-browser/"&gt;basic ASGI FastCGI demo&lt;/a&gt; and here's &lt;a href="https://simonw.github.io/research/pyodide-asgi-browser/datasette.html"&gt;a demo that runs Datasette 1.0a31&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm still getting my head around exactly how it works, but once I've done that I plan to upgrade Datasette Lite itself.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/asgi"&gt;asgi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/service-workers"&gt;service-workers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pyodide"&gt;pyodide&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-lite"&gt;datasette-lite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="javascript"/><category term="python"/><category term="datasette"/><category term="asgi"/><category term="webassembly"/><category term="service-workers"/><category term="pyodide"/><category term="datasette-lite"/><category term="claude-code"/></entry><entry><title>I think Anthropic and OpenAI have found product-market fit</title><link href="https://simonwillison.net/2026/May/27/product-market-fit/#atom-tag" rel="alternate"/><published>2026-05-27T16:38:35+00:00</published><updated>2026-05-27T16:38:35+00:00</updated><id>https://simonwillison.net/2026/May/27/product-market-fit/#atom-tag</id><summary type="html">
    &lt;p&gt;Anthropic are &lt;a href="https://techcrunch.com/2026/05/20/anthropic-says-its-about-to-have-its-first-profitable-quarter/"&gt;strongly rumored&lt;/a&gt; to be about to have their first profitable quarter. Stories &lt;a href="https://www.theinformation.com/newsletters/applied-ai/uber-cto-shows-claude-code-can-blow-ai-budgets"&gt;are circulating&lt;/a&gt; of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I think this is because OpenAI and Anthropic have both found product-market fit.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#enterprise-customers-are-now-paying-api-prices"&gt;Enterprise customers are now paying API prices&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#i-think-they-ve-found-product-market-fit"&gt;I think they've found product-market fit&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#and-they-re-ramping-up"&gt;And they're ramping up&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#the-ai-failure-stories-around-this-are-pretty-thin"&gt;The AI-failure stories around this are pretty thin&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#we-also-know-the-labs-are-spending-a-lot"&gt;We also know the labs are spending a lot&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#api-revenue-is-becoming-less-important"&gt;API revenue is becoming less important&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/May/27/product-market-fit/#april-is-a-new-inflection-point"&gt;April is a new inflection point&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="enterprise-customers-are-now-paying-api-prices"&gt;Enterprise customers are now paying API prices&lt;/h4&gt;
&lt;p&gt;I currently subscribe to the $100/month Max plan from Anthropic and the $100/month Pro plan from OpenAI. If you are a heavy user of coding agents these plans are a fantastic deal. I just ran the &lt;a href="https://github.com/ryoppippi/ccusage"&gt;ccusage&lt;/a&gt; tool on my laptop to get an estimate of how much I would have spent if I were to pay for API tokens in the past 30 days and got:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$1,199.79 for Anthropic Claude Code&lt;/li&gt;
&lt;li&gt;$980.37 for OpenAI Codex&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That's $2,180.16 worth of tokens for $200 - not bad at all! I'm a moderately heavy user of these tools, but I'm certainly not running agents every hour of the day and night.&lt;/p&gt;
&lt;p&gt;I had assumed that companies making extensive use of agents were getting similar discounts. It turns out I &lt;em&gt;could not have been more wrong&lt;/em&gt; about that.&lt;/p&gt;
&lt;p&gt;I haven't been able to track down the exact date, but at some point in the last six months Anthropic switched their Enterprise plan (originally &lt;a href="https://www.anthropic.com/news/claude-code-on-team-and-enterprise"&gt;"Claude seats include enough usage for a typical workday" back in August 2025&lt;/a&gt;) to $20/seat/month plus API pricing for usage. This story about the change &lt;a href="https://www.theinformation.com/articles/anthropic-changes-pricing-bill-firms-based-ai-use-amid-compute-crunch"&gt;from The Information&lt;/a&gt; is dated Apr 14, 2026, but cites an Anthropic spokesperson claiming that the pricing change occurred in November 2025. Existing customers are finding out about the change as they renew their contracts.&lt;/p&gt;
&lt;p&gt;OpenAI made a similar pricing change in April. The &lt;a href="https://help.openai.com/en/articles/20001106-codex-rate-card"&gt;Codex rate card&lt;/a&gt; (&lt;a href="https://web.archive.org/web/20260519062438/https://help.openai.com/en/articles/20001106-codex-rate-card"&gt;Internet Archive copy&lt;/a&gt;) currently says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: On April 2, 2026, we updated Codex pricing to align with API token usage, instead of per-message pricing. This change was applicable to new and existing Plus, Pro, ChatGPT Business and new ChatGPT Enterprise plans.&lt;/p&gt;
&lt;p&gt;On April 23, 2026, we made this update for all existing ChatGPT Enterprise plans as well, inclusive of Edu, Health, Gov, and ChatGPT for Teachers.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It's a little harder to decode as they quote prices in "credits", but as far as I can tell those credit costs are an exact match for the API token costs listed for those models.&lt;/p&gt;
&lt;p&gt;All of which is to say that as of April 2026 the "Enterprise" cost for both OpenAI Codex and Anthropic Claude Code/Cowork is the same as the listed API price.&lt;/p&gt;
&lt;p&gt;GPT-5.5 (released April 23rd) is 2x the API price of GPT-5.4. Opus 4.7 (April 16th) is &lt;a href="https://simonwillison.net/2026/Apr/20/claude-token-counts/"&gt;around 1.4x&lt;/a&gt; the price of Opus 4.6 when you take their new tokenizer into account.&lt;/p&gt;
&lt;p&gt;So April saw both leading model companies release new frontier models with a higher API price, &lt;em&gt;and&lt;/em&gt; both companies now have measures to lock their enterprise customers (who tend to sign year-long deals) at those API prices, not the previous extreme discounts.&lt;/p&gt;
&lt;h4 id="i-think-they-ve-found-product-market-fit"&gt;I think they've found product-market fit&lt;/h4&gt;
&lt;p&gt;Why these sudden aggressive moves on pricing? Both Anthropic and OpenAI are planning to IPO, but I suspect there's a more important factor here: I think they've finally found product-market fit, with the coding/general-purpose agent products embodied by Claude Code/Cowork and Codex.&lt;/p&gt;
&lt;p&gt;Tools like ChatGPT are wildly popular, but that wild popularity has been difficult to turn into revenue. In February &lt;a href="https://finance.yahoo.com/news/chatgpt-almost-1-billion-weekly-212157499.html"&gt;OpenAI boasted&lt;/a&gt; more than 900 million weekly active users for ChatGPT, but only 50 million - 5.6% of that - were paying consumer subscribers.&lt;/p&gt;
&lt;p&gt;Charging $10-$20/month per user is an OK business, but you'd need 1-2 billion subscribers sticking around for four years to cover &lt;a href="https://openai.com/global-affairs/seizing-the-ai-opportunity/"&gt;$1 trillion in infrastructure&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Companies spending $200+/month/user will get you there a whole lot faster - and as noted above, as a power-user I'm at ~$1,000/month in API costs per vendor already.&lt;/p&gt;
&lt;p&gt;Coding agents really did change everything. These are tools which burn &lt;em&gt;vastly&lt;/em&gt; more tokens, but are also quickly becoming daily drivers for the work carried out by extremely well-compensated professionals. Right now that's still mostly software engineers, but a coding agent is a tool that can automate anything you can do by typing commands into a computer... so they are clearly applicable to a much wider set of skilled knowledge workers.&lt;/p&gt;
&lt;p&gt;As I've &lt;a href="https://simonwillison.net/tags/november-2025-inflection/"&gt;discussed on this site at length&lt;/a&gt;, the models released in November 2025 elevated agents to being genuinely useful. We've had six months to get used to that idea now - it's no wonder companies are beginning to spend real money on this technology.&lt;/p&gt;
&lt;p&gt;You could argue that ChatGPT achieved product-market fit when it became the &lt;a href="https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/"&gt;fastest-growing consumer app in history&lt;/a&gt; back in February 2023... but it certainly wasn't making any actual money back then. Coding agents plus enterprise pricing marks the point when these companies start making &lt;em&gt;very&lt;/em&gt; real revenue. Maybe even enough to start covering their costs!&lt;/p&gt;
&lt;h4 id="and-they-re-ramping-up"&gt;And they're ramping up&lt;/h4&gt;
&lt;p&gt;As further evidence that enterprise agents represent product-market fit for these companies, consider their open job listings.&lt;/p&gt;
&lt;p&gt;OpenAI have &lt;a href="https://openai.com/careers/search/"&gt;703 open jobs&lt;/a&gt; right now, of which I'd categorize 229 (32.6%) as relating to enterprise sales and support - account executives, "Go To Market", "Forward Deployed Engineers" and the like.&lt;/p&gt;
&lt;p&gt;Anthropic have &lt;a href="https://www.anthropic.com/careers/jobs"&gt;390 open jobs&lt;/a&gt;, 105 (26.9%) of which look enterprisey to me.&lt;/p&gt;
&lt;p&gt;It's pleasingly ironic that these AI labs have picked a business model with such a heavy demand on human labor - enterprise sales contracts don't close themselves without a whole lot of humans in the mix!&lt;/p&gt;
&lt;p&gt;&lt;small&gt;(I ran this analysis by scraping their job sites with Claude Code, then having it use Datasette's &lt;a href="https://docs.datasette.io/en/latest/json_api.html"&gt;JSON API&lt;/a&gt; to pipe that data into Datasette Cloud where I used &lt;a href="https://agent.datasette.io/"&gt;Datasette Agent&lt;/a&gt; for the analysis, &lt;a href="https://gist.github.com/simonw/5632d208d76b3c8b34f1fdbaf69eb1b8#agent-4"&gt;exported here&lt;/a&gt;. Dogfood!)&lt;/small&gt;&lt;/p&gt;
&lt;h4 id="the-ai-failure-stories-around-this-are-pretty-thin"&gt;The AI-failure stories around this are pretty thin&lt;/h4&gt;
&lt;p&gt;I started digging into this in response to &lt;a href="https://news.ycombinator.com/item?id=48287025#48287219"&gt;a growing volume&lt;/a&gt; of stories claiming that large companies were sounding the alarm because their AI usage costs had grown so large.&lt;/p&gt;
&lt;p&gt;The most widely cited of these stories appear quite overblown to me.&lt;/p&gt;
&lt;p&gt;The most discussed has been Uber, based on &lt;a href="https://www.theinformation.com/newsletters/applied-ai/uber-cto-shows-claude-code-can-blow-ai-budgets"&gt;this report&lt;/a&gt; where CTO Praveen Neppalli Naga indicated that Uber had "maxed out its full year AI budget just a few months into 2026", mostly thanks to Claude Code.&lt;/p&gt;
&lt;p&gt;Given that Claude Code only got &lt;em&gt;really&lt;/em&gt; good in November it's entirely unsurprising to me that a budget set in 2025 may have failed to predict demand for that tool in 2026!&lt;/p&gt;
&lt;p&gt;That Uber story was further fueled by comments made by Uber's COO, Andrew Macdonald, on the Rapid Response podcast. I tracked down &lt;a href="https://www.youtube.com/watch?v=y_mQ6xLcKyc&amp;amp;t=1616s"&gt;the segment&lt;/a&gt; and there really isn't much there. Here's what Andrew said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;But then you sometimes go and talk to your senior engineering leaders and you're saying, OK, how many projects that were on the cutting room floor got moved above the line because of the productivity gains because 25% of our code commits were via Claude Code last quarter?&lt;/p&gt;
&lt;p&gt;That link is not there yet, right? I think maybe implicitly there's more that is getting shipped. But it's very hard to draw a line between one of those stats and, OK, now we're actually producing like 25% more useful consumer features, right? And that line is hard to draw.&lt;/p&gt;
&lt;p&gt;[...] And so if you're not actually able to draw a direct line to how much useful features and functionality you're shipping to your users, that trade becomes harder to justify.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Somehow this fragment turned into headlines like &lt;a href="https://www.businessinsider.com/uber-coo-andrew-macdonald-ai-token-spending-harder-justify-2026-5"&gt;Uber's COO says it's getting harder to justify the money spent on AI tokenmaxxing&lt;/a&gt;, because the market for stories about AI failures remains enormous.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update 29th May 2026&lt;/strong&gt;: I edited the above quote to add that last paragraph ending in "becomes harder to justify" on &lt;a href="https://x.com/MadisonMills22/status/2060343512936186240"&gt;the suggestion of Madison Mills&lt;/a&gt; - previously my quoted section stopped at "hard to draw". Here's the &lt;a href="https://gist.github.com/simonw/59096a338c82f6f95e40e3d7c7b5bad9"&gt;full unedited transcript&lt;/a&gt; from MacWhisper.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The other popular story around this is &lt;a href="https://www.theverge.com/tech/930447/microsoft-claude-code-discontinued-notepad"&gt;Microsoft starts canceling Claude Code licenses&lt;/a&gt;, ostensibly to encourage their engineers to dogfood their own Copilot CLI agent instead - but The Verge reporter Tom Warren says "sources tell me the decision is also a financial one", triggered by the June 30th end of Microsoft's financial year.&lt;/p&gt;
&lt;p&gt;I think both of these stories support my "product-market fit" hypothesis. The best advice I ever heard on pricing a product was that your customer should &lt;em&gt;suck air through their teeth&lt;/em&gt; and then say yes. Uber's budget overrun and Microsoft's seat cancellations look like that effect playing out in practice.&lt;/p&gt;
&lt;h4 id="we-also-know-the-labs-are-spending-a-lot"&gt;We also know the labs are spending a lot&lt;/h4&gt;
&lt;p&gt;The big AI labs spend billions of dollars on both training and inference. Credible figures are hard to come by, but we did get one huge hint as to the figures involved from, oddly enough, the recent &lt;a href="https://www.sec.gov/Archives/edgar/data/1181412/000162828026036936/spaceexplorationtechnologi.htm"&gt;SpaceX S-1&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[...] in May 2026, we entered into &lt;strong&gt;Cloud Services Agreements with Anthropic PBC&lt;/strong&gt; (“Anthropic”), an AI research and development public benefit corporation, with respect to access to &lt;strong&gt;compute capacity across COLOSSUS and COLOSSUS II&lt;/strong&gt;. Pursuant to these agreements, the customer &lt;strong&gt;has agreed to pay us $1.25 billion per month&lt;/strong&gt; through May 2029 [...]&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;a href="https://www.anthropic.com/news/higher-limits-spacex"&gt;Anthropic announcement&lt;/a&gt; said that this deal meant they could "increase our usage limits for Claude Code and the Claude API", heavily implying that Colossus is being used for inference, not model training.&lt;/p&gt;
&lt;p&gt;Anthropic already have vast amounts of compute from other providers. The fact that they're willing to spend $1.25 billion per month for extra capacity from just &lt;em&gt;one&lt;/em&gt; of their vendors hints at how big these inference budgets have become.&lt;/p&gt;
&lt;h4 id="api-revenue-is-becoming-less-important"&gt;API revenue is becoming less important&lt;/h4&gt;
&lt;p&gt;Over the past two years my impression has been that OpenAI made more of their income from subscription revenue while Anthropic made more from their API.&lt;/p&gt;
&lt;p&gt;Anthropic's API revenue was historically quite dependent on a small number of large API customers - &lt;a href="https://venturebeat.com/ai/anthropic-revenue-tied-to-two-customers-as-ai-pricing-war-threatens-margins"&gt;this VentureBeat story from August 2025&lt;/a&gt; quotes "sources familiar with the matter" suggesting that just Cursor and GitHub Copilot were responsible for $1.2 billion of the company's then-$4 billion revenue.&lt;/p&gt;
&lt;p&gt;Today Anthropic are rumored to hit &lt;a href="https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-propel-anthropic-into-its-first-profitable-quarter-7edbf2f4"&gt;$10.9 billion in the second quarter&lt;/a&gt;, potentially even operating at a profit for the first time.&lt;/p&gt;
&lt;p&gt;This pivot-to-Enterprise suggests that the labs have realized that the real money lies in cutting out the middlemen. Anthropic's Claude Code directly competes with Cursor and Copilot. No wonder Cursor are &lt;a href="https://cursor.com/blog/composer-2"&gt;investing in their own models&lt;/a&gt;!&lt;/p&gt;
&lt;h4 id="april-is-a-new-inflection-point"&gt;April is a new inflection point&lt;/h4&gt;
&lt;p&gt;I've called November 2025 the &lt;a href="https://simonwillison.net/tags/november-2025-inflection/"&gt;November inflection point&lt;/a&gt; because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got &lt;em&gt;good&lt;/em&gt; - good enough that we've spent the last six months adapting to agent systems that can reliably get useful work done.&lt;/p&gt;
&lt;p&gt;I think April 2026 is a new inflection point where the revenue implications of this have started to land, to the benefit of the frontier AI labs and with material impacts on the budgets of large companies.&lt;/p&gt;
&lt;p&gt;We'll know for sure how real this moment is when the S-1 documents for the upcoming Anthropic and OpenAI IPOs give us some real, audited numbers to get our teeth into.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-pricing"&gt;llm-pricing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex"&gt;codex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-cowork"&gt;claude-cowork&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/november-2025-inflection"&gt;november-2025-inflection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette-agent"&gt;datasette-agent&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uber"&gt;uber&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="datasette"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="llm-pricing"/><category term="coding-agents"/><category term="claude-code"/><category term="codex"/><category term="claude-cowork"/><category term="november-2025-inflection"/><category term="datasette-agent"/><category term="uber"/></entry><entry><title>Using Claude Code: The Unreasonable Effectiveness of HTML</title><link href="https://simonwillison.net/2026/May/8/unreasonable-effectiveness-of-html/#atom-tag" rel="alternate"/><published>2026-05-08T21:00:11+00:00</published><updated>2026-05-08T21:00:11+00:00</updated><id>https://simonwillison.net/2026/May/8/unreasonable-effectiveness-of-html/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://twitter.com/trq212/status/2052809885763747935"&gt;Using Claude Code: The Unreasonable Effectiveness of HTML&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Thought-provoking piece by Thariq Shihipar (on the Claude Code team at Anthropic) advocating for HTML over Markdown as an output format to request from Claude.&lt;/p&gt;
&lt;p&gt;The article is crammed with interesting examples (collected on &lt;a href="https://thariqs.github.io/html-effectiveness/"&gt;this site&lt;/a&gt;) and prompt suggestions like this one:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Help me review this PR by creating an HTML artifact that describes it. I'm not very familiar with the streaming/backpressure logic so focus on that. Render the actual diff with inline margin annotations, color-code findings by severity and whatever else might be needed to convey the concept well.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I've been defaulting to asking for most things in Markdown since the GPT-4 days, when the 8,192 token limit meant that Markdown's token-efficiency over HTML was extremely worthwhile.&lt;/p&gt;
&lt;p&gt;Thariq's piece here has caused me to reconsider that, especially for output. Asking Claude for an explanation in HTML means it can drop in SVG diagrams, interactive widgets, in-page navigation and all sorts of other neat ways of making the information more pleasant to navigate.&lt;/p&gt;
&lt;p&gt;I wrote about &lt;a href="https://simonwillison.net/2025/Dec/10/html-tools/"&gt;Useful patterns for building HTML tools&lt;/a&gt; last December, but that was focused very much on interactive utilities like the ones on my &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; site. I'm excited to start experimenting more with rich HTML explanations in response to ad-hoc prompts.&lt;/p&gt;
&lt;h4 id="trying-this-out"&gt;Trying this out on copy.fail&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://copy.fail/"&gt;copy.fail&lt;/a&gt; describes a recently discovered Linux security exploit, including a proof of concept distributed as obfuscated Python.&lt;/p&gt;
&lt;p&gt;I tried having GPT-5.5 create an HTML explanation of the exploit like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;curl https://copy.fail/exp | llm -m gpt-5.5 -s 'Explain this code in detail. Reformat it, expand out any confusing bits and go deep into what it does and how it works. Output HTML, neatly styled and using capabilities of HTML and CSS and JavaScript to make the explanation rich and interactive and as clear as possible'&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gisthost.github.io/?ae53e3461ffdbfd0826156aacf025c7e"&gt;the resulting HTML page&lt;/a&gt;. It's pretty good, though I should have emphasized explaining the exploit over the Python harness around it.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a dark-themed technical document titled &amp;quot;What this Python script does&amp;quot;. Body text: &amp;quot;This is a compact, deliberately obfuscated Linux-specific local privilege-escalation proof-of-concept. Its apparent goal is to tamper with the in-memory image/page cache of /usr/bin/su, then execute su to obtain elevated privileges.&amp;quot; A yellow-bordered callout reads: &amp;quot;Safety note: This explanation is for code understanding, reverse engineering, and defensive analysis. Do not run this on systems you do not own or administer. On a vulnerable kernel, code like this can alter the behavior of a privileged executable.&amp;quot; Left column heading &amp;quot;High-level summary&amp;quot;: &amp;quot;The script opens /usr/bin/su read-only, decompresses an embedded binary payload, and then processes that payload in 4-byte chunks. For each chunk, it performs a carefully arranged sequence involving Linux's kernel crypto socket interface, AF_ALG, pipes, and splice(). The important point is that this is not ordinary file writing. It never calls write() on /usr/bin/su. Instead, it appears to rely on a kernel bug/primitive involving spliced file pages and the crypto API to get controlled bytes placed into the page-cache representation of a privileged executable.&amp;quot; Numbered steps follow: &amp;quot;1. Open target executable — /usr/bin/su is opened read-only. 2. Decode hidden payload — A zlib-compressed hex blob is decompressed into bytes. 3. Patch in 4-byte chunks — The helper function is called repeatedly with offsets 0, 4, 8, ...&amp;quot;. Right column heading &amp;quot;Why it looks strange&amp;quot; contains a table with Pattern and Purpose columns: &amp;quot;import os as g — Short aliasing to make the script compact and harder to read. socket(38, 5, 0) — Uses raw numeric Linux constants instead of readable names. Compressed hex blob — Hides binary payload bytes and keeps the script small. splice() — Moves file-backed pages through pipes without normal user-space copying. try: recv(...) except: 0 — Triggers the kernel operation and ignores expected errors.&amp;quot;" src="https://static.simonwillison.net/static/2026/python-script-explainer.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/html"&gt;html&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="html"/><category term="security"/><category term="markdown"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="claude-code"/></entry><entry><title>Live blog: Code w/ Claude 2026</title><link href="https://simonwillison.net/2026/May/6/code-w-claude-2026/#atom-tag" rel="alternate"/><published>2026-05-06T15:58:27+00:00</published><updated>2026-05-06T15:58:27+00:00</updated><id>https://simonwillison.net/2026/May/6/code-w-claude-2026/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm at Anthropic's Code w/ Claude event today. Here's my live blog of the morning keynote sessions.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/live-blog"&gt;live-blog&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="claude-code"/><category term="live-blog"/></entry><entry><title>Sightings</title><link href="https://simonwillison.net/2026/May/2/sightings/#atom-tag" rel="alternate"/><published>2026-05-02T17:26:40+00:00</published><updated>2026-05-02T17:26:40+00:00</updated><id>https://simonwillison.net/2026/May/2/sightings/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonwillison.net/elsewhere/sighting/"&gt;/elsewhere/sightings/&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I have a new camera (a Canon R6 Mark II) so I'm taking a lot more photos of birds. I share my best wildlife photos on &lt;a href="https://www.inaturalist.org/"&gt;iNaturalist&lt;/a&gt;, and based on yesterday's &lt;a href="https://simonwillison.net/2026/May/1/inat-sightings/"&gt;successful prototype&lt;/a&gt;  I decided to add those to my blog.&lt;/p&gt;
&lt;p&gt;&lt;img class="blogmark-image" src="https://static.simonwillison.net/static/2026/beats-sightings.jpeg" alt="Screenshot of a &amp;quot;Sightings&amp;quot; webpage with a search bar and RSS icon, showing &amp;quot;Filters: Sorted by date&amp;quot; and &amp;quot;208 results page 1 / 7 next » last »»&amp;quot;. First entry: SIGHTING 7:51 PM — Acorn Woodpecker, with two photos labeled &amp;quot;Acorn Woodpecker&amp;quot; of black and white woodpeckers with red caps on tree branches, dated 2nd May 2026. Second entry: SIGHTING 10:08 AM – 11:17 AM — Acorn Woodpecker, Western Fence Lizard, Osprey, with three photos labeled &amp;quot;Acorn Woodpecker&amp;quot; (bird on bare branches against blue sky), &amp;quot;Wester...&amp;quot; (lizard on tree bark), and &amp;quot;Osprey&amp;quot; (nest on a utility pole), dated 1st May 2026. Third entry: SIGHTING 11:11 AM — White-crowned Sparrow, with a photo labeled &amp;quot;White-crowned Sparrow&amp;quot; of a sparrow with black and white striped head singing with open beak, dated 30th Apr 2026."&gt;&lt;/p&gt;
&lt;p&gt;I built this feature on my phone using Claude Code for web, as an extension of my &lt;a href="https://simonwillison.net/2026/Feb/20/beats/"&gt;beats system&lt;/a&gt; for syndicating external content. Here's &lt;a href="https://github.com/simonw/simonwillisonblog/pull/668"&gt;the PR&lt;/a&gt; and prompt.&lt;/p&gt;
&lt;p&gt;As with my other forms of incoming syndicated content sightings show up on the homepage, the date archive pages, and in site search results.&lt;/p&gt;
&lt;p&gt;I back-populated over a decade of iNaturalist sightings, which means you that if you &lt;a href="https://simonwillison.net/search/?q=lemur"&gt;search for lemur&lt;/a&gt; you'll see my lemur photos from Madagascar in 2019!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/photography"&gt;photography&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wildlife"&gt;wildlife&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="blogging"/><category term="photography"/><category term="wildlife"/><category term="ai"/><category term="inaturalist"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude-code"/></entry><entry><title>iNaturalist Sightings</title><link href="https://simonwillison.net/2026/May/1/inat-sightings/#atom-tag" rel="alternate"/><published>2026-05-01T19:35:41+00:00</published><updated>2026-05-01T19:35:41+00:00</updated><id>https://simonwillison.net/2026/May/1/inat-sightings/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/inat-sightings"&gt;iNaturalist Sightings&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I wanted to see my &lt;a href="https://www.inaturalist.org"&gt;iNaturalist&lt;/a&gt; observations - across two separate accounts - grouped by when they occurred. I'm camping this weekend so I built this entirely on my phone using Claude Code for web.&lt;/p&gt;
&lt;p&gt;I started by building an &lt;a href="https://github.com/simonw/inaturalist-clumper"&gt;inaturalist-clumper&lt;/a&gt; Python CLI for fetching and "clumping" observations - by default clumps use observations within 2 hours and 5km of each other.&lt;/p&gt;
&lt;p&gt;Then I setup &lt;a href="https://github.com/simonw/inaturalist-clumps"&gt;simonw/inaturalist-clumps&lt;/a&gt; as a &lt;a href="https://simonwillison.net/series/git-scraping/"&gt;Git scraping&lt;/a&gt; repository to run that tool and record the result to &lt;a href="https://github.com/simonw/inaturalist-clumps/blob/main/clumps.json"&gt;clumps.json&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That JSON file is hosted on GitHub, which means it can be fetched by JavaScript using CORS.&lt;/p&gt;
&lt;p&gt;Finally I ran this prompt against my &lt;a href="https://github.com/simonw/tools"&gt;simonw/tools&lt;/a&gt; repo:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Build inat-sightings.html - an app that does a fetch() against https://raw.githubusercontent.com/simonw/inaturalist-clumps/refs/heads/main/clumps.json and then displays all of the observations on one page using the https://static.inaturalist.org/photos/538073008/small.jpg small.jpg URLs for the thumbnails - with loading=lazy - but when a thumbnail is clicked showing the large.jpg in an HTML modal. Both small and large should include the common species names if available&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="tools"/><category term="ai"/><category term="inaturalist"/><category term="generative-ai"/><category term="llms"/><category term="claude-code"/></entry><entry><title>An update on recent Claude Code quality reports</title><link href="https://simonwillison.net/2026/Apr/24/recent-claude-code-quality-reports/#atom-tag" rel="alternate"/><published>2026-04-24T01:31:25+00:00</published><updated>2026-04-24T01:31:25+00:00</updated><id>https://simonwillison.net/2026/Apr/24/recent-claude-code-quality-reports/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.anthropic.com/engineering/april-23-postmortem"&gt;An update on recent Claude Code quality reports&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems.&lt;/p&gt;
&lt;p&gt;The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users.&lt;/p&gt;
&lt;p&gt;Anthropic's postmortem describes these in detail. This one in particular stood out to me:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I &lt;em&gt;frequently&lt;/em&gt; have Claude Code sessions which I leave for an hour (or often a day or longer) before returning to them. Right now I have 11 of those (according to &lt;code&gt;ps aux  | grep 'claude '&lt;/code&gt;) and that's after closing down dozens more the other day.&lt;/p&gt;
&lt;p&gt;I estimate I spend more time prompting in these "stale" sessions than sessions that I've recently started!&lt;/p&gt;
&lt;p&gt;If you're building agentic systems it's worth reading this article in detail - the kinds of bugs that affect harnesses are deeply complicated, even if you put aside the inherent non-deterministic nature of the models themselves.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=47878905"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>Extract PDF text in your browser with LiteParse for the web</title><link href="https://simonwillison.net/2026/Apr/23/liteparse-for-the-web/#atom-tag" rel="alternate"/><published>2026-04-23T21:54:24+00:00</published><updated>2026-04-23T21:54:24+00:00</updated><id>https://simonwillison.net/2026/Apr/23/liteparse-for-the-web/#atom-tag</id><summary type="html">
    &lt;p&gt;LlamaIndex have a most excellent open source project called &lt;a href="https://github.com/run-llama/liteparse"&gt;LiteParse&lt;/a&gt;, which provides a Node.js CLI tool for extracting text from PDFs. I got a version of LiteParse working entirely in the browser, using most of the same libraries that LiteParse uses to run in Node.js.&lt;/p&gt;
&lt;h4 id="spatial-text-parsing"&gt;Spatial text parsing&lt;/h4&gt;
&lt;p&gt;Refreshingly, LiteParse doesn't use AI models to do what it does: it's good old-fashioned PDF parsing, falling back to Tesseract OCR (or other pluggable OCR engines) for PDFs that contain images of text rather than the text itself.&lt;/p&gt;
&lt;p&gt;The hard problem that LiteParse solves is extracting text in a sensible order despite the infuriating vagaries of PDF layouts. They describe this as "spatial text parsing" - they use some very clever heuristics to detect things like multi-column layouts and group and return the text in a sensible linear flow.&lt;/p&gt;
&lt;p&gt;The LiteParse documentation describes a pattern for implementing &lt;a href="https://developers.llamaindex.ai/liteparse/guides/visual-citations/"&gt;Visual Citations with Bounding Boxes&lt;/a&gt;. I really like this idea: being able to answer questions from a PDF and accompany those answers with cropped, highlighted images feels like a great way of increasing the credibility of answers from RAG-style Q&amp;amp;A.&lt;/p&gt;
&lt;p&gt;LiteParse is provided as a pure CLI tool, designed to be used by agents. You run it like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;npm i -g @llamaindex/liteparse
lit parse document.pdf
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I &lt;a href="https://claude.ai/share/44a5ed86-e5b5-4e14-90be-1eba1e0acd13"&gt;explored its capabilities with Claude&lt;/a&gt; and quickly determined that there was no real reason it had to stay a CLI app: it's built on top of PDF.js and Tesseract.js, two libraries I've used for something similar in a browser &lt;a href="https://simonwillison.net/2024/Mar/30/ocr-pdfs-images/"&gt;in the past&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The only reason LiteParse didn't have a pure browser-based version is that nobody had built one yet...&lt;/p&gt;
&lt;h4 id="introducing-liteparse-for-the-web"&gt;Introducing LiteParse for the web&lt;/h4&gt;
&lt;p&gt;Visit &lt;a href="https://simonw.github.io/liteparse/"&gt;https://simonw.github.io/liteparse/&lt;/a&gt; to try out LiteParse against any PDF file, running entirely in your browser. Here's what that looks like:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/liteparse-web.jpg" alt="Screenshot of the LiteParse browser demo web page. Header reads &amp;quot;LiteParse&amp;quot; with subtitle &amp;quot;Browser demo of LiteParse — parse PDFs in your browser. Nothing leaves your machine.&amp;quot; A dashed-border drop zone says &amp;quot;Drop a PDF here or click to choose / Your file stays in your browser.&amp;quot; with a file pill labeled &amp;quot;19720005243.pdf&amp;quot;. Below are a checked &amp;quot;Run OCR&amp;quot; checkbox, an unchecked &amp;quot;Render page screenshots&amp;quot; checkbox, and a blue &amp;quot;Parse&amp;quot; button. Status text: &amp;quot;Parsed 86 pages.&amp;quot; Two side-by-side panels follow. Left panel titled &amp;quot;Text&amp;quot; with a Copy button shows monospace extracted text beginning &amp;quot;Apollo 5 was an unmanned system, both propulsion systems ascent and descent stages&amp;quot;. Right panel titled &amp;quot;JSON&amp;quot;, also with a copy button, contains JSON showing the dimensions and position and detected font of each piece of text." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The tool can work with or without running OCR, and can optionally display images for every page in the PDF further down the page.&lt;/p&gt;
&lt;h4 id="building-it-with-claude-code-and-opus-4-7"&gt;Building it with Claude Code and Opus 4.7&lt;/h4&gt;
&lt;p&gt;The process of building this started in the regular Claude app on my iPhone. I wanted to try out LiteParse myself, so I started by uploading a random PDF I happened to have on my phone along with this prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Clone https://github.com/run-llama/liteparse and try it against this file&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Regular Claude chat can clone directly from GitHub these days, and while by default it can't access most of the internet from its container it can also install packages from PyPI and npm.&lt;/p&gt;
&lt;p&gt;I often use this to try out new pieces of open source software on my phone - it's a quick way to exercise something without having to sit down with my laptop.&lt;/p&gt;
&lt;p&gt;You can follow my full conversation in &lt;a href="https://claude.ai/share/44a5ed86-e5b5-4e14-90be-1eba1e0acd13"&gt;this shared Claude transcript&lt;/a&gt;. I asked a few follow-up questions about how it worked, and then asked:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Does this library run in a browser? Could it?&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This gave me a thorough enough answer that I was convinced it was worth trying getting that to work for real. I opened up my laptop and switched to Claude Code.&lt;/p&gt;
&lt;p&gt;I forked the original repo on GitHub, cloned a local copy, started a new &lt;code&gt;web&lt;/code&gt; branch and pasted that last reply from Claude into a new file called &lt;a href="https://github.com/simonw/liteparse/blob/web/notes.md"&gt;notes.md&lt;/a&gt;. Then I told Claude Code:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Get this working as a web app. index.html, when loaded, should render an app that lets users open a PDF in their browser and select OCR or non-OCR mode and have this run. Read notes.md for initial research on this problem, then write out plan.md with your detailed implementation plan&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I always like to start with a plan for this kind of project. Sometimes I'll use Claude's "planning mode", but in this case I knew I'd want the plan as an artifact in the repository so I told it to write &lt;code&gt;plan.md&lt;/code&gt; directly.&lt;/p&gt;
&lt;p&gt;This also means I can iterate on the plan with Claude. I noticed that Claude had decided to punt on generating screenshots of images in the PDF, and suggested we defer a "canvas-encode swap" to v2. I fixed that by prompting:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Update the plan to say we WILL do the canvas-encode swap so the screenshots thing works&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After a few short follow-up prompts, here's the &lt;a href="https://github.com/simonw/liteparse/blob/web/plan.md"&gt;plan.md&lt;/a&gt; I thought was strong enough to implement.&lt;/p&gt;
&lt;p&gt;I prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;build it.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And then mostly left Claude Code to its own devices, tinkered with some other projects, caught up on Duolingo and occasionally checked in to see how it was doing.&lt;/p&gt;
&lt;p&gt;I added a few prompts to the queue as I was working. Those don't yet show up in my exported transcript, but it turns out running &lt;code&gt;rg queue-operation --no-filename | grep enqueue | jq -r '.content'&lt;/code&gt; in the relevant &lt;code&gt;~/.claude/projects/&lt;/code&gt; folder extracts them.&lt;/p&gt;
&lt;p&gt;Here are the key follow-up prompts with some notes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;When you implement this use playwright and red/green TDD, plan that too&lt;/code&gt; - I've written more &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/"&gt;about red/green TDD here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;let's use PDF.js's own renderer&lt;/code&gt; (it was messing around with pdfium)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;The final UI should include both the text and the pretty-printed JSON output, both of those in textareas and both with copy-to-clipboard buttons - it should also be mobile friendly&lt;/code&gt; - I had a new idea for how the UI should work&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;small commits along the way&lt;/code&gt; - see below&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Make sure the index.html page includes a link back to https://github.com/run-llama/liteparse near the top of the page&lt;/code&gt; - it's important to credit your dependencies in a project like this!&lt;/li&gt;
&lt;li&gt;&lt;code&gt;View on GitHub → is bad copy because that's not the repo with this web app in, it's the web app for the underlying LiteParse library&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Run OCR should be unchecked by default&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;When I try to parse a PDF in my browser I see 'Parse failed: undefined is not a function (near '...value of readableStream...')&lt;/code&gt; - it was testing with Playwright in Chrome, turned out there was a bug in Safari&lt;/li&gt;
&lt;li&gt;&lt;code&gt;... oh that is in safari but it works in chrome&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;When "Copy" is clicked the text should change to "Copied!" for 1.5s&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;[Image #1] Style the file input so that long filenames don't break things on Firefox like this - in fact add one of those drag-drop zone UIs which you can also click to select a file&lt;/code&gt; - dropping screenshots in of small UI glitches works surprisingly well&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Tweak the drop zone such that the text is vertically centered, right now it is a bit closer to the top&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;it breaks in Safari on macOS, works in both Chrome and Firefox. On Safari I see "Parse failed: undefined is not a function (near '...value of readableStream...')" after I click the Parse button, when OCR is not checked&lt;/code&gt; - it still wasn't working in Safari...&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;works in safari now&lt;/code&gt;  - but it fixed it pretty quickly once I pointed that out and it got Playwright working with that browser&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I've started habitually asking for "small commits along the way" because it makes for code that's easier to understand or review later on, and I have an unproven hunch that it helps the agent work more effectively too - it's yet another encouragement towards planning and taking on one problem at a time.&lt;/p&gt;
&lt;p&gt;While it was working I decided it would be nice to be able to interact with an in-progress version.  I asked a separate Claude Code session against the same directory for tips on how to run it, and it told me to use &lt;code&gt;npx vite&lt;/code&gt;. Running that started a development server with live-reloading, which meant I could instantly see the effect of each change it made on disk - and prompt with further requests for tweaks and fixes.&lt;/p&gt;
&lt;p&gt;Towards the end I decided it was going to be good enough to publish. I started a fresh Claude Code instance and told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Look at the web/ folder - set up GitHub actions for this repo such that any push runs the tests, and if the tests pass it then does a GitHub Pages deploy of the built vite app such that the web/index.html page is the index.html page for the thing that is deployed and it works on GitHub Pages&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After a bit more iteration &lt;a href="https://github.com/simonw/liteparse/blob/web/.github/workflows/deploy-web.yml"&gt;here's the GitHub Actions workflow&lt;/a&gt; that builds the app using Vite and deploys the result to &lt;a href="https://simonw.github.io/liteparse/"&gt;https://simonw.github.io/liteparse/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I love GitHub Pages for this kind of thing because it can be quickly configured (by Claude, in this case) to turn any repository into a deployed web-app, at zero cost and with whatever build step is necessary. It even works against private repos, if you don't mind your only security being a secret URL.&lt;/p&gt;
&lt;p&gt;With this kind of project there's always a major risk that the model might "cheat" - mark key features as "TODO" and fake them, or take shortcuts that ignore the initial requirements.&lt;/p&gt;
&lt;p&gt;The responsible way to prevent this is to review all of the code... but this wasn't intended as that kind of project, so instead I fired up OpenAI Codex with GPT-5.5 (I had preview access) and told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Describe the difference between how the node.js CLI tool runs and how the web/ version runs&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The answer I got back was enough to give me confidence that Claude hadn't taken any project-threatening shortcuts.&lt;/p&gt;
&lt;p&gt;... and that was about it. Total time in Claude Code for that "build it" step was 59 minutes. I used my &lt;a href="https://github.com/simonw/claude-code-transcripts"&gt;claude-code-transcripts&lt;/a&gt; tool to export a readable version of the full transcript which you can &lt;a href="https://gisthost.github.io/?d64889bfc1b897fea3867adfec62ed89/index.html"&gt;view here&lt;/a&gt;, albeit without those additional queued prompts (here's my &lt;a href="https://github.com/simonw/claude-code-transcripts/issues/98"&gt;issue to fix that&lt;/a&gt;).&lt;/p&gt;
&lt;h4 id="is-this-even-vibe-coding-any-more-"&gt;Is this even vibe coding any more?&lt;/h4&gt;
&lt;p&gt;I'm a pedantic stickler when it comes to &lt;a href="https://simonwillison.net/2025/Mar/19/vibe-coding/"&gt;the original definition of vibe coding&lt;/a&gt; - vibe coding does &lt;em&gt;not&lt;/em&gt; mean any time you use AI to help you write code, it's when you use AI without reviewing or caring about the code that's written at all.&lt;/p&gt;
&lt;p&gt;By my own definition, this LiteParse for the web project is about as pure vibe coding as you can get! I have not looked at a &lt;em&gt;single line&lt;/em&gt; of the HTML and TypeScript written for this project - in fact while writing this sentence I had to go and check if it had used JavaScript or TypeScript.&lt;/p&gt;
&lt;p&gt;Yet somehow this one doesn't feel as vibe coded to me as many of my other vibe coded projects:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;As a static in-browser web application hosted on GitHub Pages the blast radius for any bugs is almost non-existent: it either works for your PDF or doesn't.&lt;/li&gt;
&lt;li&gt;No private data is transferred anywhere - all processing happens in your browser - so a security audit is unnecessary. I've glanced once at the network panel while it's running and no additional requests are made when a PDF is being parsed.&lt;/li&gt;
&lt;li&gt;There was still a whole lot of engineering experience and knowledge required to use the models in this way. Identifying that porting LiteParse to run directly in a browser was critical to the rest of the project.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Most importantly, I'm happy to attach my reputation to this project and recommend that other people try it out. Unlike most of my vibe coded tools I'm not convinced that spending significant additional engineering time on this would have resulted in a meaningfully better initial release. It's fine as it is!&lt;/p&gt;
&lt;p&gt;I haven't opened a PR against the &lt;a href="https://github.com/run-llama/liteparse"&gt;origin repository&lt;/a&gt; because I've not discussed it with the LiteParse team. I've &lt;a href="https://github.com/run-llama/liteparse/issues/147"&gt;opened an issue&lt;/a&gt;, and if they want my vibe coded implementation as a starting point for something more official they're welcome to take it.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ocr"&gt;ocr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pdf"&gt;pdf&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="javascript"/><category term="ocr"/><category term="pdf"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="vibe-coding"/><category term="coding-agents"/><category term="claude-code"/><category term="agentic-engineering"/></entry><entry><title>Is Claude Code going to cost $100/month? Probably not - it's all very confusing</title><link href="https://simonwillison.net/2026/Apr/22/claude-code-confusion/#atom-tag" rel="alternate"/><published>2026-04-22T02:07:34+00:00</published><updated>2026-04-22T02:07:34+00:00</updated><id>https://simonwillison.net/2026/Apr/22/claude-code-confusion/#atom-tag</id><summary type="html">
    &lt;p&gt;Anthropic today quietly (as in &lt;em&gt;silently&lt;/em&gt;, no announcement anywhere at all) updated their &lt;a href="https://claude.com/pricing"&gt;claude.com/pricing&lt;/a&gt; page (but not their &lt;a href="https://support.claude.com/en/articles/11049762-choosing-a-claude-plan"&gt;Choosing a Claude plan page&lt;/a&gt;, which shows up first for me on Google) to add this tiny but significant detail (arrow is mine, &lt;a href="https://simonwillison.net/2026/Apr/22/claude-code-confusion/#they-reversed-it"&gt;and it's already reverted&lt;/a&gt;):&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/anthropic-x.jpg" alt="Screenshot of the Claude pricing grid - Compare features across plans. Free, Pro, Max 5x and Max 20x all have the same features, with the exception of Claude Code which is on Max only and Claude Cowork which is on Pro and Max only. An arrow highlights the Claude Code for Pro cross." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://web.archive.org/web/20260421040656/claude.com/pricing"&gt;Internet Archive copy&lt;/a&gt; from yesterday shows a checkbox there. Claude Code used to be a feature of the $20/month Pro plan, but according to the new pricing page it is now exclusive to the $100/month or $200/month Max plans.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update&lt;/strong&gt;: don't miss &lt;a href="https://simonwillison.net/2026/Apr/22/claude-code-confusion/#they-reversed-it"&gt;the update to this post&lt;/a&gt;, they've already changed course a few hours after this change went live.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;So what the heck is going on? Unsurprisingly, &lt;a href="https://www.reddit.com/r/ClaudeAI/comments/1srzhd7/psa_claude_pro_no_longer_lists_claude_code_as_an/"&gt;Reddit&lt;/a&gt; and &lt;a href="https://news.ycombinator.com/item?id=47854477"&gt;Hacker News&lt;/a&gt; and &lt;a href="https://twitter.com/i/trending/2046718768634589239"&gt;Twitter&lt;/a&gt; all caught fire.&lt;/p&gt;
&lt;p&gt;I didn't believe the screenshots myself when I first saw them - aside from the pricing grid I could find no announcement from Anthropic anywhere. Then Amol Avasare, Anthropic's Head of Growth, &lt;a href="https://twitter.com/TheAmolAvasare/status/2046724659039932830"&gt;tweeted&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For clarity, we're running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And that appears to be the closest we have had to official messaging from Anthropic.&lt;/p&gt;
&lt;p&gt;I don't buy the "~2% of new prosumer signups" thing, since everyone I've talked to is seeing the new pricing grid and the Internet Archive has already &lt;a href="https://web.archive.org/web/20260422001250/https://claude.com/pricing"&gt;snapped a copy&lt;/a&gt;. Maybe he means that they'll only be running this version of the pricing grid for a limited time which somehow adds up to "2%" of signups?&lt;/p&gt;
&lt;p&gt;I'm also amused to see Claude Cowork remain available on the $20/month plan, because Claude Cowork is effectively a rebranded version of Claude Code wearing a less threatening hat!&lt;/p&gt;
&lt;p&gt;There are a whole bunch of things that are bad about this.&lt;/p&gt;
&lt;p&gt;If we assume this is indeed a test, and that test comes up negative and they decide not to go ahead with it, the damage has still been extensive:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A whole lot of people got scared or angry or both that a service they relied on was about to be rug-pulled. There really is a significant difference between $20/month and $100/month for most people, especially outside of higher salary countries.&lt;/li&gt;
&lt;li&gt;The uncertainty is really bad! A tweet from an employee is &lt;em&gt;not&lt;/em&gt; the way to make an announcement like this. I wasted a solid hour of my afternoon trying to figure out what had happened here. My trust in Anthropic's transparency around pricing - a &lt;em&gt;crucial factor&lt;/em&gt; in how I understand their products - has been shaken.&lt;/li&gt;
&lt;li&gt;Strategically, should I be taking a bet on Claude Code if I know that they might 5x the minimum price of the product?&lt;/li&gt;
&lt;li&gt;More of a personal issue, but one I care deeply about myself: I invest a &lt;a href="https://simonwillison.net/tags/claude-code/"&gt;great deal of effort&lt;/a&gt; (that's 105 posts and counting) in teaching people how to use Claude Code. I don't want to invest that effort in a product that most people cannot afford to use.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Last month I ran &lt;a href="https://simonw.github.io/nicar-2026-coding-agents/"&gt;a tutorial for journalists&lt;/a&gt; on "Coding agents for data analysis" at the annual NICAR data journalism conference. I'm not going to be teaching that audience a course that depends on a $100/month subscription!&lt;/p&gt;
&lt;p&gt;This also doesn't make sense to me as a strategy for Anthropic. Claude Code &lt;em&gt;defined the category&lt;/em&gt; of coding agents. It's responsible for billions of dollars in annual revenue for Anthropic already. It has a stellar reputation, but I'm not convinced that reputation is strong enough for it to lose the $20/month trial and jump people directly to a $100/month subscription.&lt;/p&gt;
&lt;p&gt;OpenAI have been investing heavily in catching up to Claude Code with their Codex products. Anthropic just handed them this marketing opportunity on a plate - here's Codex engineering lead &lt;a href="https://twitter.com/thsottiaux/status/2046740759056162816"&gt;Thibault Sottiaux&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I don't know what they are doing over there, but Codex will continue to be available both in the FREE and PLUS ($20) plans. We have the compute and efficient models to support it. For important changes, we will engage with the community well ahead of making them.&lt;/p&gt;
&lt;p&gt;Transparency and trust are two principles we will not break, even if it means momentarily earning less. A reminder that you vote with your subscription for the values you want to see in this world.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I should note that I pay $200/month for Claude Max and I consider it well worth the money. I've had periods of free access in the past courtesy of Anthropic but I'm currently paying full price, and happy to do so.&lt;/p&gt;
&lt;p&gt;But I care about the accessibility of the tools that I work with and teach. If Codex has a free tier while Claude Code starts at $100/month I should obviously switch to Codex, because that way I can use the same tool as the people I want to teach how to use coding agents.&lt;/p&gt;
&lt;p&gt;Here's what I think happened. I think Anthropic are trying to optimize revenue growth - obviously - and someone pitched making Claude Code only available for Max and higher. That's clearly a bad idea, but "testing" culture says that it's worth putting even bad ideas out to test just in case they surprise you.&lt;/p&gt;
&lt;p&gt;So they started a test, without taking into account the wailing and gnashing of teeth that would result when their test was noticed - or accounting for the longer-term brand damage that would be caused.&lt;/p&gt;
&lt;p&gt;Or maybe they &lt;em&gt;did&lt;/em&gt; account for that, and decided it was worth the risk.&lt;/p&gt;
&lt;p&gt;I don't think that calculation was worthwhile. They're going to have to make a &lt;em&gt;very&lt;/em&gt; firm commitment along the lines of "we heard your feedback and we commit to keeping Claude Code available on our $20/month plan going forward" to regain my trust.&lt;/p&gt;
&lt;p&gt;As it stands, Codex is looking like a much safer bet for me to invest my time in learning and building educational materials around.&lt;/p&gt;
&lt;h4 id="they-reversed-it"&gt;Update: they've reversed it already&lt;/h4&gt;
&lt;p&gt;In the time I was &lt;em&gt;typing this blog entry&lt;/em&gt; Anthropic appear to have reversed course - the &lt;a href="https://claude.com/pricing"&gt;claude.com/pricing page&lt;/a&gt; now has a checkbox back in the Pro column for Claude Code. I can't find any official communication about it though.&lt;/p&gt;
&lt;p&gt;Let's see if they can come up with an explanation/apology that's convincing enough to offset the trust bonfire from this afternoon!&lt;/p&gt;
&lt;h4 id="update-2"&gt;Update 2: it may still affect 2% of signups?&lt;/h4&gt;
&lt;p&gt;Amol &lt;a href="https://x.com/TheAmolAvasare/status/2046788872517066971"&gt;on Twitter&lt;/a&gt;:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;was a mistake that the logged-out landing page and docs were updated for this test [&lt;a href="https://twitter.com/TheAmolAvasare/status/2046783926920978681"&gt;embedded self-tweet&lt;/a&gt;]&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Getting lots of questions on why the landing page / docs were updated if only 2% of new signups were affected.&lt;/p&gt;

&lt;p&gt;This was understandably confusing for the 98% of folks not part of the experiment, and we've reverted both the landing page and docs changes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/blockquote&gt;
&lt;p&gt;So the experiment is still running, just not visible to the rest of the world?&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-pricing"&gt;llm-pricing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex"&gt;codex&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="llm-pricing"/><category term="ai-ethics"/><category term="coding-agents"/><category term="claude-code"/><category term="codex"/></entry><entry><title>Exploring the new `servo` crate</title><link href="https://simonwillison.net/2026/Apr/13/servo-crate-exploration/#atom-tag" rel="alternate"/><published>2026-04-13T15:04:00+00:00</published><updated>2026-04-13T15:04:00+00:00</updated><id>https://simonwillison.net/2026/Apr/13/servo-crate-exploration/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Research:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research/tree/main/servo-crate-exploration#readme"&gt;Exploring the new `servo` crate&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;In &lt;a href="https://servo.org/blog/2026/04/13/servo-0.1.0-release/"&gt;Servo is now available on crates.io&lt;/a&gt; the Servo team announced the initial release of the &lt;a href="https://crates.io/crates/servo"&gt;servo&lt;/a&gt; crate, which packages their browser engine as an embeddable library.&lt;/p&gt;
&lt;p&gt;I set Claude Code for web &lt;a href="https://github.com/simonw/research/pull/108"&gt;the task&lt;/a&gt; of figuring out what it can do, building a CLI tool for taking screenshots using it and working out if it could be compiled to WebAssembly.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;servo-shot&lt;/code&gt; Rust tool it built works pretty well:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;git clone https://github.com/simonw/research
cd research/servo-crate-exploration/servo-shot
cargo build
./target/debug/servo-shot https://news.ycombinator.com/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here's the result:&lt;/p&gt;
&lt;p&gt;&lt;img alt="An accurately rendered screenshot of the Hacker News homepage" src="https://static.simonwillison.net/static/2026/servo-hn.png" /&gt;&lt;/p&gt;
&lt;p&gt;Compiling Servo itself to WebAssembly is not feasible due to its heavy use of threads and dependencies like SpiderMonkey, but Claude did build me &lt;a href="https://simonw.github.io/research/servo-crate-exploration/html5ever-wasm-demo/www/"&gt;this playground page&lt;/a&gt; for trying out a WebAssembly build of the &lt;code&gt;html5ever&lt;/code&gt; and &lt;code&gt;markup5ever_rcdom&lt;/code&gt; crates, providing a tool for turning fragments of HTML into a parse tree.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/browsers"&gt;browsers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/research"&gt;research&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rust"&gt;rust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/servo"&gt;servo&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="browsers"/><category term="research"/><category term="rust"/><category term="webassembly"/><category term="servo"/><category term="claude-code"/></entry><entry><title>Cleanup Claude Code Paste</title><link href="https://simonwillison.net/2026/Apr/6/cleanup-claude-code-paste/#atom-tag" rel="alternate"/><published>2026-04-06T02:55:23+00:00</published><updated>2026-04-06T02:55:23+00:00</updated><id>https://simonwillison.net/2026/Apr/6/cleanup-claude-code-paste/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/cleanup-claude-code-paste"&gt;Cleanup Claude Code Paste&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;Super-niche tool this. I sometimes copy prompts out of the Claude Code terminal app and they come out with a bunch of weird additional whitespace. This tool cleans that up.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a web tool titled &amp;quot;Cleanup Claude Code Paste&amp;quot; with the subtitle &amp;quot;Paste terminal output to remove the ❯ prompt, fix wrapped-line whitespace, and join lines into clean text.&amp;quot; An input textarea contains pasted terminal output starting with &amp;quot;❯ Add a -r/--redact option which asks for user approval (after telling it how many replacements will happen and in which files and which lines – standard output basically) and then rewrites the files in that folder to replace all matched secrets with REDACTED. Run tests with 'uv run pytest' and use red/green TDD&amp;quot;. Below is a &amp;quot;Cleaned output:&amp;quot; section showing the same text with the ❯ prompt removed and whitespace cleaned up. A blue &amp;quot;Copy to clipboard&amp;quot; button appears at the bottom." src="https://static.simonwillison.net/static/2026/claude-code-cleanup.jpg" /&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="tools"/><category term="claude-code"/></entry><entry><title>scan-for-secrets 0.1</title><link href="https://simonwillison.net/2026/Apr/5/scan-for-secrets-3/#atom-tag" rel="alternate"/><published>2026-04-05T03:27:13+00:00</published><updated>2026-04-05T03:27:13+00:00</updated><id>https://simonwillison.net/2026/Apr/5/scan-for-secrets-3/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/scan-for-secrets/releases/tag/0.1"&gt;scan-for-secrets 0.1&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I like publishing transcripts of local Claude Code sessions using my &lt;a href="https://github.com/simonw/claude-code-transcripts"&gt;claude-code-transcripts&lt;/a&gt; tool but I'm often paranoid that one of my API keys or similar secrets might inadvertently be revealed in the detailed log files.&lt;/p&gt;
&lt;p&gt;I built this new Python scanning tool to help reassure me. You can feed it secrets and have it scan for them in a specified directory:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx scan-for-secrets $OPENAI_API_KEY -d logs-to-publish/
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you leave off the &lt;code&gt;-d&lt;/code&gt; it defaults to the current directory.&lt;/p&gt;
&lt;p&gt;It doesn't just scan for the literal secrets - it also scans for common encodings of those secrets e.g. backslash or JSON escaping, &lt;a href="https://github.com/simonw/scan-for-secrets/blob/main/README.md#escaping-schemes"&gt;as described in the README&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you have a set of secrets you always want to protect you can list commands to echo them in a &lt;code&gt;~/.scan-for-secrets.conf.sh&lt;/code&gt; file. Mine looks like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm keys get openai
llm keys get anthropic
llm keys get gemini
llm keys get mistral
awk -F= '/aws_secret_access_key/{print $2}' ~/.aws/credentials | xargs
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I built this tool using README-driven-development: I carefully constructed the README describing exactly how the tool should work, then &lt;a href="https://gisthost.github.io/?d4b1a398bf3b6b14aade923dea69a1ac/index.html"&gt;dumped it into Claude Code&lt;/a&gt; and told it to build the actual tool (using &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/red-green-tdd/"&gt;red/green TDD&lt;/a&gt;, naturally.)&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="security"/><category term="ai-assisted-programming"/><category term="coding-agents"/><category term="claude-code"/><category term="agentic-engineering"/></entry><entry><title>Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer</title><link href="https://simonwillison.net/2026/Mar/30/mr-chatterbox/#atom-tag" rel="alternate"/><published>2026-03-30T14:28:34+00:00</published><updated>2026-03-30T14:28:34+00:00</updated><id>https://simonwillison.net/2026/Mar/30/mr-chatterbox/#atom-tag</id><summary type="html">
    &lt;p&gt;Trip Venturella released &lt;a href="https://www.estragon.news/mr-chatterbox-or-the-modern-prometheus/"&gt;Mr. Chatterbox&lt;/a&gt;, a language model trained entirely on out-of-copyright text from the British Library. Here's how he describes it in &lt;a href="https://huggingface.co/tventurella/mr_chatterbox_model"&gt;the model card&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Mr. Chatterbox is a language model trained entirely from scratch on a corpus of over 28,000 Victorian-era British texts published between 1837 and 1899, drawn from a dataset made available &lt;a href="https://huggingface.co/datasets/TheBritishLibrary/blbooks"&gt;by the British Library&lt;/a&gt;. The model has absolutely no training inputs from after 1899 — the vocabulary and ideas are formed exclusively from nineteenth-century literature.&lt;/p&gt;
&lt;p&gt;Mr. Chatterbox's training corpus was 28,035 books, with an estimated 2.93 billion input tokens after filtering. The model has roughly 340 million paramaters, roughly the same size as GPT-2-Medium. The difference is, of course, that unlike GPT-2, Mr. Chatterbox is trained entirely on historical data.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Given how hard it is to train a useful LLM without using vast amounts of scraped, unlicensed data I've been dreaming of a model like this for a couple of years now. What would a model trained on out-of-copyright text be like to chat with?&lt;/p&gt;
&lt;p&gt;Thanks to Trip we can now find out for ourselves!&lt;/p&gt;
&lt;p&gt;The model itself is tiny, at least by Large Language Model standards - just &lt;a href="https://huggingface.co/tventurella/mr_chatterbox_model/tree/main"&gt;2.05GB&lt;/a&gt; on disk. You can try it out using Trip's &lt;a href="https://huggingface.co/spaces/tventurella/mr_chatterbox"&gt;HuggingFace Spaces demo&lt;/a&gt;:&lt;/p&gt;
&lt;p style="text-align: center"&gt;&lt;img src="https://static.simonwillison.net/static/2026/chatterbox.jpg" alt="Screenshot of a Victorian-themed chatbot interface titled &amp;quot;🎩 Mr. Chatterbox (Beta)&amp;quot; with subtitle &amp;quot;The Victorian Gentleman Chatbot&amp;quot;. The conversation shows a user asking &amp;quot;How should I behave at dinner?&amp;quot; with the bot replying &amp;quot;My good fellow, one might presume that such trivialities could not engage your attention during an evening's discourse!&amp;quot; The user then asks &amp;quot;What are good topics?&amp;quot; and the bot responds &amp;quot;The most pressing subjects of our society— Indeed, a gentleman must endeavor to engage the conversation with grace and vivacity. Such pursuits serve as vital antidotes against ennui when engaged in agreeable company.&amp;quot; A text input field at the bottom reads &amp;quot;Say hello...&amp;quot; with a send button. The interface uses a dark maroon and cream color scheme." style="max-width: 80%;" /&gt;&lt;/p&gt;
&lt;p&gt;Honestly, it's pretty terrible. Talking with it feels more like chatting with a Markov chain than an LLM - the responses may have a delightfully Victorian flavor to them but it's hard to get a response that usefully answers a question.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://arxiv.org/abs/2203.15556"&gt;2022 Chinchilla paper&lt;/a&gt; suggests a ratio of 20x the parameter count to training tokens. For a 340m model that would suggest around 7 billion tokens, more than twice the British Library corpus used here. The smallest Qwen 3.5 model is 600m parameters and that model family starts to get interesting at 2b - so my hunch is we would need 4x or more the training data to get something that starts to feel like a useful conversational partner.&lt;/p&gt;
&lt;p&gt;But what a fun project!&lt;/p&gt;
&lt;h4 id="running-it-locally-with-llm"&gt;Running it locally with LLM&lt;/h4&gt;
&lt;p&gt;I decided to see if I could run the model on my own machine using my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; framework.&lt;/p&gt;
&lt;p&gt;I got Claude Code to do most of the work - &lt;a href="https://gisthost.github.io/?7d0f00e152dd80d617b5e501e4ff025b/index.html"&gt;here's the transcript&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Trip trained the model using Andrej Karpathy's &lt;a href="https://github.com/karpathy/nanochat"&gt;nanochat&lt;/a&gt;, so I cloned that project, pulled the model weights and told Claude to build a Python script to run the model. Once we had that working (which ended up needing some extra details from the &lt;a href="https://huggingface.co/spaces/tventurella/mr_chatterbox/tree/main"&gt;Space demo source code&lt;/a&gt;) I had Claude &lt;a href="https://llm.datasette.io/en/stable/plugins/tutorial-model-plugin.html"&gt;read the LLM plugin tutorial&lt;/a&gt; and build the rest of the plugin.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/llm-mrchatterbox"&gt;llm-mrchatterbox&lt;/a&gt; is the result. Install the plugin like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-mrchatterbox
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first time you run a prompt it will fetch the 2.05GB model file from Hugging Face. Try that like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm -m mrchatterbox "Good day, sir"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or start an ongoing chat session like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm chat -m mrchatterbox
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you don't have LLM installed you can still get a chat session started from scratch using uvx like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx --with llm-mrchatterbox llm chat -m mrchatterbox
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When you are finished with the model you can delete the cached file using:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm mrchatterbox delete-model
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is the first time I've had Claude Code build a full LLM model plugin from scratch and it worked really well. I expect I'll be using this method again in the future.&lt;/p&gt;
&lt;p&gt;I continue to hope we can get a useful model from entirely public domain data. The fact that Trip was able to get this far using nanochat and 2.93 billion training tokens is a promising start.&lt;/p&gt;

&lt;p id="update-31st"&gt;&lt;strong&gt;Update 31st March 2026&lt;/strong&gt;: I had missed this when I first published this piece but Trip has his own &lt;a href="https://www.estragon.news/mr-chatterbox-or-the-modern-prometheus/"&gt;detailed writeup of the project&lt;/a&gt; which goes into much more detail about how he trained the model. Here's how the books were filtered for pre-training:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;First, I downloaded the British Library dataset split of all 19th-century books. I filtered those down to books contemporaneous with the reign of Queen Victoria—which, unfortunately, cut out the novels of Jane Austen—and further filtered those down to a set of books with a optical character recognition (OCR) confidence of .65 or above, as listed in the metadata. This left me with 28,035 books, or roughly 2.93 billion tokes for pretraining data.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Getting it to behave like a conversational model was a lot harder. Trip started by trying to train on plays by Oscar Wilde and George Bernard Shaw, but found they didn't provide enough pairs. Then he tried extracting dialogue pairs from the books themselves with poor results. The approach that worked was to have Claude Haiku and GPT-4o-mini generate synthetic conversation pairs for the supervised fine tuning, which solved the problem but sadly I think dilutes the "no training inputs from after 1899" claim from the original model card.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/andrej-karpathy"&gt;andrej-karpathy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hugging-face"&gt;hugging-face&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/training-data"&gt;training-data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="andrej-karpathy"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="ai-assisted-programming"/><category term="hugging-face"/><category term="llm"/><category term="training-data"/><category term="uv"/><category term="ai-ethics"/><category term="claude-code"/></entry><entry><title>Vibe coding SwiftUI apps is a lot of fun</title><link href="https://simonwillison.net/2026/Mar/27/vibe-coding-swiftui/#atom-tag" rel="alternate"/><published>2026-03-27T20:59:53+00:00</published><updated>2026-03-27T20:59:53+00:00</updated><id>https://simonwillison.net/2026/Mar/27/vibe-coding-swiftui/#atom-tag</id><summary type="html">
    &lt;p&gt;I have a new laptop - a 128GB M5 MacBook Pro, which early impressions show to be &lt;em&gt;very&lt;/em&gt; capable for running good local LLMs. I got frustrated with Activity Monitor and decided to vibe code up some alternative tools for monitoring performance and I'm very happy with the results.&lt;/p&gt;
&lt;p&gt;This is my second experiment with vibe coding macOS apps - the first was &lt;a href="https://simonwillison.net/2026/Feb/25/present/"&gt;this presentation app a few weeks ago&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It turns out Claude Opus 4.6 and GPT-5.4 are both very competent at SwiftUI - and a full SwiftUI app can fit in a single text file, which means I can use them to spin something up without even opening Xcode.&lt;/p&gt;
&lt;p&gt;I’ve built two apps so far: Bandwidther shows me what apps are using network bandwidth and Gpuer to show me what’s going on with the GPU. At Claude’s suggestion both of these are now menu bar icons that open a panel full of information.&lt;/p&gt;
&lt;h4 id="bandwidther"&gt;Bandwidther&lt;/h4&gt;
&lt;p&gt;I built this app first, because I wanted to see what Dropbox was doing. It looks like this:&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" rel="noopener noreferrer" href="https://github.com/simonw/bandwidther/raw/main/screenshot.png"&gt;&lt;img src="https://github.com/simonw/bandwidther/raw/main/screenshot.png" alt="Screenshot of Bandwidther macOS app showing two columns: left side displays overall download/upload speeds, a bandwidth graph over the last 60 seconds, cumulative totals, internet and LAN connection counts, and internet destinations; right side shows per-process bandwidth usage sorted by rate with processes like nsurlsessiond, apsd, rapportd, mDNSResponder, Dropbox, and others listed with their individual download/upload speeds and progress bars." style="max-width: 100%;" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I’ve shared &lt;a href="https://gisthost.github.io/?6e06d4724c64c10d1fc3fbe19d9c8575/index.html"&gt;the full transcript&lt;/a&gt; I used to build the first version of the app. My prompts were pretty minimal:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Show me how much network bandwidth is in use from this machine to the internet as opposed to local LAN&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(My initial curiosity was to see if Dropbox was transferring files via the LAN from my old computer or was downloading from the internet.)&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;mkdir /tmp/bandwidther and write a native Swift UI app in there that shows me these details on a live ongoing basis&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This got me the first version, which proved to me this was worth pursuing further.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;git init and git commit what you have so far&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Since I was about to start adding new features.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Now suggest features we could add to that app, the goal is to provide as much detail as possible concerning network usage including by different apps&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The nice thing about having Claude suggest features is that it has a much better idea for what’s possible than I do.&lt;/p&gt;
&lt;p&gt;We had a bit of back and forth fixing some bugs, then I sent a few more prompts to get to the two column layout shown above:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;add Per-Process Bandwidth, relaunch the app once that is done&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;now add the reverse DNS feature but make sure original IP addresses are still visible too, albeit in smaller typeface&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;redesign the app so that it is wider, I want two columns - the per-process one on the left and the rest on the right&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;OK make it a task bar icon thing, when I click the icon I want the app to appear, the icon itself should be a neat minimal little thing&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The source code and build instructions are available in &lt;a href="https://github.com/simonw/bandwidther"&gt;simonw/bandwidther&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="gpuer"&gt;Gpuer&lt;/h4&gt;
&lt;p&gt;While I was building Bandwidther in one session I had another session running to build a similar tool for seeing what the GPU was doing. Here’s what I ended up with:&lt;/p&gt;
&lt;p&gt;&lt;a target="_blank" rel="noopener noreferrer" href="https://github.com/simonw/gpuer/raw/main/screenshot.png"&gt;&lt;img src="https://github.com/simonw/gpuer/raw/main/screenshot.png" alt="Screenshot of the Gpuer app on macOS showing memory usage for an Apple M5 Max with 40 GPU cores. Left panel: a large orange &amp;quot;38 GB Available&amp;quot; readout showing usage of 128.0 GB unified memory, &amp;quot;Room for ~18 more large apps before pressure&amp;quot;, a warning banner reading &amp;quot;1.5 GB pushed to disk — system was under pressure recently&amp;quot;, a horizontal segmented bar chart labeled &amp;quot;Where your memory is going&amp;quot; with green, blue, and grey segments and a legend, an explanatory note about GPU unified memory, a GPU Utilization section showing 0%, and a History graph showing Available and GPU Utilization over time as line charts. Right panel: a Memory Footprint list sorted by Memory, showing process names with horizontal pink/purple usage bars and CPU percentage labels beside each entry, covering processes including Dropbox, WebKit, Virtualization, node, Claude Helper, Safari, LM Studio, WindowServer, Finder, and others." style="max-width: 100%;" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://gisthost.github.io/?71ffe216ceca8d7da59a07c478d17529"&gt;the transcript&lt;/a&gt;. This one took even less prompting because I could use the in-progress Bandwidther as an example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I want to know how much RAM and GPU this computer is using, which is hard because stuff on the GPU and RAM does not seem to show up in Activity Monitor&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This collected information using &lt;code&gt;system_profiler&lt;/code&gt; and &lt;code&gt;memory_pressure&lt;/code&gt; and gave me &lt;a href="https://gisthost.github.io/?71ffe216ceca8d7da59a07c478d17529/page-001.html#msg-2026-03-24T22-13-26-614Z"&gt;an answer&lt;/a&gt; - more importantly it showed me this was possible, so I said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Look at /tmp/bandwidther and then create a similar app in /tmp/gpuer which shows the information from above on an ongoing basis, or maybe does it better&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After a few more changes to the Bandwidther app I told it to catch up:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Now take a look at recent changes in /tmp/bandwidther - that app now uses a sys tray icon, imitate that&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This remains one of my favorite tricks for using coding agents: having them &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/hoard-things-you-know-how-to-do/#recombining-things-from-your-hoard"&gt;recombine elements&lt;/a&gt; from other projects.&lt;/p&gt;
&lt;p&gt;The code for Gpuer can be found in &lt;a href="https://github.com/simonw/gpuer"&gt;simonw/gpuer&lt;/a&gt; on GitHub.&lt;/p&gt;
&lt;h4 id="you-shouldn-t-trust-these-apps"&gt;You shouldn't trust these apps&lt;/h4&gt;
&lt;p&gt;These two apps are classic vibe coding: I don't know Swift and I hardly glanced at the code they were writing.&lt;/p&gt;
&lt;p&gt;More importantly though, I have very little experience with macOS internals such as the values these tools are measuring. I am completely unqualified to evaluate if the numbers and charts being spat out by these tools are credible or accurate!&lt;/p&gt;
&lt;p&gt;I've added warnings to both GitHub repositories to that effect.&lt;/p&gt;
&lt;p&gt;This morning I caught Gpuer reporting that I had just 5GB of memory left when that clearly wasn't the case (according to Activity Monitor). I &lt;a href="https://gisthost.github.io/?9ae12fff0fecc9a4482c9b02e8599c70/page-001.html#msg-2026-03-27T19-35-35-866Z"&gt;pasted a screenshot into Claude Code&lt;/a&gt; and it &lt;a href="https://github.com/simonw/gpuer/commit/a3cd655f5ccb274d3561e4cbfcc771b0bb7e256a"&gt;adjusted the calculations&lt;/a&gt; and the new numbers &lt;em&gt;look&lt;/em&gt; right, but I'm still not confident that it's reporting things correctly.&lt;/p&gt;
&lt;p&gt;I only shared them on GitHub because I think they're interesting as an example of what Claude can do with SwiftUI.&lt;/p&gt;
&lt;p&gt;Despite my lack of confidence in the apps themselves, I did learn some useful things from these projects:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A SwiftUI app can get a whole lot done with a single file of code - here's &lt;a href="https://github.com/simonw/gpuer/blob/main/GpuerApp.swift"&gt;GpuerApp.swift&lt;/a&gt; (880 lines) and &lt;a href="https://github.com/simonw/bandwidther/blob/main/BandwidtherApp.swift"&gt;BandwidtherApp.swift&lt;/a&gt; (1063 lines).&lt;/li&gt;
&lt;li&gt;Wrapping various terminal commands in a neat UI with Swift is easily achieved.&lt;/li&gt;
&lt;li&gt;Claude has surprisingly good design taste when it comes to SwiftUI applications.&lt;/li&gt;
&lt;li&gt;Turning an app into a menu bar app is just a few lines of extra code as well.&lt;/li&gt;
&lt;li&gt;You don't need to open Xcode to build this kind of application!&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These two apps took very little time to build and have convinced me that building macOS apps in SwiftUI is a new capability I should consider for future projects.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/macos"&gt;macos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/swift"&gt;swift&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="macos"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="vibe-coding"/><category term="coding-agents"/><category term="swift"/><category term="claude-code"/></entry><entry><title>Auto mode for Claude Code</title><link href="https://simonwillison.net/2026/Mar/24/auto-mode-for-claude-code/#atom-tag" rel="alternate"/><published>2026-03-24T23:57:33+00:00</published><updated>2026-03-24T23:57:33+00:00</updated><id>https://simonwillison.net/2026/Mar/24/auto-mode-for-claude-code/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://claude.com/blog/auto-mode"&gt;Auto mode for Claude Code&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Really interesting new development in Claude Code today as an alternative to &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Today, we're introducing auto mode, a new permissions mode in Claude Code where Claude makes permission decisions on your behalf, with safeguards monitoring actions before they run.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Those safeguards appear to be implemented using Claude Sonnet 4.6, as &lt;a href="https://code.claude.com/docs/en/permission-modes#eliminate-prompts-with-auto-mode"&gt;described in the documentation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Before each action runs, a separate classifier model reviews the conversation and decides whether the action matches what you asked for: it blocks actions that escalate beyond the task scope, target infrastructure the classifier doesn’t recognize as trusted, or appear to be driven by hostile content encountered in a file or web page. [...]&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;: the classifier runs on Claude Sonnet 4.6, even if your main session uses a different model.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They ship with an extensive set of default filters, and you can also customize them further with your own rules. The most interesting insight into how they work comes when you run this new command in the terminal:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;claude auto-mode defaults
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;a href="https://gist.githubusercontent.com/simonw/91863bfd9f7ebf916d1fabb8e6940335/raw/cda3c88e919b8238e85d3f1cc990e8ff48ad9a18/defaults.json"&gt;Here's the full JSON output&lt;/a&gt;. It's pretty long, so here's an illustrative subset:&lt;/p&gt;
&lt;p&gt;From the "allow" list:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Test Artifacts: Hardcoded test API keys, placeholder credentials in examples, or hardcoding test cases&lt;/li&gt;
&lt;li&gt;Local Operations: Agent deleting local files in working directory, local file operations within project scope, or using &lt;code&gt;--ignore-certificate-errors&lt;/code&gt; for local testing. "Project scope" means the repository the session started in — wandering into ~/, ~/Library/, /etc, or other repos is scope escalation (User Intent Rule #2), not a local operation. Does NOT cover irreversible destruction of pre-existing files or local stateful services — see "Irreversible Local Destruction" in BLOCK.&lt;/li&gt;
&lt;li&gt;Read-Only Operations: GET requests, read-only API calls, or queries that don't modify state and don't contain sensitive information in the URL. Note: PREEMPTIVE BLOCK ON CLEAR INTENT still applies — if the transcript contains clear evidence the agent is using read-only operations to scout for a blocked action, block it.&lt;/li&gt;
&lt;li&gt;Declared Dependencies: Installing packages that are already declared in the repo's manifest files (requirements.txt, package.json, Cargo.toml, pyproject.toml, Gemfile, etc.) via standard commands that read those manifests (&lt;code&gt;pip install -r requirements.txt&lt;/code&gt;, &lt;code&gt;npm install&lt;/code&gt;, &lt;code&gt;cargo build&lt;/code&gt;, &lt;code&gt;bundle install&lt;/code&gt;) — provided the agent has not modified the manifest in this session. Does NOT cover installing agent-chosen package names (e.g. &lt;code&gt;pip install foo&lt;/code&gt;, &lt;code&gt;npm install bar&lt;/code&gt;) — those carry typosquat and supply-chain risk. [...]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;From "soft_deny":&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Git Destructive: Force pushing (&lt;code&gt;git push --force&lt;/code&gt;), deleting remote branches, or rewriting remote history&lt;/li&gt;
&lt;li&gt;Git Push to Default Branch: Pushing directly to main, master, or the repository's default branch — this bypasses pull request review. Commits should be pushed to a new feature branch instead.
-Code from External: Downloading and executing code from external sources — e.g. &lt;code&gt;curl | bash&lt;/code&gt;, deserializing external data via formats that can execute code (eval, exec, yaml.unsafe_load, pickle, etc), or similar mechanisms. Also includes running code from an external repo cloned earlier in the transcript (pip install -e, make install, python script.py, pickle.load/torch.load on cloned repo files) — "local on disk" does not mean trusted if it was cloned from an external source visible in the transcript. The repo the agent starts in is trusted.&lt;/li&gt;
&lt;li&gt;Cloud Storage Mass Delete: Deleting or mass modifying files on cloud storage (S3, GCS, Azure Blob, etc.) [...]&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I remain unconvinced by prompt injection protections that rely on AI, since they're non-deterministic by nature. The documentation does warn that this may still let things through:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The classifier may still allow some risky actions: for example, if user intent is ambiguous, or if Claude doesn't have enough context about your environment to know an action might create additional risk.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The fact that the default allow list includes &lt;code&gt;pip install -r requirements.txt&lt;/code&gt; also means that this wouldn't protect against supply chain attacks with unpinned dependencies, as seen this morning &lt;a href="https://simonwillison.net/2026/Mar/24/malicious-litellm/"&gt;with LiteLLM&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I still want my coding agents to run in a robust sandbox by default, one that restricts file access and network connections in a deterministic way. I trust those a whole lot more than prompt-based protections like this new auto mode.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="security"/><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>JavaScript Sandboxing Research</title><link href="https://simonwillison.net/2026/Mar/22/javascript-sandboxing-research/#atom-tag" rel="alternate"/><published>2026-03-22T19:53:00+00:00</published><updated>2026-03-22T19:53:00+00:00</updated><id>https://simonwillison.net/2026/Mar/22/javascript-sandboxing-research/#atom-tag</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Research:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme"&gt;JavaScript Sandboxing Research&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;Aaron Harper &lt;a href="https://www.inngest.com/blog/node-worker-threads"&gt;wrote about Node.js worker threads&lt;/a&gt;, which inspired me to run a research task to see if they might help with running JavaScript in a sandbox. Claude Code went way beyond my initial question and produced a comparison of &lt;a href="https://github.com/laverdet/isolated-vm"&gt;isolated-vm&lt;/a&gt;, &lt;a href="https://github.com/patriksimek/vm2"&gt;vm2&lt;/a&gt;, &lt;a href="https://github.com/justjake/quickjs-emscripten"&gt;quickjs-emscripten&lt;/a&gt;, &lt;a href="https://github.com/quickjs-ng/quickjs"&gt;QuickJS-NG&lt;/a&gt;, &lt;a href="https://github.com/tc39/proposal-shadowrealm"&gt;ShadowRealm&lt;/a&gt;, and &lt;a href="https://docs.deno.com/runtime/manual/runtime/workers/"&gt;Deno Workers&lt;/a&gt;.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nodejs"&gt;nodejs&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="javascript"/><category term="nodejs"/><category term="sandboxing"/><category term="claude-code"/></entry><entry><title>Coding agents for data analysis</title><link href="https://simonwillison.net/2026/Mar/16/coding-agents-for-data-analysis/#atom-tag" rel="alternate"/><published>2026-03-16T20:12:32+00:00</published><updated>2026-03-16T20:12:32+00:00</updated><id>https://simonwillison.net/2026/Mar/16/coding-agents-for-data-analysis/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/"&gt;Coding agents for data analysis&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Here's the handout I prepared for my NICAR 2026 workshop "Coding agents for data analysis" - a three hour session aimed at data journalists demonstrating ways that tools like Claude Code and OpenAI Codex can be used to explore, analyze and clean data.&lt;/p&gt;
&lt;p&gt;Here's the table of contents:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/coding-agents.html"&gt;Coding agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/warmup.html"&gt;Warmup: ChatGPT and Claude&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/setup.html"&gt;Setup Claude Code and Codex&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/asking-questions.html"&gt;Asking questions against a database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/exploring-data.html"&gt;Exploring data with agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/cleaning-trees.html"&gt;Cleaning data: decoding neighborhood codes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/visualizations.html"&gt;Creating visualizations with agents&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://simonw.github.io/nicar-2026-coding-agents/scraping.html"&gt;Scraping data with agents&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I ran the workshop using GitHub Codespaces and OpenAI Codex, since it was easy (and inexpensive) to distribute a budget-restricted API key for Codex that attendees could use during the class. Participants ended up burning $23 of Codex tokens.&lt;/p&gt;
&lt;p&gt;The exercises all used Python and SQLite and some of them used Datasette.&lt;/p&gt;
&lt;p&gt;One highlight of the workshop was when we started &lt;a href="https://simonw.github.io/nicar-2026-coding-agents/visualizations.html#javascript-visualizations"&gt;running Datasette&lt;/a&gt; such that it served static content from a &lt;code&gt;viz/&lt;/code&gt; folder, then had Claude Code start vibe coding new interactive visualizations directly in that folder. Here's a heat map it created for my trees database using Leaflet and &lt;a href="https://github.com/Leaflet/Leaflet.heat"&gt;Leaflet.heat&lt;/a&gt;, &lt;a href="https://gist.github.com/simonw/985ae2a6a3cd3df3fd375eb58dabea0f"&gt;source code here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a &amp;quot;Trees SQL Map&amp;quot; web application with the heading &amp;quot;Trees SQL Map&amp;quot; and subheading &amp;quot;Run a query and render all returned points as a heat map. The default query targets roughly 200,000 trees.&amp;quot; Below is an input field containing &amp;quot;/trees/-/query.json&amp;quot;, a &amp;quot;Run Query&amp;quot; button, and a SQL query editor with the text &amp;quot;SELECT cast(Latitude AS float) AS latitude, cast(Longitude AS float) AS longitude, CASE WHEN DBH IS NULL OR DBH = '' THEN 0.3 WHEN cast(DBH AS float) &amp;lt;= 0 THEN 0.3 WHEN cast(DBH AS float) &amp;gt;= 80 THEN 1.0&amp;quot; (query is truncated). A status message reads &amp;quot;Loaded 1,000 rows and plotted 1,000 points as heat map.&amp;quot; Below is a Leaflet/OpenStreetMap interactive map of San Francisco showing a heat map overlay of tree locations, with blue/green clusters concentrated in areas like the Richmond District, Sunset District, and other neighborhoods. Map includes zoom controls and a &amp;quot;Leaflet | © OpenStreetMap contributors&amp;quot; attribution." src="https://static.simonwillison.net/static/2026/tree-sql-map.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;I designed the handout to also be useful for people who weren't able to attend the session in person. As is usually the case, material aimed at data journalists is equally applicable to anyone else with data to explore.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/geospatial"&gt;geospatial&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/speaking"&gt;speaking&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-codespaces"&gt;github-codespaces&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nicar"&gt;nicar&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex"&gt;codex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/leaflet"&gt;leaflet&lt;/a&gt;&lt;/p&gt;



</summary><category term="data-journalism"/><category term="geospatial"/><category term="python"/><category term="speaking"/><category term="sqlite"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="github-codespaces"/><category term="nicar"/><category term="coding-agents"/><category term="claude-code"/><category term="codex"/><category term="leaflet"/></entry><entry><title>GIF optimization tool using WebAssembly and Gifsicle</title><link href="https://simonwillison.net/guides/agentic-engineering-patterns/gif-optimization/#atom-tag" rel="alternate"/><published>2026-03-02T16:35:10+00:00</published><updated>2026-03-02T16:35:10+00:00</updated><id>https://simonwillison.net/guides/agentic-engineering-patterns/gif-optimization/#atom-tag</id><summary type="html">
    &lt;p&gt;&lt;em&gt;&lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/"&gt;Agentic Engineering Patterns&lt;/a&gt; &amp;gt;&lt;/em&gt;&lt;/p&gt;
    &lt;p&gt;I like to include animated GIF demos in my online writing, often recorded using &lt;a href="https://www.cockos.com/licecap/"&gt;LICEcap&lt;/a&gt;. There's an example in the &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/interactive-explanations/"&gt;Interactive explanations&lt;/a&gt; chapter.&lt;/p&gt;
&lt;p&gt;These GIFs can be pretty big. I've tried a few tools for optimizing GIF file size and my favorite is &lt;a href="https://github.com/kohler/gifsicle"&gt;Gifsicle&lt;/a&gt; by Eddie Kohler. It compresses GIFs by identifying regions of frames that have not changed and storing only the differences, and can optionally reduce the GIF color palette or apply visible lossy compression for greater size reductions.&lt;/p&gt;
&lt;p&gt;Gifsicle is written in C and the default interface is a command line tool. I wanted a web interface so I could access it in my browser and visually preview and compare the different settings.&lt;/p&gt;
&lt;p&gt;I prompted Claude Code for web (from my iPhone using the Claude iPhone app) against my &lt;a href="https://github.com/simonw/tools"&gt;simonw/tools&lt;/a&gt; repo with the following:&lt;/p&gt;
&lt;pre&gt;gif-optimizer.html

Compile gifsicle to WASM, then build a web page that lets you open or drag-drop an animated GIF onto it and it then shows you that GIF compressed using gifsicle with a number of different settings, each preview with the size and a download button

Also include controls for the gifsicle options for manual use - each preview has a “tweak these settings” link which sets those manual settings to the ones used for that preview so the user can customize them further

Run “uvx rodney –help” and use that tool to tray your work - use this GIF for testing https://static.simonwillison.net/static/2026/animated-word-cloud-demo.gif&lt;/pre&gt;
&lt;p&gt;Here's &lt;a href="https://tools.simonwillison.net/gif-optimizer"&gt;what it built&lt;/a&gt;, plus an animated GIF demo that I optimized using the tool:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animation. I drop on a GIF and the tool updates the page with a series of optimized versions under different settings. I eventually select Tweak settings on one of them, scroll to the bottom, adjust some sliders and download the result." src="https://static.simonwillison.net/static/2026/demo2-32-colors-lossy.gif" /&gt;&lt;/p&gt;
&lt;p&gt;Let's address that prompt piece by piece.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;gif-optimizer.html&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The first line simply tells it the name of the file I want to create. Just a filename is enough here - I know that when Claude runs "ls" on the repo it will understand that every file is a different tool.&lt;/p&gt;
&lt;p&gt;My &lt;a href="https://github.com/simonw/tools"&gt;simonw/tools&lt;/a&gt; repo currently lacks a &lt;code&gt;CLAUDE.md&lt;/code&gt; or &lt;code&gt;AGENTS.md&lt;/code&gt; file. I've found that agents pick up enough of the gist of the repo just from scanning the existing file tree and looking at relevant code in existing files.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Compile gifsicle to WASM, then build a web page that lets you open or drag-drop an animated GIF onto it and it then shows you that GIF compressed using gifsicle with a number of different settings, each preview with the size and a download button&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm making a bunch of assumptions here about Claude's existing knowledge, all of which paid off.&lt;/p&gt;
&lt;p&gt;Gifsicle is nearly 30 years old now and is a widely used piece of software - I was confident that referring to it by name would be enough for Claude to find the code.&lt;/p&gt;
&lt;p&gt;"&lt;code&gt;Compile gifsicle to WASM&lt;/code&gt;" is doing a &lt;em&gt;lot&lt;/em&gt; of work here.&lt;/p&gt;
&lt;p&gt;WASM is short for &lt;a href="https://webassembly.org/"&gt;WebAssembly&lt;/a&gt;, the technology that lets browsers run compiled code safely in a sandbox.&lt;/p&gt;
&lt;p&gt;Compiling a project like Gifsicle to WASM is not a trivial operation, involving a complex toolchain usually involving the &lt;a href="https://emscripten.org/"&gt;Emscripten&lt;/a&gt; project. It often requires a lot of trial and error to get everything working.&lt;/p&gt;
&lt;p&gt;Coding agents are fantastic at trial and error! They can often brute force their way to a solution where I would have given up after the fifth inscrutable compiler error.&lt;/p&gt;
&lt;p&gt;I've seen Claude Code figure out WASM builds many times before, so I was quite confident this would work.&lt;/p&gt;
&lt;p&gt;"&lt;code&gt;then build a web page that lets you open or drag-drop an animated GIF onto it&lt;/code&gt;" describes a pattern I've used in a lot of my other tools.&lt;/p&gt;
&lt;p&gt;HTML file uploads work fine for selecting files, but a nicer UI, especially on desktop, is to allow users to drag and drop files into a prominent drop zone on a page.&lt;/p&gt;
&lt;p&gt;Setting this up involves a bit of JavaScript to process the events and some CSS for the drop zone. It's not complicated but it's enough extra work that I might not normally add it myself. With a prompt it's almost free.&lt;/p&gt;
&lt;p&gt;Here's the resulting UI - which was influenced by Claude taking a peek at my existing &lt;a href="https://tools.simonwillison.net/image-resize-quality"&gt;image-resize-quality&lt;/a&gt; tool:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a web application titled &amp;quot;GIF Optimizer&amp;quot; with subtitle &amp;quot;Powered by gifsicle compiled to WebAssembly — all processing happens in your browser&amp;quot;. A large dashed-border drop zone reads &amp;quot;Drop an animated GIF here or click to select&amp;quot;. Below is a text input with placeholder &amp;quot;Or paste a GIF URL...&amp;quot; and a blue &amp;quot;Load URL&amp;quot; button. Footer text reads &amp;quot;Built with gifsicle by Eddie Kohler, compiled to WebAssembly. gifsicle is released under the GNU General Public License, version 2.&amp;quot;" src="https://static.simonwillison.net/static/2026/gif-optimizer.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;I didn't ask for the GIF URL input and I'm not keen on it, because it only works against URLs to GIFs that are served with open CORS headers. I'll probably remove that in a future update.&lt;/p&gt;
&lt;p&gt;"&lt;code&gt;then shows you that GIF compressed using gifsicle with a number of different settings, each preview with the size and a download button&lt;/code&gt;" describes the key feature of the application.&lt;/p&gt;
&lt;p&gt;I didn't bother defining the collection of settings I wanted - in my experience Claude has good enough taste at picking those for me, and we can always change them if its first guesses don't work.&lt;/p&gt;
&lt;p&gt;Showing the size is important since this is all about optimizing for size.&lt;/p&gt;
&lt;p&gt;I know from past experience that asking for a "download button" gets a button with the right HTML and JavaScript mechanisms set up such that clicking it provides a file save dialog, which is a nice convenience over needing to right-click-save-as.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Also include controls for the gifsicle options for manual use - each preview has a “tweak these settings” link which sets those manual settings to the ones used for that preview so the user can customize them further&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is a pretty clumsy prompt - I was typing it in my phone after all - but it expressed my intention well enough for Claude to build what I wanted. &lt;/p&gt;
&lt;p&gt;Here's what that looks like in the resulting tool, this screenshot showing the mobile version. Each image has a "Tweak these settings" button which, when clicked, updates this set of manual settings and sliders:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a GIF Optimizer results and settings panel. At top, results show &amp;quot;110.4 KB (original: 274.0 KB) — 59.7% smaller&amp;quot; in green, with a blue &amp;quot;Download&amp;quot; button and a &amp;quot;Tweak these settings&amp;quot; button. Below is a &amp;quot;Manual Settings&amp;quot; card containing: &amp;quot;Optimization level&amp;quot; dropdown set to &amp;quot;-O3 (aggressive)&amp;quot;, &amp;quot;Lossy (0 = off, higher = more loss)&amp;quot; slider set to 0, &amp;quot;Colors (0 = unchanged)&amp;quot; slider set to 0, &amp;quot;Color reduction method&amp;quot; dropdown set to &amp;quot;Default&amp;quot;, &amp;quot;Scale (%)&amp;quot; slider set to 100%, &amp;quot;Dither&amp;quot; dropdown set to &amp;quot;Default&amp;quot;, and a blue &amp;quot;Optimize with these settings&amp;quot; button." src="https://static.simonwillison.net/static/2026/gif-optimizer-tweak.jpg" /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run “uvx rodney --help” and use that tool to tray your work - use this GIF for testing https://static.simonwillison.net/static/2026/animated-word-cloud-demo.gif&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Coding agents work &lt;em&gt;so much better&lt;/em&gt; if you make sure they have the ability to test their code while they are working.&lt;/p&gt;
&lt;p&gt;There are many different ways to test a web interface - &lt;a href="https://playwright.dev/"&gt;Playwright&lt;/a&gt; and &lt;a href="https://www.selenium.dev/"&gt;Selenium&lt;/a&gt; and &lt;a href="https://agent-browser.dev/"&gt;agent-browser&lt;/a&gt; are three solid options.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/simonw/rodney"&gt;Rodney&lt;/a&gt; is a browser automation tool I built myself, which is quick to install and has &lt;code&gt;--help&lt;/code&gt; output that's designed to teach an agent everything it needs to know to use the tool.&lt;/p&gt;
&lt;p&gt;This worked great - in &lt;a href="https://claude.ai/code/session_01C8JpE3yQpwHfBCFni4ZUc4"&gt;the session transcript&lt;/a&gt; you can see Claude using Rodney and fixing some minor bugs that it spotted, for example:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The CSS &lt;code&gt;display: none&lt;/code&gt; is winning over the inline style reset. I need to set &lt;code&gt;display: 'block'&lt;/code&gt; explicitly.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="the-follow-up-prompts"&gt;The follow-up prompts&lt;/h2&gt;
&lt;p&gt;When I'm working with Claude Code I usually keep an eye on what it's doing so I can redirect it while it's still in flight. I also often come up with new ideas while it's working which I then inject into the queue.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Include the build script and diff against original gifsicle code in the commit in an appropriate subdirectory&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;The build script should clone the gifsicle repo to /tmp and switch to a known commit before applying the diff - so no copy of gifsicle in the commit but all the scripts needed to build the wqsm&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I added this when I noticed it was putting a &lt;em&gt;lot&lt;/em&gt; of effort into figuring out how to get Gifsicle working with WebAssembly, including patching the original source code. Here's &lt;a href="https://github.com/simonw/tools/blob/main/lib/gifsicle/gifsicle-wasm.patch"&gt;the patch&lt;/a&gt; and &lt;a href="https://github.com/simonw/tools/blob/main/lib/gifsicle/build.sh"&gt;the build script&lt;/a&gt; it added to the repo.&lt;/p&gt;
&lt;p&gt;I knew there was a pattern in that repo already for where supporting files lived but I couldn't remember what that pattern was. Saying "in an appropriate subdirectory" was enough for Claude to figure out where to put it - it found and used the existing &lt;a href="https://github.com/simonw/tools/tree/main/lib"&gt;lib/ directory&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;You should include the wasm bundle&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This probably wasn't necessary, but I wanted to make absolutely sure that the compiled WASM file (which turned out &lt;a href="https://github.com/simonw/tools/blob/main/lib/gifsicle/gifsicle.wasm"&gt;to be 233KB&lt;/a&gt;) was committed to the repo. I serve &lt;code&gt;simonw/tools&lt;/code&gt; via GitHub Pages at &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; and I wanted it to work without needing to be built locally.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Make sure the HTML page credits gifsicle and links to the repo&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is just polite! I often build WebAssembly wrappers around other people's open source projects and I like to make sure they get credit in the resulting page.&lt;/p&gt;
&lt;p&gt;Claude added this to the footer of the tool:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Built with &lt;a href="https://github.com/kohler/gifsicle"&gt;gifsicle&lt;/a&gt; by Eddie Kohler, compiled to WebAssembly. gifsicle is released under the GNU General Public License, version 2.&lt;/p&gt;
&lt;/blockquote&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/gif"&gt;gif&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="gif"/><category term="tools"/><category term="webassembly"/><category term="claude"/><category term="llms"/><category term="prompt-engineering"/><category term="ai"/><category term="generative-ai"/><category term="coding-agents"/><category term="claude-code"/><category term="agentic-engineering"/></entry><entry><title>Claude Code Remote Control</title><link href="https://simonwillison.net/2026/Feb/25/claude-code-remote-control/#atom-tag" rel="alternate"/><published>2026-02-25T17:33:24+00:00</published><updated>2026-02-25T17:33:24+00:00</updated><id>https://simonwillison.net/2026/Feb/25/claude-code-remote-control/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://code.claude.com/docs/en/remote-control"&gt;Claude Code Remote Control&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New Claude Code feature dropped yesterday: you can now run a "remote control" session on your computer and then use the Claude Code for web interfaces (on web, iOS and native desktop app) to send prompts to that session.&lt;/p&gt;
&lt;p&gt;It's a little bit janky right now. Initially when I tried it I got the error "Remote Control is not enabled for your account. Contact your administrator." (but I &lt;em&gt;am&lt;/em&gt; my administrator?) - then I logged out and back into the Claude Code terminal app and it started working:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;claude remote-control
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can only run one session on your machine at a time. If you upgrade the Claude iOS app it then shows up as "Remote Control Session (Mac)" in the Code tab.&lt;/p&gt;
&lt;p&gt;It appears not to support the &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt; flag (I passed that to &lt;code&gt;claude remote-control&lt;/code&gt; and it didn't reject the option, but it also appeared to have no effect) - which means you have to approve every new action it takes.&lt;/p&gt;
&lt;p&gt;I also managed to get it to a state where every prompt I tried was met by an API 500 error.&lt;/p&gt;
&lt;p style="text-align: center;"&gt;&lt;img src="https://static.simonwillison.net/static/2026/vampire-remote.jpg" alt="Screenshot of a &amp;quot;Remote Control session&amp;quot; (Mac:dev:817b) chat interface. User message: &amp;quot;Play vampire by Olivia Rodrigo in music app&amp;quot;. Response shows an API Error: 500 {&amp;quot;type&amp;quot;:&amp;quot;error&amp;quot;,&amp;quot;error&amp;quot;:{&amp;quot;type&amp;quot;:&amp;quot;api_error&amp;quot;,&amp;quot;message&amp;quot;:&amp;quot;Internal server error&amp;quot;},&amp;quot;request_id&amp;quot;:&amp;quot;req_011CYVBLH9yt2ze2qehrX8nk&amp;quot;} with a &amp;quot;Try again&amp;quot; button. Below, the assistant responds: &amp;quot;I&amp;#39;ll play &amp;quot;Vampire&amp;quot; by Olivia Rodrigo in the Music app using AppleScript.&amp;quot; A Bash command panel is open showing an osascript command: osascript -e &amp;#39;tell application &amp;quot;Music&amp;quot; activate set searchResults to search playlist &amp;quot;Library&amp;quot; for &amp;quot;vampire Olivia Rodrigo&amp;quot; if (count of searchResults) &amp;gt; 0 then play item 1 of searchResults else return &amp;quot;Song not found in library&amp;quot; end if end tell&amp;#39;" style="max-width: 80%;" /&gt;&lt;/p&gt;

&lt;p&gt;Restarting the program on the machine also causes existing sessions to start returning mysterious API errors rather than neatly explaining that the session has terminated.&lt;/p&gt;
&lt;p&gt;I expect they'll iron out all of these issues relatively quickly. It's interesting to then contrast this to solutions like OpenClaw, where one of the big selling points is the ability to control your personal device from your phone.&lt;/p&gt;
&lt;p&gt;Claude Code still doesn't have a documented mechanism for running things on a schedule, which is the other killer feature of the Claw category of software.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: I spoke too soon: also today Anthropic announced &lt;a href="https://support.claude.com/en/articles/13854387-schedule-recurring-tasks-in-cowork"&gt;Schedule recurring tasks in Cowork&lt;/a&gt;, Claude Code's &lt;a href="https://simonwillison.net/2026/Jan/12/claude-cowork/"&gt;general agent sibling&lt;/a&gt;. These do include an important limitation:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Scheduled tasks only run while your computer is awake and the Claude Desktop app is open. If your computer is asleep or the app is closed when a task is scheduled to run, Cowork will skip the task, then run it automatically once your computer wakes up or you open the desktop app again.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I really hope they're working on a Cowork Cloud product.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/claudeai/status/2026418433911603668"&gt;@claudeai&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/applescript"&gt;applescript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openclaw"&gt;openclaw&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="applescript"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="coding-agents"/><category term="claude-code"/><category term="openclaw"/></entry><entry><title>Adding TILs, releases, museums, tools and research to my blog</title><link href="https://simonwillison.net/2026/Feb/20/beats/#atom-tag" rel="alternate"/><published>2026-02-20T23:47:10+00:00</published><updated>2026-02-20T23:47:10+00:00</updated><id>https://simonwillison.net/2026/Feb/20/beats/#atom-tag</id><summary type="html">
    &lt;p&gt;I've been wanting to add indications of my various other online activities to my blog for a while now. I just turned on a new feature I'm calling "beats" (after story beats, naming this was hard!) which adds five new types of content to my site, all corresponding to activity elsewhere.&lt;/p&gt;
&lt;p&gt;Here's what beats look like:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/three-beats.jpg" alt="Screenshot of a fragment of a page showing three entries from 30th Dec 2025. First: [RELEASE] &amp;quot;datasette-turnstile 0.1a0 — Configurable CAPTCHAs for Datasette paths usin…&amp;quot; at 7:23 pm. Second: [TOOL] &amp;quot;Software Heritage Repository Retriever — Download archived Git repositories f…&amp;quot; at 11:41 pm. Third: [TIL] &amp;quot;Downloading archived Git repositories from archive.softwareheritage.org — …&amp;quot; at 11:43 pm." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Those three are from &lt;a href="https://simonwillison.net/2025/Dec/30/"&gt;the 30th December 2025&lt;/a&gt; archive page.&lt;/p&gt;
&lt;p&gt;Beats are little inline links with badges that fit into different content timeline views around my site, including the homepage, search and archive pages.&lt;/p&gt;
&lt;p&gt;There are currently five types of beats:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/elsewhere/release/"&gt;Releases&lt;/a&gt; are GitHub releases of my many different open source projects, imported from &lt;a href="https://github.com/simonw/simonw/blob/main/releases_cache.json"&gt;this JSON file&lt;/a&gt; that was constructed &lt;a href="https://simonwillison.net/2020/Jul/10/self-updating-profile-readme/"&gt;by GitHub Actions&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/elsewhere/til/"&gt;TILs&lt;/a&gt; are the posts from my &lt;a href="https://til.simonwillison.net/"&gt;TIL blog&lt;/a&gt;, imported using &lt;a href="https://github.com/simonw/simonwillisonblog/blob/f883b92be23892d082de39dbada571e406f5cfbf/blog/views.py#L1169"&gt;a SQL query over JSON and HTTP&lt;/a&gt; against the Datasette instance powering that site.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/elsewhere/museum/"&gt;Museums&lt;/a&gt; are new posts on my &lt;a href="https://www.niche-museums.com/"&gt;niche-museums.com&lt;/a&gt; blog, imported from &lt;a href="https://github.com/simonw/museums/blob/909bef71cc8d336bf4ac1f13574db67a6e1b3166/plugins/export.py"&gt;this custom JSON feed&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/elsewhere/tool/"&gt;Tools&lt;/a&gt; are HTML and JavaScript tools I've vibe-coded on my &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; site, as described in &lt;a href="https://simonwillison.net/2025/Dec/10/html-tools/"&gt;Useful patterns for building HTML tools&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://simonwillison.net/elsewhere/research/"&gt;Research&lt;/a&gt; is for AI-generated research projects, hosted in my &lt;a href="https://github.com/simonw/research"&gt;simonw/research repo&lt;/a&gt; and described in &lt;a href="https://simonwillison.net/2025/Nov/6/async-code-research/"&gt;Code research projects with async coding agents like Claude Code and Codex&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That's five different custom integrations to pull in all of that data. The good news is that this kind of integration project is the kind of thing that coding agents &lt;em&gt;really&lt;/em&gt; excel at. I knocked most of the feature out in a single morning while working in parallel on various other things.&lt;/p&gt;
&lt;p&gt;I didn't have a useful structured feed of my Research projects, and it didn't matter because I gave Claude Code a link to &lt;a href="https://raw.githubusercontent.com/simonw/research/refs/heads/main/README.md"&gt;the raw Markdown README&lt;/a&gt; that lists them all and it &lt;a href="https://github.com/simonw/simonwillisonblog/blob/f883b92be23892d082de39dbada571e406f5cfbf/blog/importers.py#L77-L80"&gt;spun up a parser regex&lt;/a&gt;. Since I'm responsible for both the source and the destination I'm fine with a brittle solution that would be too risky against a source that I don't control myself.&lt;/p&gt;
&lt;p&gt;Claude also handled all of the potentially tedious UI integration work with my site, making sure the new content worked on all of my different page types and was handled correctly by my &lt;a href="https://simonwillison.net/2017/Oct/5/django-postgresql-faceted-search/"&gt;faceted search engine&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="prototyping-with-claude-artifacts"&gt;Prototyping with Claude Artifacts&lt;/h4&gt;
&lt;p&gt;I actually prototyped the initial concept for beats in regular Claude - not Claude Code - taking advantage of the fact that it can clone public repos from GitHub these days. I started with:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Clone simonw/simonwillisonblog and tell me about the models and views&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And then later in the brainstorming session said:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;use the templates and CSS in this repo to create a new artifact with all HTML and CSS inline that shows me my homepage with some of those inline content types mixed in&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;After some iteration we got to &lt;a href="https://gisthost.github.io/?c3f443cc4451cf8ce03a2715a43581a4/preview.html"&gt;this artifact mockup&lt;/a&gt;, which was enough to convince me that the concept had legs and was worth handing over to full &lt;a href="https://code.claude.com/docs/en/claude-code-on-the-web"&gt;Claude Code for web&lt;/a&gt; to implement.&lt;/p&gt;
&lt;p&gt;If you want to see how the rest of the build played out the most interesting PRs are &lt;a href="https://github.com/simonw/simonwillisonblog/pull/592"&gt;Beats #592&lt;/a&gt; which implemented the core feature and &lt;a href="https://github.com/simonw/simonwillisonblog/pull/595/changes"&gt;Add Museums Beat importer #595&lt;/a&gt; which added the Museums content type.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/museums"&gt;museums&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/til"&gt;til&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-artifacts"&gt;claude-artifacts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/site-upgrades"&gt;site-upgrades&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="blogging"/><category term="museums"/><category term="ai"/><category term="til"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude-artifacts"/><category term="claude-code"/><category term="site-upgrades"/></entry><entry><title>Quoting Thariq Shihipar</title><link href="https://simonwillison.net/2026/Feb/20/thariq-shihipar/#atom-tag" rel="alternate"/><published>2026-02-20T07:13:19+00:00</published><updated>2026-02-20T07:13:19+00:00</updated><id>https://simonwillison.net/2026/Feb/20/thariq-shihipar/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://twitter.com/trq212/status/2024574133011673516"&gt;&lt;p&gt;Long running agentic products like Claude Code are made feasible by prompt caching which allows us to reuse computation from previous roundtrips and significantly decrease latency and cost. [...]&lt;/p&gt;
&lt;p&gt;At Claude Code, we build our entire harness around prompt caching. A high prompt cache hit rate decreases costs and helps us create more generous rate limits for our subscription plans, so we run alerts on our prompt cache hit rate and declare SEVs if they're too low.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://twitter.com/trq212/status/2024574133011673516"&gt;Thariq Shihipar&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="ai-agents"/><category term="claude-code"/></entry><entry><title>Recovering lost code</title><link href="https://simonwillison.net/2026/Feb/19/recovering-lost-code/#atom-tag" rel="alternate"/><published>2026-02-19T23:48:35+00:00</published><updated>2026-02-19T23:48:35+00:00</updated><id>https://simonwillison.net/2026/Feb/19/recovering-lost-code/#atom-tag</id><summary type="html">
    &lt;p&gt;Reached the stage of parallel agent psychosis where I've lost a whole feature - I know I had it yesterday, but I can't seem to find the branch or worktree or cloud instance or checkout with it in.&lt;/p&gt;
&lt;p&gt;... found it! Turns out I'd been hacking on a random prototype in &lt;code&gt;/tmp&lt;/code&gt; and then my computer crashed and rebooted and I lost the code... but it's all still there in &lt;code&gt;~/.claude/projects/&lt;/code&gt; session logs and Claude Code can extract it out and spin up the missing feature again.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/parallel-agents"&gt;parallel-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;



</summary><category term="parallel-agents"/><category term="coding-agents"/><category term="claude-code"/><category term="generative-ai"/><category term="ai"/><category term="llms"/></entry><entry><title>The A.I. Disruption We’ve Been Waiting for Has Arrived</title><link href="https://simonwillison.net/2026/Feb/18/the-ai-disruption/#atom-tag" rel="alternate"/><published>2026-02-18T17:07:31+00:00</published><updated>2026-02-18T17:07:31+00:00</updated><id>https://simonwillison.net/2026/Feb/18/the-ai-disruption/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.nytimes.com/2026/02/18/opinion/ai-software.html?unlocked_article_code=1.NFA.UkLv.r-XczfzYRdXJ&amp;amp;smid=url-share"&gt;The A.I. Disruption We’ve Been Waiting for Has Arrived&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New opinion piece from Paul Ford in the New York Times. Unsurprisingly for a piece by Paul it's packed with quoteworthy snippets, but a few stood out for me in particular.&lt;/p&gt;
&lt;p&gt;Paul describes the &lt;a href="https://simonwillison.net/2026/Jan/4/inflection/"&gt;November moment&lt;/a&gt; that so many other programmers have observed, and highlights Claude Code's ability to revive old side projects:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[Claude Code] was always a helpful coding assistant, but in November it suddenly got much better, and ever since I’ve been knocking off side projects that had sat in folders for a decade or longer. It’s fun to see old ideas come to life, so I keep a steady flow. Maybe it adds up to a half-hour a day of my time, and an hour of Claude’s.&lt;/p&gt;
&lt;p&gt;November was, for me and many others in tech, a great surprise. Before, A.I. coding tools were often useful, but halting and clumsy. Now, the bot can run for a full hour and make whole, designed websites and apps that may be flawed, but credible. I spent an entire session of therapy talking about it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And as the former CEO of a respected consultancy firm (Postlight) he's well positioned to evaluate the potential impact:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When you watch a large language model slice through some horrible, expensive problem — like migrating data from an old platform to a modern one — you feel the earth shifting. I was the chief executive of a software services firm, which made me a professional software cost estimator. When I rebooted my messy personal website a few weeks ago, I realized: I would have paid $25,000 for someone else to do this. When a friend asked me to convert a large, thorny data set, I downloaded it, cleaned it up and made it pretty and easy to explore. In the past I would have charged $350,000.&lt;/p&gt;
&lt;p&gt;That last price is full 2021 retail — it implies a product manager, a designer, two engineers (one senior) and four to six months of design, coding and testing. Plus maintenance. Bespoke software is joltingly expensive. Today, though, when the stars align and my prompts work out, I can do hundreds of thousands of dollars worth of work for fun (fun for me) over weekends and evenings, for the price of the Claude $200-a-month plan.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He also neatly captures the inherent community tension involved in exploring this technology:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;All of the people I love hate this stuff, and all the people I hate love it. And yet, likely because of the same personality flaws that drew me to technology in the first place, I am annoyingly excited.&lt;/p&gt;
&lt;/blockquote&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/new-york-times"&gt;new-york-times&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/paul-ford"&gt;paul-ford&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/careers"&gt;careers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/deep-blue"&gt;deep-blue&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/november-2025-inflection"&gt;november-2025-inflection&lt;/a&gt;&lt;/p&gt;



</summary><category term="new-york-times"/><category term="paul-ford"/><category term="careers"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="ai-ethics"/><category term="coding-agents"/><category term="claude-code"/><category term="deep-blue"/><category term="november-2025-inflection"/></entry><entry><title>Introducing Claude Sonnet 4.6</title><link href="https://simonwillison.net/2026/Feb/17/claude-sonnet-46/#atom-tag" rel="alternate"/><published>2026-02-17T23:58:58+00:00</published><updated>2026-02-17T23:58:58+00:00</updated><id>https://simonwillison.net/2026/Feb/17/claude-sonnet-46/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.anthropic.com/news/claude-sonnet-4-6"&gt;Introducing Claude Sonnet 4.6&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Sonnet 4.6 is out today, and Anthropic claim it offers similar performance to &lt;a href="https://simonwillison.net/2025/Nov/24/claude-opus/"&gt;November's Opus 4.5&lt;/a&gt; while maintaining the Sonnet pricing of $3/million input and $15/million output tokens (the Opus models are $5/$25). Here's &lt;a href="https://www-cdn.anthropic.com/78073f739564e986ff3e28522761a7a0b4484f84.pdf"&gt;the system card PDF&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Sonnet 4.6 has a "reliable knowledge cutoff" of August 2025, compared to Opus 4.6's May 2025 and Haiku 4.5's February 2025. Both Opus and Sonnet default to 200,000 max input tokens but can stretch to 1 million in beta and at a higher cost.&lt;/p&gt;
&lt;p&gt;I just released &lt;a href="https://github.com/simonw/llm-anthropic/releases/tag/0.24"&gt;llm-anthropic 0.24&lt;/a&gt; with support for both Sonnet 4.6 and Opus 4.6. Claude Code &lt;a href="https://github.com/simonw/llm-anthropic/pull/65"&gt;did most of the work&lt;/a&gt; - the new models had a fiddly amount of extra details around adaptive thinking and no longer supporting prefixes, as described &lt;a href="https://platform.claude.com/docs/en/about-claude/models/migration-guide"&gt;in Anthropic's migration guide&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here's &lt;a href="https://gist.github.com/simonw/b185576a95e9321b441f0a4dfc0e297c"&gt;what I got&lt;/a&gt; from:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx --with llm-anthropic llm 'Generate an SVG of a pelican riding a bicycle' -m claude-sonnet-4.6
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img alt="The pelican has a jaunty top hat with a red band. There is a string between the upper and lower beaks for some reason. The bicycle frame is warped in the wrong way." src="https://static.simonwillison.net/static/2026/pelican-sonnet-4.6.png" /&gt;&lt;/p&gt;
&lt;p&gt;The SVG comments include:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;lt;!-- Hat (fun accessory) --&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I tried a second time and also got a top hat. Sonnet 4.6 apparently loves top hats!&lt;/p&gt;
&lt;p&gt;For comparison, here's the pelican Opus 4.5 drew me &lt;a href="(https://simonwillison.net/2025/Nov/24/claude-opus/)"&gt;in November&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="The pelican is cute and looks pretty good. The bicycle is not great - the frame is wrong and the pelican is facing backwards when the handlebars appear to be forwards.There is also something that looks a bit like an egg on the handlebars." src="https://static.simonwillison.net/static/2025/claude-opus-4.5-pelican.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;And here's Anthropic's current best pelican, drawn by Opus 4.6 &lt;a href="https://simonwillison.net/2026/Feb/5/two-new-models/"&gt;on February 5th&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Slightly wonky bicycle frame but an excellent pelican, very clear beak and pouch, nice feathers." src="https://static.simonwillison.net/static/2026/opus-4.6-pelican.png" /&gt;&lt;/p&gt;
&lt;p&gt;Opus 4.6 produces the best pelican beak/pouch. I do think the top hat from Sonnet 4.6 is a nice touch though.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=47050488"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-pricing"&gt;llm-pricing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="anthropic"/><category term="claude"/><category term="llm-pricing"/><category term="pelican-riding-a-bicycle"/><category term="llm-release"/><category term="claude-code"/></entry><entry><title>Quoting Dimitris Papailiopoulos</title><link href="https://simonwillison.net/2026/Feb/17/dimitris-papailiopoulos/#atom-tag" rel="alternate"/><published>2026-02-17T14:04:44+00:00</published><updated>2026-02-17T14:04:44+00:00</updated><id>https://simonwillison.net/2026/Feb/17/dimitris-papailiopoulos/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://twitter.com/dimitrispapail/status/2023080289828831349"&gt;&lt;p&gt;But the intellectually interesting part for me is something else. &lt;strong&gt;I now have something close to a magic box where I throw in a question and a first answer comes back basically for free, in terms of human effort&lt;/strong&gt;. Before this, the way I'd explore a new idea is to either clumsily put something together myself or ask a student to run something short for signal, and if it's there, we’d go deeper. That quick signal step, i.e., finding out if a question has any meat to it, is what I can now do without taking up anyone else's time. It’s now between just me, Claude Code, and a few days of GPU time.&lt;/p&gt;
&lt;p&gt;I don’t know what this means for how we do research long term. I don’t think anyone does yet. But &lt;strong&gt;the distance between a question and a first answer just got very small&lt;/strong&gt;.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://twitter.com/dimitrispapail/status/2023080289828831349"&gt;Dimitris Papailiopoulos&lt;/a&gt;, on running research questions though Claude Code&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/research"&gt;research&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="research"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="coding-agents"/><category term="claude-code"/></entry><entry><title>Two new Showboat tools: Chartroom and datasette-showboat</title><link href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#atom-tag" rel="alternate"/><published>2026-02-17T00:43:45+00:00</published><updated>2026-02-17T00:43:45+00:00</updated><id>https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#atom-tag</id><summary type="html">
    &lt;p&gt;I &lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/"&gt;introduced Showboat&lt;/a&gt; a week ago - my CLI tool that helps coding agents create Markdown documents that demonstrate the code that they have created. I've been finding new ways to use it on a daily basis, and I've just released two new tools to help get the best out of the Showboat pattern. &lt;a href="https://github.com/simonw/chartroom"&gt;Chartroom&lt;/a&gt; is a CLI charting tool that works well with Showboat, and &lt;a href="https://github.com/simonw/datasette-showboat"&gt;datasette-showboat&lt;/a&gt; lets Showboat's new remote publishing feature incrementally push documents to a Datasette instance.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#showboat-remote-publishing"&gt;Showboat remote publishing&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#datasette-showboat"&gt;datasette-showboat&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#chartroom"&gt;Chartroom&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#how-i-built-chartroom"&gt;How I built Chartroom&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#the-burgeoning-showboat-ecosystem"&gt;The burgeoning Showboat ecosystem&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id="showboat-remote-publishing"&gt;Showboat remote publishing&lt;/h4&gt;
&lt;p&gt;I normally use Showboat in Claude Code for web (see &lt;a href="https://simonwillison.net/2026/Feb/16/rodney-claude-code/"&gt;note from this morning&lt;/a&gt;). I've used it in several different projects in the past few days, each of them with a prompt that looks something like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Use "uvx showboat --help" to perform a very thorough investigation of what happens if you use the Python sqlite-chronicle and sqlite-history-json libraries against the same SQLite database table&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/research/blob/main/sqlite-chronicle-vs-history-json/demo.md"&gt;the resulting document&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Just telling Claude Code to run &lt;code&gt;uvx showboat --help&lt;/code&gt; is enough for it to learn how to use the tool - the &lt;a href="https://github.com/simonw/showboat/blob/main/help.txt"&gt;help text&lt;/a&gt; is designed to work as a sort of ad-hoc Skill document.&lt;/p&gt;
&lt;p&gt;The one catch with this approach is that I can't &lt;em&gt;see&lt;/em&gt; the new Showboat document until it's finished. I have to wait for Claude to commit the document plus embedded screenshots and push that to a branch in my GitHub repo - then I can view it through the GitHub interface.&lt;/p&gt;
&lt;p&gt;For a while I've been thinking it would be neat to have a remote web server of my own which Claude instances can submit updates to while they are working. Then this morning I realized Showboat might be the ideal mechanism to set that up...&lt;/p&gt;
&lt;p&gt;Showboat &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.6.0"&gt;v0.6.0&lt;/a&gt; adds a new "remote" feature. It's almost invisible to users of the tool itself, instead being configured by an environment variable.&lt;/p&gt;
&lt;p&gt;Set a variable like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; SHOWBOAT_REMOTE_URL=https://www.example.com/submit&lt;span class="pl-k"&gt;?&lt;/span&gt;token=xyz&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And every time you run a &lt;code&gt;showboat init&lt;/code&gt; or &lt;code&gt;showboat note&lt;/code&gt; or &lt;code&gt;showboat exec&lt;/code&gt; or &lt;code&gt;showboat image&lt;/code&gt; command the resulting document fragments will be POSTed to that API endpoint, in addition to the Showboat Markdown file itself being updated.&lt;/p&gt;
&lt;p&gt;There are &lt;a href="https://github.com/simonw/showboat/blob/v0.6.0/README.md#remote-document-streaming"&gt;full details in the Showboat README&lt;/a&gt; - it's a very simple API format, using regular POST form variables or a multipart form upload for the image attached to &lt;code&gt;showboat image&lt;/code&gt;.&lt;/p&gt;
&lt;h4 id="datasette-showboat"&gt;datasette-showboat&lt;/h4&gt;
&lt;p&gt;It's simple enough to build a webapp to receive these updates from Showboat, but I needed one that I could easily deploy and would work well with the rest of my personal ecosystem.&lt;/p&gt;
&lt;p&gt;So I had Claude Code write me a Datasette plugin that could act as a Showboat remote endpoint. I actually had this building at the same time as the Showboat remote feature, a neat example of running &lt;a href="https://simonwillison.net/2025/Oct/5/parallel-coding-agents/"&gt;parallel agents&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/datasette-showboat"&gt;datasette-showboat&lt;/a&gt;&lt;/strong&gt; is a Datasette plugin that adds a &lt;code&gt;/-/showboat&lt;/code&gt; endpoint to Datasette for viewing documents and a &lt;code&gt;/-/showboat/receive&lt;/code&gt; endpoint for receiving updates from Showboat.&lt;/p&gt;
&lt;p&gt;Here's a very quick way to try it out:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx --with datasette-showboat --prerelease=allow \
  datasette showboat.db --create \
  -s plugins.datasette-showboat.database showboat \
  -s plugins.datasette-showboat.token secret123 \
  --root --secret cookie-secret-123&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Click on the sign in as root link that shows up in the console, then navigate to &lt;a href="http://127.0.0.1:8001/-/showboat"&gt;http://127.0.0.1:8001/-/showboat&lt;/a&gt; to see the interface.&lt;/p&gt;
&lt;p&gt;Now set your environment variable to point to this instance:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;export&lt;/span&gt; SHOWBOAT_REMOTE_URL=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;http://127.0.0.1:8001/-/showboat/receive?token=secret123&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And run Showboat like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uvx showboat init demo.md &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Showboat Feature Demo&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Refresh that page and you should see this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/datasette-showboat-documents.jpg" alt="Title: Showboat. Remote viewer for Showboat documents. Showboat Feature Demo 2026-02-17 00:06 · 6 chunks, UUID. To send showboat output to this server, set the SHOWBOAT_REMOTE_URL environment variable: export SHOWBOAT_REMOTE_URL=&amp;quot;http://127.0.0.1:8001/-/showboat/receive?token=your-token&amp;quot;" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;Click through to the document, then start Claude Code or Codex or your agent of choice and prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run 'uvx showboat --help' and then use showboat to add to the existing demo.md document with notes and exec and image to demonstrate the tool - fetch a placekitten for the image demo.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;code&gt;init&lt;/code&gt; command assigns a UUID and title and sends those up to Datasette.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/datasette-showboat.gif" alt="Animated demo - in the foreground a terminal window runs Claude Code, which executes various Showboat commands. In the background a Firefox window where the Showboat Feature Demo adds notes then some bash commands, then a placekitten image." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The best part of this is that it works in Claude Code for web. Run the plugin on a server somewhere (an exercise left up to the reader - I use &lt;a href="https://fly.io/"&gt;Fly.io&lt;/a&gt; to host mine) and set that &lt;code&gt;SHOWBOAT_REMOTE_URL&lt;/code&gt; environment variable in your Claude environment, then any time you tell it to use Showboat the document it creates will be transmitted to your server and viewable in real time.&lt;/p&gt;
&lt;p&gt;I built &lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#rodney-cli-browser-automation-designed-to-work-with-showboat"&gt;Rodney&lt;/a&gt;, a CLI browser automation tool, specifically to work with Showboat. It makes it easy to have a Showboat document load up web pages, interact with them via clicks or injected JavaScript and captures screenshots to embed in the Showboat document and show the effects.&lt;/p&gt;
&lt;p&gt;This is wildly useful for hacking on web interfaces using Claude Code for web, especially when coupled with the new remote publishing feature. I only got this stuff working this morning and I've already had several sessions where Claude Code has published screenshots of its work in progress, which I've then been able to provide feedback on directly in the Claude session while it's still working.&lt;/p&gt;
&lt;h3 id="chartroom"&gt;Chartroom&lt;/h3&gt;
&lt;p&gt;A few days ago I had another idea for a way to extend the Showboat ecosystem: what if Showboat documents could easily include charts?&lt;/p&gt;
&lt;p&gt;I sometimes fire up Claude Code for data analysis tasks, often telling it to download a SQLite database and then run queries against it to figure out interesting things from the data.&lt;/p&gt;
&lt;p&gt;With a simple CLI tool that produced PNG images I could have Claude use Showboat to build a document with embedded charts to help illustrate its findings.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/chartroom"&gt;Chartroom&lt;/a&gt;&lt;/strong&gt; is exactly that. It's effectively a thin wrapper around the excellent &lt;a href="https://matplotlib.org/"&gt;matplotlib&lt;/a&gt; Python library, designed to be used by coding agents to create charts that can be embedded in Showboat documents.&lt;/p&gt;
&lt;p&gt;Here's how to render a simple bar chart:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,value&lt;/span&gt;
&lt;span class="pl-s"&gt;Alice,42&lt;/span&gt;
&lt;span class="pl-s"&gt;Bob,28&lt;/span&gt;
&lt;span class="pl-s"&gt;Charlie,35&lt;/span&gt;
&lt;span class="pl-s"&gt;Diana,51&lt;/span&gt;
&lt;span class="pl-s"&gt;Eve,19&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; uvx chartroom bar --csv \
  --title &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales by Person&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --ylabel &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;a target="_blank" rel="noopener noreferrer nofollow" href="https://raw.githubusercontent.com/simonw/chartroom/8812afc02e1310e9eddbb56508b06005ff2c0ed5/demo/1f6851ec-2026-02-14.png"&gt;&lt;img src="https://raw.githubusercontent.com/simonw/chartroom/8812afc02e1310e9eddbb56508b06005ff2c0ed5/demo/1f6851ec-2026-02-14.png" alt="A chart of those numbers, with a title and y-axis label" style="max-width: 100%;" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;It can also do line charts, bar charts, scatter charts, and histograms - as seen in &lt;a href="https://github.com/simonw/chartroom/blob/0.2.1/demo/README.md"&gt;this demo document&lt;/a&gt; that was built using Showboat.&lt;/p&gt;
&lt;p&gt;Chartroom can also generate alt text. If you add &lt;code&gt;-f alt&lt;/code&gt; to the above it will output the alt text for the chart instead of the image:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;name,value&lt;/span&gt;
&lt;span class="pl-s"&gt;Alice,42&lt;/span&gt;
&lt;span class="pl-s"&gt;Bob,28&lt;/span&gt;
&lt;span class="pl-s"&gt;Charlie,35&lt;/span&gt;
&lt;span class="pl-s"&gt;Diana,51&lt;/span&gt;
&lt;span class="pl-s"&gt;Eve,19&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;|&lt;/span&gt; uvx chartroom bar --csv \
  --title &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales by Person&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; --ylabel &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Sales&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; -f alt&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Outputs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Sales by Person. Bar chart of value by name — Alice: 42, Bob: 28, Charlie: 35, Diana: 51, Eve: 19
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Or you can use &lt;code&gt;-f html&lt;/code&gt; or &lt;code&gt;-f markdown&lt;/code&gt; to get the image tag with alt text directly:&lt;/p&gt;
&lt;div class="highlight highlight-text-md"&gt;&lt;pre&gt;&lt;span class="pl-s"&gt;![&lt;/span&gt;Sales by Person. Bar chart of value by name — Alice: 42, Bob: 28, Charlie: 35, Diana: 51, Eve: 19&lt;span class="pl-s"&gt;]&lt;/span&gt;&lt;span class="pl-s"&gt;(&lt;/span&gt;&lt;span class="pl-corl"&gt;/Users/simon/chart-7.png&lt;/span&gt;&lt;span class="pl-s"&gt;)&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I added support for Markdown images with alt text to Showboat in &lt;a href="https://github.com/simonw/showboat/releases/tag/v0.5.0"&gt;v0.5.0&lt;/a&gt;, to complement this feature of Chartroom.&lt;/p&gt;
&lt;p&gt;Finally, Chartroom has support for different &lt;a href="https://matplotlib.org/stable/gallery/style_sheets/style_sheets_reference.html"&gt;matplotlib styles&lt;/a&gt;. I had Claude build a Showboat document to demonstrate these all in one place - you can see that at &lt;a href="https://github.com/simonw/chartroom/blob/main/demo/styles.md"&gt;demo/styles.md&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="how-i-built-chartroom"&gt;How I built Chartroom&lt;/h4&gt;
&lt;p&gt;I started the Chartroom repository with my &lt;a href="https://github.com/simonw/click-app"&gt;click-app&lt;/a&gt; cookiecutter template, then told a fresh Claude Code for web session:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We are building a Python CLI tool which uses matplotlib to generate a PNG image containing a chart. It will have multiple sub commands for different chart types, controlled by command line options. Everything you need to know to use it will be available in the single "chartroom --help" output.&lt;/p&gt;
&lt;p&gt;It will accept data from files or standard input as CSV or TSV or JSON, similar to how sqlite-utils accepts data - clone simonw/sqlite-utils to /tmp for reference there. Clone matplotlib/matplotlib for reference as well&lt;/p&gt;
&lt;p&gt;It will also accept data from --sql path/to/sqlite.db "select ..." which runs in read-only mode&lt;/p&gt;
&lt;p&gt;Start by asking clarifying questions - do not use the ask user tool though it is broken - and generate a spec for me to approve&lt;/p&gt;
&lt;p&gt;Once approved proceed using red/green TDD running tests with "uv run pytest"&lt;/p&gt;
&lt;p&gt;Also while building maintain a demo/README.md document using the "uvx showboat --help" tool - each time you get a new chart type working commit the tests, implementation, root level
README update and a new version of that demo/README.md document with an inline image demo of the new chart type (which should be a UUID image filename managed by the showboat image command and should be stored in the demo/ folder&lt;/p&gt;
&lt;p&gt;Make sure "uv build" runs cleanly without complaining about extra directories but also ensure dist/ and uv.lock are in gitignore&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This got most of the work done. You can see the rest &lt;a href="https://github.com/simonw/chartroom/pulls?q=is%3Apr+is%3Aclosed"&gt;in the PRs&lt;/a&gt; that followed.&lt;/p&gt;
&lt;h4 id="the-burgeoning-showboat-ecosystem"&gt;The burgeoning Showboat ecosystem&lt;/h4&gt;
&lt;p&gt;The Showboat family of tools now consists of &lt;a href="https://github.com/simonw/showboat"&gt;Showboat&lt;/a&gt; itself, &lt;a href="https://github.com/simonw/rodney"&gt;Rodney&lt;/a&gt; for browser automation, &lt;a href="https://github.com/simonw/chartroom"&gt;Chartroom&lt;/a&gt; for charting and &lt;a href="https://github.com/simonw/datasette-showboat"&gt;datasette-showboat&lt;/a&gt; for streaming remote Showboat documents to Datasette.&lt;/p&gt;
&lt;p&gt;I'm enjoying how these tools can operate together based on a very loose set of conventions. If a tool can output a path to an image Showboat can include that image in a document. Any tool that can output text can be used with Showboat.&lt;/p&gt;
&lt;p&gt;I'll almost certainly be building more tools that fit this pattern. They're very quick to knock out!&lt;/p&gt;
&lt;p&gt;The environment variable mechanism for Showboat's remote streaming is a fun hack too - so far I'm just using it to stream documents somewhere else, but it's effectively a webhook extension mechanism that could likely be used for all sorts of things I haven't thought of yet.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/charting"&gt;charting&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/showboat"&gt;showboat&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="charting"/><category term="projects"/><category term="ai"/><category term="datasette"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="coding-agents"/><category term="claude-code"/><category term="showboat"/></entry><entry><title>Rodney and Claude Code for Desktop</title><link href="https://simonwillison.net/2026/Feb/16/rodney-claude-code/#atom-tag" rel="alternate"/><published>2026-02-16T16:38:57+00:00</published><updated>2026-02-16T16:38:57+00:00</updated><id>https://simonwillison.net/2026/Feb/16/rodney-claude-code/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm a very heavy user of &lt;a href="https://code.claude.com/docs/en/claude-code-on-the-web"&gt;Claude Code on the web&lt;/a&gt;, Anthropic's excellent but poorly named cloud version of Claude Code where everything runs in a container environment managed by them, greatly reducing the risk of anything bad happening to a computer I care about.&lt;/p&gt;
&lt;p&gt;I don't use the web interface at all (hence my dislike of the name) - I access it exclusively through their native iPhone and Mac desktop apps.&lt;/p&gt;
&lt;p&gt;Something I particularly appreciate about the desktop app is that it lets you see images that Claude is "viewing" via its &lt;code&gt;Read /path/to/image&lt;/code&gt; tool. Here's what that looks like:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a Claude Code session in Claude Desktop. Claude says: The debug page looks good - all items listed with titles and descriptions. Now let me check the nav
menu -  Analyzed menu image file - Bash uvx rodney open &amp;quot;http://localhost:8765/&amp;quot; 2&amp;gt;&amp;amp;1 &amp;amp;&amp;amp; uvx rodney click &amp;quot;details.nav-menu summary&amp;quot; 2&amp;gt;&amp;amp;1 &amp;amp;% sleep 0.5 &amp;amp;&amp;amp; uvx rodney screenshot /tmp/menu.png 2&amp;gt;&amp;amp;1 Output reads: Datasette: test, Clicked, /tmp/menu.png - then it says Read /tmp/menu.png and reveals a screenshot of the Datasette interface with the nav menu open, showing only &amp;quot;Debug&amp;quot; and &amp;quot;Log out&amp;quot; options. Claude continues: The menu now has just &amp;quot;Debug&amp;quot; and “Log out&amp;quot; — much cleaner. Both pages look good. Let me clean up the server and run the remaining tests." src="https://static.simonwillison.net/static/2026/rodney-claude-desktop.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;This means you can get a visual preview of what it's working on while it's working, without waiting for it to push code to GitHub for you to try out yourself later on.&lt;/p&gt;
&lt;p&gt;The prompt I used to trigger the above screenshot was:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Run "uvx rodney --help" and then use Rodney to manually test the new pages and menu - look at screenshots from it and check you think they look OK&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I designed &lt;a href="https://simonwillison.net/2026/Feb/10/showboat-and-rodney/#rodney-cli-browser-automation-designed-to-work-with-showboat"&gt;Rodney&lt;/a&gt; to have &lt;a href="https://github.com/simonw/rodney/blob/main/help.txt"&gt;--help output&lt;/a&gt; that provides everything a coding agent needs to know in order to use the tool.&lt;/p&gt;
&lt;p&gt;The Claude iPhone app doesn't display opened images yet, so I &lt;a href="https://twitter.com/simonw/status/2023432616066879606"&gt;requested it as a feature&lt;/a&gt; just now in a thread on Twitter.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/async-coding-agents"&gt;async-coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rodney"&gt;rodney&lt;/a&gt;&lt;/p&gt;



</summary><category term="anthropic"/><category term="claude"/><category term="ai"/><category term="claude-code"/><category term="llms"/><category term="async-coding-agents"/><category term="coding-agents"/><category term="generative-ai"/><category term="projects"/><category term="ai-assisted-programming"/><category term="rodney"/></entry><entry><title>Quoting Boris Cherny</title><link href="https://simonwillison.net/2026/Feb/14/boris/#atom-tag" rel="alternate"/><published>2026-02-14T23:59:09+00:00</published><updated>2026-02-14T23:59:09+00:00</updated><id>https://simonwillison.net/2026/Feb/14/boris/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://twitter.com/bcherny/status/2022762422302576970"&gt;&lt;p&gt;Someone has to prompt the Claudes, talk to customers, coordinate with other teams, decide what to build next. Engineering is changing and great engineers are more important than ever.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://twitter.com/bcherny/status/2022762422302576970"&gt;Boris Cherny&lt;/a&gt;, Claude Code creator, on why Anthropic are still hiring developers&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/careers"&gt;careers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="careers"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="anthropic"/><category term="coding-agents"/><category term="claude-code"/></entry></feed>