<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: disclosures</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/disclosures.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-10-28T17:17:44+00:00</updated><author><name>Simon Willison</name></author><entry><title>Hacking the WiFi-enabled color screen GitHub Universe conference badge</title><link href="https://simonwillison.net/2025/Oct/28/github-universe-badge/#atom-tag" rel="alternate"/><published>2025-10-28T17:17:44+00:00</published><updated>2025-10-28T17:17:44+00:00</updated><id>https://simonwillison.net/2025/Oct/28/github-universe-badge/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm at &lt;a href="https://githubuniverse.com/"&gt;GitHub Universe&lt;/a&gt; this week (thanks to a free ticket from Microsoft). Yesterday I picked up my conference badge... which incorporates a &lt;s&gt;full Raspberry Pi&lt;/s&gt; Raspberry Pi Pico microcontroller with a battery, color screen, WiFi and bluetooth.&lt;/p&gt;
&lt;p&gt;GitHub Universe has a tradition of hackable conference badges - the badge last year had an eInk display. This year's is a huge upgrade though - a color screen and WiFI connection makes this thing a genuinely useful little computer!&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/gitub-universe-badge.jpg" alt="Photo of the badge - it has a color screen with six app icons" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;The only thing it's missing is a keyboard - the device instead provides five buttons total - Up, Down, A, B, C. It might be possible to get a bluetooth keyboard to work though I'll believe that when I see it - there's not a lot of space on this device for a keyboard driver.&lt;/p&gt;
&lt;p&gt;Everything is written using MicroPython, and the device is designed to be hackable: connect it to a laptop with a USB-C cable and you can start modifying the code directly on the device.&lt;/p&gt;
&lt;h4 id="getting-setup-with-the-badge"&gt;Getting setup with the badge&lt;/h4&gt;
&lt;p&gt;Out of the box the badge will play an opening animation (implemented as a sequence of PNG image frames) and then show a home screen with six app icons.&lt;/p&gt;
&lt;p&gt;The default apps are mostly neat Octocat-themed demos: a flappy-bird clone, a tamagotchi-style pet, a drawing app that works like an etch-a-sketch, an IR scavenger hunt for the conference venue itself (this thing has an IR sensor too!), and a gallery app showing some images.&lt;/p&gt;
&lt;p&gt;The sixth app is a badge app. This will show your GitHub profile image and some basic stats, but will only work if you dig out a USB-C cable and make some edits to the files on the badge directly.&lt;/p&gt;
&lt;p&gt;I did this on a Mac. I plugged a USB-C cable into the badge which caused MacOS to treat it as an attached drive volume. In that drive are several files including &lt;code&gt;secrets.py&lt;/code&gt;. Open that up, confirm the WiFi details are correct and add your GitHub username. The file should look like this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-c1"&gt;WIFI_SSID&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"..."&lt;/span&gt;
&lt;span class="pl-c1"&gt;WIFI_PASSWORD&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"..."&lt;/span&gt;
&lt;span class="pl-c1"&gt;GITHUB_USERNAME&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"simonw"&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;The badge comes with the SSID and password for the GitHub Universe WiFi network pre-populated.&lt;/p&gt;
&lt;p&gt;That's it! Unmount the disk, hit the reboot button on the back of the badge and when it comes back up again the badge app should look something like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-profile.jpg" alt="Badge shows my GitHub avatar, plus 10,947 followers, 4,083 contribs, 893 repos" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="building-your-own-apps"&gt;Building your own apps&lt;/h4&gt;
&lt;p&gt;Here's &lt;a href="https://badger.github.io/"&gt;the official documentation&lt;/a&gt; for building software for the badge.&lt;/p&gt;
&lt;p&gt;When I got mine yesterday the official repo had not yet been updated, so I had to figure this out myself.&lt;/p&gt;
&lt;p&gt;I copied all of the code across to my laptop, added it to a Git repo and then fired up Claude Code and told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Investigate this code and add a detailed README&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/github-universe-2025-badge/blob/15773c7a53275e7836216c3aa9a8a781c06f7859/README.md"&gt;the result&lt;/a&gt;, which was really useful for getting a start on understanding how it all worked.&lt;/p&gt;
&lt;p&gt;Each of the six default apps lives in a &lt;code&gt;apps/&lt;/code&gt; folder, for example &lt;a href="https://github.com/simonw/github-universe-2025-badge/tree/main/apps/sketch"&gt;apps/sketch/&lt;/a&gt; for the sketching app.&lt;/p&gt;
&lt;p&gt;There's also a menu app which powers the home screen. That lives in &lt;a href="https://github.com/simonw/github-universe-2025-badge/tree/main/apps/menu"&gt;apps/menu/&lt;/a&gt;. You can edit code in here to add new apps that you create to that screen.&lt;/p&gt;
&lt;p&gt;I told Claude:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Add a new app to it available from the menu which shows network status and other useful debug info about the machine it is running on&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This was a bit of a long-shot, but it totally worked!&lt;/p&gt;
&lt;p&gt;The first version had an error:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-error.jpg" alt="A stacktrace! file badgeware.py line 510 has a list index out of range error." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;I OCRd that photo (with the Apple Photos app) and pasted the message into Claude Code and it fixed the problem.&lt;/p&gt;
&lt;p&gt;This almost worked... but the addition of a seventh icon to the 2x3 grid meant that you could select the icon but it didn't scroll into view. I had Claude &lt;a href="https://github.com/simonw/github-universe-2025-badge/commit/2a60f75db101dc1dc7568ff466ad5c97dc86b336"&gt;fix that for me too&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here's the code for &lt;a href="https://github.com/simonw/github-universe-2025-badge/blob/main/apps/debug/__init__.py"&gt;apps/debug/__init__.py&lt;/a&gt;, and &lt;a href="https://gistpreview.github.io/?276d3e0c6566ddbc93adc7020ef6b439"&gt;the full Claude Code transcript&lt;/a&gt; created using my terminal-to-HTML app &lt;a href="https://simonwillison.net/2025/Oct/23/claude-code-for-web-video/"&gt;described here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here are the four screens of the debug app:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-debug-network.jpg" alt="Network info, showing WiFi network details and IP address" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-debug-storage.jpg" alt="Storage screen, it has 1MB total, 72BK used. Usage 7%. CMD is /system/apps/debug" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-debug-system.jpg" alt="System: Platform rp2, Python 1.26.0, CPU freq 200MHz, Uptime 13m46s" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-debug-memory.jpg" alt="Memory info - 100KB used, 241KB total, and a usage bar. Press B to run GC." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="an-icon-editor"&gt;An icon editor&lt;/h4&gt;
&lt;p&gt;The icons used on the app are 24x24 pixels. I decided it would be neat to have a web app that helps build those icons, including the ability to start by creating an icon from an emoji.&lt;/p&gt;
&lt;p&gt;I bulit this one &lt;a href="https://claude.ai/share/ca05bd58-859e-4ceb-b5c7-7428b348df3c"&gt;using Claude Artifacts&lt;/a&gt;. Here's the result, now available at &lt;a href="https://tools.simonwillison.net/icon-editor"&gt;tools.simonwillison.net/icon-editor&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/icon-editor.jpg" alt="A stacktrace! file badgeware.py line 510 has a list index out of range error." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="and-a-repl"&gt;And a REPL&lt;/h4&gt;
&lt;p&gt;I noticed that last year's badge configuration app (which I can't find in &lt;a href="https://github.com/badger/badger.github.io/"&gt;github.com/badger/badger.github.io&lt;/a&gt; any more, I think they reset the history on that repo?) worked by talking to MicroPython over the Web Serial API from Chrome. Here's &lt;a href="https://github.com/simonw/2004-badger.github.io/blob/e3501d631a987bfbc12d93c9e35bf2c64e55d052/public/script.js#L305-L394"&gt;my archived copy of that code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Wouldn't it be useful to have a REPL in a web UI that you could use to interact with the badge directly over USB?&lt;/p&gt;
&lt;p&gt;I pointed Claude Code at a copy of that repo and told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Based on this build a new HTML with inline JavaScript page that uses WebUSB to simply test that the connection to the badge works and then list files on that device using the same mechanism&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It took a bit of poking (here's &lt;a href="https://gistpreview.github.io/?13d93a9e3b0ce1c921cd20303f2f1d84"&gt;the transcript&lt;/a&gt;) but the result is now live at &lt;a href="https://tools.simonwillison.net/badge-repl"&gt;tools.simonwillison.net/badge-repl&lt;/a&gt;. It only works in Chrome - you'll need to plug the badge in with a USB-C cable and then click "Connect to Badge".&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/badge-repl.jpg" alt="Badge Interactive REPL. Note: This tool requires the Web Serial API (Chrome/Edge on desktop). Connect to Badge, Disconnect and Clear Terminal buttons. Then a REPL interface displaying: Ready to connect. Click &amp;quot;Connect to Badge&amp;quot; to start.Traceback (most recent call last):ddae88e91.dirty on 2025-10-20; GitHub Badger with RP2350 Type &amp;quot;help()&amp;quot; for more information.  &amp;gt;&amp;gt;&amp;gt;  MicroPython v1.14-5485.gddae88e91.dirty on 2025-10-20; GitHub Badger with RP2350 Type &amp;quot;help()&amp;quot; for more information. &amp;gt;&amp;gt;&amp;gt; os.listdir() ['icon.py', 'ui.py', 'init.py', '._init.py', '._icon.py'] &amp;gt;&amp;gt;&amp;gt; machine.freq() 200000000 &amp;gt;&amp;gt;&amp;gt; gc.mem_free() 159696 &amp;gt;&amp;gt;&amp;gt; help() Welcome to MicroPython!" style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="get-hacking"&gt;Get hacking&lt;/h4&gt;
&lt;p&gt;If you're a GitHub Universe attendee I hope this is useful. The official &lt;a href="https://badger.github.io/"&gt;badger.github.io&lt;/a&gt; site has plenty more details to help you get started.&lt;/p&gt;
&lt;p&gt;There isn't yet a way to get hold of this hardware outside of GitHub Universe - I know they had some supply chain challenges just getting enough badges for the conference attendees!&lt;/p&gt;
&lt;p&gt;It's a very neat device, built for GitHub by &lt;a href="https://www.pimoroni.com/"&gt;Pimoroni&lt;/a&gt; in Sheffield, UK. A version of this should become generally available in the future under the name "Pimoroni Tufty 2350".&lt;/p&gt;

&lt;h4 id="iphone-only"&gt;Update: Setup with iPhone only&lt;/h4&gt;

&lt;p&gt;If you don't have a laptop with you it's still possible to start hacking on the device using just a USB-C cable.&lt;/p&gt;

&lt;p&gt;Plug the badge into the phone, hit the reset button on the back twice to switch it into disk mode and open the iPhone Files app - the badge should appear as a mounted disk called BADGER.&lt;/p&gt;

&lt;p&gt;I used &lt;a href="https://apps.apple.com/us/app/textastic-code-editor/id1049254261"&gt;Textastic&lt;/a&gt; to edit that &lt;code&gt;secrets.py&lt;/code&gt; and configure a new badge, then hit reset again to restart it.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hardware-hacking"&gt;hardware-hacking&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microsoft"&gt;microsoft&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/raspberry-pi"&gt;raspberry-pi&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/disclosures"&gt;disclosures&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/micropython"&gt;micropython&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/><category term="hardware-hacking"/><category term="microsoft"/><category term="ai"/><category term="generative-ai"/><category term="raspberry-pi"/><category term="llms"/><category term="claude-code"/><category term="disclosures"/><category term="micropython"/></entry><entry><title>Claude Code for web - a new asynchronous coding agent from Anthropic</title><link href="https://simonwillison.net/2025/Oct/20/claude-code-for-web/#atom-tag" rel="alternate"/><published>2025-10-20T19:43:15+00:00</published><updated>2025-10-20T19:43:15+00:00</updated><id>https://simonwillison.net/2025/Oct/20/claude-code-for-web/#atom-tag</id><summary type="html">
    &lt;p&gt;Anthropic launched Claude Code for web this morning. It's an &lt;a href="https://simonwillison.net/tags/async-coding-agents/"&gt;asynchronous coding agent&lt;/a&gt; - their answer to OpenAI's &lt;a href="https://simonwillison.net/2025/May/16/openai-codex/"&gt;Codex Cloud&lt;/a&gt; and &lt;a href="https://simonwillison.net/2025/May/19/jules/"&gt;Google's Jules&lt;/a&gt;, and has a very similar shape. I had preview access over the weekend and I've already seen some very promising results from it.&lt;/p&gt;
&lt;p&gt;It's available online at &lt;a href="https://claude.ai"&gt;claude.ai/code&lt;/a&gt; and shows up as a tab in the Claude iPhone app as well:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/claude-code-for-web.jpg" alt="Screenshot of Claude AI interface showing a conversation about updating a README file. The left sidebar shows &amp;quot;Claude&amp;quot; at the top, followed by navigation items: &amp;quot;Chats&amp;quot;, &amp;quot;Projects&amp;quot;, &amp;quot;Artifacts&amp;quot;, and &amp;quot;Code&amp;quot; (highlighted). Below that is &amp;quot;Starred&amp;quot; section listing several items with trash icons: &amp;quot;LLM&amp;quot;, &amp;quot;Python app&amp;quot;, &amp;quot;Check my post&amp;quot;, &amp;quot;Artifacts&amp;quot;, &amp;quot;Summarize&amp;quot;, and &amp;quot;Alt text writer&amp;quot;. The center panel shows a conversation list with items like &amp;quot;In progress&amp;quot;, &amp;quot;Run System C&amp;quot;, &amp;quot;Idle&amp;quot;, &amp;quot;Update Rese&amp;quot;, &amp;quot;Run Matplotl&amp;quot;, &amp;quot;Run Marketin&amp;quot;, &amp;quot;WebAssembl&amp;quot;, &amp;quot;Benchmark M&amp;quot;, &amp;quot;Build URL Qu&amp;quot;, and &amp;quot;Add Read-Or&amp;quot;. The right panel displays the active conversation titled &amp;quot;Update Research Project README&amp;quot; showing a task to update a GitHub README file at https://github.com/simonw/research/blob/main/deepseek-ocr-nvidia-spark/README.md, followed by Claude's response and command outputs showing file listings with timestamps from Oct 20 17:53." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;As far as I can tell it's their latest &lt;a href="https://www.claude.com/product/claude-code"&gt;Claude Code CLI&lt;/a&gt; app wrapped in a container (Anthropic are getting &lt;em&gt;really&lt;/em&gt; &lt;a href="https://simonwillison.net/2025/Sep/9/claude-code-interpreter/"&gt;good at containers&lt;/a&gt; these days) and configured to &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt;. It appears to behave exactly the same as the CLI tool, and includes a neat "teleport" feature which can copy both the chat transcript and the edited files down to your local Claude Code CLI tool if you want to take over locally.&lt;/p&gt;
&lt;p&gt;It's very straight-forward to use. You point Claude Code for web at a GitHub repository, select an environment (fully locked down, restricted to an allow-list of domains or configured to access domains of your choosing, including "*" for everything) and kick it off with a prompt.&lt;/p&gt;
&lt;p&gt;While it's running you can send it additional prompts which are queued up and executed after it completes its current step.&lt;/p&gt;
&lt;p&gt;Once it's done it opens a branch on your repo with its work and can optionally open a pull request.&lt;/p&gt;
&lt;h4 id="putting-claude-code-for-web-to-work"&gt;Putting Claude Code for web to work&lt;/h4&gt;
&lt;p&gt;Claude Code for web's PRs are indistinguishable from Claude Code CLI's, so Anthropic told me it was OK to submit those against public repos even during the private preview. Here are some examples from this weekend:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/tools/pull/73"&gt;Add query-string-stripper.html tool&lt;/a&gt; against my simonw/tools repo - a &lt;em&gt;very&lt;/em&gt; simple task that creates (and deployed via GitHub Pages) this &lt;a href="https://tools.simonwillison.net/query-string-stripper"&gt;query-string-stripper&lt;/a&gt; tool.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/research/tree/main/minijinja-vs-jinja2"&gt;minijinja vs jinja2 Performance Benchmark&lt;/a&gt; - I ran this against a private repo and then copied the results here, so no PR. Here's &lt;a href="https://github.com/simonw/research/blob/main/minijinja-vs-jinja2/README.md#the-prompt"&gt;the prompt&lt;/a&gt; I used.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/simonw/research/pull/1"&gt;Update deepseek-ocr README to reflect successful project completion&lt;/a&gt; - I noticed that the README produced by Claude Code CLI for &lt;a href="https://simonwillison.net/2025/Oct/20/deepseek-ocr-claude-code/"&gt;this project&lt;/a&gt; was misleadingly out of date, so I had Claude Code for web fix the problem.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That second example is the most interesting. I saw &lt;a href="https://x.com/mitsuhiko/status/1980034078297514319"&gt;a tweet from Armin&lt;/a&gt; about his &lt;a href="https://github.com/mitsuhiko/minijinja"&gt;MiniJinja&lt;/a&gt; Rust template language &lt;a href="https://github.com/mitsuhiko/minijinja/pull/841"&gt;adding support&lt;/a&gt; for Python 3.14 free threading. I hadn't realized that project &lt;em&gt;had&lt;/em&gt; Python bindings, so I decided it would be interesting to see a quick performance comparison between MiniJinja and Jinja2.&lt;/p&gt;
&lt;p&gt;I ran Claude Code for web against a private repository with a completely open environment (&lt;code&gt;*&lt;/code&gt; in the allow-list) and prompted:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I’m interested in benchmarking the Python bindings for &lt;a href="https://github.com/mitsuhiko/minijinja"&gt;https://github.com/mitsuhiko/minijinja&lt;/a&gt; against the equivalente template using Python jinja2&lt;/p&gt;
&lt;p&gt;Design and implement a benchmark for this. It should use the latest main checkout of minijinja and the latest stable release of jinja2. The benchmark should use the uv version of Python 3.14 and should test both the regular 3.14 and the 3.14t free threaded version - so four scenarios total&lt;/p&gt;
&lt;p&gt;The benchmark should run against a reasonably complicated example of a template, using template inheritance and loops and such like In the PR include a shell script to run the entire benchmark, plus benchmark implantation, plus markdown file describing the benchmark and the results in detail, plus some illustrative charts created using matplotlib&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I entered this into the Claude iPhone app on my mobile keyboard, hence the typos.&lt;/p&gt;
&lt;p&gt;It churned away for a few minutes and gave me exactly what I asked for. Here's one of the &lt;a href="https://github.com/simonw/research/tree/main/minijinja-vs-jinja2/charts"&gt;four charts&lt;/a&gt; it created:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/minijinja-timeline.jpg" alt="Line chart titled &amp;quot;Rendering Time Across Iterations&amp;quot; showing rendering time in milliseconds (y-axis, ranging from approximately 1.0 to 2.5 ms) versus iteration number (x-axis, ranging from 0 to 200+). Four different lines represent different versions: minijinja (3.14t) shown as a solid blue line, jinja2 (3.14) as a solid orange line, minijinja (3.14) as a solid green line, and jinja2 (3.14t) as a dashed red line. The green line (minijinja 3.14) shows consistently higher rendering times with several prominent spikes reaching 2.5ms around iterations 25, 75, and 150. The other three lines show more stable, lower rendering times between 1.0-1.5ms with occasional fluctuations." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;(I was surprised to see MiniJinja out-performed by Jinja2, but I guess Jinja2 has had a decade of clever performance optimizations and doesn't need to deal with any extra overhead of calling out to Rust.)&lt;/p&gt;
&lt;p&gt;Note that I would likely have got the &lt;em&gt;exact same&lt;/em&gt; result running this prompt against Claude CLI on my laptop. The benefit of Claude Code for web is entirely in its convenience as a way of running these tasks in a hosted container managed by Anthropic, with a pleasant web and mobile UI layered over the top.&lt;/p&gt;
&lt;h4 id="anthropic-are-framing-this-as-part-of-their-sandboxing-strategy"&gt;Anthropic are framing this as part of their sandboxing strategy&lt;/h4&gt;
&lt;p&gt;It's interesting how Anthropic chose to announce this new feature: the product launch is buried half way down their new engineering blog post &lt;a href="https://www.anthropic.com/engineering/claude-code-sandboxing"&gt;Beyond permission prompts: making Claude Code more secure and autonomous&lt;/a&gt;, which starts like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Claude Code's new sandboxing features, a bash tool and Claude Code on the web, reduce permission prompts and increase user safety by enabling two boundaries: filesystem and network isolation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I'm &lt;em&gt;very&lt;/em&gt; excited to hear that Claude Code CLI is taking sandboxing more seriously. I've not yet dug into the details of that - it looks like it's using seatbelt on macOS and &lt;a href="https://github.com/containers/bubblewrap"&gt;Bubblewrap&lt;/a&gt; on Linux.&lt;/p&gt;

&lt;p&gt;Anthropic released a new open source (Apache 2) library, &lt;a href="https://github.com/anthropic-experimental/sandbox-runtime"&gt;anthropic-experimental/sandbox-runtime&lt;/a&gt;, with their implementation of this so far.&lt;/p&gt;

&lt;p&gt;Filesystem sandboxing is relatively easy. The harder problem is network isolation, which they describe like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Network isolation&lt;/strong&gt;, by only allowing internet access through a unix domain socket connected to a proxy server running outside the sandbox. This proxy server enforces restrictions on the domains that a process can connect to, and handles user confirmation for newly requested domains. And if you’d like further-increased security, we also support customizing this proxy to enforce arbitrary rules on outgoing traffic.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is &lt;em&gt;crucial&lt;/em&gt; to protecting against both prompt injection and &lt;a href="https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/"&gt;lethal trifecta&lt;/a&gt; attacks. The best way to prevent lethal trifecta attacks is to cut off one of the three legs, and network isolation is how you remove the data exfiltration leg that allows successful attackers to steal your data.&lt;/p&gt;
&lt;p&gt;If you run Claude Code for web in "No network access" mode you have nothing to worry about.&lt;/p&gt;
&lt;p&gt;I'm a little bit nervous about their "Trusted network access" environment. It's intended to only allow access to domains relating to dependency installation, but the &lt;a href="https://docs.claude.com/en/docs/claude-code/claude-code-on-the-web#default-allowed-domains"&gt;default domain list&lt;/a&gt; has dozens of entries which makes me nervous about unintended exfiltration vectors sneaking through.&lt;/p&gt;
&lt;p&gt;You can also configure a custom environment with your own allow-list. I have one called "Everything" which allow-lists "*", because for projects like my MiniJinja/Jinja2 comparison above there are no secrets or source code involved that need protecting.&lt;/p&gt;
&lt;p&gt;I see Anthropic's focus on sandboxes as an acknowledgment that coding agents run in YOLO mode (&lt;code&gt;--dangerously-skip-permissions&lt;/code&gt; and the like) are &lt;em&gt;enormously&lt;/em&gt; more valuable and productive than agents where you have to approve their every step.&lt;/p&gt;
&lt;p&gt;The challenge is making it convenient and easy to run them safely. This kind of sandboxing kind is the only approach to safety that feels credible to me.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: A note on cost: I'm currently using a Claude "Max" plan that Anthropic gave me in order to test some of their features, so I don't have a good feeling for how Claude Code would cost for these kinds of projects.&lt;/p&gt;

&lt;p&gt;From running &lt;code&gt;npx ccusage@latest&lt;/code&gt; (an &lt;a href="https://github.com/ryoppippi/ccusage"&gt;unofficial cost estimate tool&lt;/a&gt;) it looks like I'm using between $1 and $5 worth of daily Claude CLI invocations at the moment.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/armin-ronacher"&gt;armin-ronacher&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jinja"&gt;jinja&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sandboxing"&gt;sandboxing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-injection"&gt;prompt-injection&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lethal-trifecta"&gt;lethal-trifecta&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/async-coding-agents"&gt;async-coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/disclosures"&gt;disclosures&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="armin-ronacher"/><category term="jinja"/><category term="sandboxing"/><category term="security"/><category term="ai"/><category term="prompt-injection"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="coding-agents"/><category term="claude-code"/><category term="lethal-trifecta"/><category term="async-coding-agents"/><category term="disclosures"/></entry><entry><title>NVIDIA DGX Spark: great hardware, early days for the ecosystem</title><link href="https://simonwillison.net/2025/Oct/14/nvidia-dgx-spark/#atom-tag" rel="alternate"/><published>2025-10-14T23:36:21+00:00</published><updated>2025-10-14T23:36:21+00:00</updated><id>https://simonwillison.net/2025/Oct/14/nvidia-dgx-spark/#atom-tag</id><summary type="html">
    &lt;p&gt;NVIDIA sent me a preview unit of their new &lt;a href="https://www.nvidia.com/en-us/products/workstations/dgx-spark/"&gt;DGX Spark&lt;/a&gt; desktop "AI supercomputer". I've never had hardware to review before! You can consider this my first ever sponsored post if you like, but they did not pay me any cash and aside from an embargo date they did not request (nor would I grant) any editorial input into what I write about the device.&lt;/p&gt;
&lt;p&gt;The device retails for around $4,000. They officially go on sale tomorrow.&lt;/p&gt;
&lt;p&gt;First impressions are that this is a snazzy little computer. It's similar in size to a Mac mini, but with an exciting textured surface that feels refreshingly different and a little bit &lt;a href="https://www.indiewire.com/awards/industry/devs-cinematography-rob-hardy-alex-garland-1234583396/"&gt;science fiction&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2025/nvidia-spark.jpg" alt="A rectangular small computer, sitting horizontally on a box. It is about the width of a Mac Mini. It has a NVIDIA logo on  a reflective handle portion, then textured silver metal front, then another reflective handle at the other end. It's pretty and a bit weird looking. It sits on the box it came in, which has NVIDIA DGX Spark written on it in white text on green." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;There is a &lt;em&gt;very&lt;/em&gt; powerful machine tucked into that little box. Here are the specs, which I had Claude Code figure out for me by &lt;a href="https://gist.github.com/simonw/021651a14e6c5bf9876c9c4244ed6c2d"&gt;poking around on the device itself&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hardware Specifications&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Architecture: aarch64 (ARM64)&lt;/li&gt;
&lt;li&gt;CPU: 20 cores
&lt;ul&gt;
&lt;li&gt;10x Cortex-X925 (performance cores)&lt;/li&gt;
&lt;li&gt;10x Cortex-A725 (efficiency cores)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;RAM: 119 GB total (112 GB available) - &lt;em&gt;I’m not sure why Claude reported it differently here, the machine is listed as 128GB - it looks like a &lt;a href="https://news.ycombinator.com/item?id=45586776#45588329"&gt;128GB == 119GiB thing&lt;/a&gt; because Claude &lt;a href="https://gist.github.com/simonw/021651a14e6c5bf9876c9c4244ed6c2d#file-nvidia-claude-code-txt-L41"&gt;used free -h&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Storage: 3.7 TB (6% used, 3.3 TB available)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;GPU Specifications&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Model: NVIDIA GB10 (Blackwell architecture)&lt;/li&gt;
&lt;li&gt;Compute Capability: sm_121 (12.1)&lt;/li&gt;
&lt;li&gt;Memory: 119.68 GB&lt;/li&gt;
&lt;li&gt;Multi-processor Count: 48 streaming multiprocessors&lt;/li&gt;
&lt;li&gt;Architecture: Blackwell&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Short version: this is an ARM64 device with 128GB of memory that's available to both the GPU and the 20 CPU cores at the same time, strapped onto a 4TB NVMe SSD.&lt;/p&gt;
&lt;p&gt;The Spark is firmly targeted at “AI researchers”. It’s designed for both training and running models.&lt;/p&gt;
&lt;h4 id="the-tricky-bit-cuda-on-arm64"&gt;The tricky bit: CUDA on ARM64&lt;/h4&gt;
&lt;p&gt;Until now almost all of my own model running experiments have taken place on a Mac. This has gotten far less painful over the past year and a half thanks to the amazing work of the &lt;a href="https://simonwillison.net/tags/mlx/"&gt;MLX&lt;/a&gt; team and community, but it's still left me deeply frustrated at my lack of access to the NVIDIA CUDA ecosystem. I've lost count of the number of libraries and tutorials which expect you to be able to use Hugging Face Transformers or PyTorch with CUDA, and leave you high and dry if you don't have an NVIDIA GPU to run things on.&lt;/p&gt;
&lt;p&gt;Armed (ha) with my new NVIDIA GPU I was excited to dive into this world that had long eluded me... only to find that there was another assumption baked in to much of this software: x86 architecture for the rest of the machine.&lt;/p&gt;
&lt;p&gt;This resulted in all kinds of unexpected new traps for me to navigate. I eventually managed to get a PyTorch 2.7 wheel for CUDA on ARM, but failed to do so for 2.8. I'm not confident there because the wheel itself is unavailable but I'm finding navigating the PyTorch ARM ecosystem pretty confusing.&lt;/p&gt;
&lt;p&gt;NVIDIA are trying to make this easier, with mixed success. A lot of my initial challenges got easier when I found their &lt;a href="https://docs.nvidia.com/dgx/dgx-spark/nvidia-container-runtime-for-docker.html"&gt;official Docker container&lt;/a&gt;, so now I'm figuring out how best to use Docker with GPUs. Here's the current incantation that's been working for me:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;docker run -it --gpus=all \
  -v /usr/local/cuda:/usr/local/cuda:ro \
  nvcr.io/nvidia/cuda:13.0.1-devel-ubuntu24.04 \
  bash&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I have not yet got my head around the difference between CUDA 12 and 13. 13 appears to be very new, and a lot of the existing tutorials and libraries appear to expect 12.&lt;/p&gt;
&lt;h4 id="the-missing-documentation-isn-t-missing-any-more"&gt;The missing documentation isn't missing any more&lt;/h4&gt;
&lt;p&gt;When I first received this machine around a month ago there was very little in the way of documentation to help get me started. This meant climbing the steep NVIDIA+CUDA learning curve mostly on my own.&lt;/p&gt;
&lt;p&gt;This has changed &lt;em&gt;substantially&lt;/em&gt; in just the last week. NVIDIA now have extensive guides for getting things working on the Spark and they are a huge breath of fresh air - exactly the information I needed when I started exploring this hardware.&lt;/p&gt;
&lt;p&gt;Here's the &lt;a href="https://developer.nvidia.com/topics/ai/dgx-spark"&gt;getting started guide&lt;/a&gt;, details on the &lt;a href="https://build.nvidia.com/spark/dgx-dashboard/instructions"&gt;DGX dashboard web app&lt;/a&gt;, and the essential collection of &lt;a href="https://build.nvidia.com/spark"&gt;playbooks&lt;/a&gt;. There's still a lot I haven't tried yet just in this official set of guides.&lt;/p&gt;
&lt;h4 id="claude-code-for-everything"&gt;Claude Code for everything&lt;/h4&gt;
&lt;p&gt;&lt;a href="https://www.claude.com/product/claude-code"&gt;Claude Code&lt;/a&gt; was an absolute lifesaver for me while I was trying to figure out how best to use this device. My Ubuntu skills were a little rusty, and I also needed to figure out CUDA drivers and Docker incantations and how to install the right versions of PyTorch. Claude 4.5 Sonnet is &lt;em&gt;much better than me&lt;/em&gt; at all of these things.&lt;/p&gt;
&lt;p&gt;Since many of my experiments took place in disposable Docker containers I had no qualms at all about running it in YOLO mode:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;IS_SANDBOX=1 claude --dangerously-skip-permissions&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;IS_SANDBOX=1&lt;/code&gt; environment variable stops Claude from complaining about running as root.&lt;/p&gt;

&lt;details&gt;&lt;summary style="font-style: italic"&gt;Before I found out about IS_SANDBOX&lt;/summary&gt;

&lt;p&gt;&lt;br /&gt;&lt;em&gt;I was &lt;a href="https://twitter.com/lawrencecchen/status/1978255934938886409"&gt;tipped off&lt;/a&gt; about IS_SANDBOX after I published this article. Here's my original workaround:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Claude understandably won't let you do this as root, even in a Docker container, so I found myself using the following incantation in a fresh &lt;code&gt;nvcr.io/nvidia/cuda:13.0.1-devel-ubuntu24.04&lt;/code&gt; instance pretty often:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;apt-get update &lt;span class="pl-k"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get install -y sudo
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; pick the first free UID &amp;gt;=1000&lt;/span&gt;
U=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;for i &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;seq 1000 65000&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;do&lt;/span&gt; &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-k"&gt;!&lt;/span&gt; getent passwd &lt;span class="pl-smi"&gt;$i&lt;/span&gt; &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt;/dev/null&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;then&lt;/span&gt; &lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-smi"&gt;$i&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-c1"&gt;break&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;fi&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; done&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Chosen UID: &lt;span class="pl-smi"&gt;$U&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; same for a GID&lt;/span&gt;
G=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;for i &lt;span class="pl-k"&gt;in&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;$(&lt;/span&gt;seq 1000 65000&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;do&lt;/span&gt; &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-k"&gt;!&lt;/span&gt; getent group &lt;span class="pl-smi"&gt;$i&lt;/span&gt; &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt;/dev/null&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;then&lt;/span&gt; &lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-smi"&gt;$i&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-c1"&gt;break&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; &lt;span class="pl-k"&gt;fi&lt;/span&gt;&lt;span class="pl-k"&gt;;&lt;/span&gt; done&lt;span class="pl-pds"&gt;)&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c1"&gt;echo&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Chosen GID: &lt;span class="pl-smi"&gt;$G&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; create user+group&lt;/span&gt;
groupadd -g &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$G&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; devgrp
useradd -m -u &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$U&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; -g &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;span class="pl-smi"&gt;$G&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; -s /bin/bash dev
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; enable password-less sudo:&lt;/span&gt;
&lt;span class="pl-c1"&gt;printf&lt;/span&gt; &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;dev ALL=(ALL) NOPASSWD:ALL\n&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; &lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; /etc/sudoers.d/90-dev-nopasswd
chmod 0440 /etc/sudoers.d/90-dev-nopasswd
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Install npm&lt;/span&gt;
DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get install -y npm
&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; Install Claude&lt;/span&gt;
npm install -g @anthropic-ai/claude-code&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then switch to the &lt;code&gt;dev&lt;/code&gt; user and run Claude for the first time:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;su - dev
claude --dangerously-skip-permissions&lt;/pre&gt;&lt;/div&gt;

&lt;/details&gt;&lt;br /&gt;

&lt;p&gt;This will provide a URL which you can visit to authenticate with your Anthropic account, confirming by copying back a token and pasting it into the terminal.&lt;/p&gt;
&lt;p&gt;Docker tip: you can create a snapshot of the current image (with Claude installed) by running &lt;code&gt;docker ps&lt;/code&gt; to get the container ID and then:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;docker commit --pause=false &lt;span class="pl-k"&gt;&amp;lt;&lt;/span&gt;container_id&lt;span class="pl-k"&gt;&amp;gt;&lt;/span&gt; cc:snapshot&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then later you can start a similar container using:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;docker run -it \
  --gpus=all \
  -v /usr/local/cuda:/usr/local/cuda:ro \
  cc:snapshot bash&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's an example of the kinds of prompts I've been running in Claude Code inside the container:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;I want to run https://huggingface.co/unsloth/Qwen3-4B-GGUF using llama.cpp - figure out how to get llama cpp working on this machine  such that it runs with the GPU, then install it in this directory and get that model to work to serve a prompt. Goal is to get this  command to run: llama-cli -hf unsloth/Qwen3-4B-GGUF -p "I believe the meaning of life is" -n 128 -no-cnv&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That one worked flawlessly - Claude checked out the &lt;code&gt;llama.cpp&lt;/code&gt; repo, compiled it for me and iterated on it until it could run that model on the GPU. Here's a &lt;a href="https://gist.github.com/simonw/3e7d28d9ed222d842f729bfca46d6673"&gt;full transcript&lt;/a&gt;, converted from Claude's &lt;code&gt;.jsonl&lt;/code&gt; log format to Markdown using a script I &lt;a href="https://github.com/simonw/tools/blob/main/python/claude_to_markdown.py"&gt;vibe coded just now&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I later told it:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Write out a markdown file with detailed notes on what you did. Start with the shortest form of notes on how to get a successful build, then add a full account of everything you tried, what went wrong and how you fixed it.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Which produced &lt;a href="https://gist.github.com/simonw/0942d96f616b9e328568ab27d911c8ed"&gt;this handy set of notes&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="tailscale-was-made-for-this"&gt;Tailscale was made for this&lt;/h4&gt;
&lt;p&gt;Having a machine like this on my local network is neat, but what's even neater is being able to access it from anywhere else in the world, from both my phone and my laptop.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://tailscale.com/"&gt;Tailscale&lt;/a&gt; is &lt;em&gt;perfect&lt;/em&gt; for this. I installed it on the Spark (using the &lt;a href="https://tailscale.com/kb/1031/install-linux"&gt;Ubuntu instructions here&lt;/a&gt;), signed in with my SSO account (via Google)... and the Spark showed up in the "Network Devices" panel on my laptop and phone instantly.&lt;/p&gt;
&lt;p&gt;I can SSH in from my laptop or using the &lt;a href="https://termius.com/free-ssh-client-for-iphone"&gt;Termius iPhone app&lt;/a&gt; on my phone. I've also been running tools like &lt;a href="https://openwebui.com/"&gt;Open WebUI&lt;/a&gt; which give me a mobile-friendly web interface for interacting with LLMs on the Spark.&lt;/p&gt;
&lt;h4 id="here-comes-the-ecosystem"&gt;Here comes the ecosystem&lt;/h4&gt;
&lt;p&gt;The embargo on these devices dropped yesterday afternoon, and it turns out a whole bunch of relevant projects have had similar preview access to myself. This is &lt;em&gt;fantastic news&lt;/em&gt; as many of the things I've been trying to figure out myself suddenly got a whole lot easier.&lt;/p&gt;
&lt;p&gt;Four particularly notable examples:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ollama &lt;a href="https://ollama.com/blog/nvidia-spark"&gt;works out of the box&lt;/a&gt;. They actually had a build that worked a few weeks ago, and were the first success I had running an LLM on the machine.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;llama.cpp&lt;/code&gt; creator Georgi Gerganov just published  &lt;a href="https://github.com/ggml-org/llama.cpp/discussions/16578"&gt;extensive benchmark results&lt;/a&gt; from running &lt;code&gt;llama.cpp&lt;/code&gt; on a Spark. He's getting ~3,600 tokens/second to read the prompt and ~59 tokens/second to generate a response with the MXFP4 version of GPT-OSS 20B and ~817 tokens/second to read and ~18 tokens/second to generate for GLM-4.5-Air-GGUF.&lt;/li&gt;
&lt;li&gt;LM Studio now have &lt;a href="https://lmstudio.ai/blog/dgx-spark"&gt;a build for the Spark&lt;/a&gt;. I haven't tried this one yet as I'm currently using my machine exclusively via SSH.&lt;/li&gt;
&lt;li&gt;vLLM - one of the most popular engines for serving production LLMs - had &lt;a href="https://x.com/eqhylxx/status/1977928690945360049"&gt;early access&lt;/a&gt; and there's now an official &lt;a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/vllm?version=25.09-py3"&gt;NVIDIA vLLM NGC Container&lt;/a&gt; for running their stack.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's &lt;a href="https://docs.unsloth.ai/new/fine-tuning-llms-with-nvidia-dgx-spark-and-unsloth"&gt;a tutorial from Unsloth&lt;/a&gt; on fine-tuning gpt-oss-20b on the Spark.&lt;/p&gt;
&lt;h4 id="should-you-get-one-"&gt;Should you get one?&lt;/h4&gt;
&lt;p&gt;It's a bit too early for me to provide a confident recommendation concerning this machine. As indicated above, I've had a tough time figuring out how best to put it to use, largely through my own inexperience with CUDA, ARM64 and Ubuntu GPU machines in general.&lt;/p&gt;
&lt;p&gt;The ecosystem improvements in just the past 24 hours have been very reassuring though. I expect it will be clear within a few weeks how well supported this machine is going to be.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/hardware"&gt;hardware&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/docker"&gt;docker&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tailscale"&gt;tailscale&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nvidia"&gt;nvidia&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ollama"&gt;ollama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lm-studio"&gt;lm-studio&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/disclosures"&gt;disclosures&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nvidia-spark"&gt;nvidia-spark&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="hardware"/><category term="ai"/><category term="docker"/><category term="tailscale"/><category term="generative-ai"/><category term="local-llms"/><category term="llms"/><category term="nvidia"/><category term="ollama"/><category term="llama-cpp"/><category term="coding-agents"/><category term="claude-code"/><category term="lm-studio"/><category term="disclosures"/><category term="nvidia-spark"/></entry><entry><title>OpenAI DevDay 2025 live blog</title><link href="https://simonwillison.net/2025/Oct/6/openai-devday-live-blog/#atom-tag" rel="alternate"/><published>2025-10-06T17:03:15+00:00</published><updated>2025-10-06T17:03:15+00:00</updated><id>https://simonwillison.net/2025/Oct/6/openai-devday-live-blog/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm at &lt;a href="https://devday.openai.com/2025"&gt;OpenAI DevDay&lt;/a&gt; in Fort Mason, San Francisco today. As &lt;a href="https://simonwillison.net/2024/Oct/1/openai-devday-2024-live-blog/"&gt;I did last year&lt;/a&gt;, I'm going to be live blogging the announcements from the kenote. Unlike last year, this year &lt;a href="https://www.youtube.com/live/hS1YqcewH0c"&gt;there's a livestream&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Disclosure: OpenAI provided me with a free ticket and reserved me a seat in the press/influencer section for the keynote.&lt;/em&gt;&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/disclosures"&gt;disclosures&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/live-blog"&gt;live-blog&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="disclosures"/><category term="live-blog"/></entry><entry><title>GitHub Copilot CLI is now in public preview</title><link href="https://simonwillison.net/2025/Sep/25/github-copilot-cli/#atom-tag" rel="alternate"/><published>2025-09-25T23:58:34+00:00</published><updated>2025-09-25T23:58:34+00:00</updated><id>https://simonwillison.net/2025/Sep/25/github-copilot-cli/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.blog/changelog/2025-09-25-github-copilot-cli-is-now-in-public-preview/"&gt;GitHub Copilot CLI is now in public preview&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
GitHub now have their own entry in the coding terminal CLI agent space: &lt;a href="https://github.com/features/copilot/cli"&gt;Copilot CLI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It's the same basic shape as Claude Code, Codex CLI, Gemini CLI and a growing number of other tools in this space. It's a terminal UI which you accepts instructions and can modify files, run commands and integrate with GitHub's MCP server and other MCP servers that you configure.&lt;/p&gt;
&lt;p&gt;Two notable features compared to many of the others:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It works against the &lt;a href="https://docs.github.com/en/github-models"&gt;GitHub Models&lt;/a&gt; backend. It defaults to Claude Sonnet 4 but you can set &lt;code&gt;COPILOT_MODEL=gpt-5&lt;/code&gt; to switch to GPT-5. Presumably other models will become available soon.&lt;/li&gt;
&lt;li&gt;It's billed against your existing GitHub Copilot account. &lt;a href="https://github.com/features/copilot/plans"&gt;Pricing details are here&lt;/a&gt; - they're split into "Agent mode" requests and "Premium" requests. Different plans get different allowances, which are shared with other products in the GitHub Copilot family.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The best available documentation right now is the &lt;code&gt;copilot --help&lt;/code&gt; screen - &lt;a href="https://gist.github.com/simonw/bc739b8c67aa6e7a5f4f519942e66671"&gt;here's a copy of that in a Gist&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;It's a competent entry into the market, though it's missing features like the ability to paste in images which have been introduced to Claude Code and Codex CLI over the past few months.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Disclosure: I got a preview of this at an event at Microsoft's offices in Seattle last week. They did not pay me for my time but they did cover my flight, hotel and some dinners.&lt;/em&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microsoft"&gt;microsoft&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github-copilot"&gt;github-copilot&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex"&gt;codex&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/disclosures"&gt;disclosures&lt;/a&gt;&lt;/p&gt;



</summary><category term="github"/><category term="microsoft"/><category term="ai"/><category term="generative-ai"/><category term="github-copilot"/><category term="llms"/><category term="ai-assisted-programming"/><category term="ai-agents"/><category term="coding-agents"/><category term="claude-code"/><category term="codex"/><category term="disclosures"/></entry><entry><title>Previewing GPT-5 at OpenAI's office</title><link href="https://simonwillison.net/2025/Aug/7/previewing-gpt-5/#atom-tag" rel="alternate"/><published>2025-08-07T19:11:19+00:00</published><updated>2025-08-07T19:11:19+00:00</updated><id>https://simonwillison.net/2025/Aug/7/previewing-gpt-5/#atom-tag</id><summary type="html">
    &lt;p&gt;A couple of weeks ago I was invited to OpenAI's headquarters for a "preview event", for which I had to sign both an NDA and a video release waiver. I suspected it might relate to either GPT-5 or the OpenAI open weight models... and &lt;a href="https://simonwillison.net/2025/Aug/7/gpt-5/"&gt;GPT-5 it was&lt;/a&gt;!&lt;/p&gt;
&lt;p&gt;OpenAI had invited five developers: &lt;a href="https://clairevo.com/"&gt;Claire Vo&lt;/a&gt;, &lt;a href="https://www.youtube.com/@t3dotgg"&gt;Theo Browne&lt;/a&gt;, &lt;a href="https://x.com/benhylak"&gt;Ben Hylak&lt;/a&gt;, &lt;a href="https://www.swyx.io/"&gt;Shawn @swyx Wang&lt;/a&gt;, and myself. We were all given early access to the new models and asked to spend a couple of hours (of paid time, see &lt;a href="https://simonwillison.net/about/#disclosures"&gt;my disclosures&lt;/a&gt;) experimenting with them, while being filmed by a professional camera crew.&lt;/p&gt;
&lt;p&gt;The resulting video is &lt;a href="https://www.youtube.com/watch?v=-gXmWYQtv5o"&gt;now up on YouTube&lt;/a&gt;. Unsurprisingly most of my edits related to &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle/"&gt;SVGs of pelicans&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;lite-youtube videoid="-gXmWYQtv5o" js-api="js-api"
  title=" Surprising developers with GPT-5 "
  playlabel="Play:  Surprising developers with GPT-5 "
&gt; &lt;/lite-youtube&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt-5"&gt;gpt-5&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/disclosures"&gt;disclosures&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/theo-browne"&gt;theo-browne&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt"&gt;gpt&lt;/a&gt;&lt;/p&gt;



</summary><category term="youtube"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="pelican-riding-a-bicycle"/><category term="gpt-5"/><category term="disclosures"/><category term="theo-browne"/><category term="gpt"/></entry></feed>