Simon Willison's Weblog: python

micropython-wasm 0.1a2

2026-06-06T04:26:06+00:00

I added a CLI to micropython-wasm (issue #7), inspired by the first draft of the blog entry when I realized it would be a great way to illustrate the Try it yourself section.

Tags: python, sandboxing, webassembly, micropython

Running Python code in a sandbox with MicroPython and WASM

2026-06-06T03:53:34+00:00

I've been experimenting with different approaches to running code in a sandbox for several years now, but my latest attempt feels like it might finally have all of the characteristics I've been looking for. I've released it as an alpha package called micropython-wasm, and I'm using it for a code execution sandbox plugin for Datasette Agent called datasette-agent-micropython.

Why do I want a sandbox?

My key open source projects - Datasette, LLM, even sqlite-utils - all support plugins.

I absolutely love plugins as a mechanism for extending software. A carefully designed plugin system reduces the risk involved in trying new things to almost nothing - even the wildest ideas won't leave a lasting influence on the core application itself. My software can grow a new feature overnight and I don't even have to review a pull request!

There's one major drawback: my plugin systems all use Python and Pluggy, and plugin code executes with full privileges within my applications. A buggy or malicious plugin could break everything or leak private data.

I'd love to be able to run plugin-style code in an environment where it is unable to read unapproved files, connect to a network, or generally operate in a way that's risky or harmful to the rest of the application or the user's computer.

My interest covers more than just plugins. For Datasette in particular there are many features I'd like to support where arbitrary code execution would be useful. I've already experimented with this for Datasette Enrichments, where code can be used to transform values stored in a table. I'd love to build a mechanism where you can run code on a schedule that fetches JSON from an approved location, runs a tiny bit of code to reformat it into a list of dictionaries, then inserts those as rows in a SQLite database table.

What I want from a sandbox

My goal is to execute code safely within my own Python applications. Here's what I need:

Dependencies that cleanly install from PyPI, including binary wheels across multiple platforms if necessary. I don't want people using my software to have to take any extra steps beyond directly installing my Python package.
Executed code must be subject to both memory and CPU limits. I don't want while True: s += "longer string" to crash my application or the user's computer.
File access must be strictly controlled. Either no filesystem access at all or I get to define exactly which files can be read and which files can be written to.
Network access is controlled as well. Sandboxed code should not be able to communicate with anything without going through a layer I fully control.
Support for interaction with host functions. A sandbox isn't much use if I can't carefully expose selected platform features to the code that it's running.
It has to be robust, supported, and clearly documented. I've lost count of the number of sandbox projects I've seen in repos with warnings that they aren't actively maintained!

WebAssembly looks really promising here

Web browsers operate in the most hostile environment imaginable when it comes to malicious code. Their job is to download and execute untrusted code from the web on almost every page load.

Given this, JavaScript engines should be excellent candidates for sandboxes. Sadly those engines are also extremely complicated, and are not designed for easy embedding in other projects. Most of the V8-in-Python projects I've seen are infrequently maintained and come with warnings not to use them with completely untrusted code.

WebAssembly is a much better candidate. It was designed from the start to support all of the characteristics I care about and has been tested in browsers for nearly a decade. The wasmtime Python library brings WASM to Python, is actively maintained, and has binary wheels.

MicroPython in WebAssembly

WebAssembly engines like wasmtime run WebAssembly binaries. Some programming languages like Rust are easy to compile directly to WebAssembly. Dynamic languages like JavaScript and Python are harder - they support language primitives like eval(), which means they need a full interpreter available at runtime.

To run Python we need a full Python interpreter compiled to WebAssembly, wired up in a way that makes it easy to feed it code, hook up host functions and access the results.

Pyodide offers an outstanding package for running Python using WebAssembly in the browser, but using Pyodide in server-side Python isn't supported. The most recent advice I could find was from October 2024 stating "Pyodide is built by the Emscripten toolchain and can only run in a browser or Node.js".

The other day I decided to take a look at MicroPython as an option for this. The MicroPython site says:

MicroPython is a lean and efficient implementation of the Python 3 programming language that includes a small subset of the Python standard library and is optimised to run on microcontrollers and in constrained environments.

WebAssembly sure feels like a constrained environment to me!

Building the first version

I had GPT-5.5 Pro do some research for me, which turned up this PR against MicroPython by Yamamoto Takahashi titled "Experimental WASI support for ports/unix".

It then produced this research.md document, so I let Codex Desktop and GPT-5.5 high loose on it to see what would happen:

read the research.md document and build this. You will probably need to write a script that compiles a custom WASM version of MicroPython as part of this project - fetch the MicroPython code to a /tmp directory for this as part of that script.

It worked. I now had a prototype Python library that could execute Python code inside a WebAssembly sandbox!

The trickiest piece to solve was persistent interpreter state. The WASM build we are using here exposes a single entry point which starts the interpreter, runs the code and then stops the interpreter at the end.

This works fine for one-off scripts, but for Datasette Agent I want variables and functions to stay resident in memory so I can reuse them across multiple code execution calls.

A neat thing about working with coding agents is that you can get from an idea to a proof of concept quickly. I prompted:

For keeping variables resident: what if we ran code inside micropython itself which called a host function get_next_python_code() and then passed that to eval() - and that host function blocked until new code was available, maybe by running in a thread with a queue? Could that or a similar idea help here?

After some iteration we got to a version of this that works! In Python code you can now do this:

from micropython_wasm import MicroPythonSession

with MicroPythonSession() as session:
    print(session.run("x = 10\nprint(x)").stdout)
    print(session.run("x += 5\nprint(x)").stdout)
    print(session.run("print(x * 2)").stdout)

Under the hood this starts a thread, sets up a request queue and then sends messages to that queue for the session.run() command, each time waiting on a reply queue for the result of that execution. Inside WASM the MicroPython interpreter blocks waiting for a __session_next__() host function to return the next line of code, which it runs eval() on before calling __session_result__({"id": request_id, "ok": True}) when each block has been successfully executed.

The other piece of complexity was supporting host functions, so my Python library could selectively expose functions that could then be called by code running in MicroPython.

Codex ended up solving this with 78 lines of C, which ends up compiled into the 362KB WebAssembly blob I'm distributing with the package.

I am by no means a C programmer, but I've read the C and had two different models explain it to me (here's Claude's explanation) and I've subjected it to a barrage of tests.

The great thing about working with WebAssembly is that if the C turns out to be fatally flawed the worst that can happen is the WebAssembly execution will fail with an exception. I can live with that risk.

Memory limits are directly supported by wasmtime. CPU limits are a little harder: wasmtime offers a "fuel" concept to limit how many operations a WebAssembly call can execute, and that's the correct fit for this problem, but the units are hard to reason about. I'm experimenting with a 20 million default "fuel" setting now but I'm not confident that it's the most appropriate value.

Try it yourself

The micropython-wasm alpha is now live on PyPI.

You can try it from your own Python code as described in the README. I've also added a simple CLI mode in version 0.1a2 which means you can try it using uvx without first installing it like so:

uvx micropython-wasm -c 'print("Hello world")'
# To see it run out of fuel:
uvx micropython-wasm -c 's = ""; while True: s += "longer"'
# Outputs: micropython-wasm: guest exited with code 1

You can also try it in Datasette Agent like this:

uvx llm keys set openai
# Paste in an OpenAI key, then:
uvx --with datasette-agent \
  --with datasette-agent-micropython \
  --prerelease allow \
  datasette --internal internal.db \
    -s plugins.datasette-llm.default_model gpt-5.5 \
    --root -o

Then navigate to http://127.0.0.1:8001/-/agent and run the prompt:

show me some micropython

You can try a live demo of that plugin running in Datasette Agent by signing into agent.datasette.io with your GitHub account.

Should you trust my vibe-coded sandbox?

Having complained about immature, loosely-maintained sandboxing libraries, it's deeply ironic that I've now built my own!

I deliberately slapped an alpha release version on it, and I'm not ready to recommend it to anyone who isn't willing to take a significant risk.

I've put it through enough testing that I'm OK using it myself. I've shipped my first plugin that uses it, datasette-agent-micropython. I've also locked GPT-5.5 xhigh in that Datasette Agent plugin and challenged it to break out of the sandbox and so far it has not managed to.

I'm hoping this implementation can convince some companies with professional security teams and high-stakes problems to commit to using Python in WebAssembly as a sandboxing approach and open source their own solutions.

Tags: python, sandboxing, ai, datasette, webassembly, generative-ai, llms, ai-assisted-programming, codex, datasette-agent, micropython

datasette-agent-micropython 0.1a0

2026-06-02T19:28:36+00:00

Release: datasette-agent-micropython 0.1a0

I want Datasette Agent to be able to generate and execute Python code safely. This alpha is looking promising so far. GPT-5.5 has so far failed to break out of the sandbox!

Tags: python, sandboxing, datasette, webassembly, datasette-agent, micropython

micropython-wasm 0.1a1

2026-06-02T19:20:47+00:00

Release: micropython-wasm 0.1a1

Fixes for some limitations that emerged while I was trying to use this to build datasette-agent-micropython.

Tags: python, sandboxing, webassembly, micropython

micropython-wasm 0.1a0

2026-06-02T03:43:45+00:00

Release: micropython-wasm 0.1a0

My latest sandboxing experiment: This alpha package bundles a lightly customized WASM build of MicroPython with a wrapper to execute code in it via wasmtime.

Tags: python, sandboxing, webassembly, micropython

Running Python ASGI apps in the browser via Pyodide + a service worker

2026-05-30T15:34:00+00:00

Research: Running Python ASGI apps in the browser via Pyodide + a service worker

Datasette Lite is my version of Datasette that runs entirely in the browser using Pyodide in WebAssembly.

When I first built it four years ago I used Web Workers and code that intercepts navigation operations and fetches the generated HTML by running the Python app.

This worked, but had the disadvantage that any JavaScript in <script> tags would not be executed - breaking some Datasette functionality and a whole lot of Datasette plugins.

This morning I set Claude Opus 4.8 the task (in Claude Code for web) of figuring out how to run Python ASGI apps in Pyodide using Service Workers instead, and it seems to work! Here's a basic ASGI FastCGI demo and here's a demo that runs Datasette 1.0a31.

I'm still getting my head around exactly how it works, but once I've done that I plan to upgrade Datasette Lite itself.

Tags: javascript, python, datasette, asgi, webassembly, service-workers, pyodide, datasette-lite, claude-code

pydantic-monty investigation

2026-05-22T22:41:00+00:00

Research: pydantic-monty investigation

It's been a few months since I last poked at Monty, the sandboxed subset of Python implemented in Rust. I had Claude Code look at the most recent release.

Importantly the max_duration_secs, max_memory, max_allocations, and max_recursion_depth settings all appear to work as advertised.

Tags: python, sandboxing, pydantic

TRE Python binding — ReDoS robustness demo

2026-05-04T17:52:00+00:00

Research: TRE Python binding — ReDoS robustness demo

If it's good enough for antirez to add to Redis I figured Ville Laurikari's TRE regular expression engine was worth exploring in a little more detail.

I had Claude Code build an experimental Python binding (it used ctypes) and try some malicious regular expression attacks against the library. TRE handles those much better than Python's standard library implementation, thanks mainly to the lack of support for backtracking.

Tags: c, ctypes, python, regular-expressions, security

LLM 0.32a0 is a major backwards-compatible refactor

2026-04-29T19:01:47+00:00

I just released LLM 0.32a0, an alpha release of my LLM Python library and CLI tool for accessing LLMs, with some consequential changes that I've been working towards for quite a while.

Previous versions of LLM modeled the world in terms of prompts and responses. Send the model a text prompt, get back a text response.

import llm

model = llm.get_model("gpt-5.5")
response = model.prompt("Capital of France?")
print(response.text())

This made sense when I started working on the library back in April 2023. A lot has changed since then!

LLM provides an abstraction over thousands of different models via its plugin system. The original abstraction - of text input that returns text output - was no longer able to represent everything I needed it to.

Over time LLM itself has grown attachments to handle image, audio, and video input, then schemas for outputting structured JSON, then tools for executing tool calls. Meanwhile LLMs kept evolving, adding reasoning support and the ability to return images and all kinds of other interesting capabilities.

LLM needs to evolve to better handle the diversity of input and output types that can be processed by today's frontier models.

The 0.32a0 alpha has two key changes: model inputs can be represented as a sequence of messages, and model responses can be composed of a stream of differently typed parts.

Prompts as a sequence of messages

LLMs accept input as text, but ever since ChatGPT demonstrated the value of a two-way conversational interface, the most common way to prompt them has been to treat that input as a sequence of conversational turns.

The first turn might look like this:

user: Capital of France?
assistant:

(The model then gets to fill out the reply from the assistant.)

But each subsequent turn needs to replay the entire conversation up to that point, as a sort of screenplay:

user: Capital of France?
assistant: Paris
user: Germany?
assistant:

Most of the JSON APIs from the major vendors follow this pattern. Here's what the above looks like using the OpenAI chat completions API, which has been widely imitated by other providers:

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {
        "role": "user",
        "content": "Capital of France?"
      },
      {
        "role": "assistant",
        "content": "Paris"
      },
      {
        "role": "user",
        "content": "Germany?"
      }
    ]
  }'

Prior to 0.32, LLM modeled these as conversations:

model = llm.get_model("gpt-5.5")

conversation = model.conversation()
r1 = conversation.prompt("Capital of France?")
print(r1.text())
# Outputs "Paris"

r2 = conversation.prompt("Germany?")
print(r2.text())
# Outputs "Berlin"

This worked if you were building a conversation with the model from scratch, but it didn't provide a way to feed in a previous conversation from the start. This made tasks like building an emulation of the OpenAI chat completions API much harder than they should have been.

The llm CLI tool worked around this through a custom mechanism for persisting and inflating conversations using SQLite, but that never became a stable part of the LLM API - and there are many places you might want to use the Python library without committing to SQLite as the storage layer.

The new alpha now supports this:

import llm
from llm import user, assistant

model = llm.get_model("gpt-5.5")

response = model.prompt(messages=[
    user("Capital of France?"),
    assistant("Paris"),
    user("Germany?"),
])
print(response.text())

The llm.user() and llm.assistant() functions are new builder functions designed to be used within that messages=[] array.

The previous prompt= option still works, but LLM upgrades it to a single-item messages array behind the scenes.

You can also now reply to a response, as an alternative to building a conversation:

response2 = response.reply("How about Hungary?")
print(response2) # Default __str__() calls .text()

Streaming parts

The other major new interface in the alpha concerns streaming results back from a prompt.

Previously, LLM supported streaming like this:

response = model.prompt("Generate an SVG of a pelican riding a bicycle")
for chunk in response:
    print(chunk, end="")

Or this async variant:

import asyncio
import llm

model = llm.get_async_model("gpt-5.5")
response = model.prompt("Generate an SVG of a pelican riding a bicycle")

async def run():
    async for chunk in response:
        print(chunk, end="", flush=True)

asyncio.run(run())

Many of today's models return mixed types of content. A prompt run against Claude might return reasoning output, then text, then a JSON request for a tool call, then more text content.

Some models can even execute tools on the server-side, for example OpenAI's code interpreter tool or Anthropic's web search. This means the results from the model can combine text, tool calls, tool outputs and other formats.

Multi-modal output models are starting to emerge too, which can return images or even snippets of audio intermixed into that streaming response.

The new LLM alpha models these as a stream of typed message parts. Here's what that looks like as a Python API consumer:

import asyncio
import llm

model = llm.get_model("gpt-5.5")
prompt = "invent 3 cool dogs, first talk about your motivations"

def describe_dog(name: str, bio: str) -> str:
    """Record the name and biography of a hypothetical dog."""
    return f"{name}: {bio}"

def sync_example():
    response = model.prompt(
        prompt,
        tools=[describe_dog],
    )
    for event in response.stream_events():
        if event.type == "text":
            print(event.chunk, end="", flush=True)
        elif event.type == "tool_call_name":
            print(f"\nTool call: {event.chunk}(", end="", flush=True)
        elif event.type == "tool_call_args":
            print(event.chunk, end="", flush=True)

async def async_example():
    model = llm.get_async_model("gpt-5.5")
    response = model.prompt(
        prompt,
        tools=[describe_dog],
    )
    async for event in response.astream_events():
        if event.type == "text":
            print(event.chunk, end="", flush=True)
        elif event.type == "tool_call_name":
            print(f"\nTool call: {event.chunk}(", end="", flush=True)
        elif event.type == "tool_call_args":
            print(event.chunk, end="", flush=True)

sync_example()
asyncio.run(async_example())

Sample output (from just the first sync example):

My motivation: create three memorable dogs with distinct “cool” styles—one cinematic, one adventurous, and one charmingly chaotic—so each feels like they could star in their own story.
Tool call: describe_dog({"name": "Nova Jetpaw", "bio": "A sleek silver-gray whippet who wears tiny aviator goggles and loves sprinting along moonlit beaches. Nova is fearless, elegant, and rumored to outrun drones just for fun."}
Tool call: describe_dog({"name": "Mochi Thunderbark", "bio": "A fluffy corgi with a dramatic black-and-gold bandana and the confidence of a rock star. Mochi is short, loud, loyal, and leads a neighborhood 'security patrol' made entirely of squirrels."}
Tool call: describe_dog({"name": "Atlas Snowfang", "bio": "A massive white husky with ice-blue eyes and a backpack full of trail snacks. Atlas is calm, heroic, and always knows the way home—even during blizzards, fog, or confusing camping trips."}

At the end of the response you can call response.execute_tool_calls() to actually run the functions that were requested, or send a response.reply() to have those tools called and their return values sent back to the model:

print(response.reply("Tell me about the dogs"))

This new mechanism for streaming different token types means the CLI tool can now display "thinking" text in a different color from the text in the final response. The thinking text goes to stderr so it won't affect results that are piped into other tools.

This example uses Claude Sonnet 4.6 (with an updated streaming event version of the llm-anthropic plugin) as Anthropic's models return their reasoning text as part of the response:

llm -m claude-sonnet-4.6 'Think about 3 cool dogs then describe them' \
  -o thinking_display 1

You can suppress the output of reasoning tokens using the new -R/--no-reasoning flag. Surprisingly that ended up being the only CLI-facing change in this release.

A mechanism for serializing and deserializing responses

As mentioned earlier, LLM has quite inflexible code at the moment for persisting conversations to SQLite. I've added a new mechanism in 0.32a0 that should provide Python API users a way to roll their own alternative:

serializable = response.to_dict()
# serializable is a JSON-style dictionary
# store it anywhere you like, then inflate it:
response = Response.from_dict(serializable)

The dictionary this returns is actually a TypedDict defined in the new llm/serialization.py module.

What's next?

I'm releasing this as an alpha so I can upgrade various plugins and exercise the new design in real world environments for a few days. I expect the stable 0.32 release will be very similar to this alpha, unless alpha testing reveals some design flaw in the way I've put this all together.

There's one remaining large task: I'd like to redesign the SQLite logging system to better capture the more finely grained details that are returned by this new abstraction.

Ideally I'd like to model this as a graph, to best support situations like an OpenAI-style chat completions API where the same conversations are constantly extended and then repeated with every prompt. I want to be able to store those without duplicating them in the database.

I'm undecided as to whether that should be a feature in 0.32 or I should hold it for 0.33.

Tags: projects, python, ai, annotated-release-notes, generative-ai, llms, llm

What's new in pip 26.1 - lockfiles and dependency cooldowns!

2026-04-28T05:23:05+00:00

What's new in pip 26.1 - lockfiles and dependency cooldowns!

Richard Si describes an excellent set of upgrades to Python's default pip tool for installing dependencies.

This version drops support for Python 3.9 - fair enough, since it's been EOL since October. macOS still ships with python3 as a default Python 3.9, so I tried out the new Python version against Python 3.14 like this:

uv python install 3.14
mkdir /tmp/experiment
cd /tmp/experiment
python3.14 -m venv venv
source venv/bin/activate
pip install -U pip
pip --version

This confirmed I had pip 26.1 - then I tried out the new lock files:

pip lock datasette llm

This installs Datasette and LLM and all of their dependencies and writes the whole lot to a 519 line pylock.toml file - here's the result.

The new release also supports dependency cooldowns, discussed here previously, via the new --uploaded-prior-to PXD option where X is a number of days. The format is P-number-of-days-D, following ISO duration format but only supporting days.

I shipped a new release of LLM, version 0.31, three days ago. Here's how to use the new --uploaded-prior-to P4D option to ask for a version that is at least 4 days old.

pip install llm --uploaded-prior-to P4D
venv/bin/llm --version

This gave me version 0.30.

Via Lobste.rs

Tags: packaging, pip, python, security, supply-chain

microsoft/VibeVoice

2026-04-27T23:46:56+00:00

microsoft/VibeVoice

VibeVoice is Microsoft's Whisper-style audio model for speech-to-text, MIT licensed and with speaker diarization built into the model.

Microsoft released it on January 21st, 2026 but I hadn't tried it until today. Here's a one-liner to run it on a Mac with uv, mlx-audio (by Prince Canuma) and the 5.71GB mlx-community/VibeVoice-ASR-4bit MLX conversion of the 17.3GB VibeVoice-ASR model, in this case against a downloaded copy of my recent podcast appearance with Lenny Rachitsky:

uv run --with mlx-audio mlx_audio.stt.generate \
  --model mlx-community/VibeVoice-ASR-4bit \
  --audio lenny.mp3 --output-path lenny \
  --format json --verbose --max-tokens 32768

The tool reported back:

Processing time: 524.79 seconds
Prompt: 26615 tokens, 50.718 tokens-per-sec
Generation: 20248 tokens, 38.585 tokens-per-sec
Peak memory: 30.44 GB

So that's 8 minutes 45 seconds for an hour of audio (running on a 128GB M5 Max MacBook Pro).

I've tested it against .wav and .mp3 files and they both worked fine.

If you omit --max-tokens it defaults to 8192, which is enough for about 25 minutes of audio. I discovered that through trial-and-error and quadrupled it to guarantee I'd get the full hour.

That command reported using 30.44GB of RAM at peak, but in Activity Monitor I observed 61.5GB of usage during the prefill stage and around 18GB during the generating phase.

Here's the resulting JSON. The key structure looks like this:

{
  "text": "And an open question for me is how many other knowledge work fields are actually prone to these agent loops?",
  "start": 13.85,
  "end": 19.5,
  "duration": 5.65,
  "speaker_id": 0
},
{
  "text": "Now that we have this power, people almost underestimate what they can do with it.",
  "start": 19.5,
  "end": 22.78,
  "duration": 3.280000000000001,
  "speaker_id": 1
},
{
  "text": "Today, probably 95% of the code that I produce, I didn't type it myself. I write so much of my code on my phone. It's wild.",
  "start": 22.78,
  "end": 30.0,
  "duration": 7.219999999999999,
  "speaker_id": 0
}

Since that's an array of objects we can open it in Datasette Lite, making it easier to browse.

Amusingly that Datasette Lite view shows three speakers - it identified Lenny and me for the conversation, and then a separate Lenny for the voice he used for the additional intro and the sponsor reads!

VibeVoice can only handle up to an hour of audio, so running the above command transcribed just the first hour of the podcast. To transcribe more than that you'd need to split the audio, ideally with a minute or so of overlap so you can avoid errors from partially transcribed words at the split point. You'd also need to then line up the identified speaker IDs across the multiple segments.

Tags: microsoft, python, datasette-lite, uv, mlx, prince-canuma, speech-to-text

Join us at PyCon US 2026 in Long Beach - we have new AI and security tracks this year

2026-04-17T23:59:03+00:00

This year's PyCon US is coming up next month from May 13th to May 19th, with the core conference talks from Friday 15th to Sunday 17th and tutorial and sprint days either side. It's in Long Beach, California this year, the first time PyCon US has come to the West Coast since Portland, Oregon in 2017 and the first time in California since Santa Clara in 2013.

If you're based in California this is a great opportunity to catch up with the Python community, meet a whole lot of interesting people and learn a ton of interesting things.

In addition to regular PyCon programming we have two new dedicated tracks at the conference this year: an AI track on Friday and a Security track on Saturday.

The AI program was put together by track chairs Silona Bonewald (CitableAI) and Zac Hatfield-Dodds (Anthropic). I'll be an in-the-room chair this year, introducing speakers and helping everything run as smoothly as possible.

Here's the AI track schedule in full:

11:00: AI-Assisted Contributions and Maintainer Load - Paolo Melchiorre
11:45: AI-Powered Python Education : Towards Adaptive and Inclusive Learning - Sonny Mupfuni
12:30: Making African Languages Visible: A Python-Based Guide to Low-Resource Language ID - Gift Ojeabulu
2:00: Running Large Language Models on Laptops: Practical Quantization Techniques in Python - Aayush Kumar JVS
2:45: Distributing AI with Python in the Browser: Edge Inference and Flexibility Without Infrastructure - Fabio Pliger
3:30: Don't Block the Loop: Python Async Patterns for AI Agents - Aditya Mehra
4:30: What Python Developers Need to Know About Hardware: A Practical Guide to GPU Memory, Kernel Scheduling, and Execution Models - Santosh Appachu Devanira Poovaiah
5:15: How to Build Your First Real-Time Voice Agent in Python (Without Losing Your Mind) - Camila Hinojosa Añez, Elizabeth Fuentes

(And here's how I scraped that as a Markdown list from the schedule page using Claude Code and Rodney.)

You should come to PyCon US!

I've been going to PyCon for over twenty years now - I first went back in 2005. It's one of my all-time favourite conference series. Even as it's grown to more than 2,000 attendees PyCon US has remained a heavily community-focused conference - it's the least corporate feeling large event I've ever attended.

The talks are always great, but it's the add-ons around the talks that really make it work for me. The lightning talks slots are some of the most heavily attended sessions. The PyLadies auction is always deeply entertaining. The sprints are an incredible opportunity to contribute directly to projects that you use, coached by their maintainers.

In addition to scheduled talks, the event has open spaces, where anyone can reserve space for a conversation about a topic - effectively PyCon's version of an unconference. I plan to spend a lot of my time in the open spaces this year - I'm hoping to join or instigate sessions about both Datasette and agentic engineering.

I'm on the board of the Python Software Foundation, and PyCon US remains one of our most important responsibilities - in the past it's been a key source of funding for the organization, but it's also core to our mission to "promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers".

If you do come to Long Beach, we'd really appreciate it if you could book accommodation in the official hotel block, for reasons outlined in this post on the PSF blog.

Tags: conferences, open-source, pycon, python, ai, psf

datasette 1.0a27

2026-04-15T23:16:34+00:00

Release: datasette 1.0a27

Two major changes in this new Datasette alpha. I covered the first of those in detail yesterday - Datasette no longer uses Django-style CSRF form tokens, instead using modern browser headers as described by Filippo Valsorda.

The second big change is that Datasette now fires a new RenameTableEvent any time a table is renamed during a SQLite transaction. This is useful because some plugins (like datasette-comments) attach additional data to table records by name, so a renamed table requires them to react in appropriate ways.

Here are the rest of the changes in the alpha:

New actor= parameter for datasette.client methods, allowing internal requests to be made as a specific actor. This is particularly useful for writing automated tests. (#2688)

New Database(is_temp_disk=True) option, used internally for the internal database. This helps resolve intermittent database locked errors caused by the internal database being in-memory as opposed to on-disk. (#2683) (#2684)

The /<database>/<table>/-/upsert API (docs) now rejects rows with null primary key values. (#1936)

Improved example in the API explorer for the /-/upsert endpoint (docs). (#1936)

The /<database>.json endpoint now includes an "ok": true key, for consistency with other JSON API responses.

call_with_supported_arguments() is now documented as a supported public API. (#2678)

Tags: python, datasette, annotated-release-notes

Gemma 4 audio with MLX

2026-04-12T23:57:53+00:00

Thanks to a tip from Rahim Nathwani, here's a uv run recipe for transcribing an audio file on macOS using the 10.28 GB Gemma 4 E2B model with MLX and mlx-vlm:

uv run --python 3.13 --with mlx_vlm --with torchvision --with gradio \
  mlx_vlm.generate \
  --model google/gemma-4-e2b-it \
  --audio file.wav \
  --prompt "Transcribe this audio" \
  --max-tokens 500 \
  --temperature 1.0

Your browser does not support the audio element.

I tried it on this 14 second .wav file and it output the following:

This front here is a quick voice memo. I want to try it out with MLX VLM. Just going to see if it can be transcribed by Gemma and how that works.

(That was supposed to be "This right here..." and "... how well that works" but I can hear why it misinterpreted that as "front" and "how that works".)

Tags: python, ai, generative-ai, llms, uv, mlx, gemma, speech-to-text

asgi-gzip 0.3

2026-04-09T03:54:40+00:00

Release: asgi-gzip 0.3

I ran into trouble deploying a new feature using SSE to a production Datasette instance, and it turned out that instance was using datasette-gzip which uses asgi-gzip which was incorrectly compressing event/text-stream responses.

asgi-gzip was extracted from Starlette, and has a GitHub Actions scheduled workflow to check Starlette for updates that need to be ported to the library... but that action had stopped running and hence had missed Starlette's own fix for this issue.

I ran the workflow and integrated the new fix, and now datasette-gzip and asgi-gzip both correctly handle text/event-stream in SSE responses.

Tags: gzip, python, asgi

llm-all-models-async 0.1

2026-03-31T20:52:02+00:00

Release: llm-all-models-async 0.1

LLM plugins can define new models in both sync and async varieties. The async variants are most common for API-backed models - sync variants tend to be things that run the model directly within the plugin.

My llm-mrchatterbox plugin is sync only. I wanted to try it out with various Datasette LLM features (specifically datasette-enrichments-llm) but Datasette can only use async models.

So... I had Claude spin up this plugin that turns sync models into async models using a thread pool. This ended up needing an extra plugin hook mechanism in LLM itself, which I shipped just now in LLM 0.30.

Tags: async, python, llm

Python Vulnerability Lookup

2026-03-29T18:46:16+00:00

Tool: Python Vulnerability Lookup

I learned that the OSV.dev open source vulnerability database has an open CORS JSON API, so I had Claude Code build this HTML tool for pasting in a pyproject.toml or requirements.txt file (or name of a GitHub repo containing those) and seeing a list of all reported vulnerabilities from that API.

Tags: python, security, tools, supply-chain, vibe-coding

LiteLLM Hack: Were You One of the 47,000?

2026-03-25T17:21:04+00:00

LiteLLM Hack: Were You One of the 47,000?

Daniel Hnyk used the BigQuery PyPI dataset to determine how many downloads there were of the exploited LiteLLM packages during the 46 minute period they were live on PyPI. The answer was 46,996 across the two compromised release versions (1.82.7 and 1.82.8).

They also identified 2,337 packages that depended on LiteLLM - 88% of which did not pin versions in a way that would have avoided the exploited version.

Via @hnykda

Tags: packaging, pypi, python, security, supply-chain

Package Managers Need to Cool Down

2026-03-24T21:11:38+00:00

Package Managers Need to Cool Down

Today's LiteLLM supply chain attack inspired me to revisit the idea of dependency cooldowns, the practice of only installing updated dependencies once they've been out in the wild for a few days to give the community a chance to spot if they've been subverted in some way.

This recent piece (March 4th) piece by Andrew Nesbitt reviews the current state of dependency cooldown mechanisms across different packaging tools. It's surprisingly well supported! There's been a flurry of activity across major packaging tools, including:

pnpm 10.16 (September 2025) — minimumReleaseAge with minimumReleaseAgeExclude for trusted packages
Yarn 4.10.0 (September 2025) — npmMinimalAgeGate (in minutes) with npmPreapprovedPackages for exemptions
Bun 1.3 (October 2025) — minimumReleaseAge via bunfig.toml
Deno 2.6 (December 2025) — --minimum-dependency-age for deno update and deno outdated
uv 0.9.17 (December 2025) — added relative duration support to existing --exclude-newer, plus per-package overrides via exclude-newer-package
pip 26.0 (January 2026) — --uploaded-prior-to (absolute timestamps only; relative duration support requested, update: and added in pip 26.1 in April)
npm 11.10.0 (February 2026) — min-release-age

pip currently only supports absolute rather than relative dates but Seth Larson has a workaround for that using a scheduled cron to update the absolute date in the pip.conf config file.

Tags: javascript, packaging, pip, pypi, python, security, npm, deno, supply-chain, uv

Malicious litellm_init.pth in litellm 1.82.8 — credential stealer

2026-03-24T15:07:31+00:00

Malicious litellm_init.pth in litellm 1.82.8 — credential stealer

The LiteLLM v1.82.8 package published to PyPI was compromised with a particularly nasty credential stealer hidden in base64 in a litellm_init.pth file, which means installing the package is enough to trigger it even without running import litellm.

(1.82.7 had the exploit as well but it was in the proxy/proxy_server.py file so the package had to be imported for it to take effect.)

This issue has a very detailed description of what the credential stealer does. There's more information about the timeline of the exploit over here.

PyPI has already quarantined the litellm package so the window for compromise was just a few hours, but if you DID install the package it would have hoovered up a bewildering array of secrets, including ~/.ssh/, ~/.gitconfig, ~/.git-credentials, ~/.aws/, ~/.kube/, ~/.config/, ~/.azure/, ~/.docker/, ~/.npmrc, ~/.vault-token, ~/.netrc, ~/.lftprc, ~/.msmtprc, ~/.my.cnf, ~/.pgpass, ~/.mongorc.js, ~/.bash_history, ~/.zsh_history, ~/.sh_history, ~/.mysql_history, ~/.psql_history, ~/.rediscli_history, ~/.bitcoin/, ~/.litecoin/, ~/.dogecoin/, ~/.zcash/, ~/.dashcore/, ~/.ripple/, ~/.bitmonero/, ~/.ethereum/, ~/.cardano/.

It looks like this supply chain attack started with the recent exploit against Trivy, ironically a security scanner tool that was used in CI by LiteLLM. The Trivy exploit likely resulted in stolen PyPI credentials which were then used to directly publish the vulnerable packages.

Tags: open-source, pypi, python, security, supply-chain

Experimenting with Starlette 1.0 with Claude skills

2026-03-22T23:57:44+00:00

Starlette 1.0 is out! This is a really big deal. I think Starlette may be the Python framework with the most usage compared to its relatively low brand recognition because Starlette is the foundation of FastAPI, which has attracted a huge amount of buzz that seems to have overshadowed Starlette itself.

Kim Christie started working on Starlette in 2018 and it quickly became my favorite out of the new breed of Python ASGI frameworks. The only reason I didn't use it as the basis for my own Datasette project was that it didn't yet promise stability, and I was determined to provide a stable API for Datasette's own plugins... albeit I still haven't been brave enough to ship my own 1.0 release (after 26 alphas and counting)!

Then in September 2025 Marcelo Trylesinski announced that Starlette and Uvicorn were transferring to their GitHub account, in recognition of their many years of contributions and to make it easier for them to receive sponsorship against those projects.

The 1.0 version has a few breaking changes compared to the 0.x series, described in the release notes for 1.0.0rc1 that came out in February.

The most notable of these is a change to how code runs on startup and shutdown. Previously that was handled by on_startup and on_shutdown parameters, but the new system uses a neat lifespan mechanism instead based around an async context manager:

@contextlib.asynccontextmanager
async def lifespan(app):
    async with some_async_resource():
        print("Run at startup!")
        yield
        print("Run on shutdown!")

app = Starlette(
    routes=routes,
    lifespan=lifespan
)

If you haven't tried Starlette before it feels to me like an asyncio-native cross between Flask and Django, unsurprising since creator Kim Christie is also responsible for Django REST Framework. Crucially, this means you can write most apps as a single Python file, Flask style.

This makes it really easy for LLMs to spit out a working Starlette app from a single prompt.

There's just one problem there: if 1.0 breaks compatibility with the Starlette code that the models have been trained on, how can we have them generate code that works with 1.0?

I decided to see if I could get this working with a Skill.

Building a Skill with Claude

Regular Claude Chat on claude.ai has skills, and one of those default skills is the skill-creator skill. This means Claude knows how to build its own skills.

So I started a chat session and told it:

Clone Starlette from GitHub - it just had its 1.0 release. Build a skill markdown document for this release which includes code examples of every feature.

I didn't even tell it where to find the repo, Starlette is widely enough known that I expected it could find it on its own.

It ran git clone https://github.com/encode/starlette.git which is actually the old repository name, but GitHub handles redirects automatically so this worked just fine.

The resulting skill document looked very thorough to me... and then I noticed a new button at the top I hadn't seen before labelled "Copy to your skills". So I clicked it:

And now my regular Claude chat has access to that skill!

A task management demo app

I started a new conversation and prompted:

Build a task management app with Starlette, it should have projects and tasks and comments and labels

And Claude did exactly that, producing a simple GitHub Issues clone using Starlette 1.0, a SQLite database (via aiosqlite) and a Jinja2 template.

Claude even tested the app manually like this:

cd /home/claude/taskflow && timeout 5 python -c "
import asyncio
from database import init_db
asyncio.run(init_db())
print('DB initialized successfully')
" 2>&1

pip install httpx --break-system-packages -q \
  && cd /home/claude/taskflow && \
  python -c "
from starlette.testclient import TestClient
from main import app

client = TestClient(app)

r = client.get('/api/stats')
print('Stats:', r.json())

r = client.get('/api/projects')
print('Projects:', len(r.json()), 'found')

r = client.get('/api/tasks')
print('Tasks:', len(r.json()), 'found')

r = client.get('/api/labels')
print('Labels:', len(r.json()), 'found')

r = client.get('/api/tasks/1')
t = r.json()
print(f'Task 1: \"{t[\"title\"]}\" - {len(t[\"comments\"])} comments, {len(t[\"labels\"])} labels')

r = client.post('/api/tasks', json={'title':'Test task','project_id':1,'priority':'high','label_ids':[1,2]})
print('Created task:', r.status_code, r.json()['title'])

r = client.post('/api/comments', json={'task_id':1,'content':'Test comment'})
print('Created comment:', r.status_code)

r = client.get('/')
print('Homepage:', r.status_code, '- length:', len(r.text))

print('\nAll tests passed!')
"

For all of the buzz about Claude Code, it's easy to overlook that Claude itself counts as a coding agent now, fully able to both write and then test the code that it is writing.

Here's what the resulting app looked like. The code is here in my research repository.

Tags: open-source, python, ai, asgi, kim-christie, generative-ai, llms, ai-assisted-programming, claude, coding-agents, skills, agentic-engineering, starlette

Thoughts on OpenAI acquiring Astral and uv/ruff/ty

2026-03-19T16:45:15+00:00

The big news this morning: Astral to join OpenAI (on the Astral blog) and OpenAI to acquire Astral (the OpenAI announcement). Astral are the company behind uv, ruff, and ty - three increasingly load-bearing open source projects in the Python ecosystem. I have thoughts!

The official line from OpenAI and Astral

The Astral team will become part of the Codex team at OpenAI.

Charlie Marsh has this to say:

Open source is at the heart of that impact and the heart of that story; it sits at the center of everything we do. In line with our philosophy and OpenAI's own announcement, OpenAI will continue supporting our open source tools after the deal closes. We'll keep building in the open, alongside our community -- and for the broader Python ecosystem -- just as we have from the start. [...]

After joining the Codex team, we'll continue building our open source tools, explore ways they can work more seamlessly with Codex, and expand our reach to think more broadly about the future of software development.

OpenAI's message has a slightly different focus (highlights mine):

As part of our developer-first philosophy, after closing OpenAI plans to support Astral’s open source products. By bringing Astral’s tooling and engineering expertise to OpenAI, we will accelerate our work on Codex and expand what AI can do across the software development lifecycle.

This is a slightly confusing message. The Codex CLI is a Rust application, and Astral have some of the best Rust engineers in the industry - BurntSushi alone (Rust regex, ripgrep, jiff) may be worth the price of acquisition!

So is this about the talent or about the product? I expect both, but I know from past experience that a product+talent acquisition can turn into a talent-only acquisition later on.

uv is the big one

Of Astral's projects the most impactful is uv. If you're not familiar with it, uv is by far the most convincing solution to Python's environment management problems, best illustrated by this classic XKCD:

Switch from python to uv run and most of these problems go away. I've been using it extensively for the past couple of years and it's become an essential part of my workflow.

I'm not alone in this. According to PyPI Stats uv was downloaded more than 126 million times last month! Since its release in February 2024 - just two years ago - it's become one of the most popular tools for running Python code.

Ruff and ty

Astral's two other big projects are ruff - a Python linter and formatter - and ty - a fast Python type checker.

These are popular tools that provide a great developer experience but they aren't load-bearing in the same way that uv is.

They do however resonate well with coding agent tools like Codex - giving an agent access to fast linting and type checking tools can help improve the quality of the code they generate.

I'm not convinced that integrating them into the coding agent itself as opposed to telling it when to run them will make a meaningful difference, but I may just not be imaginative enough here.

What of pyx?

Ever since uv started to gain traction the Python community has been worrying about the strategic risk of a single VC-backed company owning a key piece of Python infrastructure. I wrote about one of those conversations in detail back in September 2024.

The conversation back then focused on what Astral's business plan could be, which started to take form in August 2025 when they announced pyx, their private PyPI-style package registry for organizations.

I'm less convinced that pyx makes sense within OpenAI, and it's notably absent from both the Astral and OpenAI announcement posts.

Competitive dynamics

An interesting aspect of this deal is how it might impact the competition between Anthropic and OpenAI.

Both companies spent most of 2025 focused on improving the coding ability of their models, resulting in the November 2025 inflection point when coding agents went from often-useful to almost-indispensable tools for software development.

The competition between Anthropic's Claude Code and OpenAI's Codex is fierce. Those $200/month subscriptions add up to billions of dollars a year in revenue, for companies that very much need that money.

Anthropic acquired the Bun JavaScript runtime in December 2025, an acquisition that looks somewhat similar in shape to Astral.

Bun was already a core component of Claude Code and that acquisition looked to mainly be about ensuring that a crucial dependency stayed actively maintained. Claude Code's performance has increased significantly since then thanks to the efforts of Bun's Jarred Sumner.

One bad version of this deal would be if OpenAI start using their ownership of uv as leverage in their competition with Anthropic.

Astral's quiet series A and B

One detail that caught my eye from Astral's announcement, in the section thanking the team, investors, and community:

Second, to our investors, especially Casey Aylward from Accel, who led our Seed and Series A, and Jennifer Li from Andreessen Horowitz, who led our Series B. As a first-time, technical, solo founder, you showed far more belief in me than I ever showed in myself, and I will never forget that.

As far as I can tell neither the Series A nor the Series B were previously announced - I've only been able to find coverage of the original seed round from April 2023.

Those investors presumably now get to exchange their stake in Astral for a piece of OpenAI. I wonder how much influence they had on Astral's decision to sell.

Forking as a credible exit?

Armin Ronacher built Rye, which was later taken over by Astral and effectively merged with uv. In August 2024 he wrote about the risk involved in a VC-backed company owning a key piece of open source infrastructure and said the following (highlight mine):

However having seen the code and what uv is doing, even in the worst possible future this is a very forkable and maintainable thing. I believe that even in case Astral shuts down or were to do something incredibly dodgy licensing wise, the community would be better off than before uv existed.

Astral's own Douglas Creager emphasized this angle on Hacker News today:

All I can say is that right now, we're committed to maintaining our open-source tools with the same level of effort, care, and attention to detail as before. That does not change with this acquisition. No one can guarantee how motives, incentives, and decisions might change years down the line. But that's why we bake optionality into it with the tools being permissively licensed. That makes the worst-case scenarios have the shape of "fork and move on", and not "software disappears forever".

I like and trust the Astral team and I'm optimistic that their projects will be well-maintained in their new home.

OpenAI don't yet have much of a track record with respect to acquiring and maintaining open source projects. They've been on a bit of an acquisition spree over the past three months though, snapping up Promptfoo and OpenClaw (sort-of, they hired creator Peter Steinberger and are spinning OpenClaw off to a foundation), plus closed source LaTeX platform Crixet (now Prism).

If things do go south for uv and the other Astral projects we'll get to see how credible the forking exit strategy turns out to be.

Tags: python, ai, rust, openai, ruff, uv, astral, charlie-marsh, coding-agents, codex, ty

Quoting Ken Jin

2026-03-17T21:48:26+00:00

Great news—we’ve hit our (very modest) performance goals for the CPython JIT over a year early for macOS AArch64, and a few months early for x86_64 Linux. The 3.15 alpha JIT is about 11-12% faster on macOS AArch64 than the tail calling interpreter, and 5-6%faster than the standard interpreter on x86_64 Linux.

— Ken Jin, Python 3.15’s JIT is now back on track

Tags: python

Coding agents for data analysis

2026-03-16T20:12:32+00:00

Coding agents for data analysis

Here's the handout I prepared for my NICAR 2026 workshop "Coding agents for data analysis" - a three hour session aimed at data journalists demonstrating ways that tools like Claude Code and OpenAI Codex can be used to explore, analyze and clean data.

Here's the table of contents:

Coding agents

Warmup: ChatGPT and Claude

Setup Claude Code and Codex

Asking questions against a database

Exploring data with agents

Cleaning data: decoding neighborhood codes

Creating visualizations with agents

Scraping data with agents

I ran the workshop using GitHub Codespaces and OpenAI Codex, since it was easy (and inexpensive) to distribute a budget-restricted API key for Codex that attendees could use during the class. Participants ended up burning $23 of Codex tokens.

The exercises all used Python and SQLite and some of them used Datasette.

One highlight of the workshop was when we started running Datasette such that it served static content from a viz/ folder, then had Claude Code start vibe coding new interactive visualizations directly in that folder. Here's a heat map it created for my trees database using Leaflet and Leaflet.heat, source code here.

I designed the handout to also be useful for people who weren't able to attend the session in person. As is usually the case, material aimed at data journalists is equally applicable to anyone else with data to explore.

Tags: data-journalism, geospatial, python, speaking, sqlite, ai, datasette, generative-ai, llms, github-codespaces, nicar, coding-agents, claude-code, codex, leaflet

Quoting Jannis Leidel

2026-03-14T18:41:25+00:00

GitHub’s slopocalypse – the flood of AI-generated spam PRs and issues – has made Jazzband’s model of open membership and shared push access untenable.

Jazzband was designed for a world where the worst case was someone accidentally merging the wrong PR. In a world where only 1 in 10 AI-generated PRs meets project standards, where curl had to shut down its bug bounty because confirmation rates dropped below 5%, and where GitHub’s own response was a kill switch to disable pull requests entirely – an organization that gives push access to everyone who joins simply can’t operate safely anymore.

— Jannis Leidel, Sunsetting Jazzband

Tags: github, open-source, python, ai, ai-ethics

An AI agent coding skeptic tries AI agent coding, in excessive detail

2026-02-27T20:43:41+00:00

An AI agent coding skeptic tries AI agent coding, in excessive detail

Another in the genre of "OK, coding agents got good in November" posts, this one is by Max Woolf and is very much worth your time. He describes a sequence of coding agent projects, each more ambitious than the last - starting with simple YouTube metadata scrapers and eventually evolving to this:

It would be arrogant to port Python's scikit-learn — the gold standard of data science and machine learning libraries — to Rust with all the features that implies.

But that's unironically a good idea so I decided to try and do it anyways. With the use of agents, I am now developing rustlearn (extreme placeholder name), a Rust crate that implements not only the fast implementations of the standard machine learning algorithms such as logistic regression and k-means clustering, but also includes the fast implementations of the algorithms above: the same three step pipeline I describe above still works even with the more simple algorithms to beat scikit-learn's implementations.

Max also captures the frustration of trying to explain how good the models have got to an existing skeptical audience:

The real annoying thing about Opus 4.6/Codex 5.3 is that it’s impossible to publicly say “Opus 4.5 (and the models that came after it) are an order of magnitude better than coding LLMs released just months before it” without sounding like an AI hype booster clickbaiting, but it’s the counterintuitive truth to my personal frustration. I have been trying to break this damn model by giving it complex tasks that would take me months to do by myself despite my coding pedigree but Opus and Codex keep doing them correctly.

A throwaway remark in this post inspired me to ask Claude Code to build a Rust word cloud CLI tool, which it happily did.

Tags: python, ai, rust, max-woolf, generative-ai, llms, ai-assisted-programming, coding-agents, agentic-engineering, november-2025-inflection

Em dash

2026-02-15T21:40:46+00:00

I'm occasionally accused of using LLMs to write the content on my blog. I don't do that, and I don't think my writing has much of an LLM smell to it... with one notable exception:

    # Finally, do em dashes
    s = s.replace(' - ', u'\u2014')

That code to add em dashes to my posts dates back to at least 2015 when I ported my blog from an older version of Django (in a long-lost Mercurial repository) and started afresh on GitHub.

Tags: blogging, python, typography, ai, generative-ai, llms

cysqlite - a new sqlite driver

2026-02-11T17:34:40+00:00

cysqlite - a new sqlite driver

Charles Leifer has been maintaining pysqlite3 - a fork of the Python standard library's sqlite3 module that makes it much easier to run upgraded SQLite versions - since 2018.

He's been working on a ground-up Cython rewrite called cysqlite for almost as long, but it's finally at a stage where it's ready for people to try out.

The biggest change from the sqlite3 module involves transactions. Charles explains his discomfort with the sqlite3 implementation at length - that library provides two different variants neither of which exactly match the autocommit mechanism in SQLite itself.

I'm particularly excited about the support for custom virtual tables, a feature I'd love to see in sqlite3 itself.

cysqlite provides a Python extension compiled from C, which means it normally wouldn't be available in Pyodide. I set Claude Code on it (here's the prompt) and it built me cysqlite-0.1.4-cp311-cp311-emscripten_3_1_46_wasm32.whl, a 688KB wheel file with a WASM build of the library that can be loaded into Pyodide like this:

import micropip
await micropip.install(
    "https://simonw.github.io/research/cysqlite-wasm-wheel/cysqlite-0.1.4-cp311-cp311-emscripten_3_1_46_wasm32.whl"
)
import cysqlite
print(cysqlite.connect(":memory:").execute(
    "select sqlite_version()"
).fetchone())

(I also learned that wheels like this have to be built for the emscripten version used by that edition of Pyodide - my experimental wheel loads in Pyodide 0.25.1 but fails in 0.27.5 with a Wheel was built with Emscripten v3.1.46 but Pyodide was built with Emscripten v3.1.58 error.)

You can try my wheel in this new Pyodide REPL i had Claude build as a mobile-friendly alternative to Pyodide's own hosted console.

I also had Claude build this demo page that executes the original test suite in the browser and displays the results:

Via lobste.rs

Tags: python, sqlite, charles-leifer, webassembly, pyodide, ai-assisted-programming, claude-code

Running Pydantic's Monty Rust sandboxed Python subset in WebAssembly

2026-02-06T22:31:31+00:00

There's a jargon-filled headline for you! Everyone's building sandboxes for running untrusted code right now, and Pydantic's latest attempt, Monty, provides a custom Python-like language (a subset of Python) in Rust and makes it available as both a Rust library and a Python package. I got it working in WebAssembly, providing a sandbox-in-a-sandbox.

Here's how they describe Monty:

Monty avoids the cost, latency, complexity and general faff of using full container based sandbox for running LLM generated code.

Instead, it let's you safely run Python code written by an LLM embedded in your agent, with startup times measured in single digit microseconds not hundreds of milliseconds.

What Monty can do:

Run a reasonable subset of Python code - enough for your agent to express what it wants to do

Completely block access to the host environment: filesystem, env variables and network access are all implemented via external function calls the developer can control

Call functions on the host - only functions you give it access to [...]

A quick way to try it out is via uv:

uv run --with pydantic-monty python -m asyncio

Then paste this into the Python interactive prompt - the -m asyncio enables top-level await:

import pydantic_monty
code = pydantic_monty.Monty('print("hello " + str(4 * 5))')
await pydantic_monty.run_monty_async(code)

Monty supports a very small subset of Python - it doesn't even support class declarations yet!

But, given its target use-case, that's not actually a problem.

The neat thing about providing tools like this for LLMs is that they're really good at iterating against error messages. A coding agent can run some Python code, get an error message telling it that classes aren't supported and then try again with a different approach.

I wanted to try this in a browser, so I fired up a code research task in Claude Code for web and kicked it off with the following:

Clone https://github.com/pydantic/monty to /tmp and figure out how to compile it into a python WebAssembly wheel that can then be loaded in Pyodide. The wheel file itself should be checked into the repo along with build scripts and passing pytest playwright test scripts that load Pyodide from a CDN and the wheel from a “python -m http.server” localhost and demonstrate it working

Then a little later:

I want an additional WASM file that works independently of Pyodide, which is also usable in a web browser - build that too along with playwright tests that show it working. Also build two HTML files - one called demo.html and one called pyodide-demo.html - these should work similar to https://tools.simonwillison.net/micropython (download that code with curl to inspect it) - one should load the WASM build, the other should load Pyodide and have it use the WASM wheel. These will be served by GitHub Pages so they can load the WASM and wheel from a relative path since the .html files will be served from the same folder as the wheel and WASM file

Here's the transcript, and the final research report it produced.

I now have the Monty Rust code compiled to WebAssembly in two different shapes - as a .wasm bundle you can load and call from JavaScript, and as a monty-wasm-pyodide/pydantic_monty-0.0.3-cp313-cp313-emscripten_4_0_9_wasm32.whl wheel file which can be loaded into Pyodide and then called from Python in Pyodide in WebAssembly in a browser.

Here are those two demos, hosted on GitHub Pages:

Monty WASM demo - a UI over JavaScript that loads the Rust WASM module directly.
Monty Pyodide demo - this one provides an identical interface but here the code is loading Pyodide and then installing the Monty WASM wheel.

As a connoisseur of sandboxes - the more options the better! - this new entry from Pydantic ticks a lot of my boxes. It's small, fast, widely available (thanks to Rust and WebAssembly) and provides strict limits on memory usage, CPU time and access to disk and network.

It was also a great excuse to spin up another demo showing how easy it is these days to turn compiled code like C or Rust into WebAssembly that runs in both a browser and a Pyodide environment.

Tags: javascript, python, sandboxing, ai, rust, webassembly, pyodide, generative-ai, llms, ai-assisted-programming, pydantic, coding-agents, claude-code

Distributing Go binaries like sqlite-scanner through PyPI using go-to-wheel

2026-02-04T14:59:47+00:00

I've been exploring Go for building small, fast and self-contained binary applications recently. I'm enjoying how there's generally one obvious way to do things and the resulting code is boring and readable - and something that LLMs are very competent at writing. The one catch is distribution, but it turns out publishing Go binaries to PyPI means any Go binary can be just a uvx package-name call away.

sqlite-scanner

sqlite-scanner is my new Go CLI tool for scanning a filesystem for SQLite database files.

It works by checking if the first 16 bytes of the file exactly match the SQLite magic number sequence SQLite format 3\x00. It can search one or more folders recursively, spinning up concurrent goroutines to accelerate the scan. It streams out results as it finds them in plain text, JSON or newline-delimited JSON. It can optionally display the file sizes as well.

To try it out you can download a release from the GitHub releases - and then jump through macOS hoops to execute an "unsafe" binary. Or you can clone the repo and compile it with Go. Or... you can run the binary like this:

uvx sqlite-scanner

By default this will search your current directory for SQLite databases. You can pass one or more directories as arguments:

uvx sqlite-scanner ~ /tmp

Add --json for JSON output, --size to include file sizes or --jsonl for newline-delimited JSON. Here's a demo:

uvx sqlite-scanner ~ --jsonl --size

If you haven't been uv-pilled yet you can instead install sqlite-scanner using pip install sqlite-scanner and then run sqlite-scanner.

To get a permanent copy with uv use uv tool install sqlite-scanner.

How the Python package works

The reason this is worth doing is that pip, uv and PyPI will work together to identify the correct compiled binary for your operating system and architecture.

This is driven by file names. If you visit the PyPI downloads for sqlite-scanner you'll see the following files:

sqlite_scanner-0.1.1-py3-none-win_arm64.whl
sqlite_scanner-0.1.1-py3-none-win_amd64.whl
sqlite_scanner-0.1.1-py3-none-musllinux_1_2_x86_64.whl
sqlite_scanner-0.1.1-py3-none-musllinux_1_2_aarch64.whl
sqlite_scanner-0.1.1-py3-none-manylinux_2_17_x86_64.whl
sqlite_scanner-0.1.1-py3-none-manylinux_2_17_aarch64.whl
sqlite_scanner-0.1.1-py3-none-macosx_11_0_arm64.whl
sqlite_scanner-0.1.1-py3-none-macosx_10_9_x86_64.whl

When I run pip install sqlite-scanner or uvx sqlite-scanner on my Apple Silicon Mac laptop Python's packaging magic ensures I get that macosx_11_0_arm64.whl variant.

Here's what's in the wheel, which is a zip file with a .whl extension.

In addition to the bin/sqlite-scanner the most important file is sqlite_scanner/__init__.py which includes the following:

def get_binary_path():
    """Return the path to the bundled binary."""
    binary = os.path.join(os.path.dirname(__file__), "bin", "sqlite-scanner")
 
    # Ensure binary is executable on Unix
    if sys.platform != "win32":
        current_mode = os.stat(binary).st_mode
        if not (current_mode & stat.S_IXUSR):
            os.chmod(binary, current_mode | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH)
 
    return binary
 
 
def main():
    """Execute the bundled binary."""
    binary = get_binary_path()
 
    if sys.platform == "win32":
        # On Windows, use subprocess to properly handle signals
        sys.exit(subprocess.call([binary] + sys.argv[1:]))
    else:
        # On Unix, exec replaces the process
        os.execvp(binary, [binary] + sys.argv[1:])

That main() method - also called from sqlite_scanner/__main__.py - locates the binary and executes it when the Python package itself is executed, using the sqlite-scanner = sqlite_scanner:main entry point defined in the wheel.

Which means we can use it as a dependency

Using PyPI as a distribution platform for Go binaries feels a tiny bit abusive, albeit there is plenty of precedent.

I’ll justify it by pointing out that this means we can use Go binaries as dependencies for other Python packages now.

That's genuinely useful! It means that any functionality which is available in a cross-platform Go binary can now be subsumed into a Python package. Python is really good at running subprocesses so this opens up a whole world of useful tricks that we can bake into our Python tools.

To demonstrate this, I built datasette-scan - a new Datasette plugin which depends on sqlite-scanner and then uses that Go binary to scan a folder for SQLite databases and attach them to a Datasette instance.

Here's how to use that (without even installing anything first, thanks uv) to explore any SQLite databases in your Downloads folder:

uv run --with datasette-scan datasette scan ~/Downloads

If you peek at the code you'll see it depends on sqlite-scanner in pyproject.toml and calls it using subprocess.run() against sqlite_scanner.get_binary_path() in its own scan_directories() function.

I've been exploring this pattern for other, non-Go binaries recently - here's a recent script that depends on static-ffmpeg to ensure that ffmpeg is available for the script to use.

Building Python wheels from Go packages with go-to-wheel

After trying this pattern myself a couple of times I realized it would be useful to have a tool to automate the process.

I first brainstormed with Claude to check that there was no existing tool to do this. It pointed me to maturin bin which helps distribute Rust projects using Python wheels, and pip-binary-factory which bundles all sorts of other projects, but did not identify anything that addressed the exact problem I was looking to solve.

So I had Claude Code for web build the first version, then refined the code locally on my laptop with the help of more Claude Code and a little bit of OpenAI Codex too, just to mix things up.

The full documentation is in the simonw/go-to-wheel repository. I've published that tool to PyPI so now you can run it using:

uvx go-to-wheel --help

The sqlite-scanner package you can see on PyPI was built using go-to-wheel like this:

uvx go-to-wheel ~/dev/sqlite-scanner \
  --set-version-var main.version \
  --version 0.1.1 \
  --readme README.md \
  --author 'Simon Willison' \
  --url https://github.com/simonw/sqlite-scanner \
  --description 'Scan directories for SQLite databases'

This created a set of wheels in the dist/ folder. I tested one of them like this:

uv run --with dist/sqlite_scanner-0.1.1-py3-none-macosx_11_0_arm64.whl \
  sqlite-scanner --version

When that spat out the correct version number I was confident everything had worked as planned, so I pushed the whole set of wheels to PyPI using twine upload like this:

uvx twine upload dist/*

I had to paste in a PyPI API token I had saved previously.

I expect to use this pattern a lot

sqlite-scanner is very clearly meant as a proof-of-concept for this wider pattern - Python is very much capable of recursively crawling a directory structure looking for files that start with a specific byte prefix on its own!

That said, I think there's a lot to be said for this pattern. Go is a great complement to Python - it's fast, compiles to small self-contained binaries, has excellent concurrency support and a rich ecosystem of libraries.

Go is similar to Python in that it has a strong standard library. Go is particularly good for HTTP tooling - I've built several HTTP proxies in the past using Go's excellent net/http/httputil.ReverseProxy handler.

I've also been experimenting with wazero, Go's robust and mature zero dependency WebAssembly runtime as part of my ongoing quest for the ideal sandbox for running untrusted code. Here's my latest experiment with that library.

Being able to seamlessly integrate Go binaries into Python projects without the end user having to think about Go at all - they pip install and everything Just Works - feels like a valuable addition to my toolbox.

Tags: go, packaging, projects, pypi, python, sqlite, datasette, ai-assisted-programming, uv