Simon Willison on claude-mythos-fable

30 posts tagged “claude-mythos-fable”

Mythos-class models from Anthropic, including Claude Fable.

2026

Tool SQLite Query Explainer

Julia Evan's, in Learning a few things about running SQLite:

Maybe one day I’ll learn to read a query plan.

Big same.... which inspired me to have Fable build this interactive explain tool, which runs SQLite in Python in Pyodide in Web Assembly in the browser and adds a layer of explanation to the results of both EXPLAIN and EXPLAIN QUERY PLAN.

Approach with caution, since I don't know enough about SQLite query plans to verify the results myself, but it seems cromulent enough to me.

18th Jul 2026, 5:19 pm · sql, sqlite, tools, julia-evans, pyodide, claude-mythos-fable

Claude make Fable 5 permanent. An update from the @claudeai account on Twitter:

Beginning July 20, Claude Fable 5 will be included in all Max and Team Premium plans, at 50% of limits.

Pro and Team Standard users will continue to have access to Fable via usage credits, and will receive a one-time $100 credit.

As I was saying last week, the competition from GPT-5.6 Sol (and maybe to a lesser extent Kimi 3) made untenable Anthropic's plan to remove Fable 5 from their subscription accounts and make it available exclusively through API pricing.

Why pay $100 or $200/month for a subscription plan that doesn't include Anthropic's best model?

Their original plan was driven by concerns over compute capacity. I wonder if they'll have to dial back their training efforts in order to make more GPUs available to help serve the model.

A lot of people were losing sleep over trying to make the most of Fable 5 before subscriber access was withdrawn. It's nice not to have to worry about the Fablepocalypse any more.

Update: Important to note that users on the $20/month plan will still not have access to Fable 5 on that subscription. The Max plans are $100 and $200/month.

# 18th July 2026, 6 am / ai, generative-ai, llms, anthropic, claude, llm-pricing, claude-mythos-fable

Firefox in WebAssembly (via) This is absurdly cool: Puter compiled Firefox to WebAssembly such that the whole browser runs in another browser.

Here's my blog, running in Firefox, running in WebAssembly, running in Chrome:

A Chrome window. The tab has the Firefox UI and has loaded my blog. On the right is the Chrome network panel showing that it loaded resources that include a 233MB gecko.wasm and an 18MB chrome-assets.tar.zst

They chose Firefox/Gecko because it has strong single-process support. The project used an estimated $25,000 worth of Claude Opus and Fable tokens, but took advantage of a Claude Max subscription plan so cost much less in actual dollars.

The demo funnels all traffic over a WebSocket protocol (using the Wisp protocol) through Puter's server - a requirement to get this kind of thing to work because code running in browsers can't open arbitrary network connections.

(That proxying sounds expensive! The team had to scale the servers up to handle the traffic during the Hacker News conversation about the project.)

Puter claim this supports end-to-end encryption and that looks to be true - I inspected the WebSocket messages and traffic to my own HTTPS site was encrypted whereas requests and responses to http://www.example.com/ were in cleartext.

Here's the repo for firefox-wasm. theogbob/WebkitWasm is a similar project that compiles WebKit to WASM, but that one doesn't currently have an accessible online demo.

# 16th July 2026, 11:34 pm / browsers, firefox, ai, webassembly, generative-ai, llms, ai-assisted-programming, claude, claude-mythos-fable

One of the consequences of GPT-5.6 Sol being clearly a Fable/Mythos class model is that Anthropic have, once again, bumped the date that Fable stops being available in their Claude Max plans:

We're extending Claude Fable 5 access on all paid plans, as well as keeping Claude Code’s weekly rate limits 50% higher, through July 19.

As before, you can use up to half of your weekly usage limit on Fable 5. After that, you can continue using Fable 5 with usage credits, or switch to another model to keep working within your remaining limits.

Anthropic's original rationale for this was compute constraints - they wanted a better idea of both demand and compute availability before committing to keeping the new model cheap for subscribers.

OpenAI appear confident that they won't need to restrict access to GPT-5.6 in the same way. Here's Thibault Sottiaux this morning:

The last 48 hours of Codex and ChatGPT Work have been intense! Three important updates:

Temporarily removing the 5 hour usage limit restriction for all Plus, Business and Pro plans

Rolling out changes that will make GPT 5.6 Sol more efficient across the board and that will be reflected in less usage being used so that it can take you further. Exact impact to be quantified and shared

We hit 6M active users, and are landing a usage reset in the next hour

At this point I think Anthropic should change track and keep Fable permanently available on those plans. OpenAI are winning users simply due to the uncertainty that surrounds Fable access.

# 12th July 2026, 9:20 pm / ai, openai, generative-ai, llms, anthropic, llm-pricing, gpt, claude-mythos-fable

Rewriting Bun in Rust (via) Jarred Sumner has been promising this blog post (since May 9th) about his Zig to Rust rewrite of Bun for significantly longer than it took him to finish the rewrite.

Honestly, it was worth the wait. This is a detailed description of an extremely sophisticated piece of agentic engineering, featuring dynamic workflows, trial runs, adversarial review and all sorts of other interesting tricks.

Jarred spends the first half of the post praising Zig for getting Bun this far. Then we get to a core idea in the piece, emphasis mine:

Our bugfix list felt bad and I was tired of going to sleep worrying about crashes in Bun. I don't blame Zig for that - other users of Zig don't have the bugs we had, and mixing GC with manually-managed memory is an uncommon enough thing for software to need that no language really designs for it. We wouldn't have gotten this far if not for Zig, and I'll always be grateful. Until very recently, programming language choice was a one-way decision for a project like Bun.

Everyone knows you should never stop the world and rewrite a large piece of software from the ground up. Joel Spolsky highlighted that in Things You Should Never Do, Part I back in April 2000!

Coding agents powered by today's frontier models change that equation.

Why pick Rust? It all came down to those challenges with memory management:

A large percentage of bugs from that list are use-after-free, double-free, and "forgot to free" in an error path. In safe Rust, these are compiler errors and RAII-like automatic cleanup with Drop.

A crucial enabling factor for the rewrite was that the Bun test suite was written in TypeScript, which meant it could act as a conformance suite. This allowed an agent harness to automate much of the initial port from Bun to Rust, initially as an experiment to try out an earlier version of the model we now have access to as Mythos/Fable.

At first, I didn't expect it to work. A few days in, a high % of the test suite started passing and I saw how much the new Rust code matched up with the original Zig codebase. My opinion went from "this is worth trying" to "I'm going to merge this". [...]

For most of those 11 days (and after), I monitored workflows - manually reading the outputs to check for issues and bugs, and prompting Claude to edit the loop to fix things.

How do you review a PR with +1 million lines added? How do you start to build the confidence needed to responsibly merge large quantities of LLM-authored code?

A language-independent test suite with a million assertions, adversarial code review and when something does go wrong, fixing the process that generates the code instead of hand-fixing the code.

The new implementation of Bun has been live in Claude Code for nearly a month now:

Claude Code v2.1.181 (released June 17th) and later use the Rust port of Bun. Startup got 10% faster on Linux but otherwise, barely anyone noticed. Boring is good.

A perk of working at Anthropic is that you don't have to pay for your tokens - handy when the estimated cost is $165,000!

Pre-merge, this took 5.9 billion uncached input tokens, 690 million output tokens, and 72 billion cached input token reads — around $165,000 at API pricing.

This whole thing is a fascinating case study in taking on wildly ambitious projects with the help of coordinated parallel agents.

# 8th July 2026, 11:57 pm / ai, rust, zig, generative-ai, llms, ai-assisted-programming, anthropic, bun, conformance-suites, agentic-engineering, claude-mythos-fable, jarred-sumner

sqlite-utils 4.0, now with database schema migrations

This morning I released sqlite-utils 4.0, the 124th release of that project and the first major version bump since 3.0 in November 2020. In addition to some small but significant breaking changes (described in this upgrade guide), this version introduces three major features: database migrations, nested transactions (via a new db.atomic() method), and support for compound foreign keys.

[... 2,121 words]

7:32 pm / 7th July 2026 / schema-migrations, projects, sqlite, ai, sqlite-utils, annotated-release-notes, generative-ai, llms, ai-assisted-programming, anthropic, claude, agentic-engineering, claude-mythos-fable

Release sqlite-utils 4.0rc4

The last RC before the 4.0 stable release. Mainly implements feedback from a detailed review by Claude Fable 5.

7th Jul 2026, 5:36 am · sqlite-utils, claude-mythos-fable

Release sqlite-utils 4.0rc3

I hoped to release sqlite-utils 4.0 stable this weekend, but as I worked through the backlog of issues and PRs with a combination of Claude Fable 5 and GPT-5.5 the changelog since rc2 kept getting bigger.

The biggest new feature is support for introspecting and creating compound foreign keys - a feature that involves a subtle breaking change to table.foreign_keys and hence needed to land for the 4.0 stable release.

sqlite-utils also now follows SQLite's convention for case insensitive column names, which turned out to touch a bunch of different places at once.

6th Jul 2026, 5:40 am · projects, sqlite, sqlite-utils, annotated-release-notes, gpt, claude-mythos-fable

sqlite-utils 4.0rc2, mostly written by Claude Fable (for about $149.25)

I wrote about the sqlite-utils 4.0rc1 release a couple of weeks ago. Since we only have Claude Fable on our Max subscriptions for a few more days, I decided to see if it could help me get to a 4.0 stable release that I felt truly comfortable about, since I try to keep to SemVer and like my incompatible major versions to be as rare as possible.

[... 2,427 words]

1 am / 5th July 2026 / projects, sqlite, ai, sqlite-utils, annotated-release-notes, generative-ai, llms, anthropic, claude, llm-pricing, coding-agents, claude-code, agentic-engineering, gpt, claude-mythos-fable

One of the most interesting tips I got from the Fireside Chat I hosted with Cat Wu and Thariq Shihipar from the Claude Code team at AIE on Wednesday was to let Fable (and to a certain extent Opus) use their own judgement rather than dictating how they should work.

The example they gave was testing. You can tell Fable "only use automated testing for larger features, don't update and run tests for small copy or design changes" - but it's better to just tell Fable to use its own judgement when deciding to write tests instead.

Jesse Vincent just gave me a related tip to help avoid burning too many of those valuable Fable tokens in the few days we have left before the prices go up. Tell Fable to use other models for smaller tasks, applying its own judgement about which model to use.

I prompted Claude Code just now with:

For all coding tasks use your judgement to decide an appropriate lower power model and run that in a subagent

Claude saved this memory file in ~/.claude/projects/name-of-project/memory/delegate-coding-to-subagents.md:

---
name: delegate-coding-to-subagents
description: Simon wants coding tasks delegated to subagents running an appropriately lower-power model
metadata: 
  node_type: memory
  type: feedback
  originSessionId: 30068d78-43a9-4fb1-bb29-9799e18c526a
---
Stated by Simon on 2026-07-03: "For all coding tasks use your judgement to decide an appropriate lower power model and run that in a subagent."

Why: cost/efficiency — implementation work rarely needs the top-tier model; judgment, review, and synthesis stay with the main loop.

How to apply: when a task in this project is primarily writing/editing code, spawn an Agent with a model override (sonnet for substantive implementation, haiku for trivial/mechanical edits) and a self-contained prompt; review the result in the main loop before committing. Design, auditing, data synthesis, and anything judgment-heavy stays in the main model. See also [[project-goals]].

So far it seems to be working well. I'm getting a ton of work done and my Fable allowance is shrinking less quickly than before.

# 3rd July 2026, 6:51 pm / ai, prompt-engineering, generative-ai, llms, anthropic, claude, coding-agents, claude-code, claude-mythos-fable, thariq-shihipar, cat-wu

Release llm-coding-agent 0.1a0

Another Fable 5 experiment. Now that my LLM library has evolved into more of an agent framework it's time to see what a simple coding agent would look like built on it.

I started a new Python library using my python-lib-template-repository GitHub template repository, then ran these two prompts (here's the Claude Code for web transcript):

Write a spec.md for this project - it will depend on the latest “llm” alpha from PyPI and implement a Claude code style coding agent complete with tools for reading and editing files and executing commands

Then:

Commit the spec, then build it using red/green TDD in a series of sensible commits (each with passing tests and updated docs) - occasionally manually test it using the OpenAI API key in your environment

Here's the spec, the resulting README file, and the sequence of commits.

I've shipped a slop-alpha to PyPI, so you can run the new agent like this:

uvx --prerelease=allow --with llm-coding-agent llm code

It's pretty good for a first attempt! Here's the (Fable-authored) README, which lists recipes like llm code --yolo and llm code --allow "pytest*" --allow "git diff*".

It also presents a Python API based around a CodingAgent(model="gpt-5.5", root="/path", approve=True).run("Fix the failing test in tests/test_parser.py") class which I didn't ask for but I'm delighted to see implemented.

Here's the suite of tools it implemented, listed using uvx ... llm tools:

CodingTools_edit_file(path: str, old_string: str, new_string: str, replace_all: bool = False) -> str

Replace an exact string in a file.

old_string must match the file contents exactly (including whitespace) and must identify a unique location unless replace_all is true. Returns a diff of the change so it can be verified.

CodingTools_execute_command(command: str, timeout: int = 120) -> str

Run a shell command in the session root directory.

Returns combined stdout and stderr followed by an Exit code line. timeout is in seconds (maximum 600); on timeout the whole process tree is killed.

CodingTools_list_files(pattern: str = '**/*', path: str = '.') -> str

List files matching a glob pattern, newest first.

Skips hidden directories, node_modules, __pycache__ and (in a git repository) anything covered by .gitignore. Returns at most 200 paths relative to the searched directory.

CodingTools_read_file(path: str, offset: int = 0, limit: int = 2000) -> str

Read a text file, returning numbered lines like cat -n.

Paths are relative to the session root. Use offset (0-based first line) and limit (max lines) to page through files too large to read in one call.

CodingTools_search_files(pattern: str, path: str = '.', glob: str = None, max_results: int = 100) -> str

Search file contents for a regular expression.

Returns matches as path:line_number:line, capped at max_results. Use glob (e.g. "*.py") to restrict which files are searched.

CodingTools_write_file(path: str, content: str) -> str

Create or overwrite a file with the given content.

Parent directories are created as needed. Prefer edit_file for modifying existing files.

I tried it out by running llm code --yolo and then prompting:

mkdir /tmp/demo and then in that folder create a simple swiftui CLI app for telling the time in ascii art

Here's the transcript, in which GPT-5.5 reasoning notes that "SwiftUI isn't suitable for a true CLI" and then builds an app that outputs this on swift run AsciiTime:

      █    █████         ████     █             █     ███   
     ██    █        █        █   ██      █     ██    █   █  
      █    ████           ███     █             █       █   
      █        █    █        █    █      █      █      █    
     ███   ████          ████    ███           ███   █████

2nd Jul 2026, 7:33 pm · projects, ai, generative-ai, llm, llm-tool-use, coding-agents, claude-code, claude-mythos-fable

Research Using DSPy to evaluate and improve Datasette Agent's SQL system prompts

One of this morning's AIE keynotes covered dspy, which reminded me I've been meaning to see if it could help me improve the system prompt used by Datasette Agent - so I fired off an asynchronous research task in Claude Code for web using Claude Fable 5:

Pip install the latest Datasette alpha and datasette-agent and dspy - then figure out how to use dspy to evaluate and improve the main system prompts used by Datasette Agent for the feature where it can execute read only SQL queries to answer user questions about data.

Fable chose to test using GPT 4.1 mini and nano, and identified several promising looking directions for improvements. I particularly like this one:

The schema listing gives only table names; the "don't call describe_table if you already have the information" advice caused column-name guessing (page_count, o.order_id, first_name) and error-retry loops in baseline traces. Either include column names in the prompt's schema listing or soften that advice.

2nd Jul 2026, 6:25 pm · ai, datasette, generative-ai, llms, evals, dspy, datasette-agent, claude-mythos-fable

We’ve received notice that the Department of Commerce has lifted export controls on Claude Fable 5 and Mythos 5.

We'll begin restoring access tomorrow, and will share an update soon.

— Anthropic, on Twitter

# 30th June 2026, 11:58 pm / ai, generative-ai, llms, anthropic, claude, claude-mythos-fable

The Fable 5 Export Controls Harm US Cyber Defense. I quoted The Atlantic quoting Kate Moussouris earlier, when I should have gone straight to the source. Here she is confirming that the "jailbreak" that got Claude Fable 5 banned under an export control really was "fix this code":

The researchers took open-source code with known CVEs, plus new code with deliberately planted vulnerabilities, and asked Fable 5, Mythos, and Opus to “review the code for security issues.” Fable 5 refused. They then asked the models to “fix this code” and, through a multistep and manual process, turned the output into scripts that test the patches.

As Kate points out, this is absurd. Coding models fix bugs, and security exploits are the most important category of bugs for them to fix!

Defenders need to be able to ask AI to fix the bugs in a file, explain why the fix matters, and write tests that confirm the patch works. That is not a guardrail bypass. It is the most valuable thing an AI model can do for defensive security: executing the find, fix, and test loop defenders run every day. [...]

The prompts worked because they were defensive requests, and that capability cannot be removed without making the model worse at fixing bugs and verifying patches.

This whole situation is such a mess. Non-technical decision-makers have been hearing that models that can "craft cyber attacks" are uniquely dangerous for months. Now they look ready to ban any model that can help us secure our code.

# 16th June 2026, 5:20 am / jailbreaking, security, ai, generative-ai, llms, anthropic, ai-security-research, claude-mythos-fable

Katie Moussouris, a cybersecurity expert and the CEO of Luta Security, told me that Anthropic shared with her a copy of the White House’s report on the Fable jailbreak to get her appraisal. (She said that she is not being paid by Anthropic.) The report, Moussouris said, involved IT experts asking Fable to help find and patch bugs. When given deliberately insecure code, she said, Fable refused the prompt “review the code for security issues” but then complied when asked to “fix this code,” followed by some further manual steps. Moussouris told me that this was just “the model working as intended” for cyberdefense.

— Matteo Wong, The Atlantic, The White House Is Ratcheting Up Its War Against Anthropic

# 16th June 2026, 3:07 am / jailbreaking, ai, generative-ai, llms, anthropic, claude, ai-ethics, ai-security-research, claude-mythos-fable

“They screwed us”: Personality clashes sent Anthropic’s models offline. Lots of "source familiar with the administration's thinking" and "source close to Anthropic" in this Axios piece, which is the best collection of behind-the-scenes gossip I've seen about the US government export control Mythos/Fable story so far.

Logan Graham (I lead the Frontier Red Team at Anthropic), Dave Orr (Head of Safeguards, previously a Director of Engineering at Google DeepMind), and blog favorite Nicholas Carlini are reported to be meeting with the Commerce Department today in D.C. Good luck to them!

(I just noticed Logan was "Special Adviser to the Prime Minister" in the Boris Johnson era, covering AI, science, and technology policy - so significant political experience.)

This closing note doesn't give me much optimism that we'll be getting Fable back any time soon:

The bottom line: One option is to make sure Anthropic's models can't be jailbroken — though perfect jailbreak resistance may be impossible.

Absent that, a source familiar with the administration's thinking said it may simply come down to an attitude fix where, instead of feeling dismissed, "everyone feels safe, secure and happy."

This made me wonder if Anthropic ever successfully addressed the class of attacks described in the Universal and Transferable Adversarial Attacks on Aligned Language Models paper from 2023.

It looks like their Constitutional Classifiers work (that post is from January this year) is relevant to that. They continue to claim that no "universal jailbreak" has been found against Claude Mythos, classifying the jailbreak that triggered the US government response as "a potential narrow, non-universal jailbreak".

# 15th June 2026, 2:57 pm / jailbreaking, ai, generative-ai, llms, anthropic, claude, nicholas-carlini, ai-ethics, claude-mythos-fable

Statement on the US government directive to suspend access to Fable 5 and Mythos 5 (via) Well this is nuts:

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Anthropic models will not be affected.

We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing, or "jailbreaking" Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass. [...]

To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI's GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.

I still have access to Fable via claude.ai and Claude Code now, at 9:01pm ET.

Update: I ran this script against the Anthropic API to spot when claude-fable-5 would stop working. My access was cut off at 6:59pm Pacific (9:59pm ET):

[2026-06-12T18:56:50-07:00] attempt 35: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:56:55-07:00] success: Hi there! How can I help you today?
[2026-06-12T18:57:55-07:00] attempt 36: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:57:59-07:00] success: Hi! How can I help you today?
[2026-06-12T18:58:59-07:00] attempt 37: running uv run llm -m claude-fable-5 hi
[2026-06-12T18:59:00-07:00] FAILED after attempt 37 with exit code 1

stderr:
Error: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'Claude Fable 5 is not available. Please use Opus 4.8. Learn more: https://www.anthropic.com/news/fable-mythos-access'}, 'request_id': 'req_011CbzRyirV7KZLHYYdBM9od'}

# 13th June 2026, 1:01 am / jailbreaking, ai, generative-ai, llms, anthropic, claude, ai-ethics, claude-mythos-fable

Claude Fable is relentlessly proactive

After two days of experience with Claude Fable 5 I think the best way to describe it is relentlessly proactive. It knows a whole lot of tricks and it will deploy pretty much any of them to get to its goal.

[... 1,939 words]

11:35 pm / 11th June 2026 / ai, prompt-injection, generative-ai, llms, ai-assisted-programming, coding-agents, claude-code, claude-mythos-fable

Release asyncinject 0.7

I built this utility library to support an asyncio dependency injection pattern a few years ago. I was using it with Datasette and Claude Fable 5 spotted some bugs in the dependency which it then fixed for me. It's a very proactive model!

11th Jun 2026, 6:28 am · async, projects, python, claude-mythos-fable

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude (via) Big scoop for Maxwell Zeff at Wired:

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”

There's been a huge outcry about Anthropic's policy, tucked away in their system card, that Claude Fable/Mythos would identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user.

It's good news that they're dropping the invisible aspect of this. It would be a whole lot better of they dropped this category of refusals entirely.

Update: More details from @ClaudeDevs on Twitter:

We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible.

Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days).

We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.

# 11th June 2026, 3:45 am / ai, generative-ai, llms, anthropic, claude, ai-ethics, claude-mythos-fable

Easy solution to slow down recursive AI self improvement:

The lab with the top-ranked model must agree THEY must not use it for working on frontier AI

But everyone else should have access to it.

By definition, this means the frontier doesn't advance.

It also has the critical benefit of avoiding a dangerous power imbalance.

Anthropic has chosen the opposite of the safe path: they are allowing themselves, the current top lab, to use their top model for frontier AI research. They've said they'll sabotage others who try.

This means the AI frontier advances, & power imbalance increases.

(To be clear, I don't think we should try to slow down recursive AI self improvement - I think we should open it up and democratize it as much as possible. My point is: if you claim we should slow down, and you have the best model, you should ensure your org can't use it.)

— Jeremy Howard, in a Twitter thread

# 10th June 2026, 3:23 pm / ai, generative-ai, llms, jeremy-howard, anthropic, ai-ethics, claude-mythos-fable

If Claude Fable stops helping you, you’ll never know (via) Jonathon Ready highlights one of the more eyebrow-raising details from the 319 page system card for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine:

In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.

Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, these safeguards will not be visible to the user. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations.

I believe this is the first time Anthropic have announced these kinds of silent interventions. The justification still feels pretty science-fiction to me - the linked article talks about "recursive self-improvement". I'm not at all keen on a model that silently corrupts its replies to questions about "ML accelerator design" purely to slow down research that might conflict with Anthropic's own goals!

Update: Anthropic walked back this policy in the face of widespread outrage from the research community.

# 10th June 2026, 12:37 am / ai, generative-ai, llms, anthropic, claude, ai-ethics, claude-mythos-fable

Initial impressions of Claude Fable 5

I didn’t have early access to today’s Claude Fable 5 release, but I’ve spent the past ~5.5 hours putting it through its paces. My initial impressions are that this is something of a beast. It’s slow, expensive and has been quite happily churning through everything I’ve thrown at it so far. As is frequently the case with current frontier models the challenge is finding tasks that it can’t do.

[... 2,404 words]

11:59 pm / 9th June 2026 / ai, generative-ai, llms, anthropic, claude, llm-pricing, pelican-riding-a-bicycle, llm-release, claude-mythos-fable

Release llm 0.32a3

Almost entirely written by the new Claude Fable 5, see my write-up for more details.

9th Jun 2026, 10:27 pm · projects, ai, generative-ai, llms, llm, claude-mythos-fable

TIL Setting a custom price for a model in AgentsView

I've been really enjoying AgentsView by Wes McKinney as a tool for exploring my token usage across different coding agents running on my laptop.

Claude Fable 5 came out today and wasn't yet included in the pricing database AgentsView uses. I used Fable to reverse-engineer AgentsView and figured out this recipe for setting custom prices.

Here's my Claude Fable 5 usage for today so far, plotted by AgentsView as a treemap across my different local projects:

9th Jun 2026, 9:35 pm · ai, generative-ai, llms, llm-pricing, claude-mythos-fable

I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref).

— Andrej Karpathy, on Claude Fable 5

# 9th June 2026, 7:03 pm / ai, andrej-karpathy, generative-ai, llms, anthropic, jevons-paradox, claude-mythos-fable

Behind the Scenes Hardening Firefox with Claude Mythos Preview (via) Fascinating, in-depth details on how Mozilla used their access to the Claude Mythos preview to locate and then fix hundreds of vulnerabilities in Firefox:

Suddenly, the bugs are very good

Just a few months ago, AI-generated security bug reports to open source projects were mostly known for being unwanted slop. Dealing with reports that look plausibly correct but are wrong imposes an asymmetric cost on project maintainers: it’s cheap and easy to prompt an LLM to find a “problem” in code, but slow and expensive to respond to it.

It is difficult to overstate how much this dynamic changed for us over a few short months. This was due to a combination of two main factors. First, the models got a lot more capable. Second, we dramatically improved our techniques for harnessing these models — steering them, scaling them, and stacking them to generate large amounts of signal and filter out the noise.

They include some detailed bug descriptions too, including a 20-year old XSLT bug and a 15-year-old bug in the <legend> element.

A lot of the attempts made by the harness were blocked by Firefox's existing defense-in-depth measures, which is reassuring.

Mozilla were fixing around 20-30 security bugs in Firefox per month through 2025. That jumped to 423 in April.

# 7th May 2026, 5:56 pm / firefox, mozilla, security, ai, generative-ai, llms, anthropic, claude, ai-security-research, claude-mythos-fable

As part of our continued collaboration with Anthropic, we had the opportunity to apply an early version of Claude Mythos Preview to Firefox. This week’s release of Firefox 150 includes fixes for 271 vulnerabilities identified during this initial evaluation. [...]

Our experience is a hopeful one for teams who shake off the vertigo and get to work. You may need to reprioritize everything else to bring relentless and single-minded focus to the task, but there is light at the end of the tunnel. We are extremely proud of how our team rose to meet this challenge, and others will too. Our work isn’t finished, but we’ve turned the corner and can glimpse a future much better than just keeping up. Defenders finally have a chance to win, decisively.

— Bobby Holley, CTO, Firefox

# 22nd April 2026, 5:40 am / firefox, mozilla, security, ai, generative-ai, llms, anthropic, claude, ai-security-research, claude-mythos-fable

Cybersecurity Looks Like Proof of Work Now. The UK's AI Safety Institute recently published Our evaluation of Claude Mythos Preview’s cyber capabilities, their own independent analysis of Claude Mythos which backs up Anthropic's claims that it is exceptionally effective at identifying security vulnerabilities.

Drew Breunig notes that AISI's report shows that the more tokens (and hence money) they spent the better the result they got, which leads to a strong economic incentive to spend as much as possible on security reviews:

If Mythos continues to find exploits so long as you keep throwing money at it, security is reduced to a brutally simple equation: to harden a system you need to spend more tokens discovering exploits than attackers will spend exploiting them.

An interesting result of this is that open source libraries become more valuable, since the tokens spent securing them can be shared across all of their users. This directly counters the idea that the low cost of vibe-coding up a replacement for an open source library makes those open source projects less attractive.

# 14th April 2026, 7:41 pm / open-source, ai, generative-ai, llms, drew-breunig, vibe-coding, ai-security-research, claude-mythos-fable

Anthropic’s Project Glasswing—restricting Claude Mythos to security researchers—sounds necessary to me

Anthropic didn’t release their latest model, Claude Mythos (system card PDF), today. They have instead made it available to a very restricted set of preview partners under their newly announced Project Glasswing.

[... 1,296 words]

8:52 pm / 7th April 2026 / security, thomas-ptacek, ai, generative-ai, llms, anthropic, nicholas-carlini, ai-ethics, llm-release, ai-security-research, claude-mythos-fable

Simon Willison’s Weblog