<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: models</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/models.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-05-02T23:41:52+00:00</updated><author><name>Simon Willison</name></author><entry><title>Qwen3-8B</title><link href="https://simonwillison.net/2025/May/2/qwen3-8b/#atom-tag" rel="alternate"/><published>2025-05-02T23:41:52+00:00</published><updated>2025-05-02T23:41:52+00:00</updated><id>https://simonwillison.net/2025/May/2/qwen3-8b/#atom-tag</id><summary type="html">
    &lt;p&gt;Having tried a few of the &lt;a href="https://simonwillison.net/2025/Apr/29/qwen-3/"&gt;Qwen 3 models&lt;/a&gt; now my favorite is a bit of a surprise to me: I'm really enjoying &lt;a href="https://huggingface.co/Qwen/Qwen3-8B"&gt;Qwen3-8B&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I've been running prompts through the MLX 4bit quantized version, &lt;a href="https://huggingface.co/mlx-community/Qwen3-8B-4bit"&gt;mlx-community/Qwen3-8B-4bit&lt;/a&gt;. I'm using &lt;a href="https://github.com/simonw/llm-mlx"&gt;llm-mlx&lt;/a&gt; like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-mlx
llm mlx download-model mlx-community/Qwen3-8B-4bit
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pulls 4.3GB of data and saves it to &lt;code&gt;~/.cache/huggingface/hub/models--mlx-community--Qwen3-8B-4bit&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I assigned it a default alias:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm aliases set q3 mlx-community/Qwen3-8B-4bit
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I also added a default option for that model - this saves me from adding &lt;code&gt;-o unlimited 1&lt;/code&gt; to every prompt which disables the default output token limit:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm models options set q3 unlimited 1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And now I can run prompts:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm -m q3 'brainstorm questions I can ask my friend who I think is secretly from Atlantis that will not tip her off to my suspicions'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Qwen3 is a "reasoning" model, so it starts each prompt with a &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt; block containing its chain of thought. Reading these is always &lt;em&gt;really fun&lt;/em&gt;. Here's the full response I got for &lt;a href="https://gist.github.com/simonw/52a883eb4709de66c6bfe9bb3b0f3ee0"&gt;the above question&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I'm finding Qwen3-8B to be surprisingly capable for useful things too. It can &lt;a href="https://gist.github.com/simonw/ab414f01a28e050b8419b4152a4016d1"&gt;summarize short articles&lt;/a&gt;. It can &lt;a href="https://gist.github.com/simonw/db129dddb76e5ba8f97794a794ae626d#response-1"&gt;write simple SQL queries&lt;/a&gt; given a question and a schema. It can &lt;a href="https://gist.github.com/simonw/54f040ae2f2ca3b83cdc1b2e691936ab"&gt;figure out what a simple web app does&lt;/a&gt; by reading the HTML and JavaScript. It can &lt;a href="https://gist.github.com/simonw/ac4082df0dcde87d5845586804fb80c9"&gt;write Python code&lt;/a&gt; to meet a paragraph long spec - for that one it "reasoned" for an unreasonably long time but it did eventually get to a useful answer.&lt;/p&gt;
&lt;p&gt;All this while consuming between 4 and 5GB of memory, depending on the length of the prompt.&lt;/p&gt;
&lt;p&gt;I think it's pretty extraordinary that a few GBs of floating point numbers can usefully achieve these various tasks, especially using so little memory that it's not an imposition on the rest of the things I want to run on my laptop at the same time.&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/models"&gt;models&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/qwen"&gt;qwen&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mlx"&gt;mlx&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-reasoning"&gt;llm-reasoning&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-in-china"&gt;ai-in-china&lt;/a&gt;&lt;/p&gt;



</summary><category term="models"/><category term="ai"/><category term="generative-ai"/><category term="local-llms"/><category term="llm"/><category term="qwen"/><category term="mlx"/><category term="llm-reasoning"/><category term="ai-in-china"/></entry><entry><title>South's Design</title><link href="https://simonwillison.net/2009/May/13/south/#atom-tag" rel="alternate"/><published>2009-05-13T12:30:45+00:00</published><updated>2009-05-13T12:30:45+00:00</updated><id>https://simonwillison.net/2009/May/13/south/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.aeracode.org/2009/5/9/souths-design/"&gt;South&amp;#x27;s Design&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Andrew Godwin explains why South resorts to parsing your models.py file in order to construct information about for creating automatic migrations.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/andrew-godwin"&gt;andrew-godwin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/models"&gt;models&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/parsing"&gt;parsing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/south"&gt;south&lt;/a&gt;&lt;/p&gt;



</summary><category term="andrew-godwin"/><category term="django"/><category term="models"/><category term="orm"/><category term="parsing"/><category term="python"/><category term="south"/></entry><entry><title>django-mptt</title><link href="https://simonwillison.net/2007/Dec/29/djangomptt/#atom-tag" rel="alternate"/><published>2007-12-29T11:33:08+00:00</published><updated>2007-12-29T11:33:08+00:00</updated><id>https://simonwillison.net/2007/Dec/29/djangomptt/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/django-mptt/"&gt;django-mptt&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Jonathan Buchanan’s simple utility for performing Modified Preorder Tree Traversal (efficient tree operations in SQL) on Django models.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="http://insin.webfactional.com/weblog/2007/dec/29/django-mptt/"&gt;Jonathan Buchanan&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/djangoorm"&gt;djangoorm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jonathan-buchanan"&gt;jonathan-buchanan&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/models"&gt;models&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/modifiedpreordertreetraversal"&gt;modifiedpreordertreetraversal&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mptt"&gt;mptt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="djangoorm"/><category term="jonathan-buchanan"/><category term="models"/><category term="modifiedpreordertreetraversal"/><category term="mptt"/><category term="python"/><category term="sql"/></entry><entry><title>tranquil</title><link href="https://simonwillison.net/2007/Oct/9/tranquil/#atom-tag" rel="alternate"/><published>2007-10-09T02:30:29+00:00</published><updated>2007-10-09T02:30:29+00:00</updated><id>https://simonwillison.net/2007/Oct/9/tranquil/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://code.google.com/p/tranquil/"&gt;tranquil&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Inspired take on the Django ORM to SQLAlchemy problem: lets you define your models with the Django ORM but use SQLAlchemy to run queries against them.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/djangoorm"&gt;djangoorm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/models"&gt;models&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/orm"&gt;orm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlalchemy"&gt;sqlalchemy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tranquil"&gt;tranquil&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="djangoorm"/><category term="models"/><category term="orm"/><category term="python"/><category term="sqlalchemy"/><category term="tranquil"/></entry></feed>