<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: cosmopolitan</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/cosmopolitan.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-02-06T20:39:15+00:00</updated><author><name>Simon Willison</name></author><entry><title>sqlite-page-explorer</title><link href="https://simonwillison.net/2025/Feb/6/sqlite-page-explorer/#atom-tag" rel="alternate"/><published>2025-02-06T20:39:15+00:00</published><updated>2025-02-06T20:39:15+00:00</updated><id>https://simonwillison.net/2025/Feb/6/sqlite-page-explorer/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/QuadrupleA/sqlite-page-explorer"&gt;sqlite-page-explorer&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Outstanding tool by  Luke Rissacher  for understanding the SQLite file format. Download the application (built using redbean and Cosmopolitan, so the same binary runs on Windows, Mac and Linux) and point it at a SQLite database to get a local web application with an interface for exploring how the file is structured.&lt;/p&gt;
&lt;p&gt;Here's it running against the &lt;a href="https://datasette.io/content"&gt;datasette.io/content&lt;/a&gt; database that runs the official Datasette website:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Animated demo. Starts on a page that shows the SQLite header information, then toggles hex display on and off, then navigates to an index of all the pages and clicks in to the details of one of those pages." src="https://static.simonwillison.net/static/2025/sqlite-page-explorer-content.gif" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=42965198"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/lua"&gt;lua&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redbean"&gt;redbean&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;&lt;/p&gt;



</summary><category term="lua"/><category term="sqlite"/><category term="redbean"/><category term="cosmopolitan"/></entry><entry><title>llamafile is the new best way to run an LLM on your own computer</title><link href="https://simonwillison.net/2023/Nov/29/llamafile/#atom-tag" rel="alternate"/><published>2023-11-29T20:54:16+00:00</published><updated>2023-11-29T20:54:16+00:00</updated><id>https://simonwillison.net/2023/Nov/29/llamafile/#atom-tag</id><summary type="html">
    &lt;p&gt;Mozilla’s innovation group and Justine Tunney &lt;a href="https://hacks.mozilla.org/2023/11/introducing-llamafile/"&gt;just released llamafile&lt;/a&gt;, and I think it's now the single best way to get started running Large Language Models (think your own local copy of ChatGPT) on your own computer.&lt;/p&gt;
&lt;p&gt;A llamafile is a single multi-GB file that contains both the model weights for an LLM and the code needed to run that model - in some cases a full local server with a web UI for interacting with it.&lt;/p&gt;
&lt;p&gt;The executable is compiled using &lt;a href="https://justine.lol/cosmopolitan/index.html"&gt;Cosmopolitan Libc&lt;/a&gt;, Justine's incredible project that supports compiling a single binary that works, unmodified, on multiple different operating systems and hardware architectures.&lt;/p&gt;
&lt;p&gt;Here's how to get started with &lt;a href="https://llava-vl.github.io/"&gt;LLaVA 1.5&lt;/a&gt;, a large multimodal model (which means text and image inputs, like GPT-4 Vision) fine-tuned on top of Llama 2. I've tested this process on an M2 Mac, but it should work on other platforms as well (though be sure to &lt;a href="https://github.com/mozilla-Ocho/llamafile#gotchas"&gt;read the Gotchas&lt;/a&gt; section of the README, and take a look at Justine's &lt;a href="https://news.ycombinator.com/item?id=38465916"&gt;list of supported platforms&lt;/a&gt; in a comment on Hacker News).&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Download the 4.29GB &lt;code&gt;llava-v1.5-7b-q4.llamafile&lt;/code&gt; file &lt;a href="https://huggingface.co/jartine/llava-v1.5-7B-GGUF/tree/main"&gt;from Justine's repository on Hugging Face&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;curl -LO https://huggingface.co/jartine/llava-v1.5-7B-GGUF/resolve/main/llava-v1.5-7b-q4.llamafile&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Make that binary executable, by running this in a terminal:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;chmod 755 llava-v1.5-7b-q4.llamafile&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Run your new executable, which will start a web server on port 8080:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;./llava-v1.5-7b-q4.llamafile&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Navigate to &lt;code&gt;http://127.0.0.1:8080/&lt;/code&gt; to start interacting with the model in your browser.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That's all there is to it. On my M2 Mac it runs at around 55 tokens a second, which is &lt;em&gt;really&lt;/em&gt; fast. And it can analyze images - here's what I got when I uploaded a photograph and asked "Describe this plant":&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/llamafile.jpg" alt="Screenshot. llama.cpp - then a photo I took of a plant

User: Describe this plant

Llama: The image features a large, green plant with numerous thin branches and leaves. Among the many stems of this plant, there is an orange flower visible near its center. This beautifully decorated plant stands out in the scene due to its vibrant colors and intricate structure.

18ms per token, 54.24 tokens per second
Powered by llama.cpp, ggml.ai, and llamafile" style="max-width: 100%" /&gt;&lt;/p&gt;
&lt;h4 id="llamafile-how-this-works"&gt;How this works&lt;/h4&gt;
&lt;p&gt;There are a number of different components working together here to make this work.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The LLaVA 1.5 model by Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee is &lt;a href="https://arxiv.org/abs/2310.03744"&gt;described in this paper&lt;/a&gt;, with further details on &lt;a href="https://llava-vl.github.io/"&gt;llava-vl.github.io&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The models are executed using &lt;a href="https://github.com/ggerganov/llama.cpp"&gt;llama.cpp&lt;/a&gt;, and in the above demo also use the &lt;code&gt;llama.cpp&lt;/code&gt; server example to provide the UI.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://justine.lol/cosmopolitan/index.html"&gt;Cosmopolitan Libc&lt;/a&gt; is the magic that makes one binary work on multiple platforms. I wrote more about that in a TIL a few months ago, &lt;a href="https://til.simonwillison.net/cosmopolitan/ecosystem"&gt;Catching up with the Cosmopolitan ecosystem&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="llamafile-trying-other-models"&gt;Trying more models&lt;/h4&gt;
&lt;p&gt;The &lt;a href="https://github.com/mozilla-Ocho/llamafile"&gt;llamafile README&lt;/a&gt; currently links to binaries for &lt;code&gt;Mistral-7B-Instruct&lt;/code&gt;, &lt;code&gt;LLaVA 1.5&lt;/code&gt; and &lt;code&gt;WizardCoder-Python-13B&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;You can also download a much smaller &lt;code&gt;llamafile&lt;/code&gt; binary from &lt;a href="https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.1"&gt;their releases&lt;/a&gt;, which can then execute any model that has been compiled to GGUF format:&lt;/p&gt;
&lt;p&gt;I grabbed &lt;code&gt;llamafile-server-0.1&lt;/code&gt; (4.45MB) like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl -LO https://github.com/Mozilla-Ocho/llamafile/releases/download/0.1/llamafile-server-0.1
chmod 755 llamafile-server-0.1&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then ran it against a 13GB &lt;code&gt;llama-2-13b.Q8_0.gguf&lt;/code&gt; file I had &lt;a href="https://huggingface.co/TheBloke/Llama-2-13B-GGUF/tree/main"&gt;previously downloaded&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;./llamafile-server-0.1 -m llama-2-13b.Q8_0.gguf&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This gave me the same interface at &lt;code&gt;http://127.0.0.1:8080/&lt;/code&gt; (without the image upload) and let me talk with the model at 24 tokens per second.&lt;/p&gt;
&lt;h4 id="llamafile-one-file"&gt;One file is all you need&lt;/h4&gt;
&lt;p&gt;I think my favourite thing about llamafile is what it represents. This is a single binary file which you can download and then use, forever, on (almost) any computer.&lt;/p&gt;

&lt;p&gt;You don't need a network connection, and you don't need to keep track of more than one file.&lt;/p&gt;

&lt;p&gt;Stick that file on a USB stick and stash it in a drawer as insurance against a future apocalypse. You'll never be without a language model ever again.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mistral"&gt;mistral&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llamafile"&gt;llamafile&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/justine-tunney"&gt;justine-tunney&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama-cpp"&gt;llama-cpp&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="mozilla"/><category term="ai"/><category term="generative-ai"/><category term="cosmopolitan"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="mistral"/><category term="llamafile"/><category term="justine-tunney"/><category term="llama-cpp"/></entry><entry><title>djngo.com: Portable Django</title><link href="https://simonwillison.net/2023/Feb/24/djngocom/#atom-tag" rel="alternate"/><published>2023-02-24T00:52:27+00:00</published><updated>2023-02-24T00:52:27+00:00</updated><id>https://simonwillison.net/2023/Feb/24/djngocom/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://djngo.com/"&gt;djngo.com: Portable Django&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“A 20mb executable zip file with Python 3.6 and Django 2.2. Works on Windows, Linux, MacOSX with x86_64 and aarch64 (yes, Apple M1 and Raspberry Pi).” The latest wizardry from the ecosystem surrounding the Cosmopolitan project, which provides a should-be-impossible mechanism for running the same executable on a bunch of different platforms. This utility by Ariel Núñez bundles Python and Django and SQLite, such that a Django application can become a portable executable ready to run on multiple platforms. It’s currently limited to Python 3.6 and Django 2.2 since those are the versions that run under Cosmopolitan, but I expect we’ll see more recent versions of those dependencies in the future.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=34913333"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redbean"&gt;redbean&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/raspberry-pi"&gt;raspberry-pi&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="python"/><category term="sqlite"/><category term="redbean"/><category term="cosmopolitan"/><category term="raspberry-pi"/></entry><entry><title>Cosmopolitan: Compiling Python</title><link href="https://simonwillison.net/2022/Jul/26/cosmopolitan-python/#atom-tag" rel="alternate"/><published>2022-07-26T20:43:31+00:00</published><updated>2022-07-26T20:43:31+00:00</updated><id>https://simonwillison.net/2022/Jul/26/cosmopolitan-python/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/jart/cosmopolitan/issues/141"&gt;Cosmopolitan: Compiling Python&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Cosmopolitan is Justine Tunney’s “build-once run-anywhere C library”—part of the αcτµαlly pδrταblε εxεcµταblε effort, which produces wildly clever binary executable files that work on multiple different platforms, and is the secret sauce behind redbean. I hadn’t realized this was happening but there’s an active project to get Python to work as this format, producing a new way of running Python applications as standalone executables, only these ones have the potential to run unmodified on Windows, Linux and macOS.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redbean"&gt;redbean&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/justine-tunney"&gt;justine-tunney&lt;/a&gt;&lt;/p&gt;



</summary><category term="python"/><category term="redbean"/><category term="cosmopolitan"/><category term="justine-tunney"/></entry><entry><title>Fullmoon</title><link href="https://simonwillison.net/2022/Feb/18/fullmoon/#atom-tag" rel="alternate"/><published>2022-02-18T18:41:19+00:00</published><updated>2022-02-18T18:41:19+00:00</updated><id>https://simonwillison.net/2022/Feb/18/fullmoon/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/pkulchenko/fullmoon"&gt;Fullmoon&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A “fast and minimalistic web framework” written in Lua, based on Redbean. The documentation for this is fantastic, and because it uses Redbean the development experience is to download the Redbean executable (which runs on every platform) and then drop your own Lua scripts into it using zip.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=30385759"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/lua"&gt;lua&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redbean"&gt;redbean&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;&lt;/p&gt;



</summary><category term="lua"/><category term="redbean"/><category term="cosmopolitan"/></entry><entry><title>redbean</title><link href="https://simonwillison.net/2022/Feb/17/redbean/#atom-tag" rel="alternate"/><published>2022-02-17T06:01:18+00:00</published><updated>2022-02-17T06:01:18+00:00</updated><id>https://simonwillison.net/2022/Feb/17/redbean/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://redbean.dev/"&gt;redbean&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“redbean makes it possible to share web applications that run offline as a single-file αcτµαlly pδrταblε εxεcµταblε zip archive which contains your assets. All you need to do is download the redbean.com program below, change the filename to .zip, add your content in a zip editing tool, and then change the extension back to .com”.&lt;/p&gt;

&lt;p&gt;redbean is implemented as a single C file with a dazzling array of clever tricks—most impressively, the single executable works on Linux, macOS, Windows and various BSDs!&lt;/p&gt;

&lt;p&gt;It embeds Lua, and in June last year added SQLite too—so self-contained distributable web applications built with Redbean can now use Lua and SQLite for dynamic scripting. Performance sounds incredible: “redbean can serve 1 million+ gzip encoded responses per second on a cheap personal computer”.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/c6ogym/redbean_single_file_distributable_web"&gt;lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/lua"&gt;lua&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redbean"&gt;redbean&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="lua"/><category term="sqlite"/><category term="redbean"/><category term="cosmopolitan"/></entry><entry><title>cosmopolitan libc</title><link href="https://simonwillison.net/2021/Feb/27/cosmopolitan-libc/#atom-tag" rel="alternate"/><published>2021-02-27T06:02:11+00:00</published><updated>2021-02-27T06:02:11+00:00</updated><id>https://simonwillison.net/2021/Feb/27/cosmopolitan-libc/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://justine.lol/cosmopolitan/index.html"&gt;cosmopolitan libc&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“Cosmopolitan makes C a build-once run-anywhere language, similar to Java, except it doesn’t require interpreters or virtual machines be installed beforehand. [...] Instead, it reconfigures stock GCC to output a POSIX-approved polyglot format that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + BIOS with the best possible performance and the tiniest footprint imaginable.” This is a spectacular piece of engineering.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=26271117"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cosmopolitan"&gt;cosmopolitan&lt;/a&gt;&lt;/p&gt;



</summary><category term="c"/><category term="cosmopolitan"/></entry></feed>