<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: nomic</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/nomic.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-03-27T20:03:56+00:00</updated><author><name>Simon Willison</name></author><entry><title>Nomic Embed Code: A State-of-the-Art Code Retriever</title><link href="https://simonwillison.net/2025/Mar/27/nomic-embed-code/#atom-tag" rel="alternate"/><published>2025-03-27T20:03:56+00:00</published><updated>2025-03-27T20:03:56+00:00</updated><id>https://simonwillison.net/2025/Mar/27/nomic-embed-code/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.nomic.ai/blog/posts/introducing-state-of-the-art-nomic-embed-code"&gt;Nomic Embed Code: A State-of-the-Art Code Retriever&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Nomic have released a new embedding model that specializes in code, based on their CoRNStack "large-scale high-quality training dataset specifically curated for code retrieval".&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://huggingface.co/nomic-ai/nomic-embed-code"&gt;nomic-embed-code&lt;/a&gt; model is pretty large - 26.35GB - but the announcement also mentioned a much smaller model (released 5 months ago) called &lt;a href="https://huggingface.co/nomic-ai/CodeRankEmbed"&gt;CodeRankEmbed&lt;/a&gt; which is just 521.60MB.&lt;/p&gt;
&lt;p&gt;I missed that when it first came out, so I decided to give it a try using my &lt;a href="https://github.com/simonw/llm-sentence-transformers"&gt;llm-sentence-transformers&lt;/a&gt; plugin for &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-sentence-transformers
llm sentence-transformers register nomic-ai/CodeRankEmbed --trust-remote-code
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now I can run the model like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm embed -m sentence-transformers/nomic-ai/CodeRankEmbed -c 'hello'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This outputs an array of 768 numbers, starting &lt;code&gt;[1.4794224500656128, -0.474479079246521, ...&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Where this gets fun is combining it with my &lt;a href="https://simonwillison.net/2023/Jun/18/symbex/"&gt;Symbex tool&lt;/a&gt; to create and then search embeddings for functions in a codebase.&lt;/p&gt;
&lt;p&gt;I created an index for my LLM codebase like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cd llm
symbex '*' '*.*' --nl &amp;gt; code.txt
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This creates a newline-separated JSON file of all of the functions (from &lt;code&gt;'*'&lt;/code&gt;) and methods (from &lt;code&gt;'*.*'&lt;/code&gt;) in the current directory - you can &lt;a href="https://gist.github.com/simonw/ac45c6638ea87942383e97c5cf69ae09"&gt;see that here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then I fed that into the &lt;a href="https://llm.datasette.io/en/stable/embeddings/cli.html#llm-embed-multi"&gt;llm embed-multi&lt;/a&gt; command like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm embed-multi \
  -d code.db \
  -m sentence-transformers/nomic-ai/CodeRankEmbed \
  code code.txt \
  --format nl \
  --store \
  --batch-size 10
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I found the &lt;code&gt;--batch-size&lt;/code&gt; was needed to prevent it from crashing with an error. &lt;/p&gt;
&lt;p&gt;The above command creates a collection called &lt;code&gt;code&lt;/code&gt; in a SQLite database called &lt;code&gt;code.db&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Having run this command I can search for functions that match a specific search term in that &lt;code&gt;code&lt;/code&gt; collection like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm similar code -d code.db \
  -c 'Represent this query for searching relevant code: install a plugin' | jq
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That &lt;code&gt;"Represent this query for searching relevant code: "&lt;/code&gt; prefix is required by the model. I pipe it through &lt;code&gt;jq&lt;/code&gt; to make it a little more readable, which gives me &lt;a href="https://gist.github.com/simonw/fdc1b48b20a99714200f5d3970b1dff4"&gt;these results&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This &lt;code&gt;jq&lt;/code&gt; recipe makes for a better output:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm similar code -d code.db \
  -c 'Represent this query for searching relevant code: install a plugin' | \
  jq -r '.id + "\n\n" + .content + "\n--------\n"'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output from that starts like so:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm/cli.py:1776

@cli.command(name="plugins")
@click.option("--all", help="Include built-in default plugins", is_flag=True)
def plugins_list(all):
    "List installed plugins"
    click.echo(json.dumps(get_plugins(all), indent=2))
--------

llm/cli.py:1791

@cli.command()
@click.argument("packages", nargs=-1, required=False)
@click.option(
    "-U", "--upgrade", is_flag=True, help="Upgrade packages to latest version"
)
...
def install(packages, upgrade, editable, force_reinstall, no_cache_dir):
    """Install packages from PyPI into the same environment as LLM"""
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Getting this output was quite inconvenient, so I've &lt;a href="https://github.com/simonw/llm/issues/853"&gt;opened an issue&lt;/a&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/jq"&gt;jq&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="jq"/><category term="embeddings"/><category term="llm"/><category term="nomic"/></entry><entry><title>Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model</title><link href="https://simonwillison.net/2025/Feb/12/nomic-embed-text-v2/#atom-tag" rel="alternate"/><published>2025-02-12T22:24:19+00:00</published><updated>2025-02-12T22:24:19+00:00</updated><id>https://simonwillison.net/2025/Feb/12/nomic-embed-text-v2/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.nomic.ai/blog/posts/nomic-embed-text-v2"&gt;Nomic Embed Text V2: An Open Source, Multilingual, Mixture-of-Experts Embedding Model&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Nomic continue to release the most interesting and powerful embedding models. Their latest is Embed Text V2, an Apache 2.0 licensed multi-lingual 1.9GB model (here it is &lt;a href="https://huggingface.co/nomic-ai/nomic-embed-text-v2-moe"&gt;on Hugging Face&lt;/a&gt;) trained on "1.6 billion high-quality data pairs", which is the first embedding model I've seen to use a Mixture of Experts architecture:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In our experiments, we found that alternating MoE layers with 8 experts and top-2 routing provides the optimal balance between performance and efficiency. This results in 475M total parameters in the model, but only 305M active during training and inference.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I first tried it out using &lt;code&gt;uv run&lt;/code&gt; like this:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;uv run \
  --with einops \
  --with sentence-transformers \
  --python 3.13 python&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Then:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sentence_transformers&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;SentenceTransformer&lt;/span&gt;
&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;SentenceTransformer&lt;/span&gt;(&lt;span class="pl-s"&gt;"nomic-ai/nomic-embed-text-v2-moe"&lt;/span&gt;, &lt;span class="pl-s1"&gt;trust_remote_code&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
&lt;span class="pl-s1"&gt;sentences&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; [&lt;span class="pl-s"&gt;"Hello!"&lt;/span&gt;, &lt;span class="pl-s"&gt;"¡Hola!"&lt;/span&gt;]
&lt;span class="pl-s1"&gt;embeddings&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;encode&lt;/span&gt;(&lt;span class="pl-s1"&gt;sentences&lt;/span&gt;, &lt;span class="pl-s1"&gt;prompt_name&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"passage"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;embeddings&lt;/span&gt;)&lt;/pre&gt;

&lt;p&gt;Then I got it working on my laptop using the &lt;a href="https://github.com/simonw/llm-sentence-transformers"&gt;llm-sentence-tranformers&lt;/a&gt; plugin like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;llm install llm-sentence-transformers
llm install einops # additional necessary package
llm sentence-transformers register nomic-ai/nomic-embed-text-v2-moe --trust-remote-code

llm embed -m sentence-transformers/nomic-ai/nomic-embed-text-v2-moe -c 'string to embed'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This outputs a 768 item JSON array of floating point numbers to the terminal. These are &lt;a href="https://huggingface.co/blog/matryoshka"&gt;Matryoshka embeddings&lt;/a&gt; which means you can truncate that down to just the first 256 items and get similarity calculations that still work albeit slightly less well.&lt;/p&gt;
&lt;p&gt;To use this for RAG you'll need to conform to Nomic's custom prompt format. For documents to be searched:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;search_document: text of document goes here
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And for search queries:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;search_query: term to search for
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I &lt;a href="https://github.com/simonw/llm/issues/745"&gt;landed a new --prepend option&lt;/a&gt; for the &lt;a href="https://llm.datasette.io/en/stable/embeddings/cli.html#llm-embed-multi"&gt;llm embed-multi&lt;/a&gt; command to help with that, but it's not out in a full release just yet. (&lt;strong&gt;Update&lt;/strong&gt;: it's now out in &lt;a href="https://simonwillison.net/2025/Feb/17/llm/"&gt;LLM 0.22&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;I also released &lt;a href="https://github.com/simonw/llm-sentence-transformers/releases/tag/0.3"&gt;llm-sentence-transformers 0.3&lt;/a&gt; with some minor improvements to make running this model more smooth.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/nomic_ai/status/1889721439948820665"&gt;@nomic_ai&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rag"&gt;rag&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/uv"&gt;uv&lt;/a&gt;&lt;/p&gt;



</summary><category term="python"/><category term="ai"/><category term="embeddings"/><category term="llm"/><category term="nomic"/><category term="rag"/><category term="uv"/></entry><entry><title>llm-gpt4all</title><link href="https://simonwillison.net/2024/Apr/20/llm-gpt4all/#atom-tag" rel="alternate"/><published>2024-04-20T17:58:25+00:00</published><updated>2024-04-20T17:58:25+00:00</updated><id>https://simonwillison.net/2024/Apr/20/llm-gpt4all/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-gpt4all/releases/tag/0.4"&gt;llm-gpt4all&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
New release of my LLM plugin which builds on Nomic's excellent gpt4all Python library. I've upgraded to their latest version which adds support for Llama 3 8B Instruct, so after a 4.4GB model download this works:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;llm -m Meta-Llama-3-8B-Instruct "say hi in Spanish"&lt;/code&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llama"&gt;llama&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="ai"/><category term="generative-ai"/><category term="llama"/><category term="local-llms"/><category term="llms"/><category term="llm"/><category term="nomic"/></entry><entry><title>llm-nomic-api-embed</title><link href="https://simonwillison.net/2024/Mar/31/llm-nomic-api-embed/#atom-tag" rel="alternate"/><published>2024-03-31T15:17:12+00:00</published><updated>2024-03-31T15:17:12+00:00</updated><id>https://simonwillison.net/2024/Mar/31/llm-nomic-api-embed/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-nomic-api-embed"&gt;llm-nomic-api-embed&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
My new plugin for LLM which adds API access to the Nomic series of embedding models. Nomic models can be run locally too, which makes them a great long-term commitment as there’s no risk of the models being retired in a way that damages the value of your previously calculated embedding vectors.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="ai"/><category term="embeddings"/><category term="llm"/><category term="nomic"/></entry><entry><title>Adaptive Retrieval with Matryoshka Embeddings</title><link href="https://simonwillison.net/2024/Feb/15/adaptive-retrieval-with-matryoshka-embeddings/#atom-tag" rel="alternate"/><published>2024-02-15T04:19:55+00:00</published><updated>2024-02-15T04:19:55+00:00</updated><id>https://simonwillison.net/2024/Feb/15/adaptive-retrieval-with-matryoshka-embeddings/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://huggingface.co/spaces/Xenova/adaptive-retrieval-web"&gt;Adaptive Retrieval with Matryoshka Embeddings&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Nomic Embed v1 only came out two weeks ago, but the same team &lt;a href="https://www.nomic.ai/blog/posts/nomic-embed-matryoshka"&gt;just released Nomic Embed v1.5&lt;/a&gt; trained using a new technique called Matryoshka Representation.&lt;/p&gt;
&lt;p&gt;This means that unlike v1 the v1.5 embeddings are resizable - instead of a fixed 768 dimension embedding vector you can trade size for quality and drop that size all the way down to 64, while still maintaining strong semantically relevant results.&lt;/p&gt;
&lt;p&gt;Joshua Lochner build this interactive demo on top of Transformers.js which illustrates quite how well this works: it lets you embed a query, embed a series of potentially matching text sentences and then adjust the number of dimensions and see what impact it has on the results.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/xenovacom/status/1757798436009599413"&gt;@xenovacom&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/transformers-js"&gt;transformers-js&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="llms"/><category term="embeddings"/><category term="nomic"/><category term="transformers-js"/></entry><entry><title>llm-sentence-transformers 0.2</title><link href="https://simonwillison.net/2024/Feb/4/llm-sentence-transformers-02/#atom-tag" rel="alternate"/><published>2024-02-04T19:39:22+00:00</published><updated>2024-02-04T19:39:22+00:00</updated><id>https://simonwillison.net/2024/Feb/4/llm-sentence-transformers-02/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/simonw/llm-sentence-transformers/releases/tag/0.2"&gt;llm-sentence-transformers 0.2&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I added a new --trust-remote-code option when registering an embedding model, which means LLM can now run embeddings through the new Nomic AI nomic-embed-text-v1 model.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/transformers"&gt;transformers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="transformers"/><category term="ai"/><category term="embeddings"/><category term="llm"/><category term="nomic"/></entry><entry><title>Introducing Nomic Embed: A Truly Open Embedding Model</title><link href="https://simonwillison.net/2024/Feb/3/introducing-nomic-embed-a-truly-open-embedding-model/#atom-tag" rel="alternate"/><published>2024-02-03T23:13:00+00:00</published><updated>2024-02-03T23:13:00+00:00</updated><id>https://simonwillison.net/2024/Feb/3/introducing-nomic-embed-a-truly-open-embedding-model/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.nomic.ai/posts/nomic-embed-text-v1"&gt;Introducing Nomic Embed: A Truly Open Embedding Model&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
A new text embedding model from Nomic AI which supports 8192 length sequences, claims better scores than many other models (including OpenAI’s new text-embedding-3-small) and is available as both a hosted API and a run-yourself model. The model is Apache 2 licensed and Nomic have released the full set of training data and code.&lt;/p&gt;

&lt;p&gt;From the accompanying paper: “Full training of nomic-embed-text-v1 can be conducted in a single week on one 8xH100 node.”


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/embeddings"&gt;embeddings&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nomic"&gt;nomic&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="embeddings"/><category term="nomic"/></entry></feed>