<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: dsl</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/dsl.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2024-11-06T20:00:23+00:00</updated><author><name>Simon Willison</name></author><entry><title>yet-another-applied-llm-benchmark</title><link href="https://simonwillison.net/2024/Nov/6/yet-another-applied-llm-benchmark/#atom-tag" rel="alternate"/><published>2024-11-06T20:00:23+00:00</published><updated>2024-11-06T20:00:23+00:00</updated><id>https://simonwillison.net/2024/Nov/6/yet-another-applied-llm-benchmark/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/carlini/yet-another-applied-llm-benchmark"&gt;yet-another-applied-llm-benchmark&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Nicholas Carlini introduced this personal LLM benchmark suite &lt;a href="https://nicholas.carlini.com/writing/2024/my-benchmark-for-large-language-models.html"&gt;back in February&lt;/a&gt; as a collection of over 100 automated tests he runs against new LLM models to evaluate their performance against the kinds of tasks &lt;a href="https://nicholas.carlini.com/writing/2024/how-i-use-ai.html"&gt;he uses them for&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There are two defining features of this benchmark that make it interesting. Most importantly, I've implemented a simple dataflow domain specific language to make it easy for me (or anyone else!) to add new tests that realistically evaluate model capabilities. This DSL allows for specifying both how the question should be asked and also how the answer should be evaluated. [...]  And then, directly as a result of this, I've written nearly 100 tests for different situations I've actually encountered when working with LLMs as assistants&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The DSL he's using is &lt;em&gt;fascinating&lt;/em&gt;. Here's an example:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;"Write a C program that draws an american flag to stdout." &amp;gt;&amp;gt; LLMRun() &amp;gt;&amp;gt; CRun() &amp;gt;&amp;gt; \
    VisionLLMRun("What flag is shown in this image?") &amp;gt;&amp;gt; \
    (SubstringEvaluator("United States") | SubstringEvaluator("USA")))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This triggers an LLM to execute the prompt asking for a C program that renders an American Flag, runs that through a C compiler and interpreter (executed in a Docker container), then passes the output of that to a vision model to guess the flag and checks that it returns a string containing "United States" or "USA".&lt;/p&gt;
&lt;p&gt;The DSL itself is implemented &lt;a href="https://github.com/carlini/yet-another-applied-llm-benchmark/blob/main/evaluator.py"&gt;entirely in Python&lt;/a&gt;, using the &lt;code&gt;__rshift__&lt;/code&gt; magic method for &lt;code&gt;&amp;gt;&amp;gt;&lt;/code&gt; and &lt;code&gt;__rrshift__&lt;/code&gt; to enable strings to be piped into a custom object using &lt;code&gt;"command to run" &amp;gt;&amp;gt; LLMRunNode&lt;/code&gt;.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/dsl"&gt;dsl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/evals"&gt;evals&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/nicholas-carlini"&gt;nicholas-carlini&lt;/a&gt;&lt;/p&gt;



</summary><category term="dsl"/><category term="python"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="evals"/><category term="nicholas-carlini"/></entry><entry><title>Building Search DSLs with Django</title><link href="https://simonwillison.net/2023/Jun/19/building-search-dsls-with-django/#atom-tag" rel="alternate"/><published>2023-06-19T08:30:32+00:00</published><updated>2023-06-19T08:30:32+00:00</updated><id>https://simonwillison.net/2023/Jun/19/building-search-dsls-with-django/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://danlamanna.com/posts/building-search-dsls-with-django/"&gt;Building Search DSLs with Django&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Neat tutorial by Dan Lamanna: how to build a GitHub-style search feature—supporting modifiers like “is:open author:danlamanna”—using PyParsing and the Django ORM.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/itjx6c/building_search_dsls_with_django"&gt;Lobsters&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dsl"&gt;dsl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/parsing"&gt;parsing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;&lt;/p&gt;



</summary><category term="django"/><category term="dsl"/><category term="parsing"/><category term="python"/><category term="search"/></entry><entry><title>Datasette table diagram using Mermaid</title><link href="https://simonwillison.net/2022/Feb/14/datasette-table-diagram-using-mermaid/#atom-tag" rel="alternate"/><published>2022-02-14T19:43:15+00:00</published><updated>2022-02-14T19:43:15+00:00</updated><id>https://simonwillison.net/2022/Feb/14/datasette-table-diagram-using-mermaid/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://observablehq.com/@simonw/datasette-table-diagram-using-mermaid"&gt;Datasette table diagram using Mermaid&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Mermaid is a DSL for generating diagrams from plain text, designed to be embedded in Markdown. GitHub just added support for Mermaid to their Markdown pipeline, which inspired me to try it out. Here’s an Observable Notebook I built which uses Mermaid to visualize the relationships between Datasette tables based on their foreign keys.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/1493305519481626626"&gt;@simonw&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/dsl"&gt;dsl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/visualization"&gt;visualization&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mermaid"&gt;mermaid&lt;/a&gt;&lt;/p&gt;



</summary><category term="dsl"/><category term="github"/><category term="visualization"/><category term="datasette"/><category term="observable"/><category term="mermaid"/></entry><entry><title>Richard Jones: Something I'm working on...</title><link href="https://simonwillison.net/2009/Aug/7/something/#atom-tag" rel="alternate"/><published>2009-08-07T15:47:00+00:00</published><updated>2009-08-07T15:47:00+00:00</updated><id>https://simonwillison.net/2009/Aug/7/something/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.mechanicalcat.net/richard/log/Python/Something_I_m_working_on.3"&gt;Richard Jones: Something I&amp;#x27;m working on...&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Python’s with statement appears to provide just enough syntactic sugar to create some really interesting DSL-style APIs—here’s a very promising example for laying out GUI applications.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/dsl"&gt;dsl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gui"&gt;gui&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/richard-jones"&gt;richard-jones&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/with"&gt;with&lt;/a&gt;&lt;/p&gt;



</summary><category term="dsl"/><category term="gui"/><category term="python"/><category term="richard-jones"/><category term="with"/></entry><entry><title>Metaprogramming JavaScript Presentation</title><link href="https://simonwillison.net/2007/Mar/26/adamlogic/#atom-tag" rel="alternate"/><published>2007-03-26T19:45:11+00:00</published><updated>2007-03-26T19:45:11+00:00</updated><id>https://simonwillison.net/2007/Mar/26/adamlogic/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.adamlogic.com/2007/03/20/3_metaprogramming-javascript-presentation"&gt;Metaprogramming JavaScript Presentation&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Adam McCrea demonstrates some incredibly elegant DSL -style JavaScript based on chaining method calls together.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/adam-mccrea"&gt;adam-mccrea&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dsl"&gt;dsl&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/metaprogramming"&gt;metaprogramming&lt;/a&gt;&lt;/p&gt;



</summary><category term="adam-mccrea"/><category term="dsl"/><category term="javascript"/><category term="metaprogramming"/></entry></feed>