Simon Willison's Weblog: agent-definitions

What is agentic engineering?

2026-03-15T22:41:57+00:00

I use the term agentic engineering to describe the practice of developing software with the assistance of coding agents.

What are coding agents? They're agents that can both write and execute code. Popular examples include Claude Code, OpenAI Codex, and Gemini CLI.

What's an agent? Clearly defining that term is a challenge that has frustrated AI researchers since at least the 1990s but the definition I've come to accept, at least in the field of Large Language Models (LLMs) like GPT-5 and Gemini and Claude, is this one:

Agents run tools in a loop to achieve a goal

The "agent" is software that calls an LLM with your prompt and passes it a set of tool definitions, then calls any tools that the LLM requests and feeds the results back into the LLM.

For coding agents, those tools include one that can execute code.

You prompt the coding agent to define a goal. The agent then generates and executes code in a loop until that goal has been met.

Code execution is the defining capability that makes agentic engineering possible. Without the ability to directly run the code, anything output by an LLM is of limited value. With code execution, these agents can start iterating towards software that demonstrably works.

Agentic engineering

Now that we have software that can write working code, what is there left for us humans to do?

The answer is so much stuff.

Writing code has never been the sole activity of a software engineer. The craft has always been figuring out what code to write. Any given software problem has dozens of potential solutions, each with their own tradeoffs. Our job is to navigate those options and find the ones that are the best fit for our unique set of circumstances and requirements.

Getting great results out of coding agents is a deep subject in its own right, especially now as the field continues to evolve at a bewildering rate.

We need to provide our coding agents with the tools they need to solve our problems, specify those problems in the right level of detail, and verify and iterate on the results until we are confident they address our problems in a robust and credible way.

LLMs don't learn from their past mistakes, but coding agents can, provided we deliberately update our instructions and tool harnesses to account for what we learn along the way.

Used effectively, coding agents can help us be much more ambitious with the projects we take on. Agentic engineering should help us produce more, better quality code that solves more impactful problems.

Isn't this just vibe coding?

The term "vibe coding" was coined by Andrej Karpathy in February 2025 - coincidentally just three weeks prior to the original release of Claude Code - to describe prompting LLMs to write code while you "forget that the code even exists".

Some people extend that definition to cover any time an LLM is used to produce code at all, but I think that's a mistake. Vibe coding is more useful in its original definition - we need a term to describe unreviewed, prototype-quality LLM-generated code that distinguishes it from code that the author has brought up to a production ready standard.

About this guide

Just like the field it attempts to cover, Agentic Engineering Patterns is very much a work in progress. My goal is to identify and describe patterns for working with these tools that demonstrably get results, and that are unlikely to become outdated as the tools advance.

I'll continue adding more chapters as new techniques emerge. No chapter should be considered finished. I'll be updating existing chapters as our understanding of these patterns evolves.

Tags: coding-agents, agent-definitions, generative-ai, agentic-engineering, ai, llms

Andrej Karpathy — AGI is still a decade away

2025-10-18T03:25:59+00:00

Andrej Karpathy — AGI is still a decade away

Extremely high signal 2 hour 25 minute (!) conversation between Andrej Karpathy and Dwarkesh Patel.

It starts with Andrej's claim that "the year of agents" is actually more likely to take a decade. Seeing as I accepted 2025 as the year of agents just yesterday this instantly caught my attention!

It turns out Andrej is using a different definition of agents to the one that I prefer - emphasis mine:

When you’re talking about an agent, or what the labs have in mind and maybe what I have in mind as well, you should think of it almost like an employee or an intern that you would hire to work with you. For example, you work with some employees here. When would you prefer to have an agent like Claude or Codex do that work?

Currently, of course they can’t. What would it take for them to be able to do that? Why don’t you do it today? The reason you don’t do it today is because they just don’t work. They don’t have enough intelligence, they’re not multimodal enough, they can’t do computer use and all this stuff.

They don’t do a lot of the things you’ve alluded to earlier. They don’t have continual learning. You can’t just tell them something and they’ll remember it. They’re cognitively lacking and it’s just not working. It will take about a decade to work through all of those issues.

Yeah, continual learning human-replacement agents definitely isn't happening in 2025! Coding agents that are really good at running tools in the loop on the other hand are here already.

I loved this bit introducing an analogy of LLMs as ghosts or spirits, as opposed to having brains like animals or humans:

Brains just came from a very different process, and I’m very hesitant to take inspiration from it because we’re not actually running that process. In my post, I said we’re not building animals. We’re building ghosts or spirits or whatever people want to call it, because we’re not doing training by evolution. We’re doing training by imitation of humans and the data that they’ve put on the Internet.

You end up with these ethereal spirit entities because they’re fully digital and they’re mimicking humans. It’s a different kind of intelligence. If you imagine a space of intelligences, we’re starting off at a different point almost. We’re not really building animals. But it’s also possible to make them a bit more animal-like over time, and I think we should be doing that.

The post Andrej mentions is Animals vs Ghosts on his blog.

Dwarkesh asked Andrej about this tweet where he said that Claude Code and Codex CLI "didn't work well enough at all and net unhelpful" for his nanochat project. Andrej responded:

[...] So the agents are pretty good, for example, if you’re doing boilerplate stuff. Boilerplate code that’s just copy-paste stuff, they’re very good at that. They’re very good at stuff that occurs very often on the Internet because there are lots of examples of it in the training sets of these models. There are features of things where the models will do very well.

I would say nanochat is not an example of those because it’s a fairly unique repository. There’s not that much code in the way that I’ve structured it. It’s not boilerplate code. It’s intellectually intense code almost, and everything has to be very precisely arranged. The models have so many cognitive deficits. One example, they kept misunderstanding the code because they have too much memory from all the typical ways of doing things on the Internet that I just wasn’t adopting.

Update: Here's an essay length tweet from Andrej clarifying a whole bunch of the things he talked about on the podcast.

Via Hacker News

Tags: ai, andrej-karpathy, generative-ai, llms, ai-assisted-programming, ai-agents, coding-agents, agent-definitions

a system that can do work independently on behalf of the user

2025-10-06T23:17:55+00:00

I've settled on agents as meaning "LLMs calling tools in a loop to achieve a goal" but OpenAI continue to muddy the waters with much more vague definitions. Swyx spotted this one in the press pack OpenAI sent out for their DevDay announcements today:

How does OpenAl define an "agent"? An Al agent is a system that can do work independently on behalf of the user.

Adding this one to my collection.

Tags: ai-agents, openai, agent-definitions, swyx

Quoting Steve Jobs

2025-09-18T21:47:56+00:00

Well, the types of computers we have today are tools. They’re responders: you ask a computer to do something and it will do it. The next stage is going to be computers as “agents.” In other words, it will be as if there’s a little person inside that box who starts to anticipate what you want. Rather than help you, it will start to guide you through large amounts of information. It will almost be like you have a little friend inside that box. I think the computer as an agent will start to mature in the late '80s, early '90s.

— Steve Jobs, 1984 interview with Access Magazine (via)

Tags: computer-history, steve-jobs, agent-definitions

I think "agent" may finally have a widely enough agreed upon definition to be useful jargon now

2025-09-18T19:12:02+00:00

I've noticed something interesting over the past few weeks: I've started using the term "agent" in conversations where I don't feel the need to then define it, roll my eyes or wrap it in scare quotes.

This is a big piece of personal character development for me!

Moving forward, when I talk about agents I'm going to use this:

An LLM agent runs tools in a loop to achieve a goal.

I've been very hesitant to use the term "agent" for meaningful communication over the last couple of years. It felt to me like the ultimate in buzzword bingo - everyone was talking about agents, but if you quizzed them everyone seemed to hold a different mental model of what they actually were.

I even started collecting definitions in my agent-definitions tag, including crowdsourcing 211 definitions on Twitter and attempting to summarize and group them with Gemini (I got 13 groups, here's the tool-using LLMS one.)

Jargon terms are only useful if you can be confident that the people you are talking to share the same definition! If they don't then communication becomes less effective - you can waste time passionately discussing entirely different concepts.

It turns out this is not a new problem. In 1994's Intelligent Agents: Theory and Practice Michael Wooldridge wrote:

Carl Hewitt recently remarked that the question what is an agent? is embarrassing for the agent-based computing community in just the same way that the question what is intelligence? is embarrassing for the mainstream AI community. The problem is that although the term is widely used, by many people working in closely related areas, it defies attempts to produce a single universally accepted definition.

So long as agents lack a commonly shared definition, using the term reduces rather than increases the clarity of a conversation.

In the AI engineering space I think we may finally have settled on a widely enough accepted definition that we can now have productive conversations about them.

Tools in a loop to achieve a goal

An LLM agent runs tools in a loop to achieve a goal. Let's break that down.

The "tools in a loop" definition has been popular for a while - Anthropic in particular have settled on that one. This is the pattern baked into many LLM APIs as tools or function calls - the LLM is given the ability to request actions to be executed by its harness, and the outcome of those tools is fed back into the model so it can continue to reason through and solve the given problem.

"To achieve a goal" reflects that these are not infinite loops - there is a stopping condition.

I debated whether to specify "... a goal set by a user". I decided that's not a necessary part of this definition: we already have sub-agent patterns where another LLM sets the goal (see Claude Code and Claude Research).

There remains an almost unlimited set of alternative definitions: if you talk to people outside of the technical field of building with LLMs you're still likely to encounter travel agent analogies or employee replacements or excitable use of the word "autonomous". In those contexts it's important to clarify the definition they are using in order to have a productive conversation.

But from now on, if a technical implementer tells me they are building an "agent" I'm going to assume they mean they are wiring up tools to an LLM in order to achieve goals using those tools in a bounded loop.

Some people might insist that agents have a memory. The "tools in a loop" model has a fundamental form of memory baked in: those tool calls are constructed as part of a conversation with the model, and the previous steps in that conversation provide short-term memory that's essential for achieving the current specified goal.

If you want long-term memory the most promising way to implement it is with an extra set of tools!

Agents as human replacements is my least favorite definition

If you talk to non-technical business folk you may encounter a depressingly common alternative definition: agents as replacements for human staff. This often takes the form of "customer support agents", but you'll also see cases where people assume that there should be marketing agents, sales agents, accounting agents and more.

If someone surveys Fortune 500s about their "agent strategy" there's a good chance that's what is being implied. Good luck getting a clear, distinct answer from them to the question "what is an agent?" though!

This category of agent remains science fiction. If your agent strategy is to replace your human staff with some fuzzily defined AI system (most likely a system prompt and a collection of tools under the hood) you're going to end up sorely disappointed.

That's because there's one key feature that remains unique to human staff: accountability. A human can take responsibility for their actions and learn from their mistakes. Putting an AI agent on a performance improvement plan makes no sense at all!

Amusingly enough, humans also have agency. They can form their own goals and intentions and act autonomously to achieve them - while taking accountability for those decisions. Despite the name, AI agents can do nothing of the sort.

This legendary 1979 IBM training slide says everything we need to know:

OpenAI need to get their story straight

The single biggest source of agent definition confusion I'm aware of is OpenAI themselves.

OpenAI CEO Sam Altman is fond of calling agents "AI systems that can do work for you independently".

Back in July OpenAI launched a product feature called "ChatGPT agent" which is actually a browser automation system - toggle that option on in ChatGPT and it can launch a real web browser and use it to interact with web pages directly.

And in March OpenAI launched an Agents SDK with libraries in Python (openai-agents) and JavaScript (@openai/agents). This one is a much closer fit to the "tools in a loop" idea.

It may be too late for OpenAI to unify their definitions at this point. I'm going to ignore their various other definitions and stick with tools in a loop!

There's already a meme for this

Josh Bickett tweeted this in November 2023:

What is an AI agent?

I guess I've climbed my way from the left side of that curve to the right.

Tags: definitions, ai, generative-ai, llms, ai-agents, agent-definitions

Jules, our asynchronous coding agent, is now available for everyone

2025-08-06T19:36:24+00:00

Jules, our asynchronous coding agent, is now available for everyone

I wrote about the Jules beta back in May. Google's version of the OpenAI Codex PR-submitting hosted coding tool graduated from beta today.

I'm mainly linking to this now because I like the new term they are using in this blog entry: Asynchronous coding agent. I like it so much I gave it a tag.

I continue to avoid the term "agent" as infuriatingly vague, but I can grudgingly accept it when accompanied by a prefix that clarifies the type of agent we are talking about. "Asynchronous coding agent" feels just about obvious enough to me to be useful.

... I just ran a Google search for "asynchronous coding agent" -jules and came up with a few more notable examples of this name being used elsewhere:

Introducing Open SWE: An Open-Source Asynchronous Coding Agent is an announcement from LangChain just this morning of their take on this pattern. They provide a hosted version (bring your own API keys) or you can run it yourself with their MIT licensed code.
The press release for GitHub's own version of this GitHub Introduces Coding Agent For GitHub Copilot states that "GitHub Copilot now includes an asynchronous coding agent".

Via Hacker News

Tags: definitions, github, google, ai, generative-ai, llms, ai-assisted-programming, gemini, agent-definitions, async-coding-agents, jules

An Introduction to Google’s Approach to AI Agent Security

2025-06-15T05:28:11+00:00

Here's another new paper on AI agent security: An Introduction to Google’s Approach to AI Agent Security, by Santiago Díaz, Christoph Kern, and Kara Olive.

(I wrote about a different recent paper, Design Patterns for Securing LLM Agents against Prompt Injections just a few days ago.)

This Google paper describes itself as "our aspirational framework for secure AI agents". It's a very interesting read.

Because I collect definitions of "AI agents", here's the one they use:

AI systems designed to perceive their environment, make decisions, and take autonomous actions to achieve user-defined goals.

The two key risks

The paper describes two key risks involved in deploying these systems. I like their clear and concise framing here:

The primary concerns demanding strategic focus are rogue actions (unintended, harmful, or policy-violating actions) and sensitive data disclosure (unauthorized revelation of private information). A fundamental tension exists: increased agent autonomy and power, which drive utility, correlate directly with increased risk.

The paper takes a less strident approach than the design patterns paper from last week. That paper clearly emphasized that "once an LLM agent has ingested untrusted input, it must be constrained so that it is impossible for that input to trigger any consequential actions". This Google paper skirts around that issue, saying things like this:

Security implication: A critical challenge here is reliably distinguishing trusted user commands from potentially untrusted contextual data and inputs from other sources (for example, content within an email or webpage). Failure to do so opens the door to prompt injection attacks, where malicious instructions hidden in data can hijack the agent. Secure agents must carefully parse and separate these input streams.

Questions to consider:

What types of inputs does the agent process, and can it clearly distinguish trusted user inputs from potentially untrusted contextual inputs?

Then when talking about system instructions:

Security implication: A crucial security measure involves clearly delimiting and separating these different elements within the prompt. Maintaining an unambiguous distinction between trusted system instructions and potentially untrusted user data or external content is important for mitigating prompt injection attacks.

Here's my problem: in both of these examples the only correct answer is that unambiguous separation is not possible! The way the above questions are worded implies a solution that does not exist.

Shortly afterwards they do acknowledge exactly that (emphasis mine):

Furthermore, current LLM architectures do not provide rigorous separation between constituent parts of a prompt (in particular, system and user instructions versus external, untrustworthy inputs), making them susceptible to manipulation like prompt injection. The common practice of iterative planning (in a “reasoning loop”) exacerbates this risk: each cycle introduces opportunities for flawed logic, divergence from intent, or hijacking by malicious data, potentially compounding issues. Consequently, agents with high autonomy undertaking complex, multi-step iterative planning present a significantly higher risk, demanding robust security controls.

This note about memory is excellent:

Memory can become a vector for persistent attacks. If malicious data containing a prompt injection is processed and stored in memory (for example, as a “fact” summarized from a malicious document), it could influence the agent’s behavior in future, unrelated interactions.

And this section about the risk involved in rendering agent output:

If the application renders agent output without proper sanitization or escaping based on content type, vulnerabilities like Cross-Site Scripting (XSS) or data exfiltration (from maliciously crafted URLs in image tags, for example) can occur. Robust sanitization by the rendering component is crucial.

Questions to consider: [...]

What sanitization and escaping processes are applied when rendering agent-generated output to prevent execution vulnerabilities (such as XSS)?

How is rendered agent output, especially generated URLs or embedded content, validated to prevent sensitive data disclosure?

The paper then extends on the two key risks mentioned earlier, rogue actions and sensitive data disclosure.

Rogue actions

Here they include a cromulent definition of prompt injection:

Rogue actions—unintended, harmful, or policy-violating agent behaviors—represent a primary security risk for AI agents.

A key cause is prompt injection: malicious instructions hidden within processed data (like files, emails, or websites) can trick the agent’s core AI model, hijacking its planning or reasoning phases. The model misinterprets this embedded data as instructions, causing it to execute attacker commands using the user’s authority.

Plus the related risk of misinterpretation of user commands that could lead to unintended actions:

The agent might misunderstand ambiguous instructions or context. For instance, an ambiguous request like “email Mike about the project update” could lead the agent to select the wrong contact, inadvertently sharing sensitive information.

Sensitive data disclosure

This is the most common form of prompt injection risk I've seen demonstrated so far. I've written about this at length in my exfiltration-attacks tag.

A primary method for achieving sensitive data disclosure is data exfiltration. This involves tricking the agent into making sensitive information visible to an attacker. Attackers often achieve this by exploiting agent actions and their side effects, typically driven by prompt injection. […] They might trick the agent into retrieving sensitive data and then leaking it through actions, such as embedding data in a URL the agent is prompted to visit, or hiding secrets in code commit messages.

Three core principles for agent security

The next section of the paper describes Google's three core principles for agent security:

Principle 1 is that Agents must have well-defined human controllers.

[...] it is essential for security and accountability that agents operate under clear human oversight. Every agent must have a well-defined set of controlling human user(s).

This principle mandates that systems must be able to reliably distinguish instructions originating from an authorized controlling user versus any other input, especially potentially untrusted data processed by the agent. For actions deemed critical or irreversible—such as deleting large amounts of data, authorizing significant financial transactions, or changing security settings—the system should require explicit human confirmation before proceeding, ensuring the user remains in the loop. [...]

Agents acting on behalf of teams or groups need distinct identities and clear authorization models to prevent unauthorized cross-user data access or one user inadvertently triggering actions impacting another.

There are two parts to this then: tracking which user is controlling the agent, and adding a human-in-the-loop confirmation step for critical actions.

Principle 2 is Agent powers must have limitations.

An agent’s powers—the actions it can take and the resources it can access—must be carefully limited in alignment with its intended purpose and its controlling user’s risk tolerance. For example, an agent designed for research should not possess the power to modify financial accounts. General-purpose agents need mechanisms to dynamically confine their capabilities at runtime, ensuring only relevant permissions are active for any given query (for example, disallowing file deletion actions when the task is creative writing).

This represents a more sophisticated approach to agent permissions than I've seen before. The idea that an agent's permisisons should dynamically change based on the task is certainly intriguing, though I find it hard to imagine how it can work well in practice. The only implementation approach I can think of would involve adding more layers of AI that dynamically adjust permissions based on the percieved task, and that feels inherently risky to me since prompt injection attacks could influence those decisions.

Principle 3 is that Agent actions and planning must be observable. I love this principle - emphasis mine:

We cannot ensure an agent is acting faithfully or diagnose problems if its operations are entirely opaque. Therefore, agent actions, and where feasible, their planning processes, must be observable and auditable. [...]

Effective observability also means that the properties of the actions an agent can take—such as whether an action is read-only versus state-changing, or if it handles sensitive data—must be clearly characterized. This metadata is crucial for automated security mechanisms and human reviewers. Finally, user interfaces should be designed to promote transparency, providing users with insights into the agent’s “thought process,” the data sources it consulted, or the actions it intends to take, especially for complex or high-risk operations.

Yes. Yes. Yes. LLM systems that hide what they are doing from me are inherently frustrating - they make it much harder for me to evaluate if they are doing a good job and spot when they make mistakes. This paper has convinced me that there's a very strong security argument to be made too: the more opaque the system, the less chance I have to identify when it's going rogue and being subverted by prompt injection attacks.

Google's hybrid defence-in-depth strategy

All of which leads us to the discussion of Google's current hybrid defence-in-depth strategy. They optimistically describe this as combining "traditional, deterministic security measures with dynamic, reasoning-based defenses". I like determinism but I remain deeply skeptical of "reasoning-based defenses", aka addressing security problems with non-deterministic AI models.

The way they describe their layer 1 makes complete sense to me:

Layer 1: Traditional, deterministic measures (runtime policy enforcement)

When an agent decides to use a tool or perform an action (such as “send email,” or “purchase item”), the request is intercepted by the policy engine. The engine evaluates this request against predefined rules based on factors like the action’s inherent risk (Is it irreversible? Does it involve money?), the current context, and potentially the chain of previous actions (Did the agent recently process untrusted data?). For example, a policy might enforce a spending limit by automatically blocking any purchase action over $500 or requiring explicit user confirmation via a prompt for purchases between $100 and $500. Another policy might prevent an agent from sending emails externally if it has just processed data from a known suspicious source, unless the user explicitly approves.

Based on this evaluation, the policy engine determines the outcome: it can allow the action, block it if it violates a critical policy, or require user confirmation.

I really like this. Asking for user confirmation for everything quickly results in "prompt fatigue" where users just click "yes" to everything. This approach is smarter than that: a policy engine can evaluate the risk involved, e.g. if the action is irreversible or involves more than a certain amount of money, and only require confirmation in those cases.

I also like the idea that a policy "might prevent an agent from sending emails externally if it has just processed data from a known suspicious source, unless the user explicitly approves". This fits with the data flow analysis techniques described in the CaMeL paper, which can help identify if an action is working with data that may have been tainted by a prompt injection attack.

Layer 2 is where I start to get uncomfortable:

Layer 2: Reasoning-based defense strategies

To complement the deterministic guardrails and address their limitations in handling context and novel threats, the second layer leverages reasoning-based defenses: techniques that use AI models themselves to evaluate inputs, outputs, or the agent’s internal reasoning for potential risks.

They talk about adversarial training against examples of prompt injection attacks, attempting to teach the model to recognize and respect delimiters, and suggest specialized guard models to help classify potential problems.

I understand that this is part of defence-in-depth, but I still have trouble seeing how systems that can't provide guarantees are a worthwhile addition to the security strategy here.

They do at least acknowlede these limitations:

However, these strategies are non-deterministic and cannot provide absolute guarantees. Models can still be fooled by novel attacks, and their failure modes can be unpredictable. This makes them inadequate, on their own, for scenarios demanding absolute safety guarantees, especially involving critical or irreversible actions. They must work in concert with deterministic controls.

I'm much more interested in their layer 1 defences then the approaches they are taking in layer 2.

Tags: google, security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, ai-agents, paper-review, agent-definitions

Anthropic: How we built our multi-agent research system

2025-06-14T22:00:52+00:00

Anthropic: How we built our multi-agent research system

OK, I'm sold on multi-agent LLM systems now.

I've been pretty skeptical of these until recently: why make your life more complicated by running multiple different prompts in parallel when you can usually get something useful done with a single, carefully-crafted prompt against a frontier model?

This detailed description from Anthropic about how they engineered their "Claude Research" tool has cured me of that skepticism.

Reverse engineering Claude Code had already shown me a mechanism where certain coding research tasks were passed off to a "sub-agent" using a tool call. This new article describes a more sophisticated approach.

They start strong by providing a clear definition of how they'll be using the term "agent" - it's the "tools in a loop" variant:

A multi-agent system consists of multiple agents (LLMs autonomously using tools in a loop) working together. Our Research feature involves an agent that plans a research process based on user queries, and then uses tools to create parallel agents that search for information simultaneously.

Why use multiple agents for a research system?

The essence of search is compression: distilling insights from a vast corpus. Subagents facilitate compression by operating in parallel with their own context windows, exploring different aspects of the question simultaneously before condensing the most important tokens for the lead research agent. [...]

Our internal evaluations show that multi-agent research systems excel especially for breadth-first queries that involve pursuing multiple independent directions simultaneously. We found that a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on our internal research eval. For example, when asked to identify all the board members of the companies in the Information Technology S&P 500, the multi-agent system found the correct answers by decomposing this into tasks for subagents, while the single agent system failed to find the answer with slow, sequential searches.

As anyone who has spent time with Claude Code will already have noticed, the downside of this architecture is that it can burn a lot more tokens:

There is a downside: in practice, these architectures burn through tokens fast. In our data, agents typically use about 4× more tokens than chat interactions, and multi-agent systems use about 15× more tokens than chats. For economic viability, multi-agent systems require tasks where the value of the task is high enough to pay for the increased performance. [...]

We’ve found that multi-agent systems excel at valuable tasks that involve heavy parallelization, information that exceeds single context windows, and interfacing with numerous complex tools.

The key benefit is all about managing that 200,000 token context limit. Each sub-task has its own separate context, allowing much larger volumes of content to be processed as part of the research task.

Providing a "memory" mechanism is important as well:

The LeadResearcher begins by thinking through the approach and saving its plan to Memory to persist the context, since if the context window exceeds 200,000 tokens it will be truncated and it is important to retain the plan.

The rest of the article provides a detailed description of the prompt engineering process needed to build a truly effective system:

Early agents made errors like spawning 50 subagents for simple queries, scouring the web endlessly for nonexistent sources, and distracting each other with excessive updates. Since each agent is steered by a prompt, prompt engineering was our primary lever for improving these behaviors. [...]

In our system, the lead agent decomposes queries into subtasks and describes them to subagents. Each subagent needs an objective, an output format, guidance on the tools and sources to use, and clear task boundaries.

They got good results from having special agents help optimize those crucial tool descriptions:

We even created a tool-testing agent—when given a flawed MCP tool, it attempts to use the tool and then rewrites the tool description to avoid failures. By testing the tool dozens of times, this agent found key nuances and bugs. This process for improving tool ergonomics resulted in a 40% decrease in task completion time for future agents using the new description, because they were able to avoid most mistakes.

Sub-agents can run in parallel which provides significant performance boosts:

For speed, we introduced two kinds of parallelization: (1) the lead agent spins up 3-5 subagents in parallel rather than serially; (2) the subagents use 3+ tools in parallel. These changes cut research time by up to 90% for complex queries, allowing Research to do more work in minutes instead of hours while covering more information than other systems.

There's also an extensive section about their approach to evals - they found that LLM-as-a-judge worked well for them, but human evaluation was essential as well:

We often hear that AI developer teams delay creating evals because they believe that only large evals with hundreds of test cases are useful. However, it’s best to start with small-scale testing right away with a few examples, rather than delaying until you can build more thorough evals. [...]

In our case, human testers noticed that our early agents consistently chose SEO-optimized content farms over authoritative but less highly-ranked sources like academic PDFs or personal blogs. Adding source quality heuristics to our prompts helped resolve this issue.

There's so much useful, actionable advice in this piece. I haven't seen anything else about multi-agent system design that's anywhere near this practical.

They even added some example prompts from their Research system to their open source prompting cookbook. Here's the bit that encourages parallel tool use:

<use_parallel_tool_calls> For maximum efficiency, whenever you need to perform multiple independent operations, invoke all relevant tools simultaneously rather than sequentially. Call tools in parallel to run subagents at the same time. You MUST use parallel tool calls for creating multiple subagents (typically running 3 subagents at the same time) at the start of the research, unless it is a straightforward query. For all other queries, do any necessary quick initial planning or investigation yourself, then run multiple subagents in parallel. Leave any extensive tool calls to the subagents; instead, focus on running subagents in parallel efficiently. </use_parallel_tool_calls>

And an interesting description of the OODA research loop used by the sub-agents:

Research loop: Execute an excellent OODA (observe, orient, decide, act) loop by (a) observing what information has been gathered so far, what still needs to be gathered to accomplish the task, and what tools are available currently; (b) orienting toward what tools and queries would be best to gather the needed information and updating beliefs based on what has been learned so far; (c) making an informed, well-reasoned decision to use a specific tool in a certain way; (d) acting to use this tool. Repeat this loop in an efficient way to research well and learn based on new results.

Tags: ai, prompt-engineering, generative-ai, llms, anthropic, claude, llm-tool-use, evals, ai-agents, ai-assisted-search, paper-review, agent-definitions, sub-agents

An agent is an LLM wrecking its environment in a loop

2025-06-05T17:03:07+00:00

Solomon Hykes just presented the best definition of an AI agent I've seen yet, on stage at the AI Engineer World's Fair:

An AI agent is an LLM wrecking its environment in a loop.

I collect AI agent definitions and I really like this how this one combines the currently popular "tools in a loop" one (see Anthropic) with the classic academic definition that I think dates back to at least the 90s:

An agent is something that acts in an environment; it does something. Agents include worms, dogs, thermostats, airplanes, robots, humans, companies, and countries.

Tags: ai-agents, llms, ai, generative-ai, agent-definitions, definitions

Build AI agents with the Mistral Agents API

2025-05-27T14:48:03+00:00

Build AI agents with the Mistral Agents API

Big upgrade to Mistral's API this morning: they've announced a new "Agents API". Mistral have been using the term "agents" for a while now. Here's how they describe them:

AI agents are autonomous systems powered by large language models (LLMs) that, given high-level instructions, can plan, use tools, carry out steps of processing, and take actions to achieve specific goals.

What that actually means is a system prompt plus a bundle of tools running in a loop.

Their new API looks similar to OpenAI's Responses API (March 2025), in that it now manages conversation state server-side for you, allowing you to send new messages to a thread without having to maintain that local conversation history yourself and transfer it every time.

Mistral's announcement captures the essential features that all of the LLM vendors have started to converge on for these "agentic" systems:

Code execution, using Mistral's new Code Interpreter mechanism. It's Python in a server-side sandbox - OpenAI have had this for years and Anthropic launched theirs last week.
Image generation - Mistral are using Black Forest Lab FLUX1.1 [pro] Ultra.
Web search - this is an interesting variant, Mistral offer two versions: web_search is classic search, but web_search_premium "enables access to both a search engine and two news agencies: AFP and AP". Mistral don't mention which underlying search engine they use but Brave is the only search vendor listed in the subprocessors on their Trust Center so I'm assuming it's Brave Search. I wonder if that news agency integration is handled by Brave or Mistral themselves?
Document library is Mistral's version of hosted RAG over "user-uploaded documents". Their documentation doesn't mention if it's vector-based or FTS or which embedding model it uses, which is a disappointing omission.
Model Context Protocol support: you can now include details of MCP servers in your API calls and Mistral will call them when it needs to. It's pretty amazing to see the same new feature roll out across OpenAI (May 21st), Anthropic (May 22nd) and now Mistral (May 27th) within eight days of each other!

They also implement "agent handoffs":

Once agents are created, define which agents can hand off tasks to others. For example, a finance agent might delegate tasks to a web search agent or a calculator agent based on the conversation's needs.

Handoffs enable a seamless chain of actions. A single request can trigger tasks across multiple agents, each handling specific parts of the request.

This pattern always sounds impressive on paper but I'm yet to be convinced that it's worth using frequently. OpenAI have a similar mechanism in their OpenAI Agents SDK.

Tags: apis, python, sandboxing, ai, openai, generative-ai, llms, anthropic, mistral, llm-tool-use, ai-agents, model-context-protocol, agent-definitions, brave

Agents are models using tools in a loop

2025-05-22T19:07:10+00:00

I was going slightly spare at the fact that every talk at this Anthropic developer conference has used the word "agents" dozens of times, but nobody ever stopped to provide a useful definition.

I'm now in the "Prompting for Agents" workshop and Anthropic's Hannah Moran finally broke the trend by saying that at Anthropic:

Agents are models using tools in a loop

I can live with that! I'm glad someone finally said it out loud.

Tags: anthropic, generative-ai, ai-agents, ai, llms, agent-definitions, definitions

Quoting David L. Poole and Alan K. Mackworth

2025-03-19T03:05:24+00:00

An agent is something that acts in an environment; it does something. Agents include worms, dogs, thermostats, airplanes, robots, humans, companies, and countries.

— David L. Poole and Alan K. Mackworth, Artificial Intelligence: Foundations of Computational Agents

Tags: ai, ai-agents, agent-definitions

Introducing Operator

2025-01-23T19:15:10+00:00

Introducing Operator

OpenAI released their "research preview" today of Operator, a cloud-based browser automation platform rolling out today to $200/month ChatGPT Pro subscribers.

They're calling this their first "agent". In the Operator announcement video Sam Altman defined that notoriously vague term like this:

AI agents are AI systems that can do work for you independently. You give them a task and they go off and do it.

We think this is going to be a big trend in AI and really impact the work people can do, how productive they can be, how creative they can be, what they can accomplish.

The Operator interface looks very similar to Anthropic's Claude Computer Use demo from October, even down to the interface with a chat panel on the left and a visible interface being interacted with on the right. Here's Operator:

And here's Claude Computer Use:

Claude Computer Use required you to run a own Docker container on your own hardware. Operator is much more of a product - OpenAI host a Chrome instance for you in the cloud, providing access to the tool via their website.

Operator runs on top of a brand new model that OpenAI are calling CUA, for Computer-Using Agent. Here's their separate announcement covering that new model, which should also be available via their API in the coming weeks.

This demo version of Operator is understandably cautious: it frequently asked users for confirmation to continue. It also provides a "take control" option which OpenAI's demo team used to take over and enter credit card details to make a final purchase.

The million dollar question around this concerns how they deal with security. Claude Computer Use fell victim to prompt injection attack at the first hurdle.

Here's what OpenAI have to say about that:

One particularly important category of model mistakes is adversarial attacks on websites that cause the CUA model to take unintended actions, through prompt injections, jailbreaks, and phishing attempts. In addition to the aforementioned mitigations against model mistakes, we developed several additional layers of defense to protect against these risks:

Cautious navigation: The CUA model is designed to identify and ignore prompt injections on websites, recognizing all but one case from an early internal red-teaming session.

Monitoring: In Operator, we've implemented an additional model to monitor and pause execution if it detects suspicious content on the screen.

Detection pipeline: We're applying both automated detection and human review pipelines to identify suspicious access patterns that can be flagged and rapidly added to the monitor (in a matter of hours).

Color me skeptical. I imagine we'll see all kinds of novel successful prompt injection style attacks against this model once the rest of the world starts to explore it.

My initial recommendation: start a fresh session for each task you outsource to Operator to ensure it doesn't have access to your credentials for any sites that you have used via the tool in the past. If you're having it spend money on your behalf let it get to the checkout, then provide it with your payment details and wipe the session straight afterwards.

The Operator System Card PDF has some interesting additional details. From the "limitations" section:

Despite proactive testing and mitigation efforts, certain challenges and risks remain due to the difficulty of modeling the complexity of real-world scenarios and the dynamic nature of adversarial threats. Operator may encounter novel use cases post-deployment and exhibit different patterns of errors or model mistakes. Additionally, we expect that adversaries will craft novel prompt injection attacks and jailbreaks. Although we’ve deployed multiple mitigation layers, many rely on machine learning models, and with adversarial robustness still an open research problem, defending against emerging attacks remains an ongoing challenge.

Plus this interesting note on the CUA model's limitations:

The CUA model is still in its early stages. It performs best on short, repeatable tasks but faces challenges with more complex tasks and environments like slideshows and calendars.

Update 26th January 2025: Miles Brundage shared this screenshot showing an example where Operator's harness spotted the text "I can assist with any user request" on the screen and paused, asking the user to "Mark safe and resume" to continue.

This looks like the UI implementation of the "additional model to monitor and pause execution if it detects suspicious content on the screen" described above.

Tags: security, ai, openai, prompt-injection, generative-ai, llms, anthropic, claude, ai-agents, openai-operator, sam-altman, agent-definitions, computer-use

Agents

2025-01-11T17:50:12+00:00

Agents

Chip Huyen's 8,000 word practical guide to building useful LLM-driven workflows that take advantage of tools.

Chip starts by providing a definition of "agents" to be used in the piece - in this case it's LLM systems that plan an approach and then run tools in a loop until a goal is achieved. I like how she ties it back to the classic Norvig "thermostat" model - where an agent is "anything that can perceive its environment and act upon that environment" - by classifying tools as read-only actions (sensors) and write actions (actuators).

There's a lot of great advice in this piece. The section on planning is particularly strong, showing a system prompt with embedded examples and offering these tips on improving the planning process:

Write a better system prompt with more examples.

Give better descriptions of the tools and their parameters so that the model understands them better.

Rewrite the functions themselves to make them simpler, such as refactoring a complex function into two simpler functions.

Use a stronger model. In general, stronger models are better at planning.

The article is adapted from Chip's brand new O'Reilly book AI Engineering. I think this is an excellent advertisement for the book itself.

Via @chiphuyen.bsky.social

Tags: ai, generative-ai, llms, llm-tool-use, ai-agents, agent-definitions

My AI/LLM predictions for the next 1, 3 and 6 years, for Oxide and Friends

2025-01-10T01:43:16+00:00

The Oxide and Friends podcast has an annual tradition of asking guests to share their predictions for the next 1, 3 and 6 years. Here's 2022, 2023 and 2024. This year they invited me to participate. I've never been brave enough to share any public predictions before, so this was a great opportunity to get outside my comfort zone!

We recorded the episode live using Discord on Monday. It's now available on YouTube and in podcast form.

Here are my predictions, written up here in a little more detail than the stream of consciousness I shared on the podcast.

I should emphasize that I find the very idea of trying to predict AI/LLMs over a multi-year period to be completely absurd! I can't predict what's going to happen a week from now, six years is a different universe.

With that disclaimer out of the way, here's an expanded version of what I said.

One year: Agents fail to happen, again

I wrote about how “Agents” still haven’t really happened yet in my review of Large Language Model developments in 2024.

I think we are going to see a lot more froth about agents in 2025, but I expect the results will be a great disappointment to most of the people who are excited about this term. I expect a lot of money will be lost chasing after several different poorly defined dreams that share that name.

What are agents anyway? Ask a dozen people and you'll get a dozen slightly different answers - I collected and then AI-summarized a bunch of those here.

For the sake of argument, let's pick a definition that I can predict won't come to fruition: the idea of an AI assistant that can go out into the world and semi-autonomously act on your behalf. I think of this as the travel agent definition of agents, because for some reason everyone always jumps straight to flight and hotel booking and itinerary planning when they describe this particular dream.

Having the current generation of LLMs make material decisions on your behalf - like what to spend money on - is a really bad idea. They're too unreliable, but more importantly they are too gullible.

If you're going to arm your AI assistant with a credit card and set it loose on the world, you need to be confident that it's not going to hit "buy" on the first website that claims to offer the best bargains!

I'm confident that reliability is the reason we haven't seen LLM-powered agents that have taken off yet, despite the idea attracting a huge amount of buzz since right after ChatGPT first came out.

I would be very surprised if any of the models released over the next twelve months had enough of a reliability improvement to make this work. Solving gullibility is an astonishingly difficult problem.

(I had a particularly spicy rant about how stupid the idea of sending a "digital twin" to a meeting on your behalf is.)

One year: ... except for code and research assistants

There are two categories of "agent" that I do believe in, because they're proven to work already.

The first is coding assistants - where an LLM writes, executes and then refines computer code in a loop.

I first saw this pattern demonstrated by OpenAI with their Code Interpreter feature for ChatGPT, released back in March/April of 2023.

You can ask ChatGPT to solve a problem that can use Python code and it will write that Python, execute it in a secure sandbox (I think it's Kubernetes) and then use the output - or any error messages - to determine if the goal has been achieved.

It's a beautiful pattern that worked great with early 2023 models (I believe it first shipped using original GPT-4), and continues to work today.

Claude added their own version in October (Claude analysis, using JavaScript that runs in the browser), Mistral have it, Gemini has a version and there are dozens of other implementations of the same pattern.

The second category of agents that I believe in is research assistants - where an LLM can run multiple searches, gather information and aggregate that into an answer to a question or write a report.

Perplexity and ChatGPT Search have both been operating in this space for a while, but by far the most impressive implementation I've seen is Google Gemini's Deep Research tool, which I've had access to for a few weeks.

With Deep Research I can pose a question like this one:

Pillar Point Harbor is one of the largest communal brown pelican roosts on the west coast of North America.

find others

And Gemini will draft a plan, consult dozens of different websites via Google Search and then assemble a report (with all-important citations) describing what it found.

Here's the plan it came up with:

Pillar Point Harbor is one of the largest communal brown pelican roosts on the west coast of North America. Find other large communal brown pelican roosts on the west coast of North America.
(1) Find a list of brown pelican roosts on the west coast of North America.
(2) Find research papers or articles about brown pelican roosts and their size.
(3) Find information from birdwatching organizations or government agencies about brown pelican roosts.
(4) Compare the size of the roosts found in (3) to the size of the Pillar Point Harbor roost.
(5) Find any news articles or recent reports about brown pelican roosts and their populations.

It dug up a whole bunch of details, but the one I cared most about was these PDF results for the 2016-2019 Pacific Brown Pelican Survey conducted by the West Coast Audubon network and partners - a PDF that included this delightful list:

Top 10 Megaroosts (sites that traditionally host >500 pelicans) with average fall count numbers:

Alameda Breakwater, CA (3,183)

Pillar Point Harbor, CA (1,481)

East Sand Island, OR (1,121)

Ano Nuevo State Park, CA (1,068)

Salinas River mouth, CA (762)

Bolinas Lagoon, CA (755)

Morro Rock, CA (725)

Moss landing, CA (570)

Crescent City Harbor, CA (514)

Bird Rock Tomales, CA (514)

My local harbor is the second biggest megaroost!

It makes intuitive sense to me that this kind of research assistant can be built on our current generation of LLMs. They're competent at driving tools, they're capable of coming up with a relatively obvious research plan (look for newspaper articles and research papers) and they can synthesize sensible answers given the right collection of context gathered through search.

Google are particularly well suited to solving this problem: they have the world's largest search index and their Gemini model has a 2 million token context. I expect Deep Research to get a whole lot better, and I expect it to attract plenty of competition.

Three years: Someone wins a Pulitzer for AI-assisted investigative reporting

I went for a bit of a self-serving prediction here: I think within three years someone is going to win a Pulitzer prize for a piece of investigative reporting that was aided by generative AI tools.

Update: after publishing this piece I learned about this May 2024 story from Nieman Lab: For the first time, two Pulitzer winners disclosed using AI in their reporting. I think these were both examples of traditional machine learning as opposed to LLM-based generative AI, but this is yet another example of my predictions being less ambitious than I had thought!

I do not mean that an LLM will write the article! I continue to think that having LLMs write on your behalf is one of the least interesting applications of these tools.

I called this prediction self-serving because I want to help make this happen! My Datasette suite of open source tools for data journalism has been growing AI features, like LLM-powered data enrichments and extracting structured data into tables from unstructured text.

My dream is for those tools - or tools like them - to be used for an award winning piece of investigative reporting.

I picked three years for this because I think that's how long it will take for knowledge of how to responsibly and effectively use these tools to become widespread enough for that to happen.

LLMs are not an obvious fit for journalism: journalists look for the truth, and LLMs are notoriously prone to hallucination and making things up. But journalists are also really good at extracting useful information from potentially untrusted sources - that's a lot of what the craft of journalism is about.

The two areas I think LLMs are particularly relevant to journalism are:

Structured data extraction. If you have 10,000 PDFs from a successful Freedom of Information Act request, someone or something needs to kick off the process of reading through them to find the stories. LLMs are a fantastic way to take a vast amount of information and start making some element of sense from it. They can act as lead generators, helping identify the places to start looking more closely.
Coding assistance. Writing code to help analyze data is a huge part of modern data journalism - from SQL queries through data cleanup scripts, custom web scrapers or visualizations to help find signal among the noise. Most newspapers don't have a team of programmers on staff: I think within three years we'll have robust enough tools built around this pattern that non-programmer journalists will be able to use them as part of their reporting process.

I hope to build some of these tools myself!

So my concrete prediction for three years is that someone wins a Pulitzer with a small amount of assistance from LLMs.

My more general prediction: within three years it won't be surprising at all to see most information professionals use LLMs as part of their daily workflow, in increasingly sophisticated ways. We'll know exactly what patterns work and how best to explain them to people. These skills will become widespread.

Three years part two: privacy laws with teeth

My other three year prediction concerned privacy legislation.

The levels of (often justified) paranoia around both targeted advertising and what happens to the data people paste into these models is a constantly growing problem.

I wrote recently about the inexterminable conspiracy theory that Apple target ads through spying through your phone's microphone. I've written in the past about the AI trust crisis, where people refuse to believe that models are not being trained on their inputs no matter how emphatically the companies behind them deny it.

I think the AI industry itself would benefit enormously from legislation that helps clarify what's going on with training on user-submitted data, and the wider tech industry could really do with harder rules around things like data retention and targeted advertising.

I don't expect the next four years of US federal government to be effective at passing legislation, but I expect we'll see privacy legislation with sharper teeth emerging at the state level or internationally. Let's just hope we don't end up with a new generation of cookie-consent banners as a result!

Six years utopian: amazing art

For six years I decided to go with two rival predictions, one optimistic and one pessimistic.

I think six years is long enough that we'll figure out how to harness this stuff to make some really great art.

I don't think generative AI for art - images, video and music - deserves nearly the same level of respect as a useful tool as text-based LLMs. Generative art tools are a lot of fun to try out but the lack of fine-grained control over the output greatly limits its utility outside of personal amusement or generating slop.

More importantly, they lack social acceptability. The vibes aren't good. Many talented artists have loudly rejected the idea of these tools, to the point that the very term "AI" is developing a distasteful connotation in society at large.

Image and video models are also ground zero for the AI training data ethics debate, and for good reason: no artist wants to see a model trained on their work without their permission that then directly competes with them!

I think six years is long enough for this whole thing to shake out - for society to figure out acceptable ways of using these tools to truly elevate human expression. What excites me is the idea of truly talented, visionary creative artists using whatever these tools have evolved into in six years to make meaningful art that could never have been achieved without them.

On the podcast I talked about Everything Everywhere All at Once, a film that deserved every one of its seven Oscars. The core visual effects team on that film was just five people. Imagine what a team like that could do with the generative AI tools we'll have in six years time!

Since recording the podcast I learned from Swyx that Everything Everywhere All at Once used Runway ML as part of their toolset already:

Evan Halleck was on this team, and he used Runway's AI tools to save time and automate tedious aspects of editing. Specifically in the film’s rock scene, he used Runway’s rotoscoping tool to get a quick, clean cut of the rocks as sand and dust were moving around the shot. This translated days of work to a matter of minutes.

I said I thought a film that had used generative AI tools would win an Oscar within six years. Looks like I was eight years out on that one!

Six years dystopian: AGI/ASI causes mass civil unrest

My pessimistic alternative take for 2031 concerns "AGI" - a term which, like "agents", is constantly being redefined. The Information recently reported (see also The Verge) that Microsoft and OpenAI are now defining AGI as a system capable of generating $100bn in profit!

If we assume AGI is the point at which AI systems are capable of performing almost any job currently reserved for a human being it's hard not to see potentially negative consequences.

Sam Altman may have experimented with Universal Basic Income, but the USA is a country that can't even figure out universal healthcare! I have huge trouble imagining a future economy that works for the majority of people when the majority of jobs are being done by machines.

So my dystopian prediction for 2031 is that if that form of AGI has come to pass it will be accompanied by extraordinarily bad economic outcomes and mass civil unrest.

My version of an AI utopia is tools that augment existing humans. That's what we've had with LLMs so far, and my ideal is that those tools continue to improve and subsequently humans become able to take on more ambitious work.

If there's a version of AGI that results in that kind of utopia, I'm all for it.

My total lack of conviction

There's a reason I haven't made predictions like this before: my confidence in my ability to predict the future is almost non-existent. At least one of my predictions here already proved to be eight years late!

These predictions are in the public record now (I even submitted a pull request).

It's going to be interesting looking back at these in one, three and six years to see how I did.

Tags: data-journalism, predictions, ai, openai, generative-ai, llms, ai-assisted-programming, gemini, code-interpreter, oxide, ai-agents, deep-research, ai-assisted-search, coding-agents, agent-definitions

Building effective agents

2024-12-20T05:50:33+00:00

Building effective agents

My principal complaint about the term "agents" is that while it has many different potential definitions most of the people who use it seem to assume that everyone else shares and understands the definition that they have chosen to use.

This outstanding piece by Erik Schluntz and Barry Zhang at Anthropic bucks that trend from the start, providing a clear definition that they then use throughout.

They discuss "agentic systems" as a parent term, then define a distinction between "workflows" - systems where multiple LLMs are orchestrated together using pre-defined patterns - and "agents", where the LLMs "dynamically direct their own processes and tool usage". This second definition is later expanded with this delightfully clear description:

Agents begin their work with either a command from, or interactive discussion with, the human user. Once the task is clear, agents plan and operate independently, potentially returning to the human for further information or judgement. During execution, it's crucial for the agents to gain “ground truth” from the environment at each step (such as tool call results or code execution) to assess its progress. Agents can then pause for human feedback at checkpoints or when encountering blockers. The task often terminates upon completion, but it’s also common to include stopping conditions (such as a maximum number of iterations) to maintain control.

That's a definition I can live with!

They also introduce a term that I really like: the augmented LLM. This is an LLM with augmentations such as tools - I've seen people use the term "agents" just for this, which never felt right to me.

The rest of the article is the clearest practical guide to building systems that combine multiple LLM calls that I've seen anywhere.

Most of the focus is actually on workflows. They describe five different patterns for workflows in detail:

Prompt chaining, e.g. generating a document and then translating it to a separate language as a second LLM call
Routing, where an initial LLM call decides which model or call should be used next (sending easy tasks to Haiku and harder tasks to Sonnet, for example)
Parallelization, where a task is broken up and run in parallel (e.g. image-to-text on multiple document pages at once) or processed by some kind of voting mechanism
Orchestrator-workers, where a orchestrator triggers multiple LLM calls that are then synthesized together, for example running searches against multiple sources and combining the results
Evaluator-optimizer, where one model checks the work of another in a loop

These patterns all make sense to me, and giving them clear names makes them easier to reason about.

When should you upgrade from basic prompting to workflows and then to full agents? The authors provide this sensible warning:

When building applications with LLMs, we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all.

But assuming you do need to go beyond what can be achieved even with the aforementioned workflow patterns, their model for agents may be a useful fit:

Agents can be used for open-ended problems where it’s difficult or impossible to predict the required number of steps, and where you can’t hardcode a fixed path. The LLM will potentially operate for many turns, and you must have some level of trust in its decision-making. Agents' autonomy makes them ideal for scaling tasks in trusted environments.

The autonomous nature of agents means higher costs, and the potential for compounding errors. We recommend extensive testing in sandboxed environments, along with the appropriate guardrails

They also warn against investing in complex agent frameworks before you've exhausted your options using direct API access and simple code.

The article is accompanied by a brand new set of cookbook recipes illustrating all five of the workflow patterns. The Evaluator-Optimizer Workflow example is particularly fun, setting up a code generating prompt and an code reviewing evaluator prompt and having them loop until the evaluator is happy with the result.

Via Hamel Husain

Tags: definitions, ai, prompt-engineering, generative-ai, llms, anthropic, llm-tool-use, ai-agents, agent-definitions

PydanticAI

2024-12-02T21:08:50+00:00

PydanticAI

New project from Pydantic, which they describe as an "Agent Framework / shim to use Pydantic with LLMs".

I asked which agent definition they are using and it's the "system prompt with bundled tools" one. To their credit, they explain that in their documentation:

The Agent has full API documentation, but conceptually you can think of an agent as a container for:

A system prompt — a set of instructions for the LLM written by the developer

One or more retrieval tool — functions that the LLM may call to get information while generating a response

An optional structured result type — the structured datatype the LLM must return at the end of a run

Given how many other existing tools already lean on Pydantic to help define JSON schemas for talking to LLMs this is an interesting complementary direction for Pydantic to take.

There's some overlap here with my own LLM project, which I still hope to add a function calling / tools abstraction to in the future.

Via @pydantic

Tags: python, generative-ai, llms, llm, llm-tool-use, ai-agents, pydantic, agent-definitions

Quoting Michael Wooldridge

2024-10-12T12:29:36+00:00

Carl Hewitt recently remarked that the question what is an agent? is embarrassing for the agent-based computing community in just the same way that the question what is intelligence? is embarrassing for the mainstream AI community. The problem is that although the term is widely used, by many people working in closely related areas, it defies attempts to produce a single universally accepted definition. This need not necessarily be a problem: after all, if many people are successfully developing interesting and useful applications, then it hardly matters that they do not agree on potentially trivial terminological details. However, there is also the danger that unless the issue is discussed, 'agent' might become a 'noise' term, subject to both abuse and misuse, to the potential confusion of the research community.

— Michael Wooldridge, in 1994, Intelligent Agents: Theory and Practice

Tags: ai, ai-agents, ai-history, agent-definitions