Simon Willison's Weblog: predictions

Notes on Pope Leo XIV's encyclical on AI

2026-05-25T23:58:17+00:00

Dropped this morning by the Vatican: Magnifica Humanitas of His Holiness Pope Leo XIV on Safeguarding the Human Person in the Time of Artificial Intelligence. This is a very interesting document. It's some of the clearest writing I've seen on the ethics of integrating AI into modern society.

Pope Leo XIV chose the name Leo in honor of Pope Leo XIII, who is known for his 1891 Rerum novarum encyclical on "Rights and Duties of Capital and Labor".

This story on Vatican News further clarifies the significance of that decision:

Meeting with the College of Cardinals for their first formal encounter after his election, Pope Leo XIV explained part of the reason for the choice of his papal name. "There are different reasons for this," he said, before going on to explain that he chose the name Leo "mainly because Pope Leo XIII, in his historic encyclical Rerum novarum addressed the social question in the context of the first great industrial revolution."

"In our own day," he continued, "the Church offers to everyone the treasury of her social teaching in response to another industrial revolution and to developments in the field of artificial intelligence that pose new challenges for the defence of human dignity, justice, and labour."

And now we get Pope Leo XIV's own encyclical on the AI revolution. There's a lot in here, but the writing style is very approachable, including to non-Catholics.

A few of my highlights

(I listened to most of the encyclical on a walk with our dog, my first time trying the ElevenReader iPhone app. It worked very well: I pasted in a URL to the document and it read it to me in a very high quality voice, highlighting each paragraph as it went.)

Here are some of my highlights. In each case below emphasis is mine.

Here's a useful description of the interpretability problem for LLMs in section 98:

First, any statement regarding AI risks becoming quickly outdated, given the remarkable pace at which these systems are developing. Second, all of us, including those who design them, possess only a limited understanding of their actual functioning. Indeed, current AI systems are more “cultivated” than “built,” for developers do not directly design every detail, but instead create a framework within which the intelligence “grows.” As a result, fundamental scientific aspects — such as the internal representations and computational processes of these systems — remain, at present, unknown.

I liked section 83's description of the relationship between development and dignity:

For individuals as well as for nations, development is both a duty and a right. Minimum conditions are required for enabling every person and people to flourish in accord with their dignity, without being kept in a state of dependence or excluded from access to necessary goods. Development is truly human when it places people at the center instead of the accumulation of wealth, and when it concerns peoples as well as individuals. Justice demands the recognition of the rights of society and the rights of peoples, and includes a responsibility toward future generations. Development is not truly human if it increases consumption for some while shifting costs and burdens onto others, or relegates entire regions to subordinate roles, preventing them from realizing their full potential.

Baked in cultural biases and sycophancy get a mention in section 100:

In personal use, three aspects in particular deserve careful consideration: the ease with which results are obtained, the impression of objectivity and the simulation of human communication. The speed and simplicity with which information, complex analyses, media content and practical assistance can be accessed undoubtedly makes life easier. Yet they can also encourage excessive reliance and the search for ready-made answers, and weaken personal creativity and judgment. The apparent objectivity of the responses and suggestions these systems provide can lead us to overlook the fact that they reflect the cultural assumptions of those who designed and trained them, with all their strengths and limitations. The artificial imitation of positive human communication — words of advice, empathy, friendship and even love — can be engaging and at times genuinely helpful. However, for less discerning users, it can also be misleading, creating the illusion of a relationship with a real personal subject. When words are simulated, they do not build genuine relationships, but only their appearance. The artificial imitation of care or support can become particularly risky when it enters contexts where real relationships and emotional bonds are lacking.

101 touches on the environmental impact:

Current AI systems require enormous amounts of energy and water, significantly influencing carbon dioxide emissions, and place heavy demands on natural resources. As their complexity increases, especially in the case of large language models, the need for computing power and storage capacity grows too, which requires an extensive network of machines, cables, data centers and energy-intensive infrastructure. For this reason, it is essential to develop more sustainable technological solutions that reduce environmental impact and help protect our common home.

102 covers the risks of algorithmic systems making decisions that impact people's lives without "compassion, mercy, forgiveness":

The use of AI is never a purely technical matter: when it enters processes that affect people’s lives, it touches on rights, opportunities, status and freedom. Important and sensitive decisions — concerning employment, credit, access to public services or even a person’s reputation — risk being fully delegated to automated systems that do not know “compassion, mercy, forgiveness, and above all, the hope that people are able to change,” and can therefore give rise to new forms of exclusion.

105 emphasizes the need for human accountability in how these systems are applied:

For AI to respect human dignity and truly serve the common good, responsibility must be clearly defined at every stage: from those who design and develop these systems to those who use them and rely on them for concrete decisions. In many cases, however, the internal processes leading to a result remain opaque, making it harder to assign responsibility and correct errors. This is where accountability becomes crucial: the possibility of identifying who must “account” for decisions, justify them, monitor them, and, when necessary, challenge them and remedy any harm caused.

And 108 touches on the way AI amplifies the power of those with resources:

In fact, as with every major technological shift, AI tends to amplify the power of those who already possess economic resources, expertise and access to data. In light of the common good and the universal destination of goods, this raises serious concerns, since small but highly influential groups can shape information and consumption patterns, influence democratic processes and steer economic dynamics to their own advantage, undermining social justice and solidarity among peoples. For this reason, it is essential that the use of AI, especially when it touches on public goods and fundamental rights, be guided by clear criteria and effective oversight, grounded in participation and subsidiarity.

That same section explicitly calls out data as something that should be thought of more as a public good:

[...] Moreover, ownership of data cannot be left solely in private hands but must be appropriately regulated. Data is the product of many contributors and should not be treated as something to be sold off or entrusted to a select few. It is necessary to think creatively in order to manage data as a common or shared good, in a spirit of participation, as Saint John Paul II already suggested regarding collective goods.

Given that Palantir is named after a Lord of the Rings reference, I can't help but wonder if the J.R.R. Tolkien quote from The Return of the King (section 213) was the Pope throwing a little shade at Peter Thiel.

The twentieth-century Catholic author J.R.R. Tolkien, in the words of a protagonist in one of his novels, described our responsibility in this way: “It is not our part to master all the tides of the world, but to do what is in us for the succour of those years wherein we are set, uprooting the evil in the fields that we know, so that those who live after may have clean earth to till.” The civilization of love will not arise from a single or spectacular gesture, but from the sum total of small and steadfast acts of fidelity that serve as a bulwark against dehumanization. For this reason, it is worthwhile pausing to reflect on some aspects of how we, each in our own way, can cooperate in building the civilization of love.

Another 2026 prediction down

On 6th January this year I joined the Oxide and Friends 2026 predictions podcast episode to talk about predictions for 2026, 2029 and 2032. I wrote mine up here, with hindsight they weren't nearly ambitious enough - it's already undeniable that LLMs write good code, we've made huge advances in sandboxing and New Zealand kākāpō have indeed had a truly excellent breeding season.

There's one segment from the episode that I didn't bother to include in my write-up, but that I can't resist providing as a lightly-edited transcript here:

Bryan Cantrill: 37:13

I think that AI has created some real public perception problems for itself. And I think that you are gonna have one of the frontier model companies, this year, have a white paper explaining how the proliferation of AI will mean prosperity for everybody. They will be trying to make some economic argument - because this is gonna be a 2026 election issue, how we think of these things and how they are regulated and it's a big mess. There's more heat than light in this debate.

Simon Willison: 38:05

I'd like to tag something on to that one: I think that only works if they can sort of wash that through existing trusted experts. Sam Altman and Dario are constantly publishing essays about this stuff and nobody believes a word they say. Get Barack Obama's signature on one of these position papers and maybe you've got something people might start to trust a little bit.

Adam Leventhal: 38:27

Otherwise, it's just like "leaded gas is good for you", says Exxon.

Bryan Cantrill: 38:31

I mean, yeah. God. Obama... let's go with that, that's a great one because if it's like Bill Clinton everyone's gonna kind of roll their eyes, so it's gotta be someone who's got real credibility saying that this is gonna be broad-based... I'd say if they get that person to do it, it's gonna be revealed that that's also a bit crooked.

Simon Willison: 38:57

How about the Pope?

Bryan Cantrill: 39:01

The Pope is very into this stuff! That's a great prediction. We've hit pay dirt. The Pope weighing in on LLMs and their economic impact on the world.

Simon, I'm giving you full credit if the Pope weighs in believing that this is gonna be economic devastation.

My prediction here looks a whole lot less insightful given the Leo XIV/Leo XIII relationship, which I was unaware of when we recorded the episode!

Tags: predictions, ai, kakapo, generative-ai, llms, bryan-cantrill, ai-ethics

One Human + One Agent = One Browser From Scratch

2026-01-27T16:58:08+00:00

One Human + One Agent = One Browser From Scratch

embedding-shapes was so infuriated by the hype around Cursor's FastRender browser project - thousands of parallel agents producing ~1.6 million lines of Rust - that they were inspired to take a go at building a web browser using coding agents themselves.

The result is one-agent-one-browser and it's really impressive. Over three days they drove a single Codex CLI agent to build 20,000 lines of Rust that successfully renders HTML+CSS with no Rust crate dependencies at all - though it does (reasonably) use Windows, macOS and Linux system frameworks for image and text rendering.

I installed the 1MB macOS binary release and ran it against my blog:

chmod 755 ~/Downloads/one-agent-one-browser-macOS-ARM64 
~/Downloads/one-agent-one-browser-macOS-ARM64 https://simonwillison.net/

Here's the result:

It even rendered my SVG feed subscription icon! A PNG image is missing from the page, which looks like an intermittent bug (there's code to render PNGs).

The code is pretty readable too - here's the flexbox implementation.

I had thought that "build a web browser" was the ideal prompt to really stretch the capabilities of coding agents - and that it would take sophisticated multi-agent harnesses (as seen in the Cursor project) and millions of lines of code to achieve.

Turns out one agent driven by a talented engineer, three days and 20,000 lines of Rust is enough to get a very solid basic renderer working!

I'm going to upgrade my prediction for 2029: I think we're going to get a production-grade web browser built by a small team using AI assistance by then.

Via Show Hacker News

Tags: browsers, predictions, ai, rust, generative-ai, llms, ai-assisted-programming, coding-agents, codex, browser-challenge

LLM predictions for 2026, shared with Oxide and Friends

2026-01-08T19:42:13+00:00

I joined a recording of the Oxide and Friends podcast on Tuesday to talk about 1, 3 and 6 year predictions for the tech industry. This is my second appearance on their annual predictions episode, you can see my predictions from January 2025 here. Here's the page for this year's episode, with options to listen in all of your favorite podcast apps or directly on YouTube.

Bryan Cantrill started the episode by declaring that he's never been so unsure about what's coming in the next year. I share that uncertainty - the significant advances in coding agents just in the last two months have left me certain that things will change significantly, but unclear as to what those changes will be.

Here are the predictions I shared in the episode.

1 year: It will become undeniable that LLMs write good code ▶ 19:27

I think that there are still people out there who are convinced that LLMs cannot write good code. Those people are in for a very nasty shock in 2026. I do not think it will be possible to get to the end of even the next three months while still holding on to that idea that the code they write is all junk and it's it's likely any decent human programmer will write better code than they will.

In 2023, saying that LLMs write garbage code was entirely correct. For most of 2024 that stayed true. In 2025 that changed, but you could be forgiven for continuing to hold out. In 2026 the quality of LLM-generated code will become impossible to deny.

I base this on my own experience - I've spent more time exploring AI-assisted programming than most.

The key change in 2025 (see my overview for the year) was the introduction of "reasoning models" trained specifically against code using Reinforcement Learning. The major labs spent a full year competing with each other on who could get the best code capabilities from their models, and that problem turns out to be perfectly attuned to RL since code challenges come with built-in verifiable success conditions.

Since Claude Opus 4.5 and GPT-5.2 came out in November and December respectively the amount of code I've written by hand has dropped to a single digit percentage of my overall output. The same is true for many other expert programmers I know.

At this point if you continue to argue that LLMs write useless code you're damaging your own credibility.

1 year: We're finally going to solve sandboxing ▶ 20:05

I think this year is the year we're going to solve sandboxing. I want to run code other people have written on my computing devices without it destroying my computing devices if it's malicious or has bugs. [...] It's crazy that it's 2026 and I still pip install random code and then execute it in a way that it can steal all of my data and delete all my files. [...] I don't want to run a piece of code on any of my devices that somebody else wrote outside of sandbox ever again.

This isn't just about LLMs, but it becomes even more important now there are so many more people writing code often without knowing what they're doing. Sandboxing is also a key part of the battle against prompt injection.

We have a lot of promising technologies in play already for this - containers and WebAssembly being the two I'm most optimistic about. There's real commercial value involved in solving this problem. The pieces are there, what's needed is UX work to reduce the friction in using them productively and securely.

1 year: A "Challenger disaster" for coding agent security ▶ 21:21

I think we're due a Challenger disaster with respect to coding agent security[...] I think so many people, myself included, are running these coding agents practically as root, right? We're letting them do all of this stuff. And every time I do it, my computer doesn't get wiped. I'm like, "oh, it's fine". [...] The worst version of this is the worm - a prompt injection worm which infects people's computers and adds itself to the Python or NPM packages that person has access to.

I used this as an opportunity to promote my favourite recent essay about AI security, the Normalization of Deviance in AI by Johann Rehberger.

The Normalization of Deviance describes the phenomenon where people and organizations get used to operating in an unsafe manner because nothing bad has happened to them yet, which can result in enormous problems (like the 1986 Challenger disaster) when their luck runs out.

Every six months I predict that a headline-grabbing prompt injection attack is coming soon, and every six months it doesn't happen. This is my most recent version of that prediction!

1 year: Kākāpō parrots will have an outstanding breeding season ▶ 50:06

(I dropped this one to lighten the mood after a discussion of the deep sense of existential dread that many programmers are feeling right now!)

I think that Kākāpō parrots in New Zealand are going to have an outstanding breeding season. The reason I think this is that the Rimu trees are in fruit right now. There's only 250 of them, and they only breed if the Rimu trees have a good fruiting. The Rimu trees have been terrible since 2019, but this year the Rimu trees were all blooming. There are researchers saying that all 87 females of breeding age might lay an egg. And for a species with only 250 remaining parrots that's great news.

(I just checked Wikipedia and I was right with the parrot numbers but wrong about the last good breeding season, apparently 2022 was a good year too.)

In a year with precious little in the form of good news I am utterly delighted to share this story. Here's more:

Kākāpō breeding season 2026 introduction from the Department of Conservation from June 2025 .
Bumper breeding season for kākāpō on the cards - 3rd December 2025, University of Auckland.

I don't often use AI-generated images on this blog, but the Kākāpō image the Oxide team created for this episode is just perfect:

3 years: the coding agents Jevons paradox for software engineering will resolve, one way or the other ▶ 54:37

We will find out if the Jevons paradox saves our careers or not. This is a big question that anyone who's a software engineer has right now: we are driving the cost of actually producing working code down to a fraction of what it used to cost. Does that mean that our careers are completely devalued and we all have to learn to live on a tenth of our incomes, or does it mean that the demand for software, for custom software goes up by a factor of 10 and now our skills are even more valuable because you can hire me and I can build you 10 times the software I used to be able to? I think by three years we will know for sure which way that one went.

The quote says it all. There are two ways this coding agents thing could go: it could turn out software engineering skills are devalued, or it could turn out we're more valuable and effective than ever before.

I'm crossing my fingers for the latter! So far it feels to me like it's working out that way.

3 years: Someone will build a new browser using mainly AI-assisted coding and it won't even be a surprise ▶ 65:13

I think somebody will have built a full web browser mostly using AI assistance, and it won't even be surprising. Rolling a new web browser is one of the most complicated software projects I can imagine[...] the cheat code is the conformance suites. If there are existing tests that it'll get so much easier.

A common complaint today from AI coding skeptics is that LLMs are fine for toy projects but can't be used for anything large and serious.

I think within 3 years that will be comprehensively proven incorrect, to the point that it won't even be controversial anymore.

I picked a web browser here because so much of the work building a browser involves writing code that has to conform to an enormous and daunting selection of both formal tests and informal websites-in-the-wild.

Coding agents are really good at tasks where you can define a concrete goal and then set them to work iterating in that direction.

A web browser is the most ambitious project I can think of that leans into those capabilities.

6 years: Typing code by hand will go the way of punch cards ▶ 80:39

I think the job of being paid money to type code into a computer will go the same way as punching punch cards [...] in six years time, I do not think anyone will be paid to just to do the thing where you type the code. I think software engineering will still be an enormous career. I just think the software engineers won't be spending multiple hours of their day in a text editor typing out syntax.

The more time I spend on AI-assisted programming the less afraid I am for my job, because it turns out building software - especially at the rate it's now possible to build - still requires enormous skill, experience and depth of understanding.

The skills are changing though! Being able to read a detailed specification and transform it into lines of code is the thing that's being automated away. What's left is everything else, and the more time I spend working with coding agents the larger that "everything else" becomes.

Tags: predictions, sandboxing, ai, kakapo, generative-ai, llms, ai-assisted-programming, oxide, bryan-cantrill, coding-agents, jevons-paradox, conformance-suites, browser-challenge, deep-blue, november-2025-inflection

Prediction: AI will make formal verification go mainstream

2025-12-09T03:11:19+00:00

Prediction: AI will make formal verification go mainstream

Martin Kleppmann makes the case for formal verification languages (things like Dafny, Nagini, and Verus) to finally start achieving more mainstream usage. Code generated by LLMs can benefit enormously from more robust verification, and LLMs themselves make these notoriously difficult systems easier to work with.

The paper Can LLMs Enable Verification in Mainstream Programming? by JetBrains Research in March 2025 found that Claude 3.5 Sonnet saw promising results for the three languages I listed above.

Via lobste.rs

Tags: predictions, programming-languages, ai, generative-ai, llms, ai-assisted-programming, martin-kleppmann

My AI/LLM predictions for the next 1, 3 and 6 years, for Oxide and Friends

2025-01-10T01:43:16+00:00

The Oxide and Friends podcast has an annual tradition of asking guests to share their predictions for the next 1, 3 and 6 years. Here's 2022, 2023 and 2024. This year they invited me to participate. I've never been brave enough to share any public predictions before, so this was a great opportunity to get outside my comfort zone!

We recorded the episode live using Discord on Monday. It's now available on YouTube and in podcast form.

Here are my predictions, written up here in a little more detail than the stream of consciousness I shared on the podcast.

I should emphasize that I find the very idea of trying to predict AI/LLMs over a multi-year period to be completely absurd! I can't predict what's going to happen a week from now, six years is a different universe.

With that disclaimer out of the way, here's an expanded version of what I said.

One year: Agents fail to happen, again

I wrote about how “Agents” still haven’t really happened yet in my review of Large Language Model developments in 2024.

I think we are going to see a lot more froth about agents in 2025, but I expect the results will be a great disappointment to most of the people who are excited about this term. I expect a lot of money will be lost chasing after several different poorly defined dreams that share that name.

What are agents anyway? Ask a dozen people and you'll get a dozen slightly different answers - I collected and then AI-summarized a bunch of those here.

For the sake of argument, let's pick a definition that I can predict won't come to fruition: the idea of an AI assistant that can go out into the world and semi-autonomously act on your behalf. I think of this as the travel agent definition of agents, because for some reason everyone always jumps straight to flight and hotel booking and itinerary planning when they describe this particular dream.

Having the current generation of LLMs make material decisions on your behalf - like what to spend money on - is a really bad idea. They're too unreliable, but more importantly they are too gullible.

If you're going to arm your AI assistant with a credit card and set it loose on the world, you need to be confident that it's not going to hit "buy" on the first website that claims to offer the best bargains!

I'm confident that reliability is the reason we haven't seen LLM-powered agents that have taken off yet, despite the idea attracting a huge amount of buzz since right after ChatGPT first came out.

I would be very surprised if any of the models released over the next twelve months had enough of a reliability improvement to make this work. Solving gullibility is an astonishingly difficult problem.

(I had a particularly spicy rant about how stupid the idea of sending a "digital twin" to a meeting on your behalf is.)

One year: ... except for code and research assistants

There are two categories of "agent" that I do believe in, because they're proven to work already.

The first is coding assistants - where an LLM writes, executes and then refines computer code in a loop.

I first saw this pattern demonstrated by OpenAI with their Code Interpreter feature for ChatGPT, released back in March/April of 2023.

You can ask ChatGPT to solve a problem that can use Python code and it will write that Python, execute it in a secure sandbox (I think it's Kubernetes) and then use the output - or any error messages - to determine if the goal has been achieved.

It's a beautiful pattern that worked great with early 2023 models (I believe it first shipped using original GPT-4), and continues to work today.

Claude added their own version in October (Claude analysis, using JavaScript that runs in the browser), Mistral have it, Gemini has a version and there are dozens of other implementations of the same pattern.

The second category of agents that I believe in is research assistants - where an LLM can run multiple searches, gather information and aggregate that into an answer to a question or write a report.

Perplexity and ChatGPT Search have both been operating in this space for a while, but by far the most impressive implementation I've seen is Google Gemini's Deep Research tool, which I've had access to for a few weeks.

With Deep Research I can pose a question like this one:

Pillar Point Harbor is one of the largest communal brown pelican roosts on the west coast of North America.

find others

And Gemini will draft a plan, consult dozens of different websites via Google Search and then assemble a report (with all-important citations) describing what it found.

Here's the plan it came up with:

Pillar Point Harbor is one of the largest communal brown pelican roosts on the west coast of North America. Find other large communal brown pelican roosts on the west coast of North America.
(1) Find a list of brown pelican roosts on the west coast of North America.
(2) Find research papers or articles about brown pelican roosts and their size.
(3) Find information from birdwatching organizations or government agencies about brown pelican roosts.
(4) Compare the size of the roosts found in (3) to the size of the Pillar Point Harbor roost.
(5) Find any news articles or recent reports about brown pelican roosts and their populations.

It dug up a whole bunch of details, but the one I cared most about was these PDF results for the 2016-2019 Pacific Brown Pelican Survey conducted by the West Coast Audubon network and partners - a PDF that included this delightful list:

Top 10 Megaroosts (sites that traditionally host >500 pelicans) with average fall count numbers:

Alameda Breakwater, CA (3,183)

Pillar Point Harbor, CA (1,481)

East Sand Island, OR (1,121)

Ano Nuevo State Park, CA (1,068)

Salinas River mouth, CA (762)

Bolinas Lagoon, CA (755)

Morro Rock, CA (725)

Moss landing, CA (570)

Crescent City Harbor, CA (514)

Bird Rock Tomales, CA (514)

My local harbor is the second biggest megaroost!

It makes intuitive sense to me that this kind of research assistant can be built on our current generation of LLMs. They're competent at driving tools, they're capable of coming up with a relatively obvious research plan (look for newspaper articles and research papers) and they can synthesize sensible answers given the right collection of context gathered through search.

Google are particularly well suited to solving this problem: they have the world's largest search index and their Gemini model has a 2 million token context. I expect Deep Research to get a whole lot better, and I expect it to attract plenty of competition.

Three years: Someone wins a Pulitzer for AI-assisted investigative reporting

I went for a bit of a self-serving prediction here: I think within three years someone is going to win a Pulitzer prize for a piece of investigative reporting that was aided by generative AI tools.

Update: after publishing this piece I learned about this May 2024 story from Nieman Lab: For the first time, two Pulitzer winners disclosed using AI in their reporting. I think these were both examples of traditional machine learning as opposed to LLM-based generative AI, but this is yet another example of my predictions being less ambitious than I had thought!

I do not mean that an LLM will write the article! I continue to think that having LLMs write on your behalf is one of the least interesting applications of these tools.

I called this prediction self-serving because I want to help make this happen! My Datasette suite of open source tools for data journalism has been growing AI features, like LLM-powered data enrichments and extracting structured data into tables from unstructured text.

My dream is for those tools - or tools like them - to be used for an award winning piece of investigative reporting.

I picked three years for this because I think that's how long it will take for knowledge of how to responsibly and effectively use these tools to become widespread enough for that to happen.

LLMs are not an obvious fit for journalism: journalists look for the truth, and LLMs are notoriously prone to hallucination and making things up. But journalists are also really good at extracting useful information from potentially untrusted sources - that's a lot of what the craft of journalism is about.

The two areas I think LLMs are particularly relevant to journalism are:

Structured data extraction. If you have 10,000 PDFs from a successful Freedom of Information Act request, someone or something needs to kick off the process of reading through them to find the stories. LLMs are a fantastic way to take a vast amount of information and start making some element of sense from it. They can act as lead generators, helping identify the places to start looking more closely.
Coding assistance. Writing code to help analyze data is a huge part of modern data journalism - from SQL queries through data cleanup scripts, custom web scrapers or visualizations to help find signal among the noise. Most newspapers don't have a team of programmers on staff: I think within three years we'll have robust enough tools built around this pattern that non-programmer journalists will be able to use them as part of their reporting process.

I hope to build some of these tools myself!

So my concrete prediction for three years is that someone wins a Pulitzer with a small amount of assistance from LLMs.

My more general prediction: within three years it won't be surprising at all to see most information professionals use LLMs as part of their daily workflow, in increasingly sophisticated ways. We'll know exactly what patterns work and how best to explain them to people. These skills will become widespread.

Three years part two: privacy laws with teeth

My other three year prediction concerned privacy legislation.

The levels of (often justified) paranoia around both targeted advertising and what happens to the data people paste into these models is a constantly growing problem.

I wrote recently about the inexterminable conspiracy theory that Apple target ads through spying through your phone's microphone. I've written in the past about the AI trust crisis, where people refuse to believe that models are not being trained on their inputs no matter how emphatically the companies behind them deny it.

I think the AI industry itself would benefit enormously from legislation that helps clarify what's going on with training on user-submitted data, and the wider tech industry could really do with harder rules around things like data retention and targeted advertising.

I don't expect the next four years of US federal government to be effective at passing legislation, but I expect we'll see privacy legislation with sharper teeth emerging at the state level or internationally. Let's just hope we don't end up with a new generation of cookie-consent banners as a result!

Six years utopian: amazing art

For six years I decided to go with two rival predictions, one optimistic and one pessimistic.

I think six years is long enough that we'll figure out how to harness this stuff to make some really great art.

I don't think generative AI for art - images, video and music - deserves nearly the same level of respect as a useful tool as text-based LLMs. Generative art tools are a lot of fun to try out but the lack of fine-grained control over the output greatly limits its utility outside of personal amusement or generating slop.

More importantly, they lack social acceptability. The vibes aren't good. Many talented artists have loudly rejected the idea of these tools, to the point that the very term "AI" is developing a distasteful connotation in society at large.

Image and video models are also ground zero for the AI training data ethics debate, and for good reason: no artist wants to see a model trained on their work without their permission that then directly competes with them!

I think six years is long enough for this whole thing to shake out - for society to figure out acceptable ways of using these tools to truly elevate human expression. What excites me is the idea of truly talented, visionary creative artists using whatever these tools have evolved into in six years to make meaningful art that could never have been achieved without them.

On the podcast I talked about Everything Everywhere All at Once, a film that deserved every one of its seven Oscars. The core visual effects team on that film was just five people. Imagine what a team like that could do with the generative AI tools we'll have in six years time!

Since recording the podcast I learned from Swyx that Everything Everywhere All at Once used Runway ML as part of their toolset already:

Evan Halleck was on this team, and he used Runway's AI tools to save time and automate tedious aspects of editing. Specifically in the film’s rock scene, he used Runway’s rotoscoping tool to get a quick, clean cut of the rocks as sand and dust were moving around the shot. This translated days of work to a matter of minutes.

I said I thought a film that had used generative AI tools would win an Oscar within six years. Looks like I was eight years out on that one!

Six years dystopian: AGI/ASI causes mass civil unrest

My pessimistic alternative take for 2031 concerns "AGI" - a term which, like "agents", is constantly being redefined. The Information recently reported (see also The Verge) that Microsoft and OpenAI are now defining AGI as a system capable of generating $100bn in profit!

If we assume AGI is the point at which AI systems are capable of performing almost any job currently reserved for a human being it's hard not to see potentially negative consequences.

Sam Altman may have experimented with Universal Basic Income, but the USA is a country that can't even figure out universal healthcare! I have huge trouble imagining a future economy that works for the majority of people when the majority of jobs are being done by machines.

So my dystopian prediction for 2031 is that if that form of AGI has come to pass it will be accompanied by extraordinarily bad economic outcomes and mass civil unrest.

My version of an AI utopia is tools that augment existing humans. That's what we've had with LLMs so far, and my ideal is that those tools continue to improve and subsequently humans become able to take on more ambitious work.

If there's a version of AGI that results in that kind of utopia, I'm all for it.

My total lack of conviction

There's a reason I haven't made predictions like this before: my confidence in my ability to predict the future is almost non-existent. At least one of my predictions here already proved to be eight years late!

These predictions are in the public record now (I even submitted a pull request).

It's going to be interesting looking back at these in one, three and six years to see how I did.

Tags: data-journalism, predictions, ai, openai, generative-ai, llms, ai-assisted-programming, gemini, code-interpreter, oxide, ai-agents, deep-research, ai-assisted-search, coding-agents, agent-definitions

Quoting Sam Bowman

2023-04-05T03:32:11+00:00

Scaling laws allow us to precisely predict some coarse-but-useful measures of how capable future models will be as we scale them up along three dimensions: the amount of data they are fed, their size (measured in parameters), and the amount of computation used to train them (measured in FLOPs). [...] Our ability to make this kind of precise prediction is unusual in the history of software and unusual even in the history of modern AI research. It is also a powerful tool for driving investment since it allows R&D teams to propose model-training projects costing many millions of dollars, with reasonable confidence that these projects will succeed at producing economically valuable systems.

— Sam Bowman

Tags: predictions, ai, generative-ai, llms

Quoting Tim Bray

2008-01-03T13:08:59+00:00

The strain due to the fact that most business desktops are locked into the Microsoft platform, at a time when both the Apple and GNU/Linux alternatives are qualitatively safer, better, and cheaper to operate, will start to become impossible to ignore.

— Tim Bray

Tags: apple, linux, macos, microsoft, predictions, tim-bray, windows