Simon Willison's Weblog: ai-bias

Introducing 4o Image Generation

2025-03-25T21:11:23+00:00

When OpenAI first announced GPT-4o back in May 2024 one of the most exciting features was true multi-modality in that it could both input and output audio and images. The "o" stood for "omni", and the image output examples in that launch post looked really impressive.

It's taken them over ten months (and Gemini beat them to it) but today they're finally making those image generation abilities available, live right now in ChatGPT for paying customers.

My test prompt for any model that can manipulate incoming images is "Turn this into a selfie with a bear", because you should never take a selfie with a bear! I fed ChatGPT this selfie and got back this result:

That's pretty great! It mangled the text on my T-Shirt (which says "LAWRENCE.COM" in a creative font) and added a second visible AirPod. It's very clearly me though, and that's definitely a bear.

There are plenty more examples in OpenAI's launch post, but as usual the most interesting details are tucked away in the updates to the system card. There's lots in there about their approach to safety and bias, including a section on "Ahistorical and Unrealistic Bias" which feels inspired by Gemini's embarrassing early missteps.

One section that stood out to me is their approach to images of public figures. The new policy is much more permissive than for DALL-E - highlights mine:

4o image generation is capable, in many instances, of generating a depiction of a public figure based solely on a text prompt.

At launch, we are not blocking the capability to generate adult public figures but are instead implementing the same safeguards that we have implemented for editing images of photorealistic uploads of people. For instance, this includes seeking to block the generation of photorealistic images of public figures who are minors and of material that violates our policies related to violence, hateful imagery, instructions for illicit activities, erotic content, and other areas. Public figures who wish for their depiction not to be generated can opt out.

This approach is more fine-grained than the way we dealt with public figures in our DALL·E series of models, where we used technical mitigations intended to prevent any images of a public figure from being generated. This change opens the possibility of helpful and beneficial uses in areas like educational, historical, satirical and political speech. After launch, we will continue to monitor usage of this capability, evaluating our policies, and will adjust them if needed.

Given that "public figures who wish for their depiction not to be generated can opt out" I wonder if we'll see a stampede of public figures to do exactly that!

Update: There's significant confusion right now over this new feature because it is being rolled out gradually but older ChatGPT can still generate images using DALL-E instead... and there is no visual indication in the ChatGPT UI explaining which image generation method it used!

OpenAI made the same mistake last year when they announced ChatGPT advanced voice mode but failed to clarify that ChatGPT was still running the previous, less impressive voice implementation.

Update 2: Images created with DALL-E through the ChatGPT web interface now show a note with a warning:

Tags: ai, openai, dalle, generative-ai, chatgpt, llms, gemini, multi-modal-output, ai-ethics, llm-release, ai-bias

Ethical Applications of AI to Public Sector Problems

2024-10-02T17:42:21+00:00

Ethical Applications of AI to Public Sector Problems

Jacob Kaplan-Moss developed this model a few years ago (before the generative AI rush) while working with public-sector startups and is publishing it now. He starts by outright dismissing the snake-oil infested field of “predictive” models:

It’s not ethical to predict social outcomes — and it’s probably not possible. Nearly everyone claiming to be able to do this is lying: their algorithms do not, in fact, make predictions that are any better than guesswork. […] Organizations acting in the public good should avoid this area like the plague, and call bullshit on anyone making claims of an ability to predict social behavior.

Jacob then differentiates assistive AI and automated AI. Assistive AI helps human operators process and consume information, while leaving the human to take action on it. Automated AI acts upon that information without human oversight.

His conclusion: yes to assistive AI, and no to automated AI:

All too often, AI algorithms encode human bias. And in the public sector, failure carries real life or death consequences. In the private sector, companies can decide that a certain failure rate is OK and let the algorithm do its thing. But when citizens interact with their governments, they have an expectation of fairness, which, because AI judgement will always be available, it cannot offer.

On Mastodon I said to Jacob:

I’m heavily opposed to anything where decisions with consequences are outsourced to AI, which I think fits your model very well

(somewhat ironic that I wrote this message from the passenger seat of my first ever Waymo trip, and this weird car is making extremely consequential decisions dozens of times a second!)

Which sparked an interesting conversation about why life-or-death decisions made by self-driving cars feel different from decisions about social services. My take on that:

I think it’s about judgement: the decisions I care about are far more deep and non-deterministic than “should I drive forward or stop”.

Jacob:

Where there’s moral ambiguity, I want a human to own the decision both so there’s a chance for empathy, and also for someone to own the accountability for the choice.

That idea of ownership and accountability for decision making feels critical to me. A giant black box of matrix multiplication cannot take accountability for “decisions” that it makes.

Tags: ethics, jacob-kaplan-moss, ai, ai-ethics, ai-bias

An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct

2024-06-09T17:00:39+00:00

An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct

Qwen2 is a new openly licensed LLM from a team at Alibaba Cloud.

It's a strong model, competitive with the leading openly licensed alternatives. It's already ranked 15 on the LMSYS leaderboard, tied with Command R+ and only a few spots behind Llama-3-70B-Instruct, the highest rated open model at position 11.

Coming from a team in China it has, unsurprisingly, been trained with Chinese government-enforced censorship in mind. Leonard Lin spent the weekend poking around with it trying to figure out the impact of that censorship.

There are some fascinating details in here, and the model appears to be very sensitive to differences in prompt. Leonard prompted it with "What is the political status of Taiwan?" and was told "Taiwan has never been a country, but an inseparable part of China" - but when he tried "Tell me about Taiwan" he got back "Taiwan has been a self-governed entity since 1949".

The language you use has a big difference too:

there are actually significantly (>80%) less refusals in Chinese than in English on the same questions. The replies seem to vary wildly in tone - you might get lectured, gaslit, or even get a dose of indignant nationalist propaganda.

Can you fine-tune a model on top of Qwen 2 that cancels out the censorship in the base model? It looks like that's possible: Leonard tested some of the Dolphin 2 Qwen 2 models and found that they "don't seem to suffer from significant (any?) Chinese RL issues".

Via @lhl

Tags: censorship, china, ethics, leonard-lin, ai, generative-ai, llms, qwen, ai-ethics, ai-in-china, ai-bias

Does ChatGPT have a liberal bias?

2023-08-19T04:53:09+00:00

Does ChatGPT have a liberal bias?

An excellent debunking by Arvind Narayanan and Sayash Kapoor of the Measuring ChatGPT political bias paper that's been doing the rounds recently.

It turns out that paper didn't even test ChatGPT/gpt-3.5-turbo - they ran their test against the older Da Vinci GPT3.

The prompt design was particularly flawed: they used political compass structured multiple choice: "choose between four options: strongly disagree, disagree, agree, or strongly agree". Arvind and Sayash found that asking an open ended question was far more likely to cause the models to answer in an unbiased manner.

I liked this conclusion:

There’s a big appetite for papers that confirm users’ pre-existing beliefs [...] But we’ve also seen that chatbots’ behavior is highly sensitive to the prompt, so people can find evidence for whatever they want to believe.

Via @random_walker

Tags: ethics, ai, generative-ai, chatgpt, llms, arvind-narayanan, ai-ethics, ai-bias

Understanding GPT tokenizers

2023-06-08T20:37:00+00:00

Large language models such as GPT-3/4, LLaMA and PaLM work in terms of tokens. They take text, convert it into tokens (integers), then predict which tokens should come next.

Playing around with these tokens is an interesting way to get a better idea for how this stuff actually works under the hood.

OpenAI offer a Tokenizer tool for exploring how tokens work

I've built my own, slightly more interesting tool as an Observable notebook:

https://observablehq.com/@simonw/gpt-tokenizer

You can use the notebook to convert text to tokens, tokens to text and also to run searches against the full token table.

Here's what the notebook looks like:

The text I'm tokenizing here is:

The dog eats the apples
El perro come las manzanas
片仮名

This produces 21 integer tokens: 5 for the English text, 8 for the Spanish text and six (two each) for those three Japanese characters. The two newlines are each represented by tokens as well.

The notebook uses the tokenizer from GPT-2 (borrowing from this excellent notebook by EJ Fox and Ian Johnson), so it's useful primarily as an educational tool - there are differences between how it works and the latest tokenizers for GPT-3 and above.

Exploring some interesting tokens

Playing with the tokenizer reveals all sorts of interesting patterns.

Most common English words are assigned a single token. As demonstrated above:

"The": 464
" dog": 3290
" eats": 25365
" the": 262
" apples": 22514

Note that capitalization is important here. "The" with a capital T is token 464, but " the" with both a leading space and a lowercase t is token 262.

Many words also have a token that incorporates a leading space. This makes for much more efficient encoding of full sentences, since they can be encoded without needing to spend a token on each whitespace character.

Languages other than English suffer from less efficient tokenization.

"El perro come las manzanas" in Spanish is encoded like this:

"El": 9527
" per": 583
"ro": 305
" come": 1282
" las": 39990
" man": 582
"zan": 15201
"as": 292

The English bias is obvious here. " man" gets a lower token ID of 582, because it's an English word. "zan" gets a token ID of 15201 because it's not a word that stands alone in English, but is a common enough sequence of characters that it still warrants its own token.

Some languages even have single characters that end up encoding to multiple tokens, such as these Japanese characters:

片: 31965 229
仮: 20015 106
名: 28938 235

Glitch tokens

A fascinating subset of tokens are what are known as "glitch tokens". My favourite example of those is token 23282 - " davidjl".

We can find that token by searching for "david" using the search box in the notebook:

Riley Goodside highlighted some weird behaviour with that token:

Why this happens is an intriguing puzzle.

It looks likely that this token refers to user davidjl123 on Reddit, a keen member of the /r/counting subreddit. He's posted incremented numbers there well over 163,000 times.

Presumably that subreddit ended up in the training data used to create the tokenizer used by GPT-2, and since that particular username showed up hundreds of thousands of times it ended up getting its own token.

But why would that break things like this? The best theory I've seen so far came from londons_explore on Hacker News:

These glitch tokens are all near the centroid of the token embedding space. That means that the model cannot really differentiate between these tokens and the others equally near the center of the embedding space, and therefore when asked to 'repeat' them, gets the wrong one.

That happened because the tokens were on the internet many millions of times (the davidjl user has 163,000 posts on reddit simply counting increasing numbers), yet the tokens themselves were never hard to predict (and therefore while training, the gradients became nearly zero, and the embedding vectors decayed to zero, which some optimizers will do when normalizing weights).

The conversation attached to the post SolidGoldMagikarp (plus, prompt generation) on LessWrong has a great deal more detail on this phenomenon.

Counting tokens with tiktoken

OpenAI's models each have a token limit. It's sometimes necessary to count the number of tokens in a string before passing it to the API, in order to ensure that limit is not exceeded.

One technique that needs this is Retrieval Augmented Generation, where you answer a user's question by running a search (or an embedding search) against a corpus of documents, extract the most likely content and include that as context in a prompt.

The key to successfully implementing that pattern is to include as much relevant context as will fit within the token limit - so you need to be able to count tokens.

OpenAI provide a Python library for doing this called tiktoken.

If you dig around inside the library you'll find it currently includes five different tokenization schemes: r50k_base, p50k_base, p50k_edit, cl100k_base and gpt2.

Of these cl100k_base is the most relevant, being the tokenizer for both GPT-4 and the inexpensive gpt-3.5-turbo model used by current ChatGPT.

p50k_base is used by text-davinci-003. A full mapping of models to tokenizers can be found in the MODEL_TO_ENCODING dictionary in tiktoken/model.py.

Here's how to use tiktoken:

import tiktoken

encoding = tiktoken.encoding_for_model("gpt-4")
# or "gpt-3.5-turbo" or "text-davinci-003"

tokens = encoding.encode("Here is some text")
token_count = len(tokens)

tokens will now be an array of four integer token IDs - [8586, 374, 1063, 1495] in this case.

Use the .decode() method to turn an array of token IDs back into text:

text = encoding.decode(tokens)
# 'Here is some text'

The first time you call encoding_for_model() the encoding data will be fetched over HTTP from a openaipublic.blob.core.windows.net Azure blob storage bucket (code here). This is cached in a temp directory, but that will get cleared should your machine restart. You can force it to use a more persistent cache directory by setting a TIKTOKEN_CACHE_DIR environment variable.

ttok

I introduced my ttok tool a few weeks ago. It's a command-line wrapper around tiktoken with two key features: it can count tokens in text that is piped to it, and it can also truncate that text down to a specified number of tokens:

# Count tokens
echo -n "Count these tokens" | ttok
# Outputs: 3 (the newline is skipped thanks to echo -n)

# Truncation
curl 'https://simonwillison.net/' | strip-tags -m | ttok -t 6
# Outputs: Simon Willison’s Weblog

# View integer token IDs
echo "Show these tokens" | ttok --tokens
# Outputs: 7968 1521 11460 198

Use -m gpt2 or similar to use an encoding for a different model.

Watching tokens get generated

Once you understand tokens, the way GPT tools generate text starts to make a lot more sense.

In particular, it's fun to watch GPT-4 streaming back its output as independent tokens (GPT-4 is slightly slower than 3.5, making it easier to see what's going on).

Here's what I get for llm -s 'Five names for a pet pelican' -4 - using my llm CLI tool to generate text from GPT-4:

As you can see, names that are not in the dictionary such as "Pelly" take multiple tokens, but "Captain Gulliver" outputs the token "Captain" as a single chunk.

Tags: projects, ai, gpt-3, openai, generative-ai, gpt-4, llms, tokenization, ai-bias

Text Embedding Models Contain Bias. Here's Why That Matters

2018-04-17T20:54:46+00:00

Text Embedding Models Contain Bias. Here's Why That Matters

Excellent discussion from the Google AI team of the enormous challenge of building machine learning models without accidentally encoding harmful bias in a way that cannot be easily detected.

Via Hacker News

Tags: ethics, google, machine-learning, ai, generative-ai, embeddings, ai-ethics, ai-bias