<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: digital-literacy</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/digital-literacy.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2025-09-07T21:45:04+00:00</updated><author><name>Simon Willison</name></author><entry><title>Is the LLM response wrong, or have you just failed to iterate it?</title><link href="https://simonwillison.net/2025/Sep/7/is-the-llm-response-wrong-or-have-you-just-failed-to-iterate-it/#atom-tag" rel="alternate"/><published>2025-09-07T21:45:04+00:00</published><updated>2025-09-07T21:45:04+00:00</updated><id>https://simonwillison.net/2025/Sep/7/is-the-llm-response-wrong-or-have-you-just-failed-to-iterate-it/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://mikecaulfield.substack.com/p/is-the-llm-response-wrong-or-have"&gt;Is the LLM response wrong, or have you just failed to iterate it?&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
More from Mike Caulfield (see also &lt;a href="https://simonwillison.net/2025/Sep/7/the-sift-method/"&gt;the SIFT method&lt;/a&gt;). He starts with a &lt;em&gt;fantastic&lt;/em&gt; example of Google's &lt;a href="https://simonwillison.net/2025/Sep/7/ai-mode/"&gt;AI mode&lt;/a&gt; usually correctly handling a common piece of misinformation but occasionally falling for it (the curse of non-deterministic systems), then shows an example if what he calls a "sorting prompt" as a follow-up:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;What is the evidence for and against this being a real photo of Shirley Slade?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The response starts with a non-committal "there is compelling evidence for and against...", then by the end has firmly convinced itself that the photo is indeed a fake. It reads like a fact-checking variant of "think step by step".&lt;/p&gt;
&lt;p&gt;Mike neatly describes a problem I've also observed recently where "hallucination" is frequently mis-applied as meaning any time a model makes a mistake:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The term hallucination has become nearly worthless in the LLM discourse. It initially described a very weird, mostly non-humanlike behavior where LLMs would make up things out of whole cloth that did not seem to exist as claims referenced any known source material or claims inferable from any known source material. Hallucinations as stuff made up out of nothing. Subsequently people began calling any error or imperfect summary a hallucination, rendering the term worthless.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In this example is the initial incorrect answers were not hallucinations: they correctly summarized online content that contained misinformation. The trick then is to encourage the model to look further, using "sorting prompts" like these:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Facts and misconceptions and hype about what I posted&lt;/li&gt;
&lt;li&gt;What is the evidence for and against the claim I posted&lt;/li&gt;
&lt;li&gt;Look at the most recent information on this issue, summarize how it shifts the analysis (if at all), and provide link to the latest info&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;I appreciated this closing footnote:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Should platforms have more features to nudge users to this sort of iteration? Yes. They should. Getting people to iterate investigation rather than argue with LLMs would be a good first step out of this mess that the chatbot model has created.&lt;/p&gt;
&lt;/blockquote&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://bsky.app/profile/mikecaulfield.bsky.social/post/3lya2nv7xi226"&gt;@mikecaulfield.bsky.social&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-search"&gt;ai-assisted-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/hallucinations"&gt;hallucinations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-ethics"/><category term="ai-assisted-search"/><category term="hallucinations"/><category term="digital-literacy"/></entry><entry><title>The SIFT method</title><link href="https://simonwillison.net/2025/Sep/7/the-sift-method/#atom-tag" rel="alternate"/><published>2025-09-07T20:51:31+00:00</published><updated>2025-09-07T20:51:31+00:00</updated><id>https://simonwillison.net/2025/Sep/7/the-sift-method/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://guides.lib.uchicago.edu/c.php?g=1241077&amp;amp;p=9082322"&gt;The SIFT method&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The SIFT method is "an evaluation strategy developed by digital literacy expert, Mike Caulfield, to help determine whether online content can be trusted for credible or reliable sources of information."&lt;/p&gt;
&lt;p&gt;This looks &lt;em&gt;extremely&lt;/em&gt; useful as a framework for helping people more effectively consume information online (increasingly gathered with &lt;a href="https://simonwillison.net/tags/ai-assisted-search/"&gt;the help of LLMs&lt;/a&gt;).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Stop&lt;/strong&gt;. "Be aware of your emotional response to the headline or information in the article" to protect against clickbait, and don't read further or share until you've applied the other three steps.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Investigate the Source&lt;/strong&gt;. Apply &lt;a href="https://pressbooks.pub/webliteracy/chapter/what-reading-laterally-means/"&gt;lateral reading&lt;/a&gt;, checking what others say about the source rather than just trusting their "about" page.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Find Better Coverage&lt;/strong&gt;. "Use lateral reading to see if you can find other sources corroborating the same information or disputing it" and consult trusted fact checkers if necessary.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Trace Claims, Quotes, and Media to their Original Context&lt;/strong&gt;. Try to find the original report or referenced material to learn more and check it isn't being represented out of context.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This framework really resonates with me: it formally captures and improves on a bunch of informal techniques I've tried to apply in my own work.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://bsky.app/profile/anildash.com/post/3lyavuu6ku22r"&gt;@anildash.com&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/research"&gt;research&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-search"&gt;ai-assisted-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="blogging"/><category term="research"/><category term="ai-assisted-search"/><category term="digital-literacy"/></entry><entry><title>Tips on prompting ChatGPT for UK technology secretary Peter Kyle</title><link href="https://simonwillison.net/2025/Jun/3/tips-for-peter-kyle/#atom-tag" rel="alternate"/><published>2025-06-03T19:08:57+00:00</published><updated>2025-06-03T19:08:57+00:00</updated><id>https://simonwillison.net/2025/Jun/3/tips-for-peter-kyle/#atom-tag</id><summary type="html">
    &lt;p&gt;Back in March &lt;a href="https://www.newscientist.com/article/2472068-revealed-how-the-uk-tech-secretary-uses-chatgpt-for-policy-advice/"&gt;New Scientist reported on&lt;/a&gt; a successful Freedom of Information request they had filed requesting UK Secretary of State for Science, Innovation and Technology &lt;a href="https://en.wikipedia.org/wiki/Peter_Kyle"&gt;Peter Kyle's&lt;/a&gt; ChatGPT logs:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;New Scientist has obtained records of Kyle’s ChatGPT use under the Freedom of Information (FOI) Act, in what is believed to be a world-first test of whether chatbot interactions are subject to such laws.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What a fascinating precedent this could set!&lt;/p&gt;
&lt;p&gt;They picked out some highlights they thought were particularly newsworthy. Personally I'd have loved to see that raw data to accompany the story.&lt;/p&gt;
&lt;h4 id="a-good-example-of-a-poorly-considered-prompt"&gt;A good example of a poorly considered prompt&lt;/h4&gt;
&lt;p&gt;Among the questions Kyle asked of ChatGPT was this one:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Why is AI adoption so slow in the UK small and medium business community?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(I pinged the New Scientist reporter, Chris Stokel-Walker, to confirm the exact wording here.)&lt;/p&gt;
&lt;p&gt;This provides an irresistible example of the "jagged frontier" of LLMs in action. LLMs are great at some things, terrible at others and the difference between the two is often not obvious at all.&lt;/p&gt;
&lt;p&gt;Experienced prompters will no doubt have the same reaction I did: that's not going to give an accurate response! It's worth digging into why those of us with a firmly developed sense of intuition around LLMs would jump straight to that conclusion.&lt;/p&gt;
&lt;p&gt;The problem with this question is that it assumes a level of omniscience that even the very best LLMs do not possess.&lt;/p&gt;
&lt;p&gt;At the very best, I would expect this prompt to spit out the approximate average of what had been published on that subject in time to be hoovered up by the training data for the GPT-4o training cutoff &lt;a href="https://platform.openai.com/docs/models/gpt-4o"&gt;of September 2023&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(Here's &lt;a href="https://chatgpt.com/share/683f3f94-d51c-8006-aea9-7567d08e2f68"&gt;what I got just now&lt;/a&gt; running it against GPT-4o.)&lt;/p&gt;
&lt;p&gt;This illustrates the first lesson of effective LLM usage: &lt;strong&gt;know your training cutoff dates&lt;/strong&gt;. For many queries these are an essential factor in whether or not the LLM is likely to provide you with a useful answer.&lt;/p&gt;
&lt;p&gt;Given the pace of change in the AI landscape, an answer based on September 2023 training data is unlikely to offer useful insights into the state of things in 2025.&lt;/p&gt;
&lt;p&gt;It's worth noting that there &lt;em&gt;are&lt;/em&gt; tools that might do better at this. OpenAI's Deep Research tool for example can run a barrage of searches against the web for recent information, then spend multiple minutes digesting those results, running follow-up searches and crunching that together into an impressive looking report.&lt;/p&gt;
&lt;p&gt;(I still wouldn't trust it for a question this broad though: the report format looks more credible than it is, and can suffer from &lt;a href="https://simonwillison.net/2025/Feb/25/deep-research-system-card/"&gt;misinformation by omission&lt;/a&gt; which is very difficult to spot.)&lt;/p&gt;
&lt;p&gt;Deep Research only rolled out in February this year, so it is unlikely to be the tool Peter Kyle was using given likely delays in receiving the requested FOIA data.&lt;/p&gt;
&lt;h4 id="what-i-would-do-instead"&gt;What I would do instead&lt;/h4&gt;
&lt;p&gt;Off the top of my head, here are examples of prompts I would use if I wanted to get ChatGPT's help digging into this particular question:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Brainstorm potential reasons that UK SMBs might be slow to embrace recent advances in AI&lt;/strong&gt;. This would give me a starting point for my own thoughts about the subject, and may highlight some things I hadn't considered that I should look into further.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identify key stakeholders in the UK SMB community who might have insights on this issue&lt;/strong&gt;. I wouldn't expect anything comprehensive here, but it might turn up some initial names I could reach out to for interviews or further research.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I work in UK Government: which departments should I contact that might have relevant information on this topic&lt;/strong&gt;? Given the size and complexity of the UK government even cabinet ministers could be excused from knowing every department.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Suggest other approaches I could take to research this issue&lt;/strong&gt;. Another brainstorming prompt. I like prompts like this where "right or wrong" doesn't particularly matter. LLMs are electric bicycles for the mind.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use your search tool: find recent credible studies on the subject and identify their authors&lt;/strong&gt;. I've been getting some good results from telling LLMs with good search tools - &lt;a href="https://simonwillison.net/2025/Apr/21/ai-assisted-search/#o3-and-o4-mini-are-really-good-at-search"&gt;like o3 and o4-mini&lt;/a&gt; - to evaluate the "credibility" of sources they find. It's a dumb prompting hack but it appears to work quite well - you can watch their reasoning traces and see how they place more faith in papers from well known publications, or newspapers with strong reputations for fact checking.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id="prompts-that-do-make-sense"&gt;Prompts that do make sense&lt;/h4&gt;
&lt;p&gt;From the New Scientist article:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As well as seeking this advice, Kyle asked ChatGPT to define various terms relevant to his department: antimatter, quantum and digital inclusion. Two experts &lt;em&gt;New Scientist&lt;/em&gt; spoke to said they were surprised by the quality of the responses when it came to ChatGPT's definitions of quantum. "This is surprisingly good, in my opinion," says &lt;a href="https://profiles.imperial.ac.uk/p.knight"&gt;Peter Knight&lt;/a&gt; at Imperial College London. "I think it's not bad at all," says &lt;a href="https://researchportal.hw.ac.uk/en/persons/cristian-bonato"&gt;Cristian Bonato&lt;/a&gt; at Heriot-Watt University in Edinburgh, UK.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This doesn't surprise me at all. If you ask a good LLM for definitions of terms with strong, well established meanings you're going to get great results almost every time.&lt;/p&gt;
&lt;p&gt;My rule of thumb used to be that if a friend who had just read the Wikipedia page on a subject could answer my question then an LLM will be able to answer it too.&lt;/p&gt;
&lt;p&gt;As the frontier models have grown stronger I've upgraded that rule of thumb. I now expect a good result for any mainstream-enough topic for which there was widespread consensus prior to that all-important training cutoff date.&lt;/p&gt;
&lt;p&gt;Once again, it all comes down to intuition. The only way to get really strong intuition as to what will work with LLMs is to spend a huge amount of time using them, and paying a skeptical eye to everything that they produce.&lt;/p&gt;
&lt;p&gt;Treating ChatGPT as an all knowing Oracle for anything outside of a two year stale Wikipedia version of the world's knowledge is almost always a mistake.&lt;/p&gt;
&lt;p&gt;Treating it as a brainstorming companion and electric bicycle for the mind is, I think, a much better strategy.&lt;/p&gt;
&lt;h4 id="should-the-uk-technology-secretary-be-using-chatgpt-"&gt;Should the UK technology secretary be using ChatGPT?&lt;/h4&gt;
&lt;p&gt;Some of the reporting I've seen around this story has seemed to suggest that Peter Kyle's use of ChatGPT is embarrassing.&lt;/p&gt;
&lt;p&gt;Personally, I think that if the UK's Secretary of State for Science, Innovation and Technology was &lt;em&gt;not&lt;/em&gt; exploring this family of technologies it would be a dereliction of duty!&lt;/p&gt;
&lt;p&gt;The thing we can't tell from these ChatGPT logs is how dependent he was on these results.&lt;/p&gt;
&lt;p&gt;Did he idly throw some questions at ChatGPT out of curiosity to see what came back, then ignore that entirely, engage with his policy team and talk to experts in the field to get a detailed understanding of the issues at hand?&lt;/p&gt;
&lt;p&gt;Or did he prompt ChatGPT, take the results as gospel and make policy decisions based on that sloppy interpretation of a two-year stale guess at the state of the world?&lt;/p&gt;
&lt;p&gt;Those are the questions I'd like to see answered.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/politics"&gt;politics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/chatgpt"&gt;chatgpt&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="politics"/><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="chatgpt"/><category term="llms"/><category term="ai-ethics"/><category term="digital-literacy"/></entry><entry><title>I still don't think companies serve you ads based on spying through your microphone</title><link href="https://simonwillison.net/2025/Jan/2/they-spy-on-you-but-not-like-that/#atom-tag" rel="alternate"/><published>2025-01-02T23:43:31+00:00</published><updated>2025-01-02T23:43:31+00:00</updated><id>https://simonwillison.net/2025/Jan/2/they-spy-on-you-but-not-like-that/#atom-tag</id><summary type="html">
    &lt;p&gt;One of my weirder hobbies is trying to convince people that the idea that companies are listening to you through your phone's microphone and serving you targeted ads is a conspiracy theory that isn't true. I wrote about this previously: &lt;a href="https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#facebook-dont-spy-microphone"&gt;Facebook don’t spy on you through your microphone&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(Convincing people of this is basically impossible. It doesn't matter how good your argument is, if someone has ever seen an ad that relates to their previous voice conversation they are likely convinced and there's nothing you can do to talk them out of it. Gimlet media did &lt;a href="https://gimletmedia.com/amp/shows/reply-all/z3hlwr"&gt;a great podcast episode&lt;/a&gt; about how impossible this is back in 2017.)&lt;/p&gt;
&lt;p&gt;This is about to get even harder thanks to this proposed settlement: &lt;a href="https://arstechnica.com/tech-policy/2025/01/apple-agrees-to-pay-95m-delete-private-conversations-siri-recorded/"&gt;Siri “unintentionally” recorded private convos; Apple agrees to pay $95M&lt;/a&gt; (Ars Technica).&lt;/p&gt;
&lt;p&gt;Apple are spending $95m (nine hours of profit), agreeing to settle while "denying wrongdoing".&lt;/p&gt;
&lt;p&gt;What actually happened is it turns out Apple were capturing snippets of audio surrounding the "Hey Siri" wake word, sending those back to their servers and occasionally using them for QA, without informing users that they were doing this. This is bad.&lt;/p&gt;
&lt;p&gt;The Reuters 2021 story &lt;a href="https://www.reuters.com/technology/apple-must-face-siri-voice-assistant-privacy-lawsuit-us-judge-2021-09-02/"&gt;Apple must face Siri voice assistant privacy lawsuit -U.S. judge&lt;/a&gt; reported that:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One Siri user said his private discussions with his doctor about a "brand name surgical treatment" caused him to receive targeted ads for that treatment, while two others said their discussions about Air Jordan sneakers, Pit Viper sunglasses and "Olive Garden" caused them to receive ads for those products.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The claim from that story was then repeated in &lt;a href="https://www.reuters.com/legal/apple-pay-95-million-settle-siri-privacy-lawsuit-2025-01-02/"&gt;the 2025 Reuters story&lt;/a&gt; about the settlement.&lt;/p&gt;
&lt;p&gt;The &lt;a href="https://arstechnica.com/tech-policy/2025/01/apple-agrees-to-pay-95m-delete-private-conversations-siri-recorded/"&gt;Ars Technica story&lt;/a&gt; reframes that like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The only clue that users seemingly had of Siri's alleged spying was eerily accurate targeted ads that appeared after they had just been talking about specific items like Air Jordans or brands like Olive Garden, Reuters &lt;a href="https://www.reuters.com/legal/apple-pay-95-million-settle-siri-privacy-lawsuit-2025-01-02/"&gt;noted&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Crucially, this was never &lt;em&gt;proven in court&lt;/em&gt;. And if Apple settle the case it never will be.&lt;/p&gt;
&lt;p&gt;Let’s think this through. For the accusation to be true, Apple would need to be recording those wake word audio snippets and transmitting them back to their servers for additional processing (likely true), but then they would need to be feeding those snippets &lt;em&gt;in almost real time&lt;/em&gt; into a system which forwards them onto advertising partners who then feed that information into targeting networks such that next time you view an ad on your phone the information is available to help select the relevant ad.&lt;/p&gt;
&lt;p&gt;That is &lt;em&gt;so far fetched&lt;/em&gt;. Why would Apple do that? Especially given both their brand and reputation as a privacy-first company combined with the large amounts of product design and engineering work they’ve put into preventing apps from doing exactly this kind of thing by enforcing permission-based capabilities &lt;em&gt;and&lt;/em&gt; ensuring a “microphone active” icon is available at all times when an app is listening in.&lt;/p&gt;
&lt;p&gt;I really don't think this is happening - in particular for Siri wake words!&lt;/p&gt;

&lt;p id="argued-these-points"&gt;I've &lt;a href="https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#facebook-dont-spy-microphone"&gt;argued these points before&lt;/a&gt;, but I'll do it again here for good measure.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;You don't notice the hundreds of times a day you say something and &lt;em&gt;don't&lt;/em&gt; see a relevant advert a short time later. You see thousands of ads a day, can you remember what &lt;em&gt;any&lt;/em&gt; of them are?&lt;/li&gt;
&lt;li&gt;The tiny fraction of times where you see an ad that's relevant to something you've just said (hence breaking through your filter that prevents you from seeing most ads at all) stick in your head.&lt;/li&gt;
&lt;li&gt;Human beings are pattern matching machines with a huge bias towards personal anecdotes. If we've seen direct evidence of something ourselves, good luck talking us out of it!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I think the truth of the matter here is much more pedestrian: the quality of ad targeting that's possible just through apps sharing data on your regular actions within those apps is shockingly high... combined with the fact that it turns out just knowing "male, 40s, NYC" is often more than enough - we're all pretty basic!&lt;/p&gt;
&lt;p&gt;I fully expect that this Apple story will be used as "proof" by conspiracy theorists effectively forever.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apple"&gt;apple&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/conspiracy"&gt;conspiracy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/privacy"&gt;privacy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/misinformation"&gt;misinformation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microphone-ads-conspiracy"&gt;microphone-ads-conspiracy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="apple"/><category term="conspiracy"/><category term="privacy"/><category term="misinformation"/><category term="microphone-ads-conspiracy"/><category term="digital-literacy"/></entry><entry><title>Project: VERDAD - tracking misinformation in radio broadcasts using Gemini 1.5</title><link href="https://simonwillison.net/2024/Nov/7/project-verdad/#atom-tag" rel="alternate"/><published>2024-11-07T18:41:51+00:00</published><updated>2024-11-07T18:41:51+00:00</updated><id>https://simonwillison.net/2024/Nov/7/project-verdad/#atom-tag</id><summary type="html">
    &lt;p&gt;I'm starting a new interview series called &lt;strong&gt;Project&lt;/strong&gt;. The idea is to interview people who are building interesting data projects and talk about what they've built, how they built it, and what they learned along the way.&lt;/p&gt;
&lt;p&gt;The first episode is a conversation with Rajiv Sinclair from &lt;a href="https://publicdata.works/"&gt;Public Data Works&lt;/a&gt; about &lt;a href="https://verdad.app/"&gt;VERDAD&lt;/a&gt;, a brand new project in collaboration with journalist &lt;a href="https://twitter.com/mguzman_detroit"&gt;Martina Guzmán&lt;/a&gt; that aims to track misinformation in radio broadcasts around the USA.&lt;/p&gt;
&lt;p&gt;VERDAD hits a whole bunch of my interests at once. It's a beautiful example of scrappy data journalism in action, and it attempts something that simply would not have been possible just a year ago by taking advantage of new LLM tools.&lt;/p&gt;
&lt;p&gt;You can watch &lt;a href="https://www.youtube.com/watch?v=t_S-loWDGE0"&gt;the half hour interview&lt;/a&gt; on YouTube. Read on for the shownotes and some highlights from our conversation.&lt;/p&gt;

&lt;iframe style="margin-top: 1.5em; margin-bottom: 1.5em;" width="560" height="315" src="https://www.youtube-nocookie.com/embed/t_S-loWDGE0" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="allowfullscreen"&gt; &lt;/iframe&gt;

&lt;h4 id="the-verdad-project"&gt;The VERDAD project&lt;/h4&gt;
&lt;p&gt;VERDAD tracks radio broadcasts from 48 different talk radio radio stations across the USA, primarily in Spanish. Audio from these stations is archived as MP3s, transcribed and then analyzed to identify potential examples of political misinformation.&lt;/p&gt;
&lt;p&gt;The result is "snippets" of audio accompanied by the trancript, an English translation, categories indicating the type of misinformation that may be present and an LLM-generated explanation of why that snippet was selected.&lt;/p&gt;
&lt;p&gt;These are then presented in an interface for human reviewers, who can listen directly to the audio in question, update the categories and add their own comments as well.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/verdad-1.jpg" alt="Screenshot of a content moderation interface titled VERDAD showing three posts with ratings and tags. Main view shows filters on left including Source Language, State, Source, Label, and Political Spectrum slider. Two users visible in left sidebar: Simon Willison and Rajiv Sinclair. Posts discuss claims about Harris, Walz, and election results, with timestamps and political leaning indicators." /&gt;&lt;/p&gt;
&lt;p&gt;VERDAD processes around a thousand hours of audio content a day - &lt;em&gt;way&lt;/em&gt; more than any team of journalists or researchers could attempt to listen to manually.&lt;/p&gt;
&lt;h4 id="the-technology-stack"&gt;The technology stack&lt;/h4&gt;
&lt;p&gt;VERDAD uses &lt;a href="https://github.com/PrefectHQ/prefect"&gt;Prefect&lt;/a&gt; as a workflow orchestration system to run the different parts of their pipeline.&lt;/p&gt;
&lt;p&gt;There are multiple stages, roughly as follows:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;MP3 audio is recorded from radio station websites and stored in Cloudflare R2&lt;/li&gt;
&lt;li&gt;An initial transcription is performed using the extremely inexpensive Gemini 1.5 Flash&lt;/li&gt;
&lt;li&gt;That transcript is fed to the more powerful Gemini 1.5 Pro with a complex prompt to help identify potential misinformation snippets&lt;/li&gt;
&lt;li&gt;Once identified, audio containing snippets is run through the more expensive Whisper model to generate timestamps for the snippets&lt;/li&gt;
&lt;li&gt;Further prompts then generate things like English translations and summaries of the snippets&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/verdad-2.jpg" alt="Screenshot of a Prefect workflow dashboard showing the apricot-silkworm run execution timeline. Interface displays task runs including audio file transcription and processing tasks with timestamps from 11:05 PM to 11:09 PM. Bottom panel shows detailed logs of task creation and completion." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;h4 id="developing-the-prompts"&gt;Developing the prompts&lt;/h4&gt;
&lt;p&gt;The prompts used by VERDAD are &lt;a href="https://github.com/PublicDataWorks/verdad/tree/main/prompts"&gt;available in their GitHub repository&lt;/a&gt; and they are &lt;em&gt;fascinating&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Rajiv initially tried to get Gemini 1.5 Flash to do both the transcription and the misinformation detection, but found that asking that model to do two things at once frequently confused it.&lt;/p&gt;
&lt;p&gt;Instead, he switched to a separate prompt running that transcript against Gemini 1.5 Pro. Here's &lt;a href="https://github.com/PublicDataWorks/verdad/blob/main/prompts/Stage_3_analysis_prompt.md"&gt;that more complex prompt&lt;/a&gt; - it's 50KB is size and includes a whole bunch of interesting sections, including plenty of examples and a detailed JSON schema.&lt;/p&gt;
&lt;p&gt;Here's just one of the sections aimed at identifying content about climate change:&lt;/p&gt;
&lt;blockquote&gt;
&lt;h3 id="4-climate-change-and-environmental-policies"&gt;&lt;strong&gt;4. Climate Change and Environmental Policies&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;Disinformation that denies or minimizes human impact on climate change, often to oppose environmental regulations. It may discredit scientific consensus and promote fossil fuel interests.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Common Narratives&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Labeling climate change as a &lt;strong&gt;"hoax"&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Arguing that climate variations are natural cycles.&lt;/li&gt;
&lt;li&gt;Claiming environmental policies harm the economy.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Cultural/Regional Variations&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spanish-Speaking Communities&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Impact of climate policies on agricultural jobs.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Arabic-Speaking Communities&lt;/strong&gt;:
&lt;ul&gt;
&lt;li&gt;Reliance on oil economies influencing perceptions.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Potential Legitimate Discussions&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Debates on balancing environmental protection with economic growth.&lt;/li&gt;
&lt;li&gt;Discussions about energy independence.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Examples&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Spanish&lt;/em&gt;: "El 'cambio climático' es una mentira para controlarnos."&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Arabic&lt;/em&gt;: "'تغير المناخ' كذبة للسيطرة علينا."&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Rajiv iterated on these prompts over multiple months - they are the core of the VERDAD project. Here's &lt;a href="https://github.com/PublicDataWorks/verdad/commit/3eac808e77b6d1aadf0de055a1d5287166dbb6d3"&gt;an update from yesterday&lt;/a&gt; informing the model of the US presidental election results so that it wouldn't flag claims of a candidate winning as false!&lt;/p&gt;

&lt;p&gt;Rajiv used both Claude 3.5 Sonnet and OpenAI o1-preview to help develop the prompt itself. Here's &lt;a href="https://gist.github.com/rajivsinclair/8fb0371f6eda25f9e5cc515cd77abd62"&gt;his transcript&lt;/a&gt; of a conversation with Claude used to iterate further on an existing prompt.&lt;/p&gt;

&lt;h4 id="the-human-review-process"&gt;The human review process&lt;/h4&gt;
&lt;p&gt;The final component of VERDAD is the web application itself. Everyone knows that AI makes mistakes, &lt;em&gt;a lot&lt;/em&gt;. Providing as much context as possible for human review is essential.&lt;/p&gt;
&lt;p&gt;The Whisper transcripts provide accurate timestamps (Gemini is sadly unable to provide those on its own), which means the tool can provide the Spanish transcript, the English translation and a play button to listen to the audio at the moment of the captured snippet.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2024/verdad-3.jpg" alt="Screenshot of VERDAD content moderation interface showing detailed view of a post titled False Claim of Trump Victory from WAXY radio station in Florida. Shows audio player with Spanish/English transcript toggle, green highlighted fact-check box. Post metadata indicates &amp;quot;Right&amp;quot; political leaning and timestamp Nov 6, 2024 23:06 GMT+7." style="max-width: 100%;" /&gt;&lt;/p&gt;

&lt;h4 id="want-to-learn-more-"&gt;Want to learn more?&lt;/h4&gt;
&lt;p&gt;VERDAD is under active development right now. Rajiv and his team are keen to collaborate, and are actively looking forward to conversations with other people working in this space. You can reach him at &lt;code&gt;help@verdad.app&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The technology stack itself is &lt;em&gt;incredibly&lt;/em&gt; promising. Pulling together a project like this even a year ago would have been prohibitively expensive, but new multi-modal LLM tools like Gemini (and Gemini 1.5 Flash in particular) are opening up all sorts of new possibilities.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/data-journalism"&gt;data-journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/youtube"&gt;youtube&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="data-journalism"/><category term="youtube"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="gemini"/><category term="digital-literacy"/></entry><entry><title>Quoting Mike Caulfield</title><link href="https://simonwillison.net/2024/Oct/11/mike-caulfield/#atom-tag" rel="alternate"/><published>2024-10-11T15:21:37+00:00</published><updated>2024-10-11T15:21:37+00:00</updated><id>https://simonwillison.net/2024/Oct/11/mike-caulfield/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://mikecaulfield.substack.com/p/copium-addicts-what-misinformation"&gt;&lt;p&gt;The primary use of “misinformation” is not to change the beliefs of other people at all. Instead, the vast majority of misinformation is offered as a service for people to &lt;em&gt;maintain&lt;/em&gt; their beliefs in face of overwhelming evidence to the contrary.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://mikecaulfield.substack.com/p/copium-addicts-what-misinformation"&gt;Mike Caulfield&lt;/a&gt;, via &lt;a href="https://www.theatlantic.com/technology/archive/2024/10/hurricane-milton-conspiracies-misinformation/680221/"&gt;Charlie Warzel&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/misinformation"&gt;misinformation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="misinformation"/><category term="digital-literacy"/></entry><entry><title>The AI trust crisis</title><link href="https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#atom-tag" rel="alternate"/><published>2023-12-14T16:14:11+00:00</published><updated>2023-12-14T16:14:11+00:00</updated><id>https://simonwillison.net/2023/Dec/14/ai-trust-crisis/#atom-tag</id><summary type="html">
    &lt;p&gt;Dropbox added some &lt;a href="https://help.dropbox.com/view-edit/dropbox-ai-how-to"&gt;new AI features&lt;/a&gt;. In the past couple of days these have attracted a firestorm of criticism. Benj Edwards rounds it up in  &lt;a href="https://arstechnica.com/information-technology/2023/12/dropbox-spooks-users-by-sending-data-to-openai-for-ai-search-features/"&gt;Dropbox spooks users with new AI features that send data to OpenAI when used&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The key issue here is that people are worried that their private files on Dropbox are being passed to OpenAI to use as training data for their models - a claim that is strenuously denied by Dropbox.&lt;/p&gt;
&lt;p&gt;As far as I can tell, Dropbox built some sensible features - summarize on demand, "chat with your data" via Retrieval Augmented Generation - and did a moderately OK job of communicating how they work... but when it comes to data privacy and AI, a "moderately OK job" is a failing grade. Especially if you hold as much of people's private data as Dropbox does!&lt;/p&gt;
&lt;p&gt;Two details in particular seem really important. Dropbox have an &lt;a href="https://www.dropbox.com/ai-principles"&gt;AI principles document&lt;/a&gt; which includes this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Customer trust and the privacy of their data are our foundation. We will not use customer data to train AI models without consent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They also have a checkbox &lt;a href="https://www.dropbox.com/account/ai"&gt;in their settings&lt;/a&gt; that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2023/dropbox-third-party.png" alt="Third-party AI: Use artificial intelligence (Al) from third-party partners so you can work faster in Dropbox. We only use technology partners we have vetted. Your data is never used to train their internal models, and is deleted from third-party servers within 30 days. Learn more. There is a toggle set to On." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;Update:&lt;/strong&gt; Some time between me publishing this article and four hours later, that link stopped working.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;I took that screenshot on my own account. It's toggled "on" - but I never turned it on myself.&lt;/p&gt;
&lt;p&gt;Does that mean I'm marked as "consenting" to having my data used to train AI models?&lt;/p&gt;
&lt;p&gt;I don't think so: I think this is a combination of confusing wording and the eternal vagueness of what the term "consent" means in a world where everyone agrees to the terms and conditions of everything without reading them.&lt;/p&gt;
&lt;p&gt;But a LOT of people have come to the conclusion that this means their private data - which they pay Dropbox to protect - is now being funneled into the OpenAI training abyss.&lt;/p&gt;
&lt;h4 id="people-dont-believe-openai"&gt;People don't believe OpenAI&lt;/h4&gt;
&lt;p&gt;Here's copy from that Dropbox preference box, talking about their "third-party partners" - in this case OpenAI:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Your data is never used to train their internal models, and is deleted from third-party servers within 30 days.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It's increasing clear to me like people simply &lt;strong&gt;don't believe OpenAI&lt;/strong&gt; when they're told that data won't be used for training.&lt;/p&gt;
&lt;p&gt;What's really going on here is something deeper then: AI is facing a crisis of trust.&lt;/p&gt;
&lt;p&gt;I quipped &lt;a href="https://twitter.com/simonw/status/1735086765814542802"&gt;on Twitter&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;"OpenAI are training on every piece of data they see, even when they say they aren't" is the new "Facebook are showing you ads based on overhearing everything you say through your phone's microphone"&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's what I meant by that.&lt;/p&gt;
&lt;h4 id="facebook-dont-spy-microphone"&gt;Facebook don't spy on you through your microphone&lt;/h4&gt;
&lt;p&gt;Have you heard the one about Facebook spying on you through your phone's microphone and showing you ads based on what you're talking about?&lt;/p&gt;
&lt;p&gt;This theory has been floating around for years. From a technical perspective it should be easy to disprove:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Mobile phone operating systems don't allow apps to invisibly access the microphone.&lt;/li&gt;
&lt;li&gt;Privacy researchers can audit communications between devices and Facebook to confirm if this is happening.&lt;/li&gt;
&lt;li&gt;Running high quality voice recognition like this at scale is extremely expensive - I had a conversation with a friend who works on server-based machine learning at Apple a few years ago who found the entire idea laughable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The non-technical reasons are even stronger:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Facebook say they aren't doing this. The risk to their reputation if they are caught in a lie is astronomical.&lt;/li&gt;
&lt;li&gt;As with many conspiracy theories, too many people would have to be "in the loop" and not blow the whistle.&lt;/li&gt;
&lt;li&gt;Facebook don't need to do this: there are much, much cheaper and more effective ways to target ads at you than spying through your microphone. These methods have been working incredibly well for years.&lt;/li&gt;
&lt;li&gt;Facebook gets to show us thousands of ads a year. 99% of those don't correlate in the slightest to anything we have said out loud. If you keep rolling the dice long enough, eventually a coincidence will strike.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's the thing though: &lt;em&gt;none of these arguments matter&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;If you've ever experienced Facebook showing you an ad for something that you were talking about out-loud about moments earlier, you've already dismissed everything I just said. You have personally experienced anecdotal evidence which overrides all of my arguments here.&lt;/p&gt;
&lt;p&gt;Here's a Reply All podcast episode from Novemember 2017 that explores this issue: &lt;a href="https://gimletmedia.com/shows/reply-all/z3hlwr"&gt;109 Is Facebook Spying on You?&lt;/a&gt;. Their conclusion: Facebook are not spying through your microphone. But if someone already believes that there is no argument that can possibly convince them otherwise.&lt;/p&gt;
&lt;p&gt;I've experienced this effect myself - over the past few years I've tried talking people out of this, as part of my own personal fascination with how sticky this conspiracy theory is.&lt;/p&gt;
&lt;p&gt;The key issue here is the same as the OpenAI training issue: people &lt;strong&gt;don't believe&lt;/strong&gt; these companies when they say that they aren't doing something.&lt;/p&gt;
&lt;p&gt;One interesting difference here is that in the Facebook example people have personal evidence that makes them believe they understand what's going on.&lt;/p&gt;
&lt;p&gt;With AI we have almost the complete opposite: AI models are weird black boxes, built in secret and with no way of understanding what the training data was or how it influences the model.&lt;/p&gt;
&lt;p&gt;As with so much in AI, people are left with nothing more than "vibes" to go on. And the vibes are bad.&lt;/p&gt;
&lt;h4 id="this-really-matters"&gt;This really matters&lt;/h4&gt;
&lt;p&gt;Trust is really important. Companies lying about what they do with your privacy is a very serious allegation.&lt;/p&gt;
&lt;p&gt;A society where big companies tell blatant lies about how they are handling our data - and get away with it without consequences - is a very unhealthy society.&lt;/p&gt;
&lt;p&gt;A key role of government is to prevent this from happening. If OpenAI are training on data that they said they wouldn't train on, or if Facebook are spying on us through our phone's microphones, they should be hauled in front of regulators and/or sued into the ground.&lt;/p&gt;
&lt;p&gt;If we believe that they are doing this without consequence, and have been getting away with it for years, our intolerance for corporate misbehavior becomes a victim as well. We risk letting companies get away with real misconduct because we incorrectly believed in conspiracy theories.&lt;/p&gt;
&lt;p&gt;Privacy is important, and very easily misunderstood. People both overestimate and underestimate what companies are doing, and what's possible. This isn't helped by the fact that AI technology means the scope of what's possible is changing at a rate that's hard to appreciate even if you're deeply aware of the space.&lt;/p&gt;
&lt;p&gt;If we want to protect our privacy, we need to understand what's going on. More importantly, we need to be able to trust companies to honestly and clearly explain what they are doing with our data.&lt;/p&gt;
&lt;p&gt;On a personal level we risk losing out on useful tools. How many people cancelled their Dropbox accounts in the last 48 hours? How many more turned off that AI toggle, ruling out ever evaluating if those features were useful for them or not?&lt;/p&gt;
&lt;h4 id="what-can-we-do"&gt;What can we do about it?&lt;/h4&gt;
&lt;p&gt;There is something that the big AI labs could be doing to help here: tell us how you are training!&lt;/p&gt;
&lt;p&gt;The fundamental question here is about training data: what are OpenAI using to train their models?&lt;/p&gt;
&lt;p&gt;And the answer is: we have no idea! The entire process could not be more opaque.&lt;/p&gt;
&lt;p&gt;Given that, is it any wonder that when OpenAI say "we don't train on data submitted via our API" people have trouble believing them?&lt;/p&gt;
&lt;p&gt;The situation with ChatGPT itself is even more messy. OpenAI say that they DO use ChatGPT interactions to improve their models - even those from paying customers, with the exception of the "call us" priced &lt;a href="https://openai.com/blog/introducing-chatgpt-enterprise"&gt;ChatGPT Enterprise&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If I paste a private document into ChatGPT to ask for a summary, will snippets of that document be leaked to future users after the next model update? Without more details on HOW they are using ChatGPT to improve their models I can't come close to answering that question.&lt;/p&gt;
&lt;p&gt;Clear explanations of how this stuff works could go a long way to improving the trust relationship OpenAI have with their users, and the world at large.&lt;/p&gt;
&lt;p&gt;Maybe take a leaf from large scale platform companies. They publish public post-mortem incident reports on outages, to regain trust with their customers through transparency about exactly what happened and the steps they are taking to prevent it from happening again. Dan Luu has collected a &lt;a href="https://github.com/danluu/post-mortems"&gt;great list of examples&lt;/a&gt;.&lt;/p&gt;
&lt;h4 id="opportunity-local-models"&gt;An opportunity for local models&lt;/h4&gt;
&lt;p&gt;One consistent theme I've seen in conversations about this issue is that people are much more comfortable trusting their data to local models that run on their own devices than models hosted in the cloud.&lt;/p&gt;
&lt;p&gt;The good news is that local models are consistently both increasing in quality and shrinking in size.&lt;/p&gt;
&lt;p&gt;I figured out how to run Mixtral-8x7b-Instruct &lt;a href="https://fedi.simonwillison.net/@simon/111577242044966329"&gt;on my laptop&lt;/a&gt; last night - the first local model I've tried which really does seem to be equivalent in quality to ChatGPT 3.5.&lt;/p&gt;
&lt;p&gt;Microsoft's &lt;a href="https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/"&gt;Phi-2&lt;/a&gt; is a fascinating new model in that it's only 2.7 billion parameters (most useful local models start at 7 billion) but claims state-of-the-art performance against some of those larger models. And it looks like they trained it for around $35,000.&lt;/p&gt;
&lt;p&gt;While I'm excited about the potential of local models, I'd hate to see us lose out on the power and convenience of the larger hosted models over privacy concerns which turn out to be incorrect.&lt;/p&gt;
&lt;p&gt;The intersection of AI and privacy is a critical issue. We need to be able to have the highest quality conversations about it, with maximum transparency and understanding of what's actually going on.&lt;/p&gt;
&lt;p&gt;This is hard already, and it's made even harder if we straight up disbelieve anything that companies tell us. Those companies need to earn our trust. How can we help them understand how to do that?&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/trust"&gt;trust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dropbox"&gt;dropbox&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/local-llms"&gt;local-llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/training-data"&gt;training-data&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/microphone-ads-conspiracy"&gt;microphone-ads-conspiracy&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="trust"/><category term="dropbox"/><category term="ai"/><category term="openai"/><category term="local-llms"/><category term="llms"/><category term="training-data"/><category term="microphone-ads-conspiracy"/><category term="digital-literacy"/></entry><entry><title>Can We Trust Search Engines with Generative AI? A Closer Look at Bing’s Accuracy for News Queries</title><link href="https://simonwillison.net/2023/Feb/18/can-we-trust-search-engines-with-generative-ai/#atom-tag" rel="alternate"/><published>2023-02-18T18:09:19+00:00</published><updated>2023-02-18T18:09:19+00:00</updated><id>https://simonwillison.net/2023/Feb/18/can-we-trust-search-engines-with-generative-ai/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://medium.com/@ndiakopoulos/can-we-trust-search-engines-with-generative-ai-a-closer-look-at-bings-accuracy-for-news-queries-179467806bcc"&gt;Can We Trust Search Engines with Generative AI? A Closer Look at Bing’s Accuracy for News Queries&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Computational journalism professor Nick Diakopoulos takes a deeper dive into the quality of the summarizations provided by AI-assisted Bing. His findings are troubling: for news queries, which are a great test for AI summarization since they include recent information that may have sparse or conflicting stories, Bing confidently produces answers with important errors: claiming the Ohio train derailment happened on February 9th when it actually happened on February 3rd for example.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/ndiakopoulos/status/1626840648002203649"&gt;@ndiakopoulos&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/bing"&gt;bing&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/search"&gt;search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/trust"&gt;trust&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-search"&gt;ai-assisted-search&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="bing"/><category term="search"/><category term="trust"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-search"/><category term="digital-literacy"/></entry><entry><title>Quoting Michael Hobbes</title><link href="https://simonwillison.net/2020/Oct/29/michael-hobbes/#atom-tag" rel="alternate"/><published>2020-10-29T15:06:52+00:00</published><updated>2020-10-29T15:06:52+00:00</updated><id>https://simonwillison.net/2020/Oct/29/michael-hobbes/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.huffpost.com/entry/internet-baby-boomers-misinformation-social-media_n_5f998039c5b6a4a2dc813d3d"&gt;&lt;p&gt;Seniors generally report having more trust in the people around them, a characteristic that may make them more credulous of information that comes from friends and family. There is also the issue of context: Misinformation appears in a stream that also includes baby pictures, recipes and career updates. Users may not expect to toggle between light socializing and heavy truth-assessing when they’re looking at their phone for a few minutes in line at the grocery store.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.huffpost.com/entry/internet-baby-boomers-misinformation-social-media_n_5f998039c5b6a4a2dc813d3d"&gt;Michael Hobbes&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/social-media"&gt;social-media&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/misinformation"&gt;misinformation&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="social-media"/><category term="misinformation"/><category term="digital-literacy"/></entry><entry><title>Quoting Kevin Roose</title><link href="https://simonwillison.net/2020/Oct/5/kevin-roose/#atom-tag" rel="alternate"/><published>2020-10-05T15:40:56+00:00</published><updated>2020-10-05T15:40:56+00:00</updated><id>https://simonwillison.net/2020/Oct/5/kevin-roose/#atom-tag</id><summary type="html">
    &lt;blockquote cite="https://www.nytimes.com/2020/10/03/insider/qanon-reporter.html"&gt;&lt;p&gt;I’ve often joked with other internet culture reporters about what I call the “normie tipping point.” In every emerging internet trend, there is a point at which “normies” — people who don’t spend all day online, and whose brains aren’t rotted by internet garbage — start calling, texting and emailing us to ask what’s going on. Why are kids eating Tide Pods? What is the Momo Challenge? Who is Logan Paul, and why did he film himself with a dead body?&lt;/p&gt;
&lt;p&gt;The normie tipping point is a joke, but it speaks to one of the thorniest questions in modern journalism, specifically on this beat: When does the benefit of informing people about an emerging piece of misinformation outweigh the possible harms?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.nytimes.com/2020/10/03/insider/qanon-reporter.html"&gt;Kevin Roose&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/journalism"&gt;journalism&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/kevin-roose"&gt;kevin-roose&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="journalism"/><category term="kevin-roose"/><category term="digital-literacy"/></entry><entry><title>This Is What Happens When Millions Of People Suddenly Get The Internet</title><link href="https://simonwillison.net/2017/Sep/19/what-happens-when-millions-people-suddenly-get-internet/#atom-tag" rel="alternate"/><published>2017-09-19T04:59:00+00:00</published><updated>2017-09-19T04:59:00+00:00</updated><id>https://simonwillison.net/2017/Sep/19/what-happens-when-millions-people-suddenly-get-internet/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.buzzfeed.com/sheerafrenkel/fake-news-spreads-trump-around-the-world"&gt;This Is What Happens When Millions Of People Suddenly Get The Internet&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
“Countries which come online quickly rank lowest in digital literacy &amp;amp; are most likely to fall for scams, fake news”

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/simonw/status/910111056336261126"&gt;Twitter&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/internet"&gt;internet&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/digital-literacy"&gt;digital-literacy&lt;/a&gt;&lt;/p&gt;



</summary><category term="internet"/><category term="digital-literacy"/></entry></feed>