Goblins, Plateaus, and Prices

2026-04-29 · The Fluency Briefing

The Fluency Briefing

Your Guide to What's Happening in AI and Why It Matters to You

Wednesday, April 29, 2026

Newsletter header image

Three stories landed on my desk this Wednesday that look unrelated but aren't: a guy asked AI to count his carbs 27,000 times and never got the same answer twice, iPhone memory costs are about to quadruple because AI datacenters are hoarding chips, and AI shopping agents are already buying things on your behalf while most stores have no idea how to serve them. The pattern?

AI is embedding itself into everything from your lunch plate to your phone bill to your checkout cart, and the gap between what it promises and what it actually delivers is where the real story lives.

Today in AI:

Your AI Nutritionist Is Guessing - A researcher sent the same food photo to four leading AI models over 500 times each. The results varied wildly, with one model's carb estimates for paella ranging from 55g to 484g, a difference equivalent to 42.9 units of insulin. Diabettech
AI Is Making Your Next iPhone More Expensive - According to a JPMorgan analysis cited by Financial Times, memory could jump from 10% to 45% of iPhone component costs by 2027. AI datacenter builders like Nvidia are outbidding Apple for limited chip supply from Samsung, SK Hynix, and Micron. MacRumors
AI Agents Are Already Shopping for You - PayPal's first Agentic Commerce Pulse Survey found nearly 95% of merchants can detect AI agent traffic on their sites, but only about one in five have product catalogs structured for machines to actually read. Fast Company
GPT-5.5 Went Full Goblin - OpenAI's latest model developed an obsession with goblin and gremlin metaphors, peppering its responses with fantasy creature references. LM Arena confirmed the spike in their traffic data, and nobody is entirely sure why. LessWrong
LLMs Might Have Plateaued at Real Coding - A LessWrong analysis of METR's programming benchmark data found that when the bar is set at code a human would actually approve rather than just passing tests, improvement flatlined since early 2025. A constant function fits the data better than an upward trend. LessWrong
Scout AI Raised $100M to Build Military AI - The defense startup is training its "Fury" model on autonomous ATVs at a U.S. military base, with $11 million in contracts from DARPA and the Army. The company expects its tech to deploy with the 1st Cavalry Division by 2027. TechCrunch
AI Versus Superbugs - At WIRED Health, surgeon Ara Darzi argued AI diagnostics could slash the two-to-three-day wait for antibiotic resistance testing. With drug-resistant infections causing over a million deaths annually, faster diagnosis means doctors stop guessing and start treating. Wired
MIT Cracks Faster Privacy-Preserving AI - Researchers developed a method that accelerates federated learning by 81%, letting resource-constrained devices like smartwatches train AI models without sending personal data to a server. The technique handles networks of devices with wildly different capabilities. MIT News

Section break image

Today's Takeaway:

Here's the thing about AI in 2026: three of today's stories, taken together, reveal a widening gap between AI's surface-level confidence and its actual reliability. A researcher submits the same paella photo to Gemini 2.5 Pro over 500 times and gets carb estimates ranging from 55 grams to 484 grams, per Diabettech. Meanwhile, LessWrong's analysis of METR data shows that when you measure LLM coding by whether a human maintainer would actually merge the code, improvement has been flat since early 2025. And GPT-5.5 is out here talking about goblins for no discernible reason. The models look polished, sound authoritative, and increasingly ship with zero disclaimers.

What connects these dots is a trust problem hiding in plain sight. Think of it like a restaurant with gorgeous plating but inconsistent recipes. The carb-counting study isn't just a diabetes niche story; it's a proxy for every domain where people are quietly trusting AI outputs without checking them twice. The coding plateau suggests that once you raise the bar from "technically works" to "actually good," the improvement curve goes flat. If you're building a business process, a health tool, or a shopping agent around AI, the question isn't whether the model sounds right. It's whether it gives the same right answer tomorrow that it gave today. That distinction, between fluency and reliability, is the most expensive lesson companies will learn this year.

💡 Fluency Moment - Building your AI fluency, one term at a time.

Fluency Moment banner

"Stochastic Output"

In plain English: AI models produce randomly varied answers each time, even for identical questions.

Think of it like: Asking a friend the same question on 500 different days and getting a different answer every time.

Why you'll hear about it: It's why AI gave wildly different carb counts for the same food photo 27,000 times.

🧰 Your Toolkit

Decision Framework: Should I Trust What AI Tells Me?

Is this a factual question with a right answer, or a creative task where small differences don't matter?
Have I asked the same question twice to see if I get the same answer, especially for health, money, or safety topics?
Can I quickly verify this with a trusted source like a doctor, official website, or calculator before acting on it?
Am I using AI as a starting point to learn more, or am I treating its answer as the final word?
Does the AI's answer change if I ask the same question in a slightly different way - and does that make me nervous?
Would the consequences of this answer being wrong be minor (like a recipe tweak) or serious (like a medical or financial decision)?

Revisit this framework whenever you're about to make a real-life decision based on an AI answer. The higher the stakes, the more questions you should ask before acting.

Newsletter closing image

The Bottom Line

The Pattern: AI is everywhere now, from your grocery list to military ATVs to your iPhone's price tag. But the more deeply it integrates, the more its inconsistencies, costs, and weird goblin tendencies become impossible to ignore.

Why It Matters: We're past the honeymoon phase. The organizations and individuals who thrive won't be the ones adopting AI fastest; they'll be the ones who understand where it's dependable and where it's still guessing. The carb study alone should make anyone pause before trusting a single AI output with real consequences.

Your Move: Pick one AI tool you rely on regularly and run the same prompt three times. Compare the outputs. If the answers differ meaningfully, you now know exactly how much trust to place in it, and that knowledge is worth more than any feature update.

What We're Working On

✨ Founding Cohort Special - 60% Off! - Use code MAF20 to join for just $20/month (regularly $50). Get weekly group sessions & workshops, self-paced courses for all levels, access to tools & templates, challenges with peer feedback, and 24/7 support community. → Join Now

✨ Free 30-Minute AI Consultation - Discover how My AI Fluency can help your business unlock the potential of AI. We'll discuss your goals, explore practical AI opportunities for your industry, and outline clear next steps. → Schedule Free Call

✨ How AI-Fluent Are You? - Test your AI fluency with our interactive quiz. See how you stack up and discover what to learn next. → Take the Quiz

💬 Community | 📞 Book a Consultation | 🌐 Website

Fluently yours, The My AI Fluency Team