Sunday Rundown #143: Computer Use & Wizard Bartender
Your skimmable roundup of last week's AI releases.
Happy Sunday, friends!
Welcome back to the weekly AI news roundup.
In case you missed it, here’s this week’s Thursday deep dive:
If you’re consistently missing out on my emails, remember to check your “Promotions” tab and mark whytryai@substack.com as a “Safe Sender.”
🗞️ AI news
Here’s what happened in AI this week:
👩💻 AI releases
Anthropic news:
Claude Opus 4.8 brings incremental upgrades over 4.7 and several new features at the same cost. (But initial reactions are not positive.)
Dynamic Workflows let Claude Code write orchestration scripts to run hundreds of parallel subagents to handle complex tasks.
ElevenLabs news:
Dubbing v2 preserves emotional nuance when translating video content across languages by conditioning it on the original video. (Try for free.)
Music v2 lets you create full songs that can switch genre mid-track with improved instrumentals, multilingual support, and richer vocals. (Try for free.)
Microsoft news:
365 Copilot is now more deeply embedded and loads faster inside Office apps like Excel and Word, so you can invoke it without breaking your flow.
MAI-Image-2.5 is #3 on Arena’s text-to-image leaderboard and delivers sharp text and commercial-quality photography. (Try it free on Arena.)
OpenAI brought Codex computer use and remote control to Windows, so it can operate your desktop apps and continue coding sessions from the phone.
Perplexity made Computer available directly in Excel, Outlook, PowerPoint, and Word as native add-ins, so you can use it without leaving your Office apps.
Runway launched an MCP server that lets you create images and videos directly inside ChatGPT, Claude, Cursor, or Replit without leaving the app.
xAI launched Grok Build, its coding agent with automation tools and image/video generation, in early beta for SuperGrok and X Premium+ users.
🔬 AI research
Apple previewed a new Siri for iOS 27 powered by Google Gemini, with improved reasoning, persistent chat history, and a standalone app.
📖 AI resources
“DeepSWE” [BENCHMARK]: New “contamination-free” benchmark that measures coding agents on original, long-horizon software engineering tasks.
“How to evaluate AI agents (2026 edition)” [GUIDE]: practical guide to testing AI agents in production by Ben Hylak.
🔀 AI random
YouTube is making AI labels more prominent across the platform and adding automatic detection to help spot photorealistic AI content.
🤦♂️ AI fail of the week
“Hey, barkeep, got any of them self-filling bottles with self-making cocktails that spontaneously catches on fire?”
“Say less.”
📹 Live & Learn is back
During the next Live & Learn session, I’ll be testing and rating AI logo makers.
Join the session on Tuesday, June 2, at 2PM CET (8AM EST):
Let me know what would make the session extra useful by taking this quick survey:
(It’s one open-ended question with a few AI-assisted follow-ups.)
Thanks!

