Sunday Rundown #99: Video Galore & Big Head Energy

Sunday Bonus #59: Custom Gemini Gem for finding the best AI model.

Daniel Nest

May 18, 2025

∙ Paid

Heads up: The next Sunday Rundown will be in two weeks on Sunday, June 1.

Happy Sunday, friends!

Welcome back to the weekly look at generative AI that covers the following:

Sunday Rundown (free): this week’s AI news + a fun AI fail.
Sunday Bonus (paid): an exclusive segment for my paid subscribers.

Every Sunday Bonus in one place

Let’s get to it.

🗞️ AI news

Here are this week’s AI developments.

👩‍💻 AI releases

New stuff you can try right now:

Alibaba Wan open-sourced Wan2.1-VACE, a video model that lets you create, edit, and remix video clips using text, images, and videos as inputs.
Audible launched AI narration to help publishers convert their books into audiobooks. (AI translation is also in the works.)
ElevenLabs released a fun tool called SB-1 Infinite Soundboard, which lets you create custom soundboards using its text-to-audio model. (Try it for free.)
Genspark launched a Download Agent that autonomously searches the web for specific types of files—PDFs, images, videos, etc.—and saves them to an AI Drive.
Google rolled out AI-powered accessibility features like TalkBack (screen reader), expressive captions, OCR scanner for PDFs, and more.
LTX Studio released LTXV 13B Distilled, a super-fast video model that can generate high-quality video clips in just 12 seconds.
Manus AI news:
1. The agent can now generate images and combine this image generation with task planning to help you achieve goals.
2. All users now get one free daily task, 1,000 bonus credits upon sign-up, and power users have new subscription options.
Microsoft added a “Hey, Copilot” wake word on Windows, which lets users start Copilot chats using voice. Rolling out to Windows Insiders.
Notion launched AI Meeting Notes to auto-transcribe meetings, summarize key points, and generate action items directly in the app.
OpenAI news:
1. You can now save your Deep Research reports as nicely formatted PDFs with tables, images, linked citations, etc.
2. GPT-4.1 is now available directly in ChatGPT (instead of just via API). It’s especially good at coding tasks and following instructions.
3. Codex is a cloud-based coding agent that autonomously writes features, fixes bugs, proposes pull requests, etc. directly in ChatGPT. (Available to Pro, Enterprise, and Team users, rolling out to Edu and Plus soon.)
Stability AI open-sourced Stable Audio Open Small, a mini text-to-audio model that can run on a smartphone and create stereo clips in under 8 seconds.
Spotify’s AI DJ now lets Premium users request music by genre, mood, artist, etc. using their voice.
Tencent open-sourced HunyuanCustom, a video model that can use multimodal inputs to generate clips with consistent reference subjects. (Try it here.)
TikTok launched an AI Alive feature that lets users turn their photos into short animated video clips.
Vectara launched a Hallucination Corrector tool that flags and fixes AI hallucinations and inaccuracies by comparing AI output to source documents.
Windsurf launched an in-house model family called SWE-1 built to handle the entire software engineering process.

🔬 AI research

Cool stuff you might get to try one day:

Google rumors - there are lots of them ahead of Google I/O 2025:
1. NotebookLM—in addition to existing Audio Overviews—might be working on Video Overviews powered by the company’s Veo model.
2. The company is reportedly planning to announce a Pinterest-like AI feature that shows curated inspiration for fashion and interior design.
3. We might soon see upgraded versions of Google’s image and video models: Imagen 4 and Veo 3, respectively.
Tsinghua University introduced Absolute Zero Reasoner, an AI system that learns by self-generating and solving tasks without external data, achieving SOTA results in math and coding.

📖 AI resources

Helpful AI tools and stuff that teaches you about AI:

“How far can reasoning models scale?” [ARTICLE] - an attempt to establish realistic limits to scaling for reasoners by Epoch AI.
“Safety evaluations hub” [REFERENCE] - safety evaluation results for OpenAI’s models.

🔀 AI random

🤦‍♂️ AI fail of the week

That time GPT-4o turned me into a bobblehead:

Send me your AI fail for a chance to be featured in an upcoming Sunday Rundown.

💰 Sunday Bonus #59: Find the best AI model for a given task (custom Gemini Gem)

Last week, I shared a Custom GPT that helps you pick the right ChatGPT model for your needs.

This week, I thought, “What if that, but for, like…everything?!”

So I made a custom Gemini Gem that helps you identify the best GenAI model for any task—text, image, video, etc.—regardless of provider.

It recommends models based on relevant leaderboard rankings rather than speculation.

Just describe your task, and “Best AI Model Finder” tells you which model to use:

I want to animate a photo I already have B Best AI Model Finder Custom Gem Show thinking Based on the Artificial Analysis leaderboard for video models with image as input, the top-ranked model is Kling 2.0 by Kuaishou, with an Arena ELO of 1190. You can verify this information on the leaderboard here: https://artificialanalysis.ai/text-to-video/arena?tab=leaderboard&input=image I sometimes make mistakes, so please consult the above leaderboard for the most up-to-date ranking. — At the time of writing, this is accurate.

Note: This Gem focuses on ranked core AI models, not third-party tools or wrappers built on top of them.

EDIT: Apparently, Gemini Gems are currently not accessible by other people. As such, I made a Custom GPT with the same instructions. It is slightly less reliable at getting things right the first time, but I find that simply telling it to “Check again” after the first response does the trick. (Also, it will always provide a relevant leaderboard link that you can double-check if in doubt.)

Why Try AI