Why Try AI

Why Try AI

Share this post

Why Try AI
Why Try AI
Sunday Rundown #99: Video Galore & Big Head Energy
Copy link
Facebook
Email
Notes
More
Sunday Rundown

Sunday Rundown #99: Video Galore & Big Head Energy

Sunday Bonus #59: Custom Gemini Gem for finding the best AI model.

Daniel Nest's avatar
Daniel Nest
May 18, 2025
∙ Paid
9

Share this post

Why Try AI
Why Try AI
Sunday Rundown #99: Video Galore & Big Head Energy
Copy link
Facebook
Email
Notes
More
2
Share

Heads up: The next Sunday Rundown will be in two weeks on Sunday, June 1.

Happy Sunday, friends!

Welcome back to the weekly look at generative AI that covers the following:

  • Sunday Rundown (free): this week’s AI news + a fun AI fail.

  • Sunday Bonus (paid): an exclusive segment for my paid subscribers.

Every Sunday Bonus in one place

Let’s get to it.

🗞️ AI news

Here are this week’s AI developments.

👩‍💻 AI releases

New stuff you can try right now:

  1. Alibaba Wan open-sourced Wan2.1-VACE, a video model that lets you create, edit, and remix video clips using text, images, and videos as inputs.

  2. Audible launched AI narration to help publishers convert their books into audiobooks. (AI translation is also in the works.)

  3. ElevenLabs released a fun tool called SB-1 Infinite Soundboard, which lets you create custom soundboards using its text-to-audio model. (Try it for free.)

  4. Genspark launched a Download Agent that autonomously searches the web for specific types of files—PDFs, images, videos, etc.—and saves them to an AI Drive.

  5. Google rolled out AI-powered accessibility features like TalkBack (screen reader), expressive captions, OCR scanner for PDFs, and more.

  6. LTX Studio released LTXV 13B Distilled, a super-fast video model that can generate high-quality video clips in just 12 seconds.

  7. Manus AI news:

    1. The agent can now generate images and combine this image generation with task planning to help you achieve goals.

    2. All users now get one free daily task, 1,000 bonus credits upon sign-up, and power users have new subscription options.

  8. Microsoft added a “Hey, Copilot” wake word on Windows, which lets users start Copilot chats using voice. Rolling out to Windows Insiders.

  9. Notion launched AI Meeting Notes to auto-transcribe meetings, summarize key points, and generate action items directly in the app.

  10. OpenAI news:

    1. You can now save your Deep Research reports as nicely formatted PDFs with tables, images, linked citations, etc.

    2. GPT-4.1 is now available directly in ChatGPT (instead of just via API). It’s especially good at coding tasks and following instructions.

    3. Codex is a cloud-based coding agent that autonomously writes features, fixes bugs, proposes pull requests, etc. directly in ChatGPT. (Available to Pro, Enterprise, and Team users, rolling out to Edu and Plus soon.)

  11. Stability AI open-sourced Stable Audio Open Small, a mini text-to-audio model that can run on a smartphone and create stereo clips in under 8 seconds.

  12. Spotify’s AI DJ now lets Premium users request music by genre, mood, artist, etc. using their voice.

  13. Tencent open-sourced HunyuanCustom, a video model that can use multimodal inputs to generate clips with consistent reference subjects. (Try it here.)

  14. TikTok launched an AI Alive feature that lets users turn their photos into short animated video clips.

  15. Vectara launched a Hallucination Corrector tool that flags and fixes AI hallucinations and inaccuracies by comparing AI output to source documents.

  16. Windsurf launched an in-house model family called SWE-1 built to handle the entire software engineering process.


🔬 AI research

Cool stuff you might get to try one day:

  1. Google rumors - there are lots of them ahead of Google I/O 2025:

    1. NotebookLM—in addition to existing Audio Overviews—might be working on Video Overviews powered by the company’s Veo model.

    2. The company is reportedly planning to announce a Pinterest-like AI feature that shows curated inspiration for fashion and interior design.

    3. We might soon see upgraded versions of Google’s image and video models: Imagen 4 and Veo 3, respectively.

  2. Tsinghua University introduced Absolute Zero Reasoner, an AI system that learns by self-generating and solving tasks without external data, achieving SOTA results in math and coding.


📖 AI resources

Helpful AI tools and stuff that teaches you about AI:

  1. “How far can reasoning models scale?” [ARTICLE] - an attempt to establish realistic limits to scaling for reasoners by Epoch AI.

  2. “Safety evaluations hub” [REFERENCE] - safety evaluation results for OpenAI’s models.


🔀 AI random

Other notable AI stories of the week:

  1. Google news:

    1. The new AI Futures Fund allows startups to get expert support, investment, and early access to AI models like Gemini, Imagen, Veo, etc.

    2. AlphaEvolve is a Gemini-powered coding agent that helps develop better algorithms for math, chip design, and AI training.

🤦‍♂️ AI fail of the week

That time GPT-4o turned me into a bobblehead:

Final version used in this post.

Send me your AI fail for a chance to be featured in an upcoming Sunday Rundown.

Why Try AI is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


💰 Sunday Bonus #59: Find the best AI model for a given task (custom Gemini Gem)

Last week, I shared a Custom GPT that helps you pick the right ChatGPT model for your needs.

This week, I thought, “What if that, but for, like…everything?!”

So I made a custom Gemini Gem that helps you identify the best GenAI model for any task—text, image, video, etc.—regardless of provider.

It recommends models based on relevant leaderboard rankings rather than speculation.

Just describe your task, and “Best AI Model Finder” tells you which model to use:

I want to animate a photo I already have   B Best AI Model Finder Custom Gem  Show thinking Based on the Artificial Analysis leaderboard for video models with image as input, the top-ranked model is Kling 2.0 by Kuaishou, with an Arena ELO of 1190.  You can verify this information on the leaderboard here: https://artificialanalysis.ai/text-to-video/arena?tab=leaderboard&input=image  I sometimes make mistakes, so please consult the above leaderboard for the most up-to-date ranking.
At the time of writing, this is accurate.

Note: This Gem focuses on ranked core AI models, not third-party tools or wrappers built on top of them.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Daniel Nest
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More