Why Try AI

Why Try AI

Share this post

Why Try AI
Why Try AI
Sunday Rundown #68: AI Video Everywhere & a Head in a Fridge
Copy link
Facebook
Email
Notes
More
Sunday Rundown

Sunday Rundown #68: AI Video Everywhere & a Head in a Fridge

Sunday Bonus #28: Steering the "Audio Overviews" in NotebookLM

Daniel Nest's avatar
Daniel Nest
Sep 22, 2024
∙ Paid
11

Share this post

Why Try AI
Why Try AI
Sunday Rundown #68: AI Video Everywhere & a Head in a Fridge
Copy link
Facebook
Email
Notes
More
4
2
Share

Happy Sunday, friends!

Welcome back to the weekly look at generative AI that covers the following:

  • Sunday Rundown (free): this week’s AI news + a fun AI fail.

  • Sunday Bonus (paid): a goodie for my paid subscribers.

All Sunday Bonuses In One Place

Let’s get to it.

🗞️ AI news

Here are this week’s AI developments.

👩‍💻 AI releases

New stuff you can try right now:

  1. Alibaba Cloud unleashed 100 open-source models including Qwen2.5-Max, which is on par with or better than Llama 3.1 405B and GPT-4o on many LLM benchmarks, making it the best open-source model out there.

  2. Amazon released an AI video generator for sellers to use in ad creation and Project Amelia: an all-in-one AI assistant that helps sellers with stats, answers, and suggestions.

  3. Kling AI launched version 1.5 of its video model and a “Motion Brush” tool that lets you better control the action.

  4. Luma Labs released a Dream Machine API to let developers build products using its video model.

  5. Microsoft is rolling out what it calls “the next wave” of Microsoft 365 Copilot with lots of business-oriented AI features and new tools coming to its suite of products.

  6. Snapchat is bringing text-to-video to a subset of creators as a beta test. Later, the company has plans to make image-to-video available as well. (I tested 9 image-to-video tools not so long ago.)

  7. Suno now lets you exclude specific styles, instruments, and vocals from generated songs. (Kind of like negative prompts in Midjourney and other image tools.)


🔬 AI research

Cool stuff you might get to try one day:

  1. Runway is slowly starting to roll out access to its Gen-3 Alpha Turbo API to make it easier for developers to integrate the video model into their products.

  2. YouTube is planning to roll out more AI features for creators, including its video model Veo for generating B-roll footage and an AI-powered “brainstorming buddy” that helps you generate video ideas.


📖 AI resources

Helpful stuff that teaches you about AI:

  1. Building OpenAI o1 (Extended Cut) [VIDEO] - a chat with the OpenAI team behind the o1 reasoning model with many curious insights.


🔀 AI random

Other notable AI stories of the week:

  1. Runway announced a partnership with Lionsgate to create a custom AI model based on Lionsgate’s proprietary catalog.

🤦‍♂️ AI fail of the week

“Ah, human heads! Classic Earth delicacy!” (Final version here.)

Cartoon illustration: Shot of two aliens from behind, watching a large TV screen. Inside the TV, a man is holding a fridge door open, looking dissatisfied and indecisive. A speech bubble coming from the TV says: '...abundance breeds indecision in our modern hunter-gatherer.' The aliens’ heads and backs are visible, sitting in their futuristic living room with a focus on the TV screen.

Why Try AI is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Anything to share?

Sadly, Substack doesn’t allow free subscribers to comment on posts with paid sections, but I am always open to your feedback. You can message me here:


💰 Sunday Bonus #28: How to steer the “Audio Overviews” in NotebookLM

I mentioned the new “Audio Overviews” feature in NotebookLM in last week’s issue. It automatically generates a short podcast out of any source material you upload.

But that mention didn’t do justice to just how impressive these AI podcasts are.

We’re not talking about a monotone robotic voice giving you a dry summary. These overviews genuinely feel like natural conversations.

The two AI speakers crack jokes, laugh, interrupt, and riff on each other’s statements, stop to catch a breath, occasionally stumble over their words, and so on.

As a taster, here’s a snippet of an audio overview made from the DALL-E 3 research paper. (Especially the first 15 seconds and the part after the 1:15 mark.)

1×
0:00
-1:42
Audio playback is not supported on your browser. Please upgrade.

As it stands, you have no control over these Audio Overviews beyond uploading your sources and clicking the “Generate” button. The podcast will always have the same two speakers, and the way they tackle the topic is entirely up to NotebookLM.

But after some testing, I found a semi-reliable “hack” to nudge the podcast in the direction you want in terms of structure, topics covered, etc.

Let me show you.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Daniel Nest
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More