Why Try AI

Why Try AI

Share this post

Why Try AI
Why Try AI
Sunday Rundown #93: AI Image Boom & Bad Journey
Sunday Rundown

Sunday Rundown #93: AI Image Boom & Bad Journey

Sunday Bonus #53: My custom "Logo Prompter" GPT.

Daniel Nest's avatar
Daniel Nest
Mar 30, 2025
∙ Paid
10

Share this post

Why Try AI
Why Try AI
Sunday Rundown #93: AI Image Boom & Bad Journey
14
3
Share

Happy Sunday, friends!

Welcome back to the weekly look at generative AI that covers the following:

  • Sunday Rundown (free): this week’s AI news + a fun AI fail.

  • Sunday Bonus (paid): an exclusive segment for my paid subscribers.

Every Sunday Bonus in one place

Let’s get to it.

🗞️ AI news

Here are this week’s AI developments.

👩‍💻 AI releases

New stuff you can try right now:

  1. Alibaba’s Qwen team released several new models:

    1. QVQ-Max, a visual reasoning model that can analyze images, solve math problems, and more.

    2. Qwen2.5-VL-32B, a smaller multimodal model that beats larger rivals in visual reasoning and math tasks.

    3. Qwen2.5-Omni, a multimodal model that can see, hear, talk, and write during real-time interactions.

  2. Anthropic introduced a "think" tool that lets developers trigger an additional reasoning step to help Claude better handle specific complex situations.

  3. DeepSeek upgraded its base model to V3-0324, improving its performance on coding and reasoning benchmarks.

  4. ElevenLabs introduced Actor Mode which lets creators use their own voice to guide the AI speech model’s reading of their script.

  5. Google news:

    1. The new Gemini 2.5 Pro is the company’s most intelligent model to date and outperforms other top reasoning models on most benchmarks. (Try it for free on Google AI Studio.)

    2. Google Meet received nifty updates including AI-generated follow-up items, AI transcripts linked to relevant key video moments, and more.

  6. Ideogram AI released Ideogram 3.0, convincingly outperforming all existing image models in human evaluations. (But it hasn’t yet been benchmarked against this week’s other AI image newcomers, Reve and 4o.)

  7. Luma Labs added image-to-video capabilities to its Ray2 model, so you can animate starting images. (Luma suggests trying sketches and doodles.)

  8. OpenAI news:

    1. The new 4o image generation is a major paradigm shift in text-to-image AI (I explain why here). Watch the announcement livestream:

    2. GPT-4o got a few under-the-hood updates and is now better at following instructions, being creative, and tackling complex coding tasks.

    3. Advanced Voice Mode in ChatGPT now has a better personality and is less likely to interrupt you during conversation.

  9. Perplexity now has answer tabs that let you filter for images, videos, shopping, jobs, and more:

  10. Pika Labs added a fun effect that lets you record a selfie video with your younger self.

  11. Reve AI launched Reve Image, a SOTA image model that tops leaderboards and is great at following instructions and rendering short text. (Try it for free.)


🔬 AI research

Cool stuff you might get to try one day:

  1. Alibaba introduced LHM, a model that can turn one photo into an animatable 3D avatar in seconds. (Try the demo.)

  2. Anthropic is rumored to soon expand Claude 3.7 Sonnet’s context window from 200K to 500K tokens.

  3. Microsoft is gearing up to launch two AI agents—Researcher and Analyst—to help with work tasks in Microsoft 365 Copilot. (Expected in April.)

  4. Midjourney is finally preparing to release the much-awaited V7 model. (It’s been over a year since V6 came out.)


📖 AI resources

Helpful AI tools and stuff that teaches you about AI:

  1. “Anthropic Economic Index: Insights from Claude 3.7 Sonnet” [REPORT] - the second issue, tracking the impact of Claude 3.7 Sonnet launch.

  2. “Tracing the thoughts of a large language model” [REPORT] - an exploration of Claude’s inner workings by Anthropic.

  3. “Vibe Coding 101 with Replit” [VIDEO COURSE] - a free course by Replit’s President Michele Catasta and Head of Developer Relations Matt Palmer.

🤦‍♂️ AI fail of the week

Tried the prompt for this 4o cartoon in Midjourney. None of it worked.

Create a single cartoon page divided into four equally-sized sub-cartoons, maintaining a consistent visual style: minimalistic line art, vibrant color palette, expressive characters, and simple, relevant backgrounds. Each sub-cartoon contains clear dialogue in speech bubbles. The tone is subtle, dry humor with a touch of absurdity.  Sub-Cartoon 1: Setup Setting: Two people sitting on a park bench. One is reading a newspaper, the other has a thick mustache and is looking directly at the viewer with a neutral, almost deadpan expression. Visual details: Bright park background with trees and a clear sky, minimal but colorful. Speech bubble: none. The mustached man breaks the fourth wall.  Sub-Cartoon 2: The Line Setting: Same bench, same characters. The mustached man turns toward the person with the newspaper. Dialogue: Mustached Man open the mouth: “I have a lot of jokes about unemployment.” Newspaper Reader: (still reading, no speech) Visual details: Slight movement and shift in posture. The scene remains calm.  Sub-Cartoon 3: The Punchline Setting: The newspaper reader lowers the paper and looks at the mustached man. Dialogue: Mustached Man: “None of them work.” Visual details: The mustached man delivers the line with the same deadpan expression. Reader looks mildly surprised. Simple tree or bird in background for subtle visual rhythm.  Sub-Cartoon 4: Return to Silence Setting: Same bench. The mustached man turns back to look at the viewer again. The reader resumes reading the newspaper. Visual details: Identical framing to the first panel. A bird flies past in the background for quiet comedic timing.  Maintain a clear and consistent artistic style across all panels: simple lines, expressive faces, flat shadows, bright and engaging colors. The humor should feel quiet, dry, and cinematic—like a visual dad joke delivered in deadpan style.

Why Try AI is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


💰 Sunday Bonus #53: Create killer logos with my “Logo Prompter” GPT

GPT-4o is now remarkably good at making images.

So good, in fact, that it can reliably translate highly detailed, precise prompts into perfectly matching visuals.

One ideal use case for this? Logo creation.

(Yup, we’ve come a long way from my first-ever Sunday Showdown about logos.)

But creating an effective logo requires a thorough understanding of what it should communicate, your brand guidelines, desired visual style, and many other criteria. That’s not always an obvious process.

Which is why I made the “Logo Prompter” GPT.

It walks you step-by-step through your vision, asks for relevant materials, and then generates a detailed, context-rich logo prompt for 4o.

Note: I originally wanted to create a “Logo Crafter” GPT that would make the logo directly. But for now, custom GPTs still use DALL-E 3 instead of 4o as the image model. So there’s a copy-paste step at the end. (Don’t worry, the Logo Prompter talks you through it.) I’ll upgrade the GPT once 4o image capabilities are available.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Daniel Nest
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share