AI Catch-Up: GPT-5 vs. Everything Else

Aug 14

Round-up of the major AI releases of the past month.

25 Comments

I'm just letting you know that I've gone through about 4500 emails today and I think this post might be the last one for today. And probably the first for tomorrow so I can actually process the information. Most of the other emails were largely just junk and sales emails. This has actual substance!

Expand full comment

Damn, that's impressive efficiency!

Expand full comment

I meant to tell you: I've gotten agents to do some pretty cool things, like compiling a book for me.

Expand full comment

Oh yeah? That's a pretty cool use case. Are you talking about the ChatGPT agent mode or another agent (there are so many these days)?

Expand full comment

Yeah, focusing solely on Jippity this month. Brian's got a bunch of others up and running, so I get a back-end peek at fresh coding stuff across the board. Very cool to see.

Expand full comment

Michael Woudenberg

The whole Gary Marcus day cracks me up because on the one hand he's constantly grading AI against a standard they aren't claiming. He DEMANDS exponential improvement and then laughs when it doesn't happen while not recognizing that GPT5 is MUCH better than 3.5 which he also constantly dunked on. (while those of us who have been using it have made it work for the past years.) On the other hand, he constantly warns of AGI so you'd think he'd be happy that they're struggling instead of egging them on while feeding his own ego.

He used to have good insights.... now he's so obnoxious I've had to look away.

Expand full comment

I dunno, man.

On the one hand, yes, he deliberately refuses to focus on the real practical benefits of these models, constantly highlighting their many shortcomings instead.

On the other hand, his initial premise that large language models aren't going to get us to AGI (he's a believer in neuro-symbolic AI) still holds up to this day, despite LLMs becoming more powerful. And he's not anti-AGI - he's just allergic to the often unrealized hype surrounding LLMs specifically, which, taking Sam Altman and his hype-train personality, is largely understandable.

Here's a relevant quote from Gary Marcus's latest article:

"The good news here is that science is self-correcting; new approaches will rise again from the ashes. And AGI—hopefully safe, trustworthy AGI– will eventually come. Maybe in the next decade."

At the same time, he dedicated like 4 articles to bashing GPT-5 in a single week, so it's like, chill, dude, you made your point by now!

Expand full comment

Great!! Thanks for being back!! The New V0.app is worth mentioning!!! 🤗🤗

Expand full comment

Thanks, it's good to be back!

As for the v0.dev => v0.app transition, that's on the upcoming Sunday Rundown. My cut-off for the above catch-up post was last Sunday (August 10) - everything launched after that is part of the usual Sunday Rundown.

Have you tried playing with the v0.app yet? What's your take if so?

Expand full comment

Great (of course you did not miss :-)!! I have not tried it yet, but some Webdesigner friends were really amazed!! I used v0.dev severeal times and I always loved this tool... so I am looking forward to tomorrow:-)

Expand full comment

Nice! I asked it for a Tetris-like game where you could earn coins, spend them on power ups, and level up with the game getting progressively more difficult. It one-shotted a decent game that still needed some work, but I did enjoy watching its thinking agentic process in action. Should try and play around with it some more - too many tools out there, too little time!

Expand full comment

Andrew Sniderman 🕷️

Welcome back! AlphaEarth, whaaaat. I think it's not bad that you missed the hue and cry over 5, it rubbed me wrong tbh

Expand full comment

Yeah, it was kind of nice to skip the hype cycle on this one. What exactly rubbed you the wrong way about the GPT-5 saga? Hope you've had a great summer otherwise!

And yeah, AlphaEarth Foundations be crazy!

Expand full comment

Andrew Sniderman 🕷️

There was a trend — might have just been here, because Chris Best was the first one I saw doing it — of posting and mocking grossly incorrect generated images. Maps, diagrams, etc. It's a little bit GPT5's fault because it would constantly offer to make a diagram after a prompt reply. It seems to do that a lot less now. Also Sama's fault for hyping it as PhD level. But irregardless, I kept reflecting to how far it's come on image generation and how ridiculous it is that it can do it in the first place.

Expand full comment

Aug 15Edited

Yeah, the disconnect between hype and reality is all OpenAI's (or rather, Sam's) doing - he's really becoming famous for this.

As for images and diagrams, I don't think those are a fair measure of a model's capabilities, because you can absolutely have a smart reasoning model that knows exactly what should go on the diagram being hampered by the limitations of the underlying image generation that is simply unable to reproduce that. If I ask GPT-5 for all the letters of the alphabet followed by words starting with those, it should get it right 100% of the time. But if it then attempts to make an image of the same letters and words, it'll mess up 100% of the time, too - the underlying autoregressive image model just can't handle that much complexity accurately yet.

That's why when people share examples of failed ChatGPT images as evidence of the reasoning model being stupid, I always treat those as either people not knowing the difference/disconnect between the "language" and the "image" parts of the model or people acting in bad faith/trolling.

Have you tried using GPT-5 for any specific tasks where you could compare it to other models? Any verdicts?

EDIT: Just tried to do exactly what I described above and the results are as expected: https://www.youtube.com/watch?v=9CTKBUgEILU

Expand full comment

Andrew Sniderman 🕷️

Exactly! That’s what was going on in my mind; you described it better than I ever could. In my daily usage past week or so I’ve noticed it’s less verbose/sycophantic (good), It decides to switch to reasoning/other models on it’s own (you might recall from before I wasn’t a fan of the model picker), oh and fast. So fast. I haven’t intentionally tried to push it with new or different tasks.

Expand full comment

True, when it's not in thinking mode, it's crazy fast. But I am primarily testing GPT-5 in parallel with o3 on some research-heavy stuff - I still have a soft spot for o3 and its tendency to present things in tables (many people mock that, but I mostly find it quite helpful for overview purposes).

So far, GPT-5 reaches largely the same robust conclusions as o3, so I think it's on par, but the jury is still out. Will be curious to hear what kind of things you discover eventually!

Expand full comment

"Proton launched Lumo, a privacy-first AI chat that requires no sign-ups, encrypts all chats, and doesn’t maintain logs or chat records."

Software that doesn't require signups!!! Holy mother of god, it's a miracle!!!!

Except of course, clicking the Start Chat button generates an error.

Expand full comment

Wait until you hear about duck.ai. Also, z.ai that I mentioned at the end is sign-up free too.

Expand full comment

Aug 14Edited

Just tried Duck.ai. Cool, thanks. I like that, unlike ChatGPT, I can actually copy the text displayed in the output.

Z.ai appears dead, in my use anyway. Blank white page, nothing else.

Expand full comment

Aug 14Edited

Maybe you should get a new computer/browser/Internet connection, seeing how every site falls for you. 😆 - jokes aside, I just used z.ai myself and it worked perfectly on my end.

https://imgur.com/a/g2iH0nK

As for copying text in ChatGPT, every response has a little "two stacked sheets" icon under it, which does just that.

Expand full comment

Every site doesn't fail for me. 99.99% of them work just fine. It's only AI sites that can't deliver extremely basic essential web features which were utterly reliable in 1998.

Apologies, but I have an absurd number of opinions on these topics, because I spent years personally coding systems as complicated as Substack from scratch.

What you see on AI sites is classic novice programmer syndrome. That is, code something fancy and supposedly neato that makes the programmer feel like they are a super cool dude, but then destroys basic functionality. Novice. Programmer. Syndrome.

If the AI features didn't work for me, then I would agree, advanced features may require the latest browsers.

But I'm not talking about advanced AI features, but rather very basic web interface features which were entirely reliable before many Substackers were born.

Sorry, I wasn't clear. I didn't mean the two stack icon, but rather copying text right off the page. Duck.ai does that perfectly fine, proving that is possible with AI delivered content.

Here's how I respond to AI sites that can't deliver basic web functionality, and try to tell me I'm the problem. I spend my money elsewhere.

Expand full comment

Aug 14Edited

Phil: Writers on Substack are modern luddites who're clinging on to the old ways of life, doomed to fall hopelessly behind. They must put their ego aside, admit that AI is the future, and embrace it fully.

Also Phill: Most AI sites can't deliver basic functionality and don't live up to my quality standards, so I shall stubbornly refuse to use their features if there's even a minor inconvenience and misalignment with my expectations.

I kid, I kid. But only somewhat!

I'm still confused about what you mean with copying text on duck.ai vs. something like ChatGPT. Literally every ChatGPT response can be copied in full by clicking a dedicated button under it. Or you can even click "Edit in Canvas" to have the entire response migrated into it for editing, copying, and any other manipulation. If you can share a video/screenshot of the comparison, I'd be curious to see what you mean!

Expand full comment

You’ve proven my point yourself by generously sharing duck.ai. That platform appears to be delivering OpenAI generated content, WITHOUT any of the problems I’ve been referencing. That proves those problems are unnecessary. Unnecessary problems which drive away even a small fraction of customers is an example of poor programming.

I would agree the problems I’m pointing to may be affecting only a small percent of users. For an operation the enormous scale of ChatGPT, driving away a small percent of customers costs the company a lot of money. Probably enough to hire 10 new programmers, maybe even ones who know what they’re doing.

You’re looking at this issue through the lens of one user complaining. Perhaps this is because, to my knowledge, you’ve never created a platform, managed a platform, or sold a platform. The reason I don’t have to scramble for pennies on Substack is that I have done all that, and so I look at such things through that lens, not as a user, but a former platform owner.

I think what you’re really reacting to is my bombastic confidence. :-) And that’s ok, no problem, I do get carried away, and do agree these issues may not merit the level of attention I’m giving them. The young dudes are supposed to challenge the old dudes, that’s your job! :-)

To address your question:

I was referring to copying small sections of text right off of the ChatGPT output page. I’ll often ask ChatGPT something, and it will give me a big wall of text, and I’ll only be interested in a small part of it. So I’ll try to copy that small section of text off the page. Like you can do on pretty much any other page on the Internet. For the last 30 years.

I agree I’m obsessed, you’re right to remind of that. I became obsessed with such things from coding my own platforms for years. At this point I should let all of that go, but it seems old habits die hard.

Expand full comment

Continue thread →

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts