24 Comments
Oct 6, 2023Liked by Daniel Nest

Enjoyed your Discord link to Pika Labs but also see there is a Pica Ai. Are these two connected 🤔 ?

Expand full comment
author
Oct 6, 2023·edited Oct 6, 2023Author

Heya, just looked it up, and nope: It doesn't look like they're related. As far as I can tell, Pica AI is text-to-image only, but Pika Labs are text-to-video. Also, Pica AI's Discord link points to something called Artguru.AI (https://discord.com/invite/hEG4nhFSAf)

Expand full comment
Oct 6, 2023Liked by Daniel Nest

Thanks for that. I'll take a look.

Expand full comment

“Downward Me” gave me a full guffaw!

Expand full comment
author

AI's got jokes!

Expand full comment

I've had a little time with Dall-E-3 now, and it's definitely better and worse. It's mostly a lot better at understanding stuff.

You're right about prompt engineering being a dying art, although I think we need to start reconsidering where we draw the line between:

1. Normal language/speech, EG telling Siri what you want from the store

2. Prompt engineering (being very specific with the language in order to get a good result)

3. Coding (using specific arcane knowledge to manipulate the code itself)

I see 1, 2, and 3 all sort of starting to blend together. Are we coding if we are composing a cartoon in Dall-E? Kind of, yes, definitely maybe.

Expand full comment
author
Oct 5, 2023·edited Oct 5, 2023Author

My take is that the distinction between 1 and 2 is going to get increasingly irrelevant, as AI gets so good at understanding you that it'll simply "prompt" itself under the hood based on your input. I actually just observed this happen in DALL-E 3 ChatGPT Plus version (literally got access to it 20 minutes after posting the article). I gave it the same one-line cartoon description and watched it turn it into four separate but related, more detailed prompts to generate alternative looks.

As for coding, that's still a realm of precision and knowledge of the more technical languages...but for how long? We've seen glimpses of AI helping amateurs code, and it seems highly likely that "coding" as we know it will also become obsolete with time. You'll just be able to tell AI exactly what you need and have the code written, tested, troubleshot, etc. without your intervention.

Expand full comment

I love that the 3 are sort of blending seamlessly.

Well, maybe not seamlessly, but we're making a lot of progress this year.

Expand full comment
author

Definitely. I see us moving firmly in the direction of the final level of abstraction between human and machine being broken. In the future, you'll just communicate with software the way you would with another human and have predictable results without learning any additional languages or interfaces.

Expand full comment

I think it will go beyond the way we communicate with one another, or at least the way we have communicated up until now. I think we're going to bypass human language eventually.

Expand full comment
author

Oh yeah. In the even-longer term, definitely!

At the risk of sounding like the broken record that I actually am, Tim Urban's Neuralink article talks at length about the inefficiency of language at carrying meaning and how brain-to-brain interface is vastly superior: https://waitbutwhy.com/2017/04/neuralink.html

We can certainly imagine a future where language itself is seen as an outdated mode of communication.

Expand full comment

Ok, I'm now in Bing Dalle. Thanks again for the education Daniel!

The cartoon process thing didn't work for me for some reason. I got images that were related, but not per the instructions. Squirrels discussing acorns became cartoon people discussing acorns.

However, when I forgot about cartoons and just made up my own prompt, it worked BLEEEPING perfectly. This solves the biggest problem I was having with Stable Diffusion, so I'm delighted. I can now describe and render two hippies at a time, and specify what each character is wearing etc. I've only done one prompt batch so far, but it was very encouraging. I had to stop though so I could come back here and report to all that you are a very helpful genius.

I'll keep happily experimenting with more complex scenes, and see what else I can learn.

Expand full comment
author

Awesome to hear you've found your way around DALL-E 3 so quickly. That's the beauty of it: You no longer have to stick to a prescriptive process from me or anyone else. It's about finding your own way to express what you want and counting on DALL-E 3 to understand you, which--for the most part--it does very well.

If you stumble upon some curious insigths, feel free to share them!

Expand full comment

A few quick reports....

1) Using Dalle I can create a character, and maintain that character reliably through a series of different prompt sessions. This is HUGE for what I'm doing. Stable Diffusion is much less reliable in this regard. If I want to create 100 different images of the same character doing all kinds of different things, it appears now I can. BIG DEAL!!

2) So far, the quality of the images is superb for my taste. Couldn't be happier.

3) DALLE doesn't seem to do transparent backgrounds very well (it tries), or solid green backgrounds for green screen background removal in my video editor, but I don't really care too much as removing backgrounds is easily accomplished in free online tools.

4) I don't see choices for using different models as I have in SD, but I probably need to address this in the prompts I'm entering. Just a different way of working?

5) So far it looks like all output is in a square format, which I can't change. I can work with this. Or maybe I need to learn more...

I can easily be doing some things wrong, so the above report could change at any time.

PS: Whooppee!

Expand full comment
author
Oct 5, 2023·edited Oct 5, 2023Author

Hey Phil,

Really happy it fits the bill for you, and thanks for reporting your observations!

1) That's awesome to hear! You can definitely get a consistent model out of Stable Diffusion, but it requires training it on a range of images of that specific person/character. Having DALL-E do this without any additional work is indeed massive!

2) Nice! I just got access to DALL-E 3 inside ChatGPT Plus, and it appears the quality is even slightly better than the Bing version, but I haven't done enough thorough testing.

3) Interesting. I'll try it in the ChatGPT version and see if I have more luck. But like you said, that's easily done in post-editing. Also, Adobe Express does this automatically as I described here: https://www.whytryai.com/p/free-online-image-tools

4) With Stable Diffusion being open-source, there are dozens/hundreds of spin-off models trained for different purposes (e.g. anime, photorealisic, etc.) that third-party developers have developed. DALL-E 3 is a closed model so there ARE no additional models: DALLE-3 is just DALLE-3, just as Midjourney is purely Midjourney (although you can select the previous versions of Midjourney if you want). But you can describe the styles and the effects you want to get the look you need (e.g. oil painting, pencil sketch, 3D model, etc.)

5) That's correct for the Bing version, as I also highlighted a few days ago: https://www.whytryai.com/p/10x-ai-21-meta-ai-chatgpt-upgrades-pika-labs

But I'm happy to report that the ChatGPT Plus version lets you specify three aspect ratios: tall, square, and wide. Here's a "wide" version of the "I've got baggage" cartoon: https://i.imgur.com/hPDoyT7.png

Expand full comment
Oct 5, 2023·edited Oct 5, 2023Liked by Daniel Nest

Hey Daniel, thanks for your report as well. I'm looking forward to your report on Dalle + ChatGPT.

Bing/Dalle did go down a few times on me, but momentarily. It finally said goodbye, too much traffic on the server. But by the time that happened I'd already gotten enough work done to keep me busy in the video editor.

I've got a new music video in the works, but was having trouble coming up with the main hippy character in SD to sing the song. Dalle nailed it instantly, first prompt I tried.

https://imgbox.com/cNbZsw2T

If you should ever have any characters you want to talk or sing, give me a shout, happy to help.

Expand full comment
author

Thanks Phil, I'll definitely keep it in mind!

Expand full comment

Daniel, are you having any trouble with Bing/Dalle? It worked great at first, but now most of the time when I push the Create button nothing happens. I may be out of credits or "boost", whatever that is. But I don't see any info anywhere telling me what to do in that regard. Could be another case of user error? Puzzled..

Expand full comment
author
Oct 8, 2023·edited Oct 8, 2023Author

There are rumors that the daily 100 credits have been reduced to 50 (or 25) per week:

https://www.reddit.com/r/bing/comments/17062jq/how_do_bing_image_creators_tokensboosts_work/

If you've hit that limit, I'm not sure if there's anything you can do except wait until they reset, unfortunately. Or you could upgrade to ChatGPT Plus, which now got DALL-E 3 as well.

Expand full comment

Thanks Daniel. Ok, now I know it's not me. Upgrading sounds like a plan. I'll hang back for now until I read your review. Have a good one!

Expand full comment

Can you follow that process on a Mac? An iPad? Chromebook?

Expand full comment
author

It's all browser-based, so I don't see why not. Have you tried?

Expand full comment
Oct 10, 2023Liked by Daniel Nest

No, and I don’t think I will., although tempted to learn. Just curious about how this new tool develops and how people are using it. I leave it to the pros toproduce while learning, reporting,etc...im just a reader and I think am pleasantly surprised about the results. And the examples were truly funny. Thanks for opening your window to-the curious world.

Expand full comment
author

You got it!

If you do end up trying it for yourself, I'd love to hear about your experience.

Expand full comment