What We're Thinking About Re: ChatGPT Can "See"

How new ChatGPT features might bring back more hallucination concerns; using AI to help you make AI Automations

Welcome to another edition of the best damn newsletter in AI. Here we’ll break down AI topics that matter, open your mind to use cases, and keep you ahead of the curve.

Our #1 goal is to be useful. So please shoot us an email 📩 if you have questions or feedback, and especially if you implement something we share!

Here's what we're covering today:

  • ChatGPT will now be able to see, hear and speak back to you - what to watch out for

  • Using AI to help you make AI Automations (meta, we know)

  • Image generation models are in the news again + more

... and if someone forwarded this email to you, thank them 😉, and subscribe here!

Let’s get to it! 👇


ChatGPT goes multi-modal (& here’s what we’re thinking about)

The AI craze feels like it’s back - Microsoft announced Copilot is going mainstream and won’t just be a business tool; Google has teased Gemini which they say will be on-par with GPT-4…

And now ChatGPT is officially multi-modal

It can hear you (well, it already could).

It can see what you send it (new!).

And it can speak back (new!).

To be mind blown with demos, read the full announcement from OpenAI.

But that’s not the only thing we’re paying attention to.

We haven’t seen specific studies on this (if you have, send them our way)! But what will be interesting is to see how the increased modalities and data either help or harm the hallucination issue.

In our work, we’ve seen that it’s harder to catch hallucinations when you give GPT more information.

It’s easy to think - I sent it that whole article, why would it make something up.

Or now, I sent it that photo, why would it make something up.

But remember, these AI models like ChatGPT are still just predicting the next most likely word, sound, photo, etc based on it’s training data and what you gave it.

By themselves, they don’t have a sense of what a “lie” is. And certainly not our cute term for it, a “hallucination”.

Hallucinations with text seem to be getting a little under control. People increasingly know not to trust everything that comes out of the little chat.

But with images and audio in the mix, it will be interesting to see how that changes our expectations.


Goodbye to Regex struggles (+ an AI automations tip)

If you’ve gone deep in automations before, you might be all too familiar with regex (aka regular expressions). If you haven’t regex is basically an advanced way to search and filter text to extract the pattern you’re looking for.

Ok but that looks hard and annoying.

We know. We honestly hate writing regex.

But you know what’s great at regex? ChatGPT.

Enter in our Regex Writer prompt that we use. Simply:

  1. Give it your example input text

  2. Give it an example output

  3. Then we’ve found that if we explain how the pattern typically looks, we get better results (so we include that too)

👋 If you’re totally new to Regular Expressions, automations and prompts that look like this 👆️ but want to learn to delegate your work to AI, we’ve had hundreds of students go through our first course Prompting for AI Operations, and it’ll likely be a prerequisite for our upcoming courses on Automations and other fundamental AI skills. Shoot us an email if you’ve got questions!


For your reading list 📚

On the tension between AI and people’s jobs…

AI image models get a boost…

Leading industry perspectives…

  • Kevin Systrom, co-founder of Artifact and Instagram, shares his optimism about AI's potential and dismisses AI doomerism.

  • Anthropic's CEO believes there are no limits to AI development, challenging the notion of AI's boundaries.

And if you're really nerdy...

That's all!

We'll see you again on Thursday. Thoughts, feedback and questions are much appreciated - respond here or shoot us a note at [email protected].

... and if someone forwarded this email to you, thank them 😉, and subscribe here!


🪄 The AI Exchange Team