Fine Tuning GPT-3.5: Is it actually magic?
Our learnings from a small experiment; a pretty crazy tool that does AI generated code from designs
Welcome to another edition of the best damn newsletter in AI. Here we’ll break down AI topics that matter, open your mind to use cases, and keep you ahead of the curve.
Our #1 goal is to be useful. So please shoot us an email 📩 if you have questions or feedback, and especially if you implement something we share!
Here's what we're covering today:
A practical perspective on OpenAI’s new fine tuning capabilities
You can generate WHAT now? This is huge for engineers and founders
Some startup and investor drama has our interest piqued
Let’s get to it! 👇
Is fine tuning magic?
OpenAI had a big announcement on Tuesday: you can now fine-tune GPT-3.5 with your own data. Many people rushed to call this 'game changing'.
Rachel’s talked about this many times - fine tuning (so far) hasn’t been the magic sauce that everyone expects.
But we keep an open mind! So with the big news, we wanted to test for ourselves.
What did we find? Well let's dig in.
First, what is fine tuning? Why are people excited?
Large language models learn patterns of language based on the trillions of example texts that it’s given. And that initial training, creates what we call a “base model”.
After you have that base model, you can give it more specific examples of language to further refine the model and have it take on certain behaviors, or get good at certain tasks.
Fine tuning is when instead of using prompts to teach the model what you’re looking for, you give it a dataset of examples that it can learn from.
Take ChatGPT for instance. ChatGPT has been fine tuned on chat conversations, hence why it's good at chat!
The promise of fine tuning on our own data is that we get to teach ChatGPT a new set of skills, a new way to talk, new behaviors, etc.
What did OpenAI release?
AI nerds and businesses alike have been eagerly awaiting the chance to fine tune the GPT-3.5 model. Previously, fine tuning was limited to OpenAI's older models.
But with this new rollout, we can now apply that same process of teaching the model our specific task, behavior or data - to one of the best.
We put it to the test
We needed to start simple to get this research out to you asap. So we took an AI task we already have working for our business 100% based on prompts, and tested converting it to fine tuning.
Here's what we found:
Were the results better? We didn't notice any groundbreaking improvements. We'd have to do a much larger scale test to observe a slight performance improvement.
Was it easier?: Yes, and no. We have improved our prompt for this task over time, and while feeding in tons of examples is less intellectually challenging - getting all of those examples in the first place also takes time.
What about potential cost or speed improvements?: Our fine tuning prompt is about 1/10th the size of our normal prompt. Shorter prompts run faster, so this definitely could help with speed improvements if that were a concern. But given that fine tuning is about 8x as expensive as the normal model, the cost savings aren't material (yet).
So is Fine Tuning a bust?
Not necessarily. The huge caveat here is that we intentionally approached this project as an AI Operations person should - what's the best way to solve the business problem the fastest.
You could easily look at this project like a more traditional machine learning project - designing a very high quality dataset, building meticulous prompts even for your fine tune, and iterating and testing many variations.
We believe we could get material performance gains if we invested the time.
But the reality is that fine tuning takes time. Even this simple example took us about an hour to prepare the dataset and run the experiment.
Iteration on fine-tuned models will be much slower than just tweaking a prompt.
And it’s still not clear if the payoff is worth it.
OpenAI has hinted that fine tuning for GPT-4 is on the horizon, and we can't wait to see what it brings. Maybe it'll be the game-changer we're hoping for.
p.s. the reason we were able to run this experiment so fast is because of the principles we teach in our Prompting for AI Ops course on how to delegate your work to AI. We opened up registration to all members this week - and will be opening up broader soon!
Ok but this might be magic (AI writes code from designs)
This is a nerdy tool for any of the developers or startup founders out there.
But guys - we found an AI tool that can actually take designs in Figma, and convert them into ready-to-use code (React, with HTML/CSS or Tailwind). Check it out: https://kombai.com/
For the non-developers reading this. Here’s why you should care.
Writing code, building products and creating software is getting easier and easier. This will mean more people can build their ideas, and more excitingly - there will be better and better products to choose from. Maybe AI isn’t eating software; instead it’s birthing more of it??
(sorry that was weird)
For your reading list 📚
And if you’re really nerdy…
One of Cohere’s initial investors is looking to sell their stake. Could be because the investor isn’t doing so hot (Tiger Global, for those who track the venture capital world) or other reasons…
MEMBER ANNOUNCEMENTS 📆
🔥 member sessions have been popping off. If you’re looking to learn from and connect with peers on their AI adoption, these are for you 👇️
AI & Hollywood with John, Wednesday August 30 @ 9am PT
Marketing Industry Insights with Charlie, Friday September 1 @ 11am PT
Register for upcoming member sessions here, and reach out to us if you’d like to host or co-host one of your own!
We'll see you again on Tuesday. Thoughts, feedback and questions are much appreciated - respond here or shoot us a note at [email protected].
🪄 The AI Exchange Team