Written: February 22, 2024
Edited: February 25, 2024
Note: If you ever get stuck while implementing the directions in this article, you can refer to this finished Kaggle notebook and this finished Hugging Face space for a solution.
There's something magical that happens when you manage to get code to do what you want. The itch is almost universal among hackers and coders, and it's why they keep coming back again and again.
When I first got addicted to programming, I loved the fact I could write code that would do exactly what I wanted it to even while I was asleep. For instance, you're reading this and I'm doing something completely different. I love that.
With the recent boom in artificial intelligence, we can take this once step further and create code that will predict, classify, and generate content for us.
A great course you can use to learn about this is one by FastAI called Practical Deep Learning for Coders. I'll be summarizing the first few chapters in a condensed format here that should get you up off the ground with your first model trained and deployed for use in no time.
We'll be building an image classifier in this article, which means you can upload an image to it and it will predict whether it's a certain animal, shape, car, or whatever you decide to build.
I'd encourage you to think about what you want to make. Do you want to make an image classifier for dogs and cats? Perhaps you want to make one for different types of food?
"It is difficult to think of a major industry that AI will not transform. This includes healthcare, education, transportation, retail, communications, and agriculture. There are surprisingly clear paths for AI to make a big difference in all of these industries."
- Andrew Ng
The first thing we need to do is sign up for a Kaggle account so that you can have a server and temporary access to GPUs for training your model.
Kaggle is a site that, among other things, allows people like you to participate in competitions for data science/building models; it also will allow you to run code in "notebooks" (more on this in a second) and train models on its site using GPU servers.
Go ahead and head over to kaggle.com to create a free account. Also, make sure to go verify your account via phone before you're done, or else they won't let you connect your notebook to the internet (which is kind of crucial).
I'll be here when you get back.
Ready?
Awesome. Next, go ahead and hit the Create
button (which you'll see on the lefthand side if you go to your profile) to create a new notebook. Once you do, you'll see a blank notebook page that looks like this:
Remove that first block of code by highlighting it and using the ✂️ in the top bar. Then, go ahead and give it a new title like "My First AI Model" or "First AI Model" for now.
Okay, let's take a break for a second and talk through what we've done. First off, what is a "notebook" and why do we need it?
What I'm calling a "notebook" is really a Jupyter Notebook. What are those you might ask? Jupyter Notebooks are web-based development environments for data science and machine learning.
In other words, they're the environment we're going to run code in to create our AI models!
Now that you have a new notebook, go ahead and write the following code blocks in it (using the + Code
button when hovering in the main page):
!pip install fastbook
import fastbook fastbook.setup_book()
This is what it should look like:
Each code block should have a ▶️ play button on the lefthand side if you click on the code block itself. Go ahead and press it for each of these.
You'll see a lot of text spit out for the first code block, but it's just indicative that it's installing Python modules for us to use. Fastbook
is a dependency that speeds up our development by quick-installing a lot of the dependencies we'll need. The second code block is initial setup.
Next, add (and run) another code block with the following Python code:
from fastbook import * from fastai.vision.widgets import *
fastai
is a package developed by the company FastAI as a wrapper around PyTorch with lots of sane defaults so that you and I can get to hacking.
This is what it should look like once you're done:
Good job 🔥
Now it's time to download some real data so that we can train our model!
"I imagine a world in which AI is going to make us work more productively, live longer, and have cleaner energy."
- Fei-Fei Li
So far, we've only really been concerned with setup. But in this section, we're going to download the data we'll need to train our AI model!
Remember earlier when I encouraged you to think of what kind of image classifier you were wanting to train? Well, it's time to use what you've decided on and jump in with both feet.
Go ahead and write the following code in a new code block (replacing "dogs"
with whatever you want your model to predict):
results = search_images_ddg("dogs") len(results)
It should look like this:
Once you run that, it should say something like 200
underneath your code block. This is indicative of the number of results that were found matching your search!
Next, go ahead and add the following two code blocks and run them both:
destination = 'images/dogs.jpg' download_url(results[0], destination)
image = Image.open(destination) image.to_thumb(128,128)
It should look like this:
Nice. This is always a good idea to do as we're effectively sampling the image data to make sure we're pulling in something that we'd expect from our search query term.
Next, we need to setup all our queries. These will be other categories for images to fall into. I'm going to search for "dogs"
and "cats"
as well to keep with the whole animal theme:
animal_types = "dogs", "cats" path = Path("animals")
This declares a list of annimals (or whatever you decide to train your model on) that our image classifier will place images into.
Finally, we've defined an animals
path which will be our root directory where a separate folder for each animal_types
string will be created in to download its associated images into.
Next, go ahead and make and run this code block (replacing "animals"
with whatever the parent category is for the type of images you're downloading):
if not path.exists(): path.mkdir() for o in animal_types: dest = (path/o) dest.mkdir(exist_ok=True) results = search_images_ddg(o) download_images(dest, urls=results)
This will take quite a bit of time to run. You're using DuckDuckGo to download images. You can actually see them downloading in real time if you open up the right sidebar (with the arrow on the bottom-right part of your screen) and then dig down to the files highlighted here in red:
Once this code block finishes running, you now have your dataset for training your model!
Before using this data, we should remove any errored image paths that failed to download.
Go ahead and paste this code into a code block and then run it:
fns = get_image_files(path) failed = verify_images(fns) len(failed)
Here's what it looked like for me when I ran it:
Go ahead and remove those by putting this code into a code block and running it:
Note: This will unlink the errored images the first time you run it. If you run it a second time, it will error out as there aren't any image paths to remove anymore.
failed.map(Path.unlink);
Next, we need to sample the data we've downloaded to make sure it's valid. In order to do that, we need to setup up some data loaders to manipulate our images.
Go ahead and jump into the next section to do that when you're ready.
“Some people worry that artificial intelligence will make us feel inferior, but then, anybody in his right mind should have an inferiority complex every time he looks at a flower.”
- Alan Kay
In order to manipulate our data, we're going to use a class exported from fastai
called a Dataloader
.
Go ahead and create (and, like always, run) this code block:
animals = DataBlock( blocks=(ImageBlock, CategoryBlock), get_items=get_image_files, splitter=RandomSplitter(valid_pct=0.2, seed=42), get_y=parent_label, item_tfms=RandomResizedCrop(224, min_scale=0.5), batch_tfms=aug_transforms(), )
Okay. What's going on here?
Good question.
First, we have this line:
blocks=(ImageBlock, CategoryBlock)
This is telling the DataBlocks
exactly what kinds of variables we're using - images are the "independent variable" (images don't change) and the categories ("dogs" and "cats", in my case) are the dependent variable (they change depending on the type of image fed in).
Next, we have this line that will pull our downloaded images in for us:
get_items=get_image_files
After that, we have a line that appears to be splitting something:
splitter=RandomSplitter(valid_pct=0.2, seed=42)
This is a random generator (with a consistent seed) to split our data. The reason we need to split our data is to get a training and validation set. What are those? Great question.
The training set is (as the name says) what's used to train our model. The validation (or testing) set of data is what's used to validate that the model's results are consistent and correct.
For the get_y
value, this line merely applies a label to the Y
axis:
get_y=parent_label
The final two lines of code arguments go hand-in-hand:
item_tfms=RandomResizedCrop(224, min_scale=0.5), batch_tfms=aug_transforms()
The first one preps the images so they're the same size, the the second applies data augmentation to the images so that we have some variety (as life has variety, and so our data set should reflect partial/varietal images of the categories we're training the model about).
After we're done setting up this DataBlock
class, we can create a dataloader
to use for actual training:
dls = animals.dataloaders(path)
As always, go ahead and run the code for the DataBlock
class and your dataloaders
function call.
Here's what this will looks like once you're done:
Note: If you see an error while running this (like in my image below) about "'has_mps' is deprecated", disregard it.
Believe it or not, the hard part is done now. From here, we get to train our model!
Let's get the party started 🥳
"We can build a much brighter future where humans are relieved of menial work using AI capabilities."
- Andrew Ng
In order to train our model, go ahead and put this code block in and run it (I'll explain what it's doing while it's running since it's going to take a while):
learn = vision_learner(dls, resnet18, metrics=error_rate) learn.fine_tune(4)
When you run this, it's going to kick off actually training your model.
First, we're defining a vision_learner
(makes sense for an image classifier, right?) that takes in your new dataloader
that you created in the last section.
It's also using a pre-trained model, resnet18
, to speed up the process of training. Using pre-trained models as a jumping off point is an excellent strategy and one widely used in the industry.
Finally, we're telling the vision_learner
how to define metrics for the training session. In our case, we're specifically using error_rate
. We want to use the rate of errors of misclassified images to define our metrics.
After we set up our vision_learner
, we will go ahead and call fine_tune
on it and run it for 4
epochs. An epoch is an iteration round of training the model, analyzing the error rate, and learning from it.
Let's let this run for a bit. You should go make some coffee or tea. It could be a few minutes.
Done?
Okay. This is what it should look similar to (again, ignore the errors here if you have them):
Training loss (train_loss
) is how well the model is fitting the training data, and validation loss (valid_loss
) is an indication of how well the model fits the new data (from our DataBlock
s class with the random splitter on the data). Here's a pretty good breakdown on this topic; additionally, I'd highly recommend starting the FastAI course I mentioned at the start of this article.
For now, it's time to export our model and fire it up for use!
"AI will probably most likely lead to the end of the world, but in the meantime, there'll be great companies."
- Sam Altman
It turns out that it's actually quite easy to export your model (called a .pkl
file).
Add this code block to your notebook:
learn.export()
Run that, and you're ready to download the file!
Check out the files in the sidebar again. You should see an export.pkl
file like this:
If you hover over that, you should be able to click on three dots that open a menu where you can download the file.
Congrats. You've just exported a full AI model. Let's go ahead and figure out how to run it!
"AI is neither good nor evil. It's a tool. It's a technology for us to use."
- Oren Etzioni
The final piece of the puzzle is to create a Hugging Face account and host your model there so you can interact with it (using the Hugging Face site).
First, go ahead and sign up for free account at Hugging Face.
Once you're done, create a new "Space" (for running your model) by going to the "Spaces" page and clicking "Create new Space" which should look like this:
Got ahead and give your Space a name and use a license like MIT for it:
Select "Gradio" as the Space SDK and leave the other settings as-is before clicking "Create Space"!
Next, go ahead and clone down the Hugging Face repo you just created. The HTTPS or SSH cloning commands will be listed inside the page.
Once you've done that, add these files to your repo:
# In an app.py file import gradio as gr from fastai.vision.all import * learn = load_learner("export.pkl") # Update these to whatever your categories were for training categories = ("cats", "dogs") def classify_image(img): """Takes in an image and classifies it using a model""" pred, idx, probs = learn.predict(img) return dict(zip(categories, map(float, probs))) image = gr.Image() label = gr.Label() demo = gr.Interface(fn=classify_image, inputs=image, outputs=label) demo.launch(inline=False)
# In a requirements.txt file fastai
Commit them and then push them up to the Space. Once it's done, refresh the Hugging Face Space; you should see something like this:
Once it's done, the application will say that it's starting. This can take quite a bit of time, so be patient. Free tools aren't always the fastest.
However, you'll eventually see something much like this:
Oh no! What happened?
Don't worry. This was part of my plan. You see, trying to upload your export.pkl
file with Git would have thrown an error becuase of how large the file is.
You should head to the "Files" tab and then choose the "Add file" and "Upload files" option here:
On the next page, you'll be able to directly upload and then commit your export.pkl
file. Do that, and then wait. Once again, you'll be presented with a building page for your Hugging Face Space.
Once it's done, you should be presented with a beautiful UI for interacting with your model!
Go ahead and upload an image for whatever category you trained your model on. I'm going to upload a dog picture. Check out this prediction:
NOTE: If you see the wrong predictions consistently and the model is very confident (e.g. 100% confident in it being a cat when it's a dog photo), it's likely that the order of
categories
in yourapp.py
file is off compared to the order of probabilities (where you're zipping up probabilities against categories in yourdict(zip(categories, map(float, probs)))
line of code). Go back and fiddle with this (deploying iteratively to your Hugging Face space) until it's right. Contact me if you can't figure this out - I promise I'll respond.
It even works for my own dog, Aspen (she does have some slight cat-like qualities like chasing lasers, so this checks out):
Great job! You should be able to use your model now. Go back some steps and try experimenting with different models, uploading them here to interact with them once you're done training.
The sky's the limit for you now.
There you go. You've trained a new model to do what you want and learned how to deploy and interact with it.
From here, the sky is the limit. I'd highly recommend you go check out FastAI's Practical Deep Learning for Coders course if you want to learn more. This article covers roughly the first two lectures.
Feel free to shoot me a message on the contact form on this website with whatever you've created! I'd love to see it.
As always, thanks for reading.
Nathan ☕