How to deploy an ML model to Amazon SageMaker with a custom Docker container

0 ratings

If you have trained a model, if you now want to deploy it to a SageMaker endpoint, and if your first instinct is to throw your laptop out the window, please calm down and keep reading.

I have been there too. And it took me a long while to get my head out of the water.

Truth is, AWS documentation is a rabbit hole of mostly useless, outdated, and repetitive pages pointing at each other. Once into the maze, you don’t get out.

Of course, there are blog posts some kind souls have put together to navigate the jungle over the years. I was one of those souls and did my homework for the community here and here already. Truth is, nothing seemed to truly cut it. Too many options and too confusing documentation.

That’s when I decided to double down on the solution that seemed to me the one worth investing time in.

One strategy to master them all.

Deploying to Amazon SageMaker via a custom Docker container.

Docker is the only technology that provides full end-to-end control of the solution. The stack you want to use. The software to run. Where to put the model and how to load it to run inference. It is entirely testable locally before pushing to the SageMaker’s black box. Because let’s admit it, AWS errors are close to jibberish. The pipeline will just throw a “Something went wrong” kinda exception. Good luck Googling that. Docker allows you to minimize surprises when moving from your laptop to the cloud. It was built with reproducibility and platform agnosticism in mind after all.

But…. Docker is an “advanced” skill set, isn't it? So its use cases are always hidden somewhere in a hard-to-reach section of the docs, given that, well, it’s hard, so not for many people, right? It is arguably not very user-friendly, but the reward is so much worth the effort!

Once I understood how things were chained together between Docker and the SageMaker inference ecosystem, a whole new set of opportunities opened up.

No more weird Python SDKs hiding things from you to make things “easier”. They work half the time, with the other half ending with banging your head on a wall due to incomprehensible and impossible-to-debug errors.

I hear you. “Say no more! Just help us get out of this nightmare!”

Sure thing, here's how:

We’ll start by training a binary semantic segmentation model in fastai to detect human faces. This is a standard UNet model trained on 1000 images from Microsoft’s FaceSynthetic dataset. We'll do this on a GPU-powered EC2 instance on AWS by hooking up our local VSCode IDE via SSH. Essentially, everything is gonna happen on a Jupyter notebook, with the notebook running in the cloud. Sweet.
Then we'll put together the inference script and test it locally, without any Docker involved.
Once we get it right, we move on to the Dockerfile and the Docker image wrapping up all the inference logic. We are gonna make sure the serve script works with Docker too. We'll explore two routes: both including the model inside the container and the AWS-recommended approach of uploading artifacts to S3 and letting SageMaker unzip them into the endpoint.
We can finally move to the cloud! This is when we pack everything up and deploy the model to SageMaker. We are going to do that on a CPU-based real-time endpoint both using the high-level Python SDK and the low-level boto3 client API.
Enjoy!

The code is going to be shared via a dedicated Github repo you can clone to reproduce what we went through.

Who should buy this product and who should not?

This course is for engineers who struggle to wrap their heads around deploying ML models to Amazon SageMaker. You either have no clue of how to do it, or tried and dropped it out of frustration, or tried and succeeded but don’t really know why. You have all the pieces of the puzzle but can’t really put them together. This course is going to teach you one thing, and one thing only. Get your model behind a SageMaker endpoint with Docker.

I am talking to folks who are already familiar with Python, Deep Learning, and Docker. I won’t get into details about any of those technologies, even though I will explain step by step what is it that we are doing and why. If you are a complete newbie in any of the above, I don’t think this course is for you.

As for Docker, I highly recommend starting here. Fantastic 2-hour YouTube video that opened my eyes 3 years ago when I got started in my Docker journey.

As for Deep Learning, as always, go for the fastai courses and don’t look back. Jeremy Howard is the best in town, hands down.

I won’t cover much of the AWS inner workings either. I’ll leave this for the next video if you feel it’s needed. This means I won’t explain in detail how S3 or IAM access roles work. We’ll use those services and I’ll make sure to cover the basics, but nothing more.

Hope you guys enjoy this course as much as I did making it!

I want this!

You'll get a 2 hours video tutorial in which I walk you through the process of training a semantic segmentation model in fastai to recognize human faces and then deploy it to Amazon SageMaker with a custom Docker container. We'll use both the high-level SageMaker python SDK and the low-level boto3 client API. We'll also see how to add the model artifacts inside the Docker image directly, or to let SageMaker download them from S3 when spinning up the container at endpoint boot time.

Size

577 MB

Duration

123 minutes

Resolution

1824 x 1080 px