Surprise Topic 2 Creating a Prediction Service on a Google Cloud Platform

Rodrigo Ledesma
Aug 7, 2022
11 min read

Updated: Aug 8, 2022

Hello there! Thank you for coming to my blog. This is a fascinating topic I have been willing to try for a long time. In this post, we will be going step by step into how to use a cloud service to store our machine learning models and also how to use that service from anywhere in the world to make predictions.

If this is the first time you visit my blog, the purpose of this series of articles is intended to create a Machine Learning model to predict how long it will take an average visitor to wait in line at a Disney or Universal park before they can ride their favorite rollercoaster and with this information, optimize their visit by scheduling parks in the day where their favorite parks are less crowded.

If you wish you can see my video tutorial on youtube, but this article digs deeper.

Why is this a surprise topic and why is it relevant? In school, we learn about math and the background of how the algorithms work, and how we can combine and improve them. It is an essential part of AI, but personally, I am convinced that eventually, your life will be easier if you understand how you can manipulate the cloud platforms in your favor. There is a very interesting book I recommend to the readers, even if you are beginning with ML or data science. Its name is “Machine Learning Design Patterns” by Sara Robinson et al. In chapter 3 this book starts talking about some problems you might be facing in the industry. One important problem might be for example:

You have developed a model to predict X, and it becomes so popular, that it needs to attend millions of requests per minute or even per second, 24/7. Is your machine able to handle this?

So far we have been using jupyter notebooks and anaconda to train and predict. And technically speaking we have been doing some cloud development when we use google colab, but that is like cheating. Google, Microsoft, and Amazon (there are more, those are only my favorite ones) have invested millions of dollars in creating cloud-based environments for others to use. One key aspect of their services is scalability. In other words the ability of a service to become bigger and attend to more requests on demand.

If this term is new to you let me give an example, otherwise please go directly to the next paragraph. Let’s say you are creating an ML model that predicts which items a customer will be more inclined to buy based on the clicks he or she makes on a web page and the time spent on it. On a normal labor day, the model will work normally with an X amount of calls. But let’s say then that Black Friday arrives and the number of calls to your model increase from X to 73X. Scalability is the ability of the server to provide more resources to your process and handle more calls on demand. In this case, if our model was being deployed using a single container, let’s imagine the platform creates 50 more containers to run our model and to receive the calls, process the information, and return the prediction. This is done automatically with minimal human intervention. And of course, it is not profitable to keep the 50 servers running if they are no longer needed, then we need to kill them and come back to only one or two. I have introduced the concept of container so let me explain it. A container is a virtual machine that runs on a server, the magic of containers is that you are able to configure which resources the container will have, for example, your container can have only 8 GB of RAM memory running over Linux, or you can configure it to have 8 TB or storage and 32 GB of RAM running on Windows. There are software called “orchestrators” whose job is to create or delete containers accordingly to the service’s demand. And thank god Google takes care of this automatically.

Cloud-based services give us the opportunity to:

Attend millions of calls by demand
Have parallel services
Have 24/7 availability of your service
Store your information with redundancy
Consult your model from anywhere around the world

Now that it is a little clear about the importance of the cloud, I want to give a little introduction to what we will be doing next. using GCP (Google Cloud Platform) we will be uploading an ML model, saving it, and then sending data and asking for predictions, everything on a serverless environment. So let’s get our hands dirty.

Prediction Service on Google Platform Before we jump directly into the code, we have to configure the GCP and create a user. The financial limitations of users experimenting with cloud

First of all, let me tell you that cloud services are never free, you will have some free trials, but for accessing them, you will always need to give your credit card information to the service provider. Before you attempt to recreate what is presented in this blog, please be sure you are confident with this process. Another point I need to clarify is that I hereby decline of any responsibility if any charge is given to your account. Now don’t be worried about this, because if you are a new user of google, as by the time I am writing this article, Google will give you 300 USD for a period of 90 days to use the cloud services as you want, free of charge. But be careful because any process that runs after that period of trial will be automatically charged to your credit card. So make sure to have these points in mind:

Have in mind that if you create a new google account you will have 3 months to use $300 USD in the Google Cloud Platform.
You will have to give your credit card’s information to google
Make sure that once you finish your exercise, you either turn all your services off or configure your account to not charge anything to your card.

Let’s start to get familiar with the GCP platform. All your documents and models together with the general data will be stored inside what google calls a project. By default when you first log into your account, Google will create a project for you called “My First Project”. I will leave the project’s name as is, but you can change it if you click on the main menu (the one that has 3 horizontal lines) and select IAM & ADMIN.

Getting familiar with GCP

When you first open the platform and do the login you will find a screen similar to this one:

Now in order to store our data and our models we need to use the google cloud storage service. Which we will access by clicking the following path:

All our data will be stored in one or in several buckets, so now we need to create a new bucket.

You will need to name your bucket, it is pretty straightforward just make sure to use only small letters and scores or underscores. Personally, I gave it a simple name and let the other parameters as default. Click on Create

Use the UPLOAD button to upload your model files and datasets to the server.

Now we need to tell the ML Engine which models and versions we want to use, for this we will be using the model's tab of the AI Platform. To find it, open the main menu, click on MORE PRODUCTS, and scroll all the way down until the ARTIFICIAL INTELLIGENCE part where you will find the AI Platform option.

Now click on Enable API -> Create Model and configure the region of your preference then click on Create Model one more time.

IMPORTANT: In this screen, there is a checkbox that says “regional endpoint” this has to do with how the query travels and where the information is stored inside the google platform. If you check this box, then your model will be stored in a regional point, and the API call will be different. I spent 2 days struggling with this because the code I am using is not configured regional, so in this step, please uncheck this box.

Give a name to your model and provide a quick description of it, you can here configure being able to access your model through the command line, but here we will not use the command line. Finally, click on Create

Now let’s create a version for our new model by selecting the model we just created, and then clicking on the 3 dots at the right of the screen to display options. Then selecting Create version.

Now let’s keep on configuring the model’s settings together with the environment. First, let’s give a name to the version and give a little description. I chose to use the default container as I am not able yet to give the full configuration for a tailored container and sincerely for my exercise it won’t be necessary to make any customizations.

Now let’s tell the server the container settings, for this post, I know I am using python 3.7.

You can check the framework version using the command line or as I will be showing you using the Anaconda Interface, but before I need to tell you that the Model URL, is just the path where your .pb model is stored. This path will be the same that you created inside your bucket and it is linked automatically, so you just need to follow your folder architecture and select the folder where the model is stored.

In my case I know I am working with python 3.7 but you can check it using commands in the prompt or directly in the notebook. Also for figuring out your Tensorflow’s version you can use commands or use the Anaconda’s interface like this:

And congratulations! Now we have configured our server to store and use our first machine learning model!

Predicting Service So far we have uploaded our model and configured the service to autoscale. Now we need to create a predicting service. As a user, we need to ask for access to the google platform to make use of the model and receive the predictions. And there are two possible options to do this.

As users we can access the platform in two different ways, using the Google credentials that we have created or using a service account. They both have their pros and cons but let’s discuss them a little.

If we give several users, our credentials, they will be able to help us update, modify and upgrade our models and the general services, but there will be no control over who can do what as they will all be super administrators of our project, so this idea generally speaking is not the best. Instead, we can create tailored made profiles to access specific parts of the platform and of our project for each user. This way they will have access only to the partition that they will be using and not to the general project. This is the “service account” and we will be creating one for this purpose.

To create the service account, let’s click the main menu, place our cursor over IAM & Admin, and then click on Service Accounts. Once inside let’s click on CREATE SERVICE ACCOUNT

Now we will give a name to the account and let’s give a brief description

Technically speaking we can give this user, specific permissions to use only some of the services and some of our infrastructure. But in this case, I will not be limiting the service account so I gave the owner access to everything. But inside a company, you can limit the permissions by selecting a role such as ML engine developer…

Once you click on Done, you will have created the service account, but there are still several things we need to discuss for having the correct understanding of the cloud service. Google handles the security and privacy of all your data and models, this is why in order to make a query, the user must have a token to have access. For gaining access to our service we will be creating a key, stored inside a JSON file. Please keep this JSON save in a location that is well known to you.

To create the key click the three dots at the top right of the service account you want to use. And then click on Manage Keys.

Once clicked you will be presented with this screen, click on the ADD KEY dropdown menu and click on Create new key.

This part is extremely simple, only select the JSON option and click on create. The file will be automatically created and downloaded directly into your local machine inside your Downloads folder, or wherever your default path is for storing local download.

As mentioned before, please keep that file safe, for this exercise I will store it inside the folder where my models are and where my notebook is, so there is no problem for my script to find this credential.

Querying the model for predictions using python and Jupyter So far we have had a lot of fun creating and configuring the cloud environment services, but now we need to create a script that will query our service and ask for a prediction. For this, as the subtitle says, we will be using Jupyter notebooks and Python v3.7

I want to point out that my article is based on my favorite ML book “Hands-on Machine Learning…” written by Aurelien Geron. Specifically, this part can be seen in chapter 19. And now I think you will ask yourself, then if there is a book why should I read your post? Well you should do both, the book explains the process very well, but the example is quite generic and for creating the model and the script you will have to run the whole code from the complete chapter, on the other hand, I will focus only on the most relevant part of the code and show you how to adapt it to your needs.

So let’s get our hands dirty with some code, the first part will be to use our credentials to access the platform, for this we will use the JSON file, also known as the key. Personally, I saved this file in the same folder where I have my notebook, and all my models, together also with my data. But of course, you will not see this file in the GIT repository, please be careful when you handle yours.

The code is extremely simple, just make sure to change the string between quotation marks to the name of your own JSON file. Ok now our script has the key to authenticate into the google cloud service let’s move on.

pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib

Before we continue, please make sure to install the google cloud libraries, these are meant to have connection and authentication. Now lets set some variables:

In line 2 we will set the name of our project_id, by default google creates this name and you can modify it only once, to validate your project Id you can go to your project as shown in the picture bellow:

The model_id is simply the name of your model

Please copy and paste lines 5 and 6 as, by the time I am writing this post, they are the correct path for our process. Next let’s create a function which will contain one single input for our model, which will ask for a prediction and actually will return that prediction:

The relevant pieces here, where you need to be careful is in line 3, it depends on whether you are using a dataframe, a list, or any other type of array. Variable X in this case will contain the values of all our features already normalized or standardized, so just be careful with this because that variable will be turned into a .JSON and sent via the google API.

The rest of the code only uses google’s function to ask the online service for the prediction and wait for an answer. And finally the moment you were all waiting for, the actually query to the service.

Line 1 is my data that will be ingested into the online model, in line 2 we call our predictive function which will transform our data into a JSON, send it, get the JSON back from the cloud service, and finally print out the result. Let’s see what my result looks like:

Victory!!!!!

We have queried our online model successfully. Here we can see the 3 probabilities that our multiclass classification neural network predicts. In this case, this means that as we have an 83% probability that the result belongs to class 2, then our waiting time is between 30 minutes to 1 hour, or in other words… “let’s go ride the Harry Potter rollercoaster!”

Please feel free to take a look at my full code here in my google colab notebook: https://colab.research.google.com/drive/14tXTyOlbcqw2DLtUi1KLYTAWYLTwK0kQ?usp=sharing I will be recording a video and uploading it to youtube shortly to show you guys how to do this extraordinary process step by step and from top to bottom. I really hope this post was useful for you, and I have had a very hard but rewarding time writing it. See you next time, and keep in touch because we will be using recurrent neural networks, GANs and transfer learning very very shortly.

RODRIGO LEDESMA

Surprise Topic 2 Creating a Prediction Service on a Google Cloud Platform

Recent Posts

Comments