This lab is provided as part of AWS Innovate Data And AI/ML Edition.
âšī¸ You will run this lab in your own AWS account. Please follow directions at the end of the lab to remove resources to avoid future costs.
- Introduction
- Sentiment Analysis
- SageMaker Studio
- Setup
- SageMaker Studio UI Overview
- JupyterLab Notebook
- Conclusion
- Cleanup
- Survey
Amazon SageMaker is a fully managed service that removes the heavy lifting from each step of the machine learning workflow, and provides every developer and data scientist with the ability to build, train, and deploy machine learning (ML) models quickly. The Amazon SageMaker BlazingText algorithm provides highly optimized implementations of the Word2vec and text classification algorithms. The Word2vec algorithm is useful for many downstream natural language processing (NLP) tasks, such as sentiment analysis, named entity recognition, machine translation, etc. Text classification is an important task for applications that perform web searches, information retrieval, ranking, and document classification. Learn more about Amazon BlazingText algorithm here.
Sentiment Analysis is an application of NLP. Teaching a machine to analyse a sentiment is a classification problem. Here the ML model undergoes a supervised learning where the input data has features that teach the model to predict the label(e.g. Positive, Negative, Neutral).
There are various use cases of Sentiment Analysis wherein businesses can gather the public sentiment on a product that was released recently.
In this lab, we will be using the SageMaker Studio. Amazon SageMaker Studio provides a single, web-based visual interface where you can perform all ML development steps. SageMaker Studio gives you complete access, control, and visibility into each step required to build, train, and deploy models.
All ML development activities including notebooks, experiment management, automatic model creation, debugging, and model and data drift detection can be performed within SageMaker Studio.
To know more. See here.
- Navigate to the Amazon SageMaker service in the AWS console.
- The SageMaker console is displayed as below. Click on the SageMaker Studio button. Before using the SageMaker studio, we will first need a SageMaker Domain. If you have a SageMaker Domain created already, you could use the same. Else a SageMaker domain will have to be created.
- To create a new SageMaker Domain enter a Name
aiml-innovate-2023
in the Name section,data-scientist-1
in the User Profile section and then Create a new role. - For this lab, we will allow the new execution role access to Any S3 bucket. Once selected, click on Create Role.
- Once back on the Create Domain page, Click on Submit button to create the Domain. The domain creation takes a few minutes.
- Once the domain is created, Click on the Launch Studio to Launch the SageMaker Studio. This takes a couple of minutes before the IDE is launched.
The SageMaker Studio UI has a few sections. To explore further, click on the below links:
- On SageMaker Studio, click on Git option on the Left Sidebar. Then click on the Clone a Repository.
- Enter the git repo URL as
https://github.com/phonghuule/sm-sentimentanalysis-bt.git
and Click on Clone. - Click on the File Browser option on the Left Sidebar. Open the
sentiment-analysis-blazingText.ipynb
at the locationsm-sentimentanalysis-bt/notebook
. - Select Image as
Data Science
, Kernel asPython 3
, Instance Type asml.t3.medium
and click on Select. - Run all cells as per instructions in the notebook.
In this lab we
- Used trained a machine learning learning model to perform a multi-label text classification using Supervised training.
- understood the nuances of the Input/Output interfaces to the BlazingText algorithm
- deployed the ML model using SageMaker Hosting services and invoked the Endpoint to understand the sentiment of the sentences.
sentences = [
"I love this product. I would order it again",
"I am not sure if I like this product. I probably do."
]
[
{
"label": [
"__label__3"
],
"prob": [
0.9555298686027527
]
},
{
"label": [
"__label__2"
],
"prob": [
0.9004751443862915
]
}
]
In the above example:
I love this product. I would order it again
evaluated to a __label__3
or a Positive sentiment with a 95.55% probability
and
I am not sure if I like this product. I probably do.
evaluated to a __label__2
or a Neutral sentiment with a 90.04% probability..
Follow the steps below to perform a cleanup and prevent accumulating usage charges from this lab.
-
On the Left Sidebar, click on Instances and Shutdown the Running Apps as below.
-
Then click File > Shut Down to shutdown the SageMaker Studio server and then click on Shutdown All. This will display Server stopped message.
-
After this, click on the domain
aiml-innovate-2023
, user profiledata-scienctist-1
and then delete all apps. At this there should only be thedefault
app remaining.
-
After the apps have been deleted, click on Edit on the Details section within the User Details screen.
-
Then click on Delete User to delete the user
data-scientist-1
.
Note: If there are other users please delete these to be able to delete the SageMaker Domain. -
Once back on the SageMaker Studio screen, click on Delete Domain to delete the domain created to host the SageMaker Studio.
-
Navigate to Amazon Sagemaker - Ineference - Endpoints to delete Endpoint created.
-
Navigate to Amazon Sagemaker - Ineference - Endpoint configurations to delete the configuration created.