Coder Social home page Coder Social logo

dsc-introduction-to-aws-sagemaker-atlanta-ds-062419's Introduction

Introduction to Amazon SageMaker

Introduction

In this lesson, we'll learn about Amazon SageMaker, and explore some of the common use cases it covers for data scientists.

Objectives

  • List the use cases of Amazon SageMaker

What is SageMaker?

SageMaker is a platform created by Amazon to centralize all the various services related to Data Science and Machine Learning. If you're a data scientist working on AWS, chances are that you'll be spending most (if not all) of your time in SageMaker getting things done. You can get to SageMaker by just searching for "SageMaker" inside the spotlight search bar in the AWS Console.

SageMaker Use Cases

When you visit the page for SageMaker, you'll notice that the following graphic highlighting the various use cases SageMaker can help with:

You'll also notice these same categories on the sidebar on the left side of the screen, with more detailed links to services that fall under each category:

Here's a brief explanation of what each of these service areas are used for in a professional data science setting.

Ground Truth

One of the hardest, most expensive, and most tedious parts of data science is getting the labels needed for supervised learning projects. For projects inside companies, it's quite common to start by gathering the proprietary data needed in order to train a model that can answer the business question and/or provide the service your company needs. One of the major use cases SageMaker provides is a well-structured way to manage data labeling projects. SageMaker GroundTruth allows you to manage private teams, in case your information is sensitive, or to manage public teams by leveraging AWS Mechanical Turk, which crowdsources labels from an army of public contractors that have signed up and are paid by the label.

Recently, Amazon launched an automated labeling service that makes use of machine learning models to generate labels in a human-in-the-loop format, where only labels that are above a particular confidence threshold (which you set yourself) are auto-generated by the model. This allows your contractors to focus only on the tough examples, and saves you from having to pay as much for labels for the easy examples which a model can handle.

Notebooks

These are exactly what they sound like -- cloud-based jupyter notebooks, a data scientist's 'bread and butter'! SageMaker notebooks are just like regular jupyter notebooks, with a bit more added functionality. For instance, it's quite easy to choose from a bunch of pre-configured kernels to select which version of Python/TensorFlow/etc. you want to use. You can start a notebook from scratch inside SageMaker and do all of your work in the cloud, or you can upload preexisting notebooks into SageMaker, allowing you to do you work on a local machine and move it over to the cloud when you're ready for training!

We strongly recommend you take a minute to poke around inside a SageMaker notebook to get a feel for what it looks like and what it can do. They're pretty amazing!

Training

SageMaker's training services allow you to easily leverage cloud computing with AWS's specialized GPU and TPU servers, allowing you to train massive models that simply wouldn't be possible on a local machine. There are a ton of configuration options, and you can easily set budgets, limits, training times, and even auto-tune your hyperparameters! Although this is outside the scope of our lessons on AWS, Amazon provides some pretty amazing (and fast!) tutorials about how to use more specific services like cloud training or model tuning once you've completed this section!

Inference

Arguably the most important part of the data science pipeline, Inference services focus on allowing you to create endpoints so that people can consume your models over the internet! One of the most handy parts of SageMaker's approach to inference is the fact that you can productionize your own model, or just use one of theirs! While there are certainly times where you'll need to create, train, and host your own model, AWS has made things simple by allowing you to use their own models and charging you on a per-use basis. For instance, let's say that you needed to make some time series forecasts. While you could go down the very complicated route of training your own model, you could also just make use of AWS SageMaker's DeepAR model, which uses the most cutting-edge time series model available to make forecasts on your data.

Summary

In this lesson, we learned about Amazon SageMaker, and explored some of the common use cases it covers for data scientists!

dsc-introduction-to-aws-sagemaker-atlanta-ds-062419's People

Contributors

mike-kane avatar sumedh10 avatar

Watchers

James Cloos avatar Kevin McAlear avatar  avatar Mohawk Greene avatar Victoria Thevenot avatar Yoan Ante avatar Belinda Black avatar Bernard Mordan avatar raza jafri avatar  avatar Joe Cardarelli avatar The Learn Team avatar Sophie DeBenedetto avatar  avatar Antoin avatar Alex Griffith avatar  avatar Amanda D'Avria avatar  avatar Nicole Kroese  avatar Kaeland Chatman avatar Lisa Jiang avatar Vicki Aubin avatar Maxwell Benton avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.