This is not a Google product.
This is the code accompanying this blog post showing how to build an AdaNet model with AutoEnsembleEstimator
and TF Hub, and train it on Cloud ML Engine.
See the blog post for details, and follow along below for a quick guide on how to train on ML Engine.
For this to work you'll need to create an account and project on Google Cloud Platform, and enable billing, and the necessary APIs. Steps 1-4 on this quickstart explain how to do that.
You'll use gcloud
to kick off and manage training jobs for your model. If you don't have it, install it here.
You'll use this to store all of the checkpoints for your model along with the final model export. Follow this guide to create one.
Time to start your training. Open your terminal and make sure gcloud
is set to the project you created for this tutorial: gcloud config set project your-project-name
.
Define the following environment variables:
export JOB_ID=unique_job_name
export JOB_DIR=gs://your/gcs/bucket/path
export PACKAGE_PATH=trainer/
export MODULE=trainer.model
export REGION=your_cloud_project_region
From the root directory of this repo, run the following command:
gcloud ml-engine jobs submit training $JOB_ID --package-path trainer/ --module-name trainer.author --job-dir $JOB_DIR --region $REGION --runtime-version "1.12" --python-version 3.5 --config config.yaml
Navigate to the ML Engine UI in your cloud console to monitor the progress of your job.
You can also visualize metrics for your training job with TensorBoard. If you've got TensorFlow installed locally, it already comes with TensorBoard. Run the following command to start up TensorBoard:
tensorboard --logdir=$JOB_DIR
To start it, navigate to localhost:6006
in your browser.
Once you've trained your model, it'll export the latest checkpoint to the Cloud Storage bucket path you specified. To quickly test out your model for prediction, you can use the local predict
method via gcloud
. Just create a newline delimited JSON file with your test instances in the format your model is expecting. An example file for this model is in trainer/test-instances.json
. Then run:
gcloud ml-engine local predict --model-dir=gs://path/to/saved_model.pb --json-instances=path/to/test.json