☁ Explore Machine Learning Models with Explainable AI: Challenge Lab | logbook
In this article, we will go through the lab GSP324 Explore Machine Learning Models with Explainable AI: Challenge Lab, which is labeled as an advanced-level exercise. TensorFlow is the dominant AI framework in the industry. You will practice the skills and knowledge in using Cloud AI Platform to build, train and deploy TensorFlow models for machine learning the dataset of the Home Mortgage Disclosure Act (HMDA) in New York.
- Launching an AI Platform Notebook
- Downloading and exploring a sample dataset
- Building and training two different TensorFlow models
- Deploying models to the Cloud AI Platform
- Using the What-If Tool to compare the models
Start a JupyterLab Notebook instance
- In the Cloud Console, in the search bar, type in Notebook.
- Select Notebook for AI Platform.
- On the Notebook instances page, click New Instance.
In the Customize instance menu, select the latest version of TensorFlow without GPUs.
In the New notebook instance dialog, accept the default options and click Create.
- After a few minutes, the AI Platform console will display your instance name, followed by Open Jupyterlab.
Click Open JupyterLab. Your notebook is now set up.
Download the Challenge Notebook
In your notebook, click the terminal.
Clone the repo:
git clone https://github.com/GoogleCloudPlatform/training-data-analyst
- Go to the enclosing folder:
Open the notebook file
- Download and import the dataset
hmda_2017_ny_all-records_labelsby running the first to the eighth cells (the Get the Train & Test Data section).
Build and train your models
In the second cell of the Train your first model on the complete dataset section, add the following lines to create the model.
model = Sequential() model.add(layers.Dense(8, input_dim=input_size)) model.add(layers.Dense(1, activation='sigmoid')) model.compile(optimizer='sgd', loss='mse') model.fit(train_data, train_labels, batch_size=32, epochs=10)
Copy the code for training the second model. Modify
limited_modelas well as
limited_train_data, limited_train_labels. The code for the second model should look like the following.
limited_model = Sequential() limited_model.add(layers.Dense(8, input_dim=input_size)) limited_model.add(layers.Dense(1, activation='sigmoid')) limited_model.compile(optimizer='sgd', loss='mse') limited_model.fit(limited_train_data, limited_train_labels, batch_size=32, epochs=10)
- Run the cells in this section and wait for the finish of model training.
Deploy the models to AI Platform
Moving on to the Deploy your models to the AI Platform section in the notebook.
- Replace the values of
MODEL_BUCKETwith your project ID and a unique bucket name.
us-west1(Use the same region of the Notebook instance).
Run those three cells and then confirm the created bucket and the uploaded model files in the Cloud Storage.
Create your first AI Platform model: complete_model
Add the following codes to the notebook cells for your COMPLETE model.
!gcloud ai-platform models create $MODEL_NAME --regions $REGION
!gcloud ai-platform versions create $VERSION_NAME \ --model=$MODEL_NAME \ --framework='TensorFlow' \ --runtime-version=2.1 \ --origin=$MODEL_BUCKET/saved_model/my_model \ --staging-bucket=$MODEL_BUCKET \ --python-version=3.7 \ --project=$GCP_PROJECT
Remark: The gcloud ai-platform command group should be
Create your second AI Platform model: limited_model
Add the following codes to the notebook cells for your LIMITED model.
!gcloud ai-platform models create $LIM_MODEL_NAME --regions $REGION
!gcloud ai-platform versions create $VERSION_NAME \ --model=$LIM_MODEL_NAME \ --framework='TensorFlow' \ --runtime-version=2.1 \ --origin=$MODEL_BUCKET/saved_limited_model/my_limited_model \ --staging-bucket=$MODEL_BUCKET \ --python-version=3.7 \ --project=$GCP_PROJECT
Remark: The gcloud ai-platform command group should be
Troubleshooting runtime version issue
The lab had a serious bug when I was carrying it out on Jun 12, 2020. I couldn’t pass the third checkpoint if set up the AI Platform models according to the lab instruction. The issue seems to be caused by the inconsistencies between the GCP training material and the Qwiklabs marking scheme. While the notebook guided to create the models with runtime version 2.1 and Python 3.7, the checkpoint message specified the required runtime version = 1.14 as shown in the below picture.
Unfortunately, it still doesn’t work if you just change the runtime version from 2.1 to 1.14. The runtime version 1.14 must be coupled with Python 3.5, according to the AI Platform Documentation. Thus, after replacing the runtime and Python version numbers, correspondingly, the codes for creating the AI Platform models should be modified as shown below.
Use the What-If Tool to explore biases
Run the last cell in the notebook to activate What-If Tool. Explore the differences between the two models and you should be able to get the answers as follows:
1. In the Performance and Fairness tab, slice by sex (applicant_sex_name_Female). How does the complete model compare to the limited model for females?
2. Click on one of the datapoints in the middle of the arc. In the datapoint editor, change (applicant_sex_name_Female) to 0, and (applicant_sex_name_Male) to 1. Now run the inference again. How does the model change?
3. In the Performance and Fairness tab, use the fairness buttons to see the thresholds for the sexes for demographic parity between males and females. How does this change the thresholds for the limited model?
Congratulations! You completed this challenge lab.
This browser does not support the YouTube video player. Watch on YouTube
⏱Timestamps: 00:00 Start Lab 00:35 Start a JupyterLab Notebook instance 03:43 Download the Challenge Notebook 05:38 Build and train your models 21:40 Deploy the models to AI Platform (❌ runtime version = 2.1, Python 3.7 ) 37:09 Use the What-If Tool to explore biases 47:18 Deploy the models to AI Platform (✔️Troubleshooting runtime version issue)
Keep on reading: