Authenticating AI Platform Notebooks against BigQuery in Python

Read in 3 min

When you use AI Platform Notebooks by default any API calls you make to GCP use the default compute service account that your notebook runs under. This makes it easy to start getting stuff done, but sometimes you may want to use BigQuery to query data that your service account doesn’t have access to.

The below instructions describe how to use your personal account to authenticate with BigQuery. This specifically applies to authentication when using a python based notebook. If you want to authenticate on a R based notebook you can find instructions for that here.

Normally you would use gcloud auth login from the jupyer lab terminal to login to your personal user account and call Google apis, but the BigQuery library auth works differently for some reason.

Instead, you need to create a credential object containing your user credentials and pass that to the bigquery library.

Install the pydata_google_auth package:

%pip install pydata_goog_auth

Restart the kernel: Kernel → Restart Kernel (from the Jupyter menu bar)

Import the library and create your credentials:

import pydata_google_auth credentials = pydata_google_auth.get_user_credentials( ['https://www.googleapis.com/auth/bigquery'], )

When you execute the above cell, the notebook will display:

  1. A clickable authentication URL (something like https://accounts.google.com/o/oauth2/auth?...)
  2. A text input box labeled “Enter verification code:”

Click the authentication link (or copy and paste it into your browser) and sign in with your Google account. After authorizing the application, Google will display a verification/authorization code on a confirmation page.

Copy that authorization code from the browser and paste it into the “Enter verification code” text box in your notebook, then press Enter.

Next you’ll want to reload the bigquery magic in your notebook. You ‘reload’ instead of ‘load’ because AI Platform Notebooks already loads the bigquery magic for you by default:

%reload_ext google.cloud.bigquery from google.cloud.bigquery import magics magics.context.credentials = credentials

Now when you use the bigquery magic it’ll use your personal credentials instead of the service account ones:

%%bigquery SELECT name, SUM(number) as count FROM my-private-project.usa_names.usa_1910_current GROUP BY name ORDER BY count DESC LIMIT 10

And that’s all there is to it!

If you’d rather use the python code than invoke the bigquery magic just create a client with the user credentials and query away!

from google.cloud import bigquery as bq
client = bq.Client(project="project-name", credentials=credentials)

Thanks to Anthony Brown for sharing instructions on how to use BigQuery with Jupyter Notebooks

Zain Rizvi

Zain Rizvi

I build the infrastructure used by millions of developers around the world. At Meta, I work on the infrastructure for PyTorch. Previously at Google Cloud, Microsoft Azure, and Stripe.

Comments