Kaggle is a popular platform for data science and machine learning competitions, where you can find and download various datasets for your projects. Google Colab is a free online service that allows you to run Python code in a Jupyter notebook environment, with access to GPUs and TPUs. Google Drive is a cloud storage service that lets you store and share files online.

In this tutorial, we will show you how to import Kaggle datasets into Google Colab using Google Drive. This will allow you to use the datasets in your Colab notebooks without having to download them manually or upload them to another cloud service.

The steps are as follows:

  1. Create a Kaggle account and generate an API token.
  2. Upload the API token file to your Google Drive.
  3. Mount your Google Drive in your Colab notebook.
  4. Install the Kaggle API package in your Colab notebook.
  5. Use the Kaggle API commands to download the datasets to your Google Drive or Colab Notebook.
  6. Load the datasets into your Colab notebook.

Let’s go through each step in detail.

Step 1: Create a Kaggle account and generate an API token

To use the Kaggle API, you need to have a Kaggle account and generate an API token. To create a Kaggle account, go to https://www.kaggle.com/ and sign up with your email or social media account. You will need to verify your email and accept the terms of service.

To generate an API token, click on your profile picture in the top right corner. Then, click on “Settings” and scroll down to the “API” section. Click on “Create New Token” and a file named “kaggle.json” will be downloaded to your computer. This file contains your username and key for accessing the Kaggle API.

Step 2: Upload the API token file to your Google Drive

To use the Kaggle API in your Colab notebook, you need to upload the “kaggle.json” file to your Google Drive. You can do this by going to Google Drive and clicking on “New” and then “File upload”. Select the “kaggle.json” file from your computer and upload it.

Alternatively, you can use the following code in your Colab notebook to upload the file directly from your computer if you’d rather not use Google Drive, but you’d have to upload it each time if you go this route.

from google.colab import files
files.upload()

This will prompt you to select the file from your computer and upload it. You will see a message like this:

Saving kaggle.json to kaggle.json

Step 3: Mount your Google Drive in your Colab notebook

To access the files in your Google Drive from your Colab notebook, you need to mount your Google Drive as a virtual drive. You can do this by running the following code in your Colab notebook:

from google.colab import drive
drive.mount('/content/drive')

This will prompt you to authorize access to your Google Drive by clicking on a link and entering a code or it will prompt you with a message requesting permission to access your Google Drive files. You will either see a message like this:

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=...

Enter your authorization code:
··········
Mounted at /content/drive

or

Permit this notebook to access your Google Drive files?
This notebook is requesting access to your Google Drive files. Granting access to Google Drive will permit code executed in the notebook to modify files in your Google Drive. Make sure to review notebook code prior to allowing this access.

If you get the first message, navigate to the link, enter the code and then you will see that your Google Drive is mounted at “/content/drive”. You can browse the files in your Google Drive by using the file explorer on the left side of the Colab notebook or by using terminal commands like ls or cd.

If you get the second message, click “Connect to Google Drive” and then navigate through the prompts the pop-up in a new window. You may see a screen where you need to choose your Google Drive Account and then you may see a screen saying:

Google Drive for desktop wants to access your Google Account

This will allow Google Drive for desktop to:
...
...
...
Make sure you trust Google Drive for desktop
You may be sharing sensitive info with this site or app. You can always see or remove access in your Google Account.
Learn how Google helps you share data safely.
See Google Drive for desktop’s Privacy Policy and Terms of Service.

Once you allow the connection, you’ll see that Google Drive is mounted at “/content/drive”. Again, you can browse the files in your Google Drive by using the file explorer on the left side of the Colab notebook or by using terminal commands like ls or cd.


Code More, Distract Less: Support Our Ad-Free Site

You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.


Step 4: Install the Kaggle API package in your Colab notebook

To use the Kaggle API commands in your Colab notebook, you need to install the Kaggle API package. You can do this by running the following code in your Colab notebook:

!pip install kaggle

This will install the latest version of the Kaggle API package and its dependencies, if they’re not already installed.

Once you do this, you’ll want to place your kaggle.json file in a place that the Kaggle API knows to look. Here’s an example of some commands you can use to copy the kaggle.json file from your Google Drive to a Google Colab folder called .kaggle:

!mkdir ~/.kaggle
!cp "/content/drive/MyDrive/kaggle.json" ~/.kaggle
! chmod 600 ~/.kaggle/kaggle.json

Step 5: Use the Kaggle API commands to download the datasets to your Google Drive or Colab Notebook

To download the datasets from Kaggle to your Google Drive, you need to use the Kaggle API commands. To use the Kaggle API commands, you need to know the name of the dataset you want to download. You can find the name of the dataset by browsing the Kaggle website or using the !kaggle datasets list command in your Colab notebook.

For example, let’s say we want to download the iris dataset we used in our PyTorch classification and regression tutorial. To download it to Google Colab, we need to run the following command in our Colab notebook:

! kaggle datasets download -d 'saurabh00007/iriscsv'

Alternatively, if storage space isn’t a concern for you and you want to download the dataset to your Google Drive, you can run a command like this:

!kaggle datasets download -d 'saurabh00007/iriscsv' -p /content/drive/MyDrive/Kaggle

The -d flag specifies the name of the dataset which can be found in the URL of each Kaggle dataset, the -p flag specifies the path where we want to save the dataset in our Google Drive, and the ! symbol tells Colab to run the command as a shell command.

The second command will download a zip file containing the dataset files to our Google Drive folder named Kaggle. You can change the name of the folder or create a subfolder if you want.

You can repeat either one of these command for any other dataset that you want to download from Kaggle. Just make sure to replace the name of the dataset and the path accordingly.

Step 6: Load the datasets into your Colab notebook

Once you have downloaded the datasets to your Google Drive, you can load them in your Colab notebook using pandas or any other Python library that can read data files.

To load a dataset from your Google Drive, you need to know the name and format of the data file that you want to load. You can find this information by unzipping the zip file that you downloaded from Kaggle and inspecting its contents using a command like this:

!unzip *.zip

For example, let’s say we want to load the Iris.csv file from the iris dataset we just downloaded. First, we recognize this file is a comma-separated values (CSV) file. To load it in our Colab notebook, we need to run the following code:

import pandas as pd
dataset = pd.read_csv('/content/Iris.csv')
dataset.head()

The pd.read_csv function reads a CSV file and returns a pandas DataFrame object. The /content/Iris.csv is the path where we saved the data file in our Colab notebook or in our Google Drive so you may have to update this path to match your storage location. The df.head() function displays the first five rows of the DataFrame.

You can repeat this code for any other data file you want to load from your Google Drive. Just make sure to replace the name and format of the data file and the path accordingly.


In this tutorial, you learned how to use the Kaggle API to access and download datasets from Kaggle to your Google Colab notebook by first uploading your Kaggle API token. You also learned how to mount your Google Drive in your Colab notebook and load the datasets from there. This way, you can easily work with large datasets from Kaggle without having to download them manually or use up your Colab disk space. You can now explore the Kaggle datasets and apply your data analysis and machine learning skills to them with Google Colab.

For more of our tips to help you get the most out of Python, enter your email address below.


Code More, Distract Less: Support Our Ad-Free Site

You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.