The Python Pandas read_excel function lets you import data from Excel files into a Pandas DataFrame. This tutorial includes several Pandas read_excel examples to teach you how to properly use the function to read Excel files in Python Pandas. It’s closely related to our article about the Pandas read_csv function, but this tutorial focuses exclusively on reading Excel files.


Importing Excel Files into a Pandas DataFrame

Data from an Excel formatted file can be retrieved using the Pandas from_excel function. This is a really neat way to read Excel files in Python, so let’s walk through a couple examples. I especially like it because Pandas serves as the bridge between my VBA Tutorials and Python Tutorials.

Suppose we have the sample Excel file Sample.xlsx stored in the same location as our Python script. The sample Excel file contains the following data within sheet Sheet1:

Sample Excel File

We can quickly import this spreadsheet data into Python using the following code:

import pandas as pd
df = pd.read_excel('Sample.xlsx', sheet_name='Sheet1')
print(df)
>   Row Name  Column A  Column B   Column C
> 0    Row 1         1          2         3
> 1    Row 2         4          5         6
> 2    Row 3         7          8         9

The above code easily read and imported the Excel file into a Pandas DataFrame, but it did so using several default settings that we should consider changing.

For example, the Row Name column values were imported as data, while we may want them to be read as index labels. Chances are each row in your spreadsheet isn’t going to be labeled like in our example, but it’s still something to keep in mind.

The default read_excel parameters also assumed that the first row was a list of column names, which it incorporated automatically as column labels within the DataFrame.

Fortunately, if we want to change how read_excel reads in these files, we have a several argument options we can play around with.

Note: If your Excel file isn’t stored in the same location as your Python script, you would replace Sample.xlsx with the full path to your spreadsheet.


Pandas read_excel Function

The Pandas read_excel function contains numerous options for importing data. We’re going to address the essential options for converting Excel files to Pandas DataFrames. These options are extremely common and really important to understand since they can drastically change how the read_excel function behaves.

Here are a few the options you’ll want to understand when working with the read_excel function in Pandas. These options follow the format

read_excel(io, [Other Options])

Important Pandas read_excel Options

Argument Description
io A string containing the pathname of the given Excel file.
sheet_name The Excel sheet name, or sheet number, of the data you want to import. The sheet number can be an integer where 0 is the first sheet, 1 is the second, etc. If a list of sheet names/numbers are given, then the output will be a dictionary of DataFrames. The default is to read all the sheets and output a dictionary of DataFrames.
header Row number to use for the list of column labels. The default is 0, indicating that the first row is assumed to contain the column labels. If the data does not have a row of column labels, None should be used.
names A separate Python list input of column names. This option is None by default. This option is the equivalent of assigning a list of column names to the columns attribute of the output DataFrame.
index_col Specifies which column should be used for row indices. The default option is None, meaning that all columns are included in the data, and a range of numbers is used as the row indices.
usecols An integer, list of integers, or string that specifies the columns to be imported into the DataFrame. The default is to import all columns. If a string is given, then Pandas uses the standard Excel format to select columns (e.g. "A:C,F,G" will import columns A, B, C, F, and G).
skiprows The number of rows to skip at the top of the Excel sheet. Default is 0. This option is useful for skipping rows in Excel that contain explanatory information about the data below it.

Pandas read_excel Examples

Naming Rows in your Excel DataFrame

Suppose in our earlier example we wanted to use the “Row Name” column as a series of row indices (instead of it defaulting to 0, 1, 2, etc.). We can use the read_excel index_col option to automatically set the row index labels:

import pandas as pd
df = pd.read_excel('Sample.xlsx', sheet_name='Sheet1', index_col=0)
print(df)
>           Column A  Column B   Column C
> Row Name                               
> Row 1            1          2         3
> Row 2            4          5         6
> Row 3            7          8         9

You can learn Python in half the time
I see people struggling with Python every day and I want to help. That's why I developed this systematic approach to learning Python - FAST. This powerful training program exposes you to the Python programming language in a natural way so learning is easy.

I want to join the free wellsrPRO Python Training program

Naming Columns in your Excel DataFrame

Alternatively, you can use the names option to specify your own set of column names. This way, you don’t have to rely on the column names specified in your spreadsheet.

import pandas as pd
colNames = ["Better A", "Better B", "Better C"]
df = pd.read_excel('Sample.xlsx', sheet_name='Sheet1', names=colNames, index_col=0)
print(df)
>           Better A  Better B  Better C
> Row Name                              
> Row 1            1         2         3
> Row 2            4         5         6
> Row 3            7         8         9

Skipping rows and headers

We can use the header and skiprows option to tell Pandas not to use the column labels already in our spreadsheet. Note that both header=None and skiprows=1 must be set in this case, as the first row of actual data will be skipped if the header option is missing.

import pandas as pd
df = pd.read_excel('Sample.xlsx', sheet_name='Sheet1', header=None, skiprows=1, index_col=0)
print(df)
>        1  2  3
> 0             
> Row 1  1  2  3
> Row 2  4  5  6
> Row 3  7  8  9

Specifying which columns to import

Finally, if we want to eliminate the index label column entirely and only import certain columns of data, we can tell the usecols option to only read certain columns from our Excel spreadsheet. In this example, we tell Pandas to only import the second and fourth columns ("B,D" in Excel):

import pandas as pd
df = pd.read_excel('Sample.xlsx', sheet_name='Sheet1', header=None, skiprows=1, usecols="B,D")
print(df)
>    0  1
> 0  1  3
> 1  4  6
> 2  7  9

How to import multiple sheets into Pandas

If no sheet_name is specified in our call to the read_excel function, all your sheets will be imported and the output will be a dictionary of DataFrames. The data in each sheet can be referenced by calling the corresponding “key,” which is just the sheet name or sheet number (starting at 0).

If you only want to import a couple sheets, rather than all the sheets, you have a couple options. Your first option is to just omit the sheet_name argument and work with the output dictionary to grab the sheet data you want. Again, the “key” in the dictionary is just the sheet name or sheet number and the corresponding value of each key is the DataFrame containing the sheet’s data.

Your second option is to pair the Pandas read_excel function with the ExcelFile function to store the data from each sheet in different DataFrames. You could run the code without the ExcelFile line, but I find it makes the code cleaner since you only have to specify the spreadsheet path once. Here’s an example:

import pandas as pd
xls = pd.ExcelFile('Sample.xlsx')
df1 = pd.read_excel(xls, sheet_name='Sheet1')
df2 = pd.read_excel(xls, sheet_name='Sheet2')

You can learn Python in half the time
I see people struggling with Python every day and I want to help. That's why I developed this systematic approach to learning Python - FAST. This powerful training program exposes you to the Python programming language in a natural way so learning is easy.

I want to join the free wellsrPRO Python Training program

I hope you found this Pandas read_excel tutorial helpful. We’ve presented a bunch of examples to show you how to import your Excel spreadsheets into a Pandas DataFrame.

If you found it helpful, share this article with your friends, classmates, and coworkers on Facebook, Twitter, and Google+! When you spread the word on social media, you’re helping us grow so we can continue to provide free tutorials like this one for years to come.

This article was written by Cody Gilbert, contributing writer for The Python Tutorials Blog.

About The Python Tutorials Blog

Ryan Wells

The Python Tutorials Blog was created by Ryan Wells, a Nuclear Engineer and professional VBA Developer. Ryan developed a unique 3-part free Excel training program to help others quickly learn VBA in a natural setting: right inside Excel. After his successful VBA Tutorials, which have helped hundreds of thousands learn to write better macros, he built The Python Tutorials Blog to teach people Python in a similar systematic way.