Introduction | Appending Rows | Joining Columns
Introduction to Merging Multiple DataFrames with Pandas concat
This tutorial describes the creation of new Pandas DataFrames by combining, or merging, multiple separate DataFrames using the Pandas Concat function. We will cover combining DataFrame rows from tables with the same number of columns, and combining DataFrame columns from tables with the same number of rows.
The concat
function has the following format:
[Joined DataFrame] = pandas.concat([List of DataFrames], axis=[axis], keys=[keys], ignore_index=[Index Bool], [Other])`
Each of the parameters of the Pandas concat function are described in the table below:
[Joined DataFrame] | The final concatenated DataFrame. |
[List of DataFrames] | A list or dictionary of DataFrames to be joined. |
[axis] | Axis on which DataFrames are joined. |
[keys] | A list of keys associated with each input DataFrame. The final [Joined DataFrame] will have key labels corresponding to the original separate DataFrames. |
[Index Bool] | Boolean value indicating whether the input DataFrame indices should be ignored and replaced with automated numbering. Default is |
[Other] | Other advanced options are available, but they're rarely used and would just muddy the waters in the tutorial. If you really want to see what they are, the Pandas Manual may be referenced for further information. |
To illustrate the use of the concat
function, we’ll assume that we have the following three DataFrames available for use:
import pandas as pd # Don't forget to import Pandas!
# df1, df2, and df3 are already defined
df1.head()
> A B C
> Index
> 1 A1 B1 C1
> 2 A2 B2 C2
> 3 A3 B3 C3
> 4 A4 B4 C4
> 5 A5 B5 C5
df2.head() # df2 contains rows extending df1
> A B C
> Index
> 7 A7 B7 C7
> 8 A8 B8 C8
> 9 A9 B9 C9
> 10 A10 B10 C10
> 11 A11 B11 C11
df3.head() # df3 contains columns extending df1
> D E F
> Index
> 1 D1 E1 F1
> 2 D2 E2 F2
> 3 D3 E3 F3
> 4 D4 E4 F4
> 5 D5 E5 F5
Now let’s use the Pandas concat
function to merge these DataFrames!
Appending Rows with Pandas concat
Let’s look at appending DataFrame rows. To append rows, the concat option axis
is set to
vdf = pd.concat([df1, df2])
print(vdf)
> A B C
> Index
> 1 A1 B1 C1
> 2 A2 B2 C2
> 3 A3 B3 C3
> 4 A4 B4 C4
> 5 A5 B5 C5
> 6 A6 B6 C6
> 7 A7 B7 C7
> 8 A8 B8 C8
> 9 A9 B9 C9
> 10 A10 B10 C10
> 11 A11 B11 C11
> 12 A12 B12 C12
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
If we wish to track the original DataFrames from which keys
option as follows:
vdf = pd.concat([df1, df2], keys=["df1", "df2"])
print(vdf)
> A B C
> Index
> df1 1 A1 B1 C1
> 2 A2 B2 C2
> 3 A3 B3 C3
> 4 A4 B4 C4
> 5 A5 B5 C5
> 6 A6 B6 C6
> df2 7 A7 B7 C7
> 8 A8 B8 C8
> 9 A9 B9 C9
> 10 A10 B10 C10
> 11 A11 B11 C11
> 12 A12 B12 C12
You can see the labels of the original Pandas DataFrames are clearly presented at the start of each table.
Note that indices are automatically imported from the source DataFrames, meaning that duplicate indices can occur. For example, what would happen if we used the Pandas concat function to merge two identical dataframes?
vdf = pd.concat([df1, df1]) # Suppose we join df1 to itself
print(vdf)
> A B C
> Index
> 1 A1 B1 C1
> 2 A2 B2 C2
> 3 A3 B3 C3
> 4 A4 B4 C4
> 5 A5 B5 C5
> 6 A6 B6 C6
> 1 A1 B1 C1
> 2 A2 B2 C2
> 3 A3 B3 C3
> 4 A4 B4 C4
> 5 A5 B5 C5
> 6 A6 B6 C6
In the resulting DataFrame above, we have duplicate index values. Each value is repeated twice since we merged the same DataFrame twice.
In tables containing separate information, we may want to renumber the resulting indices to avoid duplicate entries. We can use the ignore_index
flag to renumber the output indices in the final DataFrame. Let’s try to merge the
vdf = pd.concat([df1, df1], ignore_index=True)
print(vdf)
> A B C
> 0 A1 B1 C1
> 1 A2 B2 C2
> 2 A3 B3 C3
> 3 A4 B4 C4
> 4 A5 B5 C5
> 5 A6 B6 C6
> 6 A1 B1 C1
> 7 A2 B2 C2
> 8 A3 B3 C3
> 9 A4 B4 C4
> 10 A5 B5 C5
> 11 A6 B6 C6
See how the index values now start from 0 and go to 11? There are no duplicates, and this example also serves as a reminder that indexing starts at 0.
Joining Columns with Pandas concat
Now that we’ve learned how to stack two DataFrames on top of each other, let’s look at appending DataFrame columns to create a merged DataFrame (side by side). To append columns, the concat option axis
is set to
hdf = pd.concat([df1, df3], axis=1)
print(hdf)
> A B C D E F
> Index
> 1 A1 B1 C1 D1 E1 F1
> 2 A2 B2 C2 D2 E2 F2
> 3 A3 B3 C3 D3 E3 F3
> 4 A4 B4 C4 D4 E4 F4
> 5 A5 B5 C5 D5 E5 F5
> 6 A6 B6 C6 D6 E6 F6
An important note to make about the concat
function is that by default joins are performed using an “outer” join, which will keep all unmatched values and insert a null value in each position where a value is missing.
For example, if we join
hdf = pd.concat([df3, df2])
print(hdf)
> A B C D E F
> Index
> 1 NaN NaN NaN D1 E1 F1
> 2 NaN NaN NaN D2 E2 F2
> 3 NaN NaN NaN D3 E3 F3
> 4 NaN NaN NaN D4 E4 F4
> 5 NaN NaN NaN D5 E5 F5
> 6 NaN NaN NaN D6 E6 F6
> 7 A7 B7 C7 NaN NaN NaN
> 8 A8 B8 C8 NaN NaN NaN
> 9 A9 B9 C9 NaN NaN NaN
> 10 A10 B10 C10 NaN NaN NaN
> 11 A11 B11 C11 NaN NaN NaN
> 12 A12 B12 C12 NaN NaN NaN
This behavior can be altered by changing the join
option. Subscribe to our Python tutorials below because we will be discussing more about join methods in future tutorials on relational tables.
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Did you find this free tutorial helpful? Share this article with your friends, classmates, and coworkers on Facebook and Twitter! When you spread the word on social media, you’re helping us grow so we can continue to provide free tutorials like this one for years to come.