A histogram is a type of graph used to plot data distributions. In one of our earlier tutorials, we explained how to draw different types of plots with the Python Seaborn library. In that tutorial, we learned how to plot a very basic histogram using the Seaborn library. This tutorial will take a more in-depth look at how to plot different types of histograms using the Python seaborn library.

Specifically, we’ll be plotting data from a Pandas Dataframe using Seaborn’s sns.distplot. This isn’t the first time we’ve talked about plotting histograms with Python. A couple months ago, we had a full tutorial about plotting histograms with Pandas built-in DataFrame.Hist method. The Seaborn histogram plotting features are a bit more flexible so we’ll go into more detail about them here.

Seaborn Installation

Before you can start creating histograms with Seaborn, you need to install the Seaborn library. The following command installs the seaborn library using the pip installer:

$ pip install seaborn

The dataset

The dataset we’ll be using to demonstrate bar plots with Seaborn is the titanic dataset. We’ve been using this dataset a lot in our recent tutorials because it has a great mix of numeric and categorical data and comes built-in with the Seaborn library.

The following script imports the seaborn library and loads the titanic dataset into your application.

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.set_style("darkgrid")

import matplotlib.pyplot as plt
import seaborn as sns
plt.rcParams["figure.figsize"] = [8,6]
titanic_dataset = sns.load_dataset('titanic')

Let’s plot the first five rows of the dataset.

titanic_dataset.head()

Output:

titanic dataset header

Seaborn histograms

Now we’re going to show you how to plot different types of histograms with the Python seaborn library.

Basic Seaborn Histogram

To plot a simple histogram, use the distplot() function of the seaborn library. You need to pass the column of the pandas dataframe for which you want to display the data distribution. For instance, the following script plots a histogram for the age column of the Titanic dataset.

titanic_dataset.dropna(inplace = True)
sns.distplot(titanic_dataset["age"])

Output:

Basic Seaborn histogram

The output shows a histogram with a kernel density estimation (KDE) line.

Removing KDE Line

You can remove the default KDE line from a histogram by passing False as the value for the kde attribute of the distplot() function, like this:

sns.distplot(titanic_dataset["age"], kde = False)

Output:

histogram without kde

See how the KDE line has been removed from the above output?

Can't get enough Python?

Enter your email address for more free Python tutorials and tips.

Python is powerful! Show me more free Python tips

Displaying KDE Line Only

On the contrary, if you want to remove the histogram bars and display only the KDE line, you need to pass False as the value for the hist attribute. Look at the following example:

sns.distplot(titanic_dataset["age"], hist = False)

Output:

histogram-with-kde-only

Changing Number of Bins

By default, the seaborn histogram distributes data into 10 bins. You can increase or decrease the number of bins by passing an integer value to the bins attribute of the distplot() function. For instance, the following script plots a histogram with 20 bins.

sns.distplot(titanic_dataset["age"], kde = False, bins = 20)

Output:

changing histogram bins

Plotting Multiple Histograms

Plotting multiple histograms in one plot is a straight-forward process with seaborn, too. All you have to do is call the distplot() function twice with different dataframe columns. For instance, the following script plots two histograms: one for the age column and the other for the fare column.

sns.distplot(titanic_dataset["age"])
sns.distplot(titanic_dataset["fare"])
plt.legend()

Output:

plotting multiple histograms

You can see the x-axis defaults to the label of the second histogram added. We’ll talk more about editing labels in a few sections. Also notice how the y-axis changes from a numeric quantity to a probability distribution.

Changing Histogram Orientations

You can change the default orientation of seaborn histograms by passing True as the value for the vertical attribute. In the output, you’ll see a vertical histogram instead of the default horizontal one.

sns.distplot(titanic_dataset["age"], vertical= True, kde = False, bins = 20)

Output:

vertical histogram

Changing Histogram Colors

To change the color of your seaborn histograms, first you have to call the set_color_codes() method of the seaborn module. Next, the shorthand notation for the color is passed to the color attribute of the distplot() function. The following script plots a red histogram since we pass r as the value for the color attribute.

sns.set_color_codes()
sns.distplot(titanic_dataset["age"], kde = False, color = "r")

Output:

changing histogram colors

Adding Labels and Titles to a Histogram

To add labels and titles to a histogram, you can use the plt.xlabel, plt.ylabel, and plt.title attributes as shown in the following script.

sns.set_color_codes()
sns.distplot(titanic_dataset["age"], kde = False, color = "g")

plt.xlabel("Age of Passengers", fontsize= 12)
plt.title("Histogram for Passenger Age", fontsize= 15)

Output:

histogram-labels-titles


I hope you enjoyed our Seaborn histogram tutorial. For more ways to use visualize your data with Python, subscribe using the form below. We don’t email often, but we’ll send you our most helpful tutorials to make sure you’re getting the most out of Python.

Can't get enough Python?

Enter your email address for more free Python tutorials and tips.

Python is powerful! Show me more free Python tips