A histogram is a type of graph used to plot data distributions. In one of our earlier tutorials, we explained how to draw different types of plots with the Python Seaborn library. In that tutorial, we learned how to plot a very basic histogram using the Seaborn library. This tutorial will take a more in-depth look at how to plot different types of histograms using the Python seaborn library.

Specifically, we’ll be plotting data from a Pandas Dataframe using Seaborn’s sns.distplot. This isn’t the first time we’ve talked about plotting histograms with Python. A couple months ago, we had a full tutorial about plotting histograms with Pandas built-in DataFrame.Hist method. The Seaborn histogram plotting features are a bit more flexible so we’ll go into more detail about them here.

It’s worth mentioning that distplot is now a deprecated function and will be removed in a future version of Seaborn. You’ll need to update your code to use displot or histplot once it’s fully removed from Seaborn. We have a full tutorial on the different seaborn plot types.

Seaborn Installation

Before you can start creating histograms with Seaborn, you need to install the Seaborn library. The following command installs the seaborn library using the pip installer:

$ pip install seaborn

The dataset

The dataset we’ll be using to demonstrate bar plots with Seaborn is the titanic dataset. We’ve been using this dataset a lot in our recent tutorials because it has a great mix of numeric and categorical data and comes built-in with the Seaborn library.

The following script imports the seaborn library and loads the titanic dataset into your application.

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
sns.set_style("darkgrid")

plt.rcParams["figure.figsize"] = [8,6]
titanic_dataset = sns.load_dataset('titanic')

The %matplotlib inline code will give you an error if you’re not using the Jupyter Notebook. Just comment it out and add the line plt.show() right after the point where we start making plots.

Let’s plot the first five rows of the dataset.

titanic_dataset.head()

Output:

titanic dataset header

Seaborn histograms

Now we’re going to show you how to plot different types of histograms with the Python seaborn library.

Basic Seaborn Histogram

To plot a simple histogram, use the distplot() function of the seaborn library. You need to pass the column of the pandas dataframe for which you want to display the data distribution. For instance, the following script plots a histogram for the age column of the Titanic dataset.

titanic_dataset.dropna(inplace = True)
sns.distplot(titanic_dataset["age"])

Output:

Basic Seaborn histogram

The output shows a histogram with a kernel density estimation (KDE) line.

Removing KDE Line

You can remove the default KDE line from a histogram by passing False as the value for the kde attribute of the distplot() function, like this:

sns.distplot(titanic_dataset["age"], kde = False)

Output:

histogram without kde

See how the KDE line has been removed from the above output?


Get Our Python Developer Kit for Free

I put together a Python Developer Kit with over 100 pre-built Python scripts covering data structures, Pandas, NumPy, Seaborn, machine learning, file processing, web scraping and a whole lot more - and I want you to have it for free. Enter your email address below and I'll send a copy your way.

Yes, I'll take a free Python Developer Kit

Displaying KDE Line Only

On the contrary, if you want to remove the histogram bars and display only the KDE line, you need to pass False as the value for the hist attribute. Look at the following example:

sns.distplot(titanic_dataset["age"], hist = False)

Output:

histogram-with-kde-only

Changing Number of Bins

By default, the seaborn histogram distributes data into 10 bins. You can increase or decrease the number of bins by passing an integer value to the bins attribute of the distplot() function. For instance, the following script plots a histogram with 20 bins.

sns.distplot(titanic_dataset["age"], kde = False, bins = 20)

Output:

changing histogram bins

Plotting Multiple Histograms

Plotting multiple histograms in one plot is a straight-forward process with seaborn, too. All you have to do is call the distplot() function twice with different dataframe columns. For instance, the following script plots two histograms: one for the age column and the other for the fare column.

sns.distplot(titanic_dataset["age"])
sns.distplot(titanic_dataset["fare"])
plt.legend()

Output:

plotting multiple histograms

You can see the x-axis defaults to the label of the second histogram added. We’ll talk more about editing labels in a few sections. Also notice how the y-axis changes from a numeric quantity to a probability distribution.

Changing Histogram Orientations

You can change the default orientation of seaborn histograms by passing True as the value for the vertical attribute. In the output, you’ll see a vertical histogram instead of the default horizontal one.

sns.distplot(titanic_dataset["age"], vertical= True, kde = False, bins = 20)

Output:

vertical histogram

Changing Histogram Colors

To change the color of your seaborn histograms, first you have to call the set_color_codes() method of the seaborn module. Next, the shorthand notation for the color is passed to the color attribute of the distplot() function. The following script plots a red histogram since we pass r as the value for the color attribute.

sns.set_color_codes()
sns.distplot(titanic_dataset["age"], kde = False, color = "r")

Output:

changing histogram colors

Adding Labels and Titles to a Histogram

To add labels and titles to a histogram, you can use the plt.xlabel, plt.ylabel, and plt.title attributes as shown in the following script.

sns.set_color_codes()
sns.distplot(titanic_dataset["age"], kde = False, color = "g")

plt.xlabel("Age of Passengers", fontsize= 12)
plt.title("Histogram for Passenger Age", fontsize= 15)

Output:

histogram-labels-titles


I hope you enjoyed our Seaborn histogram tutorial. For more ways to use visualize your data with Python, subscribe using the form below. We don’t email often, but we’ll send you our most helpful tutorials to make sure you’re getting the most out of Python.


Get Our Python Developer Kit for Free

I put together a Python Developer Kit with over 100 pre-built Python scripts covering data structures, Pandas, NumPy, Seaborn, machine learning, file processing, web scraping and a whole lot more - and I want you to have it for free. Enter your email address below and I'll send a copy your way.

Yes, I'll take a free Python Developer Kit