When importing or manipulating data within a Pandas DataFrame, you may notice that the data is represented as strings of numbers, rather than the numeric types themselves. Strings cannot be used in numeric calculations, and will not produce numerical summary data when using the Pandas describe
function. To remedy this, we can use a variety of Pandas operations to convert string data to numeric types.
Suppose we have a set of grades data contained within the Pandas DataFrame head
and info
methods to examine the data:
import pandas as pd # Don't forget to import pandas!
grades.head() # Look at the first few rows (head) of the data
> StudentID Homework Midterm Project Final
> 0 4560 100 97.68 100 A
> 1 5540 85.68 90.02 88.54 B
> 2 6889 92.06 85.74 88.84 B
> 3 6817 65.02 85.5 87.86 C
grades.info() # Output information about the DataFrame itself
> <class 'pandas.core.frame.DataFrame'>
> RangeIndex: 4 entries, 0 to 3
> Data columns (total 5 columns):
> StudentID 4 nonnull object
> Homework 4 nonnull object
> Midterm 4 nonnull object
> Project 4 nonnull object
> Final 4 nonnull object
> dtypes: object(5)
> memory usage: 240.0+ bytes
We can see from the info
output that the data has been imported as a “nonnull object,” which in this case are strings. If we want to summarize the data with the describe
method, we will receive a description of the objects rather than numeric summaries:
grades.describe()
> StudentID Homework Midterm Project Final
> count 4 4 4 4 4
> unique 4 4 4 4 3
> top 5540 85.68 97.68 88.84 B
> freq 1 1 1 1 2
This tutorial will cover how to convert these Pandas strings into numbers so you can evaluate them numerically.
Pandas Convert String Column to Numeric
The simplest method of converting Pandas DataFrame data into numeric types is the to_numeric
function of Pandas. This function has the format [Numeric Column] = pandas.to_numeric([String Column])
where [String Column]
is the column^{1} of strings we wish to convert, and [Numeric Column]
is the new column of converted numbers. To convert a column within a DataFrame, you can simply assign the new numeric column back to the original column in the DataFrame. Since to_numeric
will convert a single column of Pandas strings into numbers, you’ll need to iterate over them with a for
loop to convert all of them. Take a look at this Python example to find out how:
import pandas as pd
cols = ['StudentID', 'Homework', 'Midterm', 'Project'] # We don't want to convert the Final grade column.
for col in cols: # Iterate over chosen columns
grades[col] = pd.to_numeric(grades[col])
grades.info()
> <class 'pandas.core.frame.DataFrame'>
> RangeIndex: 4 entries, 0 to 3
> Data columns (total 5 columns):
> StudentID 4 nonnull int64
> Homework 4 nonnull float64
> Midterm 4 nonnull float64
> Project 4 nonnull float64
> Final 4 nonnull object
> dtypes: float64(3), int64(1), object(1)
> memory usage: 240.0+ bytes
grades.describe() # Now describe will report numeric summaries
> StudentID Homework Midterm Project
> count 4.000000 4.000000 4.000000 4.000000
> mean 5951.500000 85.690000 89.735000 91.310000
> std 1115.586692 14.973332 5.689156 5.807822
> min 4560.000000 65.020000 85.500000 87.860000
> 25% 5295.000000 80.515000 85.680000 88.370000
> 50% 6178.500000 88.870000 87.880000 88.690000
> 75% 6835.000000 94.045000 91.935000 91.630000
> max 6889.000000 100.000000 97.680000 100.000000
See how the describe
function now reports a numeric summary of our DataFrame? That’s the advantage of converting your Pandas strings to numbers.
The problem with this approach is that the to_numeric column guesses which data type to convert your strings to. You can see in the info
output, it converted the
What if you wanted more control of how your numbers were converted? Pandas has a function for that, too!
Enter your email address for more free Python tutorials and tips.
Pandas Convert General Data Types
For most applications, the pandas.to_numeric
function above can be used. However, in situations where Pandas may convert types incorrectly (e.g. convert strings to floats instead of integers), then we can use the as_type
DataFrame method to specify the exact data type to which we wish to convert the data. This method is more generic than pandas.to_numeric
, as it can convert any data type. This method has the format [dtype2 Column] = [dtype1 Column].astype(dtype=[dtype2])
where [dtype1 Column]
is the original column or DataFrame^{2} and [dtype2 Column]
is the output column or DataFrame converted to Pandas data type [dtype2]
, where [dtype2]
is a Numpy data type. That was a mouthful. Take a look at these examples to help it make more sense.
Pandas Convert String to Float
Strings can be converted to floats using the astype
method with the Numpy data type numpy.float64
:
import pandas as pd
import numpy as np # To use the int64 dtype, we will need to import numpy
cols = ['Homework', 'Midterm', 'Project']
for col in cols:
grades[col] = grades[col].astype(dtype=np.float64)
grades.info()
> <class 'pandas.core.frame.DataFrame'>
> RangeIndex: 4 entries, 0 to 3
> Data columns (total 5 columns):
> StudentID 4 nonnull object
> Homework 4 nonnull float64
> Midterm 4 nonnull float64
> Project 4 nonnull float64
> Final 4 nonnull object
> dtypes: float64(3), object(2)
> memory usage: 240.0+ bytes
You can see how in our info
output that the astype
function. Specifically, the three columns are now float64 data types.
Pandas Convert String to Int
Pandas strings can be converted to integers using the astype
method with the Numpy data type numpy.int64
:
import pandas as pd
import numpy as np # To use the int64 dtype, we will need to import numpy
grades["StudentID"] = grades["StudentID"].astype(dtype=np.int64)
grades.info()
> <class 'pandas.core.frame.DataFrame'>
> RangeIndex: 4 entries, 0 to 3
> Data columns (total 5 columns):
> StudentID 4 nonnull int64
> Homework 4 nonnull object
> Midterm 4 nonnull object
> Project 4 nonnull object
> Final 4 nonnull object
> dtypes: int64(1), object(4)
> memory usage: 240.0+ bytes
In this example, the StudentID column was converted from a string to an integer, as is evident by the output of the info
function.
Did you find this free tutorial helpful? Share this article with your friends, classmates, and coworkers on Facebook and Twitter! When you spread the word on social media, you’re helping us grow so we can continue to provide free tutorials like this one for years to come.
Enter your email address for more free Python tutorials and tips.

Technically,
pandas.to_numeric
takes apandas.Series
object and returns the same.pandas.Series
objects are data vectors with associated indices and metadata, which for all practical purposes are DataFrames of a single column. ↩ 
Unlike
pandas.to_numeric
which takes only apandas.Series
object,as_type
can convert an entire DataFrame. Because we usually only want to convert specific columns within a DataFrame, we can convert columns individually as we did with thepandas.to_numeric
function. ↩