Python Input/Output Motivating Example

Suppose you are- yet again- placed in the role of a geography teacher. You have done well in analyzing the tedious, repetitive data of your students thus far. To acknowledge your hard work, the school has “promoted” you to teaching a larger class. Your class now has 300 students. The semester grades of the students were stored in a spreadsheet, which was saved as the Comma-Separated Values (CSV) file studentgrades.csv, which you can download here. Similar to our previous tutorial with Python class objects, we need to calculate the students’ final grades.

In our previous approaches to dealing with the grade data, we had to manually enter the grades in the form of of a list or dictionary. This was fine when we only had a handful of students, but it would be tiring to do so with 300. We need a way to have Python automatically read in this data, process it, and produce some output. We can do this using several of Python’s Input/Output (I/O) tools.


Introduction to Python Input/Output (I/O)

There are several ways to read information into Python, and to save information from Python. The sets of ways to import data depend heavily on the format of the data we wish to input or output. This tutorial will break up I/O in the following formats:

  1. Terminal/User I/O
  2. Plain text files
  3. Comma-Separated Value (CSV) files

We will go over the most common ways to I/O the formats above, but it should be noted that additional Python modules can implement the same I/O strategies based on specific applications such as SQL databases. Additional I/O tools for Python objects can be found in the pickle and json builtin modules, however they will not be discussed in this introductory tutorial.


I/O from the Python Terminal

In the past few tutorials, we’ve extensively used the print function to print information out to the terminal.

print("Hello, world!")
> Hello, world!

While this may seem less than useful when executing commands directly from the terminal, it is necessary to provide output from the modules we have created. Formatting string output can be somewhat complex, and we will cover cover it in detail in a later tutorial. For now we can simply, though crudely, output text by concatenating string objects:

x = 21.2
y = True
out = "It's " + str(y) + ", the value is " + str(x)
print(out)
> It's True, the value is 21.2

We have not yet covered any form of user input from the terminal. We can request that a user input a value from the terminal using the input function. The input function has the format x = input(), where x is a string object containing the text that a user has input to the terminal. For example:

print("Input a number: ")
x = input()
print("You selected " + x)
> Input a number: 
>> 3   # >> represents keyboard input
> You selected 3

The returned output from the input command will always be a string:

print("Input a number: ")
x = input()
> Input a number: 
>> 3   
type(x)
> str

This is important to remember when the input is a numeric quantity, as it will need to be converted into a numeric type prior to any numeric calculations.

print("Input a number: ")
x = input()
> Input a number: 
>> 3   
y = x + 1  # This will fail. You cannot add an int to a str
> TypeError: must be str, not int
y = int(x) + 1  
print(y)
> 4

Get Our Python Developer Kit for Free

I put together a Python Developer Kit with over 100 pre-built Python scripts covering data structures, Pandas, NumPy, Seaborn, machine learning, file processing, web scraping and a whole lot more - and I want you to have it for free. Enter your email address below and I'll send a copy your way.

Yes, I'll take a free Python Developer Kit

Opening and Closing Files with Python

The most basic aspect of I/O with Python involving files is the use of the open builtin function and its associated object. The open function is given a pathname of an file relative to the system working directory, and returns an object whose methods allow for the reading of and writing to the given file. The open function call has the following format: open(path, mode,...).

  • path: A string containing the pathname to the file we wish to open 1
  • mode: The operation we intend to perform on the file. It has the following options:
    • "r": Read the file. This is the default setting.
    • "w": Write to the file. Creates the file if it doesn’t exist, and erases it if it does.
    • "a": Write to the end of a file. Creates the file if it doesn’t exist.
    • "r+": Read and write to the file. Starting position is at the start of the file.
    • "w+": Read and write to the file. Erases the file if it already exists. Starting position is at the start of the file.
    • "a+": Read and write to the file. Starting position is at the end of the file.
    • "t": Read\Write data as text. This option can be added to the above (e.g. “r+t”). This is the default option.
    • "b": Read\Write data as binary. This option can be added to the above (e.g. “r+b”).
  • ...: This refers to other less common options available within the open function call.

The output of the open function is an open object that we will assign to a variable: obj = open(filename). While the file is opened, no other program will allowed to access it. When we are done performing whatever actions we need to the file, we can close it using the obj.close() method, where obj is the open object. Closing the file will allow other programs to access it. Suppose we create a file named example.txt that contains the following text:

It was the best of times,
it was the worst of times,
it was the age of wisdom,
it was the age of foolishness,
it was the epoch of belief,
it was the epoch of incredulity.

Now we can begin using the open function to create the open object:

x = open(example.txt, mode="r") # Opening the text file in the working directory for reading only.
## We can do stuff with the file here ##
x.close()  # Close the file

Working with Python Paths

In the above example, we only specified the name of the file without providing any context as to where it was located. The above code assumed that the file was located in the “working directory” of Python. We’ve mentioned this term before, but this is the first time we’ve had a real need to understand what it means.

When we run Python from either the terminal or a script, we define one location from which we locate files called the working directory. This directory is where Python will search for modules and files that haven’t been given a full path definition. By default, the working directory will be wherever the terminal or script file is executed.

We can examine and manipulate the working directory using the os module:

import os
os.getcwd() # Returns a string with the working directory path
> 'C:\\Users\\Cody Gilbert'

The path format depends on the operating system you use; I am using a Windows operating system in this example.

To change the working directory, we can use the os.chdir(path) function where path is a string containing the path. For example, if we want to go up one level in the above environment, we can use the following:

os.chdir("C:\\Users") 
os.getcwd()
> 'C:\\Users'

We can create a list of all the files and directories within our working directory using the os.listdir function:

os.listdir()
> ['All Users',
>  'Cody Gilbert',
>  'Default',
>  'Default User',
>  'desktop.ini',
>  'Public']

Now that example.txt is no longer within the working directory, we’ll have to refer to it by its path. To verify that a file can be read, we can use the os.access(file, os.R_OK) function.

myfile = "C:\\Users\\Cody Gilbert\\example.txt"
os.access(myfile, mode=os.R_OK)  # mode=os.R_OK specifically checks readability 
> True

The os.access function returned True, therefore we know that there is a file at this path that is readable.


Manually Reading and Writing files with Python

Note: This section is here to form a basis of learning. See Python Context Managers for the preferred method of reading and writing files in application.

We’ve now opened a file and closed it, but we have yet to do anything with the actual contents of the file. We can use the Python builtin methods of the open object to manipulate file contents.


Python read Method (Not Recommended)

We can use the read method to return the entire contents of the file as a string:

x = open("example.txt", mode="r")
all = x.read()
x.close()  # Be sure to close the file 
print(all)
> It was the best of times,
> it was the worst of times,
> it was the age of wisdom,
> it was the age of foolishness,
> it was the epoch of belief,
> it was the epoch of incredulity.

Caution: The read method will return the entire contents of the text file. For large text files, this can take up an enormous amount of memory. Avoid using the read method unless you are certain that the given file is sufficiently small.


Python Iterator Reading Method

The open object’s readline method works similar to the read method, however it will only return a strings up to the next newline2 character. The open object is an iterator over the file with readline, meaning that each subsequent call will return the next line in the file. We can reproduce the string created using the read method by simply creating a for loop over the iterator.

x = open("example.txt", mode="r")
all = ""
for line in x:
	all += line
x.close()  # Be sure to close the file 
print(all)
> It was the best of times,
> it was the worst of times,
> it was the age of wisdom,
> it was the age of foolishness,
> it was the epoch of belief,
> it was the epoch of incredulity.

This method has the benefit of giving us the option to choose the data we preserve with each line iteration, as well as to perform additional operations within the for loop.


Python write Method

Writing to a Python file is simple: it’s just like using the print function. Writing a string to a file is performed with the obj.write(string) method, where string is a given string object you wish to write and obj is the open object.

string = ("\nIt was the season of light,\n" # Remember we need \n to start a new line
		+ "it was the season of darkness.")
x = open("example.txt", mode="a") # Switch to a to append
x.write(string)
x.close() 
# Now let's check the file by reading it
x = open("example.txt", mode="r") # Switch to r to read back in
all = ""
for line in x:
	all += line
x.close() 
print(all)
> It was the best of times,
> it was the worst of times,
> it was the age of wisdom,
> it was the age of foolishness,
> it was the epoch of belief,
> it was the epoch of incredulity.
> It was the season of light,
> it was the season of darkness.

Additional write statements will continue to append characters to the file:

string = ("\nIt was the season of light,\n" # Remember we need \n to start a new line
		+ "it was the season of darkness.")
x = open("example.txt", mode="a") # Switch to a to append
x.write(string)
string = ("\nIt was the spring of hope,\n"  # Change and add another string
		+ "it was the winter of despair.")
x.write(string)
x.close() 
# Now let's check the file by reading it
x = open("example.txt", mode="r") # Switch to r to read back in
all = ""
for line in x:
	all += line
x.close() 
print(all)
> It was the best of times,
> it was the worst of times,
> it was the age of wisdom,
> it was the age of foolishness,
> it was the epoch of belief,
> it was the epoch of incredulity.
> It was the season of light,
> it was the season of darkness.
> It was the spring of hope,
> it was the winter of despair.

Get Our Python Developer Kit for Free

I put together a Python Developer Kit with over 100 pre-built Python scripts covering data structures, Pandas, NumPy, Seaborn, machine learning, file processing, web scraping and a whole lot more - and I want you to have it for free. Enter your email address below and I'll send a copy your way.

Yes, I'll take a free Python Developer Kit

Python Context Managers: the with Statement

Now that I’ve introduced you to the manual method of opening files with Python, let me go ahead and tell you to never use that method. The reason is because Python has a more elegant way of handling the opening and closing of file objects than the manual instantiation of the open object and the calling of the close method.

The fundamental reason for wanting to shy away from using the open and close features manually is the potential for leaving a file open if an exception is raised after the open statement, but before the close statement. The program will be closed by the operating system after the Python process is ended, but if the raised exception does not cause Python to stop, then the file will remain open and inaccessible. There are also other more technical reasons as to why we want to ensure files are closed after we are done with them.

Python’s solution to this problem is the with context manager. Python’s context managers generally provide what you can think of as a “sub-environment” under which the code is executed. Python has additional ways of handling exceptions that are raised within a context manager. The with statement allows a special instantiation of an object, where anything defined within the __enter__ method is executed when the object is instantiated, and anything defined within the __exit__ method is executed when the with statement is exited. Most importantly, these methods are executed whether an exception is raised or not. The open object has a close method contained within its __exit__ method, therefore if any exception is raised, Python will cleanly close the file and exit.

A with statement has the format

with [object] as [variable]:
	[execution]

Like for loops and if statements, anything defined within the indented [execution] section will be performed with the object instantiated as the variable [variable]. If no exception is raised, then the __exit__ method will be executed after the indented section is completed.

Let’s look at a simplified file I/O using the with statement.

with open("example.txt", mode="r") as x:
	all = ""
	for line in x:
		all += line

You can see that we no longer need to use the close method, and we are protected in the event an exception is raised.


Python csv Module

So far we’ve looked at reading in each line from a file as its own string, but what if we want to access the information within those rows like the columns in our motivating example data? One way is to read in the data using the with open method and separate the data using the split string method. split has the format [string].split([delimiter]), where [delimiter] is the delimiting or separating character, and [string] is the string we wish to split. The output will be a list of strings that are separated by the delimiter.

with open("example.txt", mode="r") as x:
	all = []
	for line in x:
		all.append(line.split(" ")) # Use a blank space delimiter
print(all)
> [['It', 'was', 'the', 'best', 'of', 'times,\n'],
>  ['it', 'was', 'the', 'worst', 'of', 'times,\n'],
>  ['it', 'was', 'the', 'age', 'of', 'wisdom,\n'],
>  ['it', 'was', 'the', 'age', 'of', 'foolishness,\n'],
>  ['it', 'was', 'the', 'epoch', 'of', 'belief,\n'],
>  ['it', 'was', 'the', 'epoch', 'of', 'incredulity.']]

The split method requires a bit of coding, and has been depreciated (but still included) in Python Version 3. A better way to import table data is by using the csv module.

The csv module is used primarily to read in Comma-Separated Value (CSV) files, however it can be used more generally to import any kind of delimited data file. The process for using the csv module for reading and writing is similar to iterating over an open object.

For the next few sections, assume we have the following sample data taken from our motivating example and stored within the file sample.csv:

Student ID,Homework Total (20%),Project 1 (10%),Midterm (20%),Project 2 (10%),Final (40%)
4665178,95.9,91.9,91.2,90.9,99
4665187,83.1,83,75.7,92,78.6
4665203,89.9,73.6,98.9,84.1,95.2
4665219,81.3,88.5,84.8,81.2,91.1

Python csv.reader Function

Let’s use the csv module to read in data from sample.csv. Data can be read using the reader function with the format [object] = csv.reader([openobj], delimiter=[delimiter], ...), where [object] is a new csv.reader object, [openobj] is an open object, [delimiter] is the character we wish to use as a delimiting or separating value, and ... indicates other options that we won’t explore at this time. [delimiter] by default is “,”.

When using the csv functions, we will need to add to the open statement the option newline="" to prevent the potential of adding extra newline characters in the case that the program uses a different variant of a newline character.

The following example will read in sample.csv:

import csv
with open("sample.csv", mode="r", newline="") as x:
	reader = csv.reader(x) # Default delimiter is ","
	all = []
	for line in reader:
		all.append(line)
print(all)
> [['Student ID',
>   'Homework Total (20%)',
>   'Project 1 (10%)',
>   'Midterm (20%)',
>   'Project 2 (10%)',
>  'Final (40%)'],
>  ['4665178', '95.9', '91.9', '91.2', '90.9', '99'],
>  ['4665187', '83.1', '83', '75.7', '92', '78.6'],
>  ['4665203', '89.9', '73.6', '98.9', '84.1', '95.2'],
>  ['4665219', '81.3', '88.5', '84.8', '81.2', '91.1']]

If we instead wanted to obtain a list of all the words in example.txt, we would need to change the delimiter to be a blank character " ".

import csv
with open("example.txt", mode="r", newline="") as x:
	reader = csv.reader(x, delimiter=" ")
	all = []
	for line in reader:
		all.append(line)
print(all)
> [['It', 'was', 'the', 'best', 'of', 'times,'],
>  ['it', 'was', 'the', 'worst', 'of', 'times,'],
>  ['it', 'was', 'the', 'age', 'of', 'wisdom,'],
>  ['it', 'was', 'the', 'age', 'of', 'foolishness,'],
>  ['it', 'was', 'the', 'epoch', 'of', 'belief,'],
>  ['it', 'was', 'the', 'epoch', 'of', 'incredulity.']]

Python csv.writer Function

Now let’s look at how to use the csv module to write a string to a file. Writing a list whose elements contains contain lists to be used as rows, each of which contain a list of strings to be used as columns, the writer function can be used easily. The writer function creates an object similar to the reader object, but whose methods write delimited data to a given open file object. The writerows([object], delimiter=[delimiter]) will take some iterable object [object] like a list of lists, and write them to the given file with [delimiter] delimiters. The default delimiter is again “,”.

When using the csv functions, we will need to add to the open statement the option newline="" to prevent the potential of adding extra newline characters in the case that the program uses a different variant of a newline character.

If the all variable contains the csv.reader input from sample.csv, then we can write a new file sample2.csv:

import csv
# all is already defined
with open("sample2.csv", mode="w", newline="") as x:
	writer = csv.writer(x) # Default delimiter is ","
	writer.writerows(all)

Python I/O Example

Now that we’ve seen how to read in data from a CSV file, let’s use this information to analyze the student grades from our motivating example.

The studentgrades.csv file only contains 300 lines of data, therefore we can read it directly into a list without worrying about memory. We read it into a list with the following:

import csv
with open("studentgrades.csv", mode="r", newline="") as x:
	reader = csv.reader(x) # Default delimiter is ","
	data = []
	for line in reader:
		data.append(line)

Now that we have the data, we can calculate the final grade for each student. Recall that each sublist element is a string, therefore we will have to convert them into numeric values before calculating the final grade.

header = data[0]  # Save the header
del(data[0])  # Delete the header string from the list
for student in data:
	final = ( float(student[1])*0.2  # Homework Total (20%)
			+ float(student[2])*0.1  # Project 1 (10%)
			+ float(student[3])*0.2  # Midterm (20%)
			+ float(student[4])*0.1  # Project 2 (10%)
			+ float(student[5])*0.4) # Final (40%)
	final = round(final, 2)  # Round to trim spurious digits. 
	student.append(final) # Add a new column with the final grade

Now we can save the data back to studentgrades.csv with the final grade included.

header.append("Final Grade")  # Add column to header
data.insert(0, header)  # Reinsert the new header
with open("studentgrades.csv", mode="w", newline="") as x:
	writer = csv.writer(x) # Default delimiter is ","
	writer.writerows(data)

Now we have automated a method for inserting a final grade into any table of student grades.


Get Our Python Developer Kit for Free

I put together a Python Developer Kit with over 100 pre-built Python scripts covering data structures, Pandas, NumPy, Seaborn, machine learning, file processing, web scraping and a whole lot more - and I want you to have it for free. Enter your email address below and I'll send a copy your way.

Yes, I'll take a free Python Developer Kit

Python I/O Practice Problems

  1. Use studentgrades.csv to print the Student ID of the student with the highest grade.     Solution

  2. Use studentgrades.csv to create a new file passedstudent.csv that contains the grades of each student that had a passing grade (total grade at least 85).     Solution

  3. Suppose someone attempted to write a list of data to a file using the following code:

... # Other code
with open("example.txt", mode="r") as x:
	reader = csv.reader(x, delimiter=" ")
	all = []
	for line in reader:
		all.append(line)
... # Other code

However they’re surprised to find that the output contains twice as many lines than expected. The text looks like:

[Line 1]

[Line 2]

[Line 3]

Why did this occur? How can this error be resolved?     Solution

  1. Suppose you executed Python within the directory C:\\My Documents\\Python, however the Python module and associated data that you want to use is in the directory C:\\My Documents\\Projects\\Project 1. What one-line command can you perform to change directories?     Solution

  2. Suppose you have a file RawData.csv that contains millions of rows of data in a CSV format. You need to convert the file to RawData-Spaced.txt that contains the same data, but with delimited by spaces. You cannot read the entire file into a list due to memory restrictions. Write a piece of code that performs this efficiently.     Solution


Solutions

1. We can do this by reading each student’s grade, calculating the final grade, and returning the maximum grade along with the student ID. We can also do this efficiently without having to read the entire file into memory.

import csv
maximum = 0
ID = None
with open("studentgrades.csv", mode="r", newline="") as x:
	reader = csv.reader(x) # Default delimiter is ","
	next(reader, None)  # Skip the first header row
	for line in reader:
		final = ( float(student[1])*0.2  # Homework Total (20%)
				+ float(student[2])*0.1  # Project 1 (10%)
				+ float(student[3])*0.2  # Midterm (20%)
				+ float(student[4])*0.1  # Project 2 (10%)
				+ float(student[5])*0.4) # Final (40%)
		if final > maximum:
			maximum = final
			ID = line[0]
print([ID, round(maximum, 2)])
> ['4665178', 91.88]

2. We can do this by modifying the example’s code to remove all non-passing students prior to saving the file.

import csv
with open("studentgrades.csv", mode="r", newline="") as x:
	reader = csv.reader(x) # Default delimiter is ","
	data = []
	for line in reader:
		data.append(line)
passed = []  # This list will contain the students who passed
passed.append(data[0])  # Save the header to the passed student list
passed[0].append("Final Grade")  # Add Final Grade to the header
del(data[0])  # Delete the header string from the list
for student in data:
	final = ( float(student[1])*0.2  # Homework Total (20%)
			+ float(student[2])*0.1  # Project 1 (10%)
			+ float(student[3])*0.2  # Midterm (20%)
			+ float(student[4])*0.1  # Project 2 (10%)
			+ float(student[5])*0.4) # Final (40%)
	student.append(str(round(final, 2)))  # Append the final grade
	if final >= 85:
		passed.append(student) # Add passing student to list
with open("passedstudent.csv", mode="w", newline="") as x:
	writer = csv.writer(x) # Default delimiter is ","
	writer.writerows(passed)

3. The reason this occurred is because the newline read from the file was retained within the strings, and a repeated newline was added by the writer function. We can prevent this from happening by using the open function with the newline="" argument.

4. We will need to change the working directory to the directory where our files are stored. We can use the os.chdir to change the working directory within Python.

5. The file is far too large to read into a list directly, so we will have to write each row we encounter to the next file on-the-fly. We can do this using nested with statements.

import csv
with open("RawData.csv", mode="r", newline="") as x:
	with open("RawData-Spaced.txt", mode="w", newline="") as y:
		reader = csv.reader(x)
		writer = csv.writer(y, delimiter=" ") 
		for line in reader:
			writer.writerows([line])  # Remember writerows expects a list of lists

Instead of nested with statements, we could have instead used the format with [tuple of objects] as [tuple of variables].


Did you find this free tutorial helpful? Share this article with your friends, classmates, and coworkers on Facebook and Twitter! When you spread the word on social media, you’re helping us grow so we can continue to provide free tutorials like this one for years to come.

  1. open actually receives a path type object, which is the output of other Python functions, but a path string will suffice. 

  2. Recall that the newline character is represented in Python as the string literal \n, which is used to indicate a carriage return.