The Python standard library provides several tools for reading and writing text files, but what about non-text files? With Python, it’s possible to open non-text files as well. We can do this in a variety of ways.
In most cases, opening non-text files in Python can be done using the standard library. But with the help of some modules, we can take this process much further.
In this tutorial, we’ll cover several methods for opening text files in Python. Using Python code examples, we’ll explore reading and writing binary files, a process that includes converting ASCII data into bytes.
We’ll also show you how to use the Pillow module, which is used for creating and editing image files. Next, we’ll look at using the Wave module to read audio data. As a treat, we’ve included an example of generating an audio file from scratch with Python.
We’ll also examine how to use the open() function to read .csv files, and compare it to opening non-binary files. If you’re interested in learning how read non-text files in Python, you’ve come to the right place.
What’s the Difference Between a Text File and Non-Text File?
Text files are normally composed of words and numbers. These files have lines of text, usually written in a language people can read (as opposed to machines).
Text files use ASCII (American Standard Code for Information Interchange) characters to represent letters and numbers. Text files often end with the extension .txt, although this isn’t always the case.
Non-text files, on the other hand, are files that contain data other than ASCII text. There are many such files. Usually all that is needed to open a non-text file in Python is the standard library that is distributed with the language. But with the help of a module or two, we can take our Python skills to another level.
Reading and Writing Binary (Non-Text) Files in Python
Binary files—also known as non-text files—are files that contain groups of binary digits (bits). Binary files groups bits into sequences of eight, called a byte. Usually, these bytes represent something other than text data.
With Python, we can read and write binary files using standard functions. For instance, we can use the open() function to create a new binary file. To do so, we’ll need to pass some special characters to the function. This tells the function we want to open the file in both write mode (w) and binary mode (b).
After opening a new file, we can convert a list of numbers into bytes using the bytearray() function. The bytearray() function is used to convert objects into an array of bytes. This binary data can be saved to disk using the write() function.
Example: Writing a Binary File to Disk
f = open("binary_file", 'w+b')
byte_arr = [1,2,3,4]
# convert data to a byte array
binary_format = bytearray(byte_arr)
f.write(binary_format)
f.close()
Likewise, we can open a non-text file in Python using the open() function. When reading binary file types, it’s necessary to pass the characters ‘r’ and ‘b’ to the open() function. This tells the open() function we intend to read the file in binary mode.
Example: Reading a Binary File
file = open("binary_file",'rb')
number = list(file.read())
print("Binary data = ", number)
file.close()
Output
Binary data = [1, 2, 3, 4]
How to Convert Binary Data to Text
Using standard Python methods, it’s possible to convert text into binary data, and vice versa. With the decode() method, we can convert binary data to ASCII text. Doing so allows us to take a binary message and turn it into something a human could read.
Example: Converting Binary Data in Python
# binary to text
binary_data = b'Hello World.'
text = binary_data.decode('utf-8') # Change back into ASCII
print(text)
# convert text to bytes
binary_message = text.encode('utf-8')
print(type(binary_message))
binary_data = bytes([65, 66, 67]) # ASCII values for the letters A, B, and C
text = binary_data.decode('utf-8')
print(text)
Opening an Image File in Python
It’s possible to open image files using the open() function. By assigning the result to a new variable, we can open an image file. In the following example, we’ll try to print the contents of an image file opened in read mode.
file = open("boat.jpg",'r')
print(file)
Output
<_io.TextIOWrapper name='boat.jpg' mode='r' encoding='cp1252'>
Using Python modules, we can do much more than open image files. With the Pillow module, we can process the image to our liking.
Before you can use the Pillow module, you’ll need to install it. The easiest way to install a Python module is by running pip from the command prompt, or terminal.
Use the following command to install pip.
pip install pillow
With Pillow installed on your computer, you can open and edit photos and other image files. Using Pillow, we can read an image file and print its dimensions to the console.
How to Use the Pillow Module
Before we can use the pillow module, we have to let Python know we want to use it in our program. We do this by importing the module. In the example below, we use from and import to include the Image module from PIL (Pillow).
Example: Open an Image File with PIL
from PIL import Image
img = Image.open("boat.jpg")
#open the original image
size = img.size
print(size)
# opens the image in the default picture viewer
img.show()
Pillow includes modules for editing photos. Using ImageOps—included with Pillow—we can invert a photo and display it using the default photo viewer on your computer.
Example: Invert an Image using PIL
from PIL import Image, ImageOps
img = Image.open("boat.jpg")
#open the original image
size = img.size
print(size)
# invert the image with ImageOps
img = ImageOps.invert(img)
# opens the image in the default picture viewer
img.show()
Opening an Audio File in Python
Python provides tools for reading and creating audio files. Using the Wave module, we can open .wav audio files and inspect their data. In the following example, we’ll open a .wav file named “portal.wav” and print its sample rate to the console.
Example: Open a .wav File with the Wave Module
import wave
audio = wave.open("portal.wav",'r')
print("Sample Rate: ", audio.getframerate())
Output
Sample Rate: 44100
Going a step further, we can generate our own audio files from scratch using the Wave module. By assigning the frames of the audio a random value, we can generate an audio file of static noise.
Example: Generate an Audio file with the Wave Module
import wave, struct, math, random
# this example will generate an audio file of static noise
sample_rate = 44100.0 # hertz
duration = 1.0 # seconds
frequency = 440.0 # hertz
sound = wave.open("sound.wav",'w')
sound.setnchannels(1) # mono
sound.setsampwidth(2)
sound.setframerate(sample_rate)
for i in range(99999):
# 32767 is the maximum value for a short integer.
random_val = random.randint(-32767, 32767)
data = struct.pack('h', random_val)
sound.writeframesraw(data)
sound.close()
How to Open a CSV File In Python
Comma Separated Value files —commonly referred to as a CSV file—are convenient ways of storing and exchanging data. These files often contain numbers and/or letters separated by commas.
Even though CSV files don’t end with the .txt extension, they are considered text files because they contain ASCII characters. Learning how to open CSV files is an useful skill to have as a Python developer. Use the following examples to compare opening non-text files and text files in Python.
Using the example CSV file below, we’ll explore how to read CSV data using Python.
animal_kingdom.csv
"amphibians","reptiles","birds","mammals"
"salamander","snake","owl","coyote"
"newt","turtle","bald eagle","raccoon"
"tree frog","alligator","penguin","lion"
"toad","komodo dragon","chicken","bear"
Example: Using Open() to read a CSV File
with open("animal_kingdom.csv",'r') as csv_file:
file = csv_file.read()
print(file)
Running the code above will print the contents of animal_kingdom.csv to the console. There is, however, a better way to read CSV files in Python.
How to Use the CSV Module
The CSV module comes pre-packed with Python, so there’s no need to install it. Using the CSV module gives us greater control over the contents of the CSV file. For example, we can extract the field data from the file using the reader() function.
Example: Using the csv module to read a csv file in Python
import csv
with open("animal_kingdom.csv",'r') as csv_file:
csvreader = csv.reader(csv_file)
# read the field names
fields = csvreader.__next__()
print('Field Names\n----------')
for field in fields:
print(field)
print('\nRows\n---------------------')
for row in csvreader:
for col in row:
print(col,end="--")
print("\n")
Conclusion
We’ve examined a variety of ways of working with non-text files in Python. By converting ASCII text into a byte array, we can create our own binary files. And with the help of the read() function, we can read the binary data and print it to the console.
The Python module Pillow is a great choice for opening and editing image files. If you’re interested in learning more about image processing, the Pillow module is an excellent starting point.
The Wave module is distributed with Python. Use it to read and write Wave data. With a little mathematics, the Wave module can be used to generate a variety of sound effects.
Related Posts
If you’ve found this tutorial helpful, follow the links below to learn more about the exciting world of Python programming.
- Learn Python read file operations with our how to guide
- Python list comprehension for beginners
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.