Excel Files are used in the Windows file system to store tabular data. Unlike a CSV file that contains a single sheet, excel files can store multiple sheets. In this article, we will discuss different ways to read multiple sheets from an Excel file into a pandas dataframe.
- Read Excel File Using The read_excel()Method
- Using the ExcelFile Class and read_excel() Method
- Using the ExcelFile Class and The parse() Method
- Pandas Read Multiple Excel Sheets By Name
- Pandas Read Multiple Excel Sheets By Position
- Pandas Read Multiple Excel Sheets into a Dictionary
- Read Multiple Excel Sheets into a Single Pandas DataFrame
- Conclusion
Read Excel File Using The read_excel()Method
To read an Excel file using the read_excel()
method defined in the pandas module, we will simply pass the file name to the read_excel()
method. After execution, the read_excel()
method returns the first sheet of the Excel file as a dataframe.
For the examples in this article, we will use the following Excel file.
You can read the above Excel file using the read_excel()
method in the pandas module as shown below.
import pandas as pd
filename='sample_excel_file.xlsx'
df = pd.read_excel(filename)
print("The dataframe is:")
print(df)
Output:
The dataframe is:
Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65
The Excel file in the above example contains three sheets. However, when we read an Excel file using the read_csv()
method, it returns the first sheet of the Excel file as pandas dataframe.
Using the ExcelFile Class and read_excel() Method
The ExcelFile
class is a great tool for reading multiple sheets from an Excel file in Python. The ExcelFile()
constructor takes the filename of the Excel file as its input argument and returns an ExcelFile
object. We can use this ExcelFile
object to read multiple sheets into pandas dataframes. For instance, you can read the name of all the sheets of an Excel file using the ExcelFile
class as shown below.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet_names=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet_names)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
In the above example, we first created an ExcelFile
object using the ExcelFile()
function. Then, we used the sheet_names
attribute to get the list of sheet names in the input Excel file.
Once we get the ExcelFile
object, we can use it to read a sheet from the Excel file. For this, we will first get the name of all the sheets using the sheet_names
attribute of the ExcelFile
object. Then, we will pass the ExcelFile
object as the first input argument and the name of the required sheet as the second input argument to the read_excel()
method. After execution of the read_excel()
method, you will get the sheet into the dataframe as shown below.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df = pd.read_excel(xls, 'class_1')
print("The dataframe is:")
print(df)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The dataframe is:
Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65
In this example, we passed the name of the Excel File as the first input argument and the name of the sheet as the second input argument to the read_excel()
method. After execution, it returns the specified sheet as a dataframe.
Using the ExcelFile Class and The parse() Method
If you don’t know the names of the sheets in the Excel file, you can use the parse()
method to read the sheet into a pandas dataframe. The parse()
method, when invoked on the ExcelFile
object, takes the position of the sheet in the Excel file. After execution, it returns the data frame as shown below.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
df = xls.parse(0)
print("The first dataframe is excel file is:")
print(df)
Output:
The first dataframe is excel file is:
Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65
In this example, we have passed the index 0 to the parse()
method. Hence, it returned the first sheet of the Excel file in the dataframe. To read other sheets, you can pass their respective position to the parse()
method as the input argument.
Pandas Read Multiple Excel Sheets By Name
To read multiple Excel sheets by name into pandas dataframes, we will first create an ExcelFile
object using the ExcelFile()
function. Then, we will use the sheet_names
attribute of the ExcelFile
object to get a list of all the sheet names in the Excel file. Finally, we will use the read_excel()
method to read the sheets into different dataframes as shown in the following example.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df1 = pd.read_excel(xls, 'class_1')
df2 = pd.read_excel(xls, 'class_2')
df3 = pd.read_excel(xls, 'class_3')
print("The first dataframe is:")
print(df1)
print("The second dataframe is:")
print(df2)
print("The third dataframe is:")
print(df3)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The first dataframe is:
Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65
The second dataframe is:
Student Physics Chemistry Math
0 Harry 68 92 69
1 Dan 74 95 96
2 Joshua 99 79 77
3 Jennifer 72 94 61
4 Lisa 98 99 93
5 Patricia 93 67 78
6 Elizabeth 99 76 78
The third dataframe is:
Student Physics Math Biology
0 Aditya 92 95 73
1 Chris 95 79 71
2 Sam 65 75 95
3 Harry 68 69 66
4 Golu 74 96 76
5 Joel 99 77 91
6 Tom 72 61 65
7 Harsh 98 93 95
8 Clara 93 78 79
9 Tina 99 78 94
Instead of reading the sheets into different variables, we can read multiple sheets of an Excel file into a list of dataframes.
For this, we will first create an empty list to store the output dataframes. Then, we will iterate through the sheet names using the for loop. While iterating, we will read each sheet using the read_excel()
method and add it to the list using the append()
method. After execution of the for loop, we will get all the dataframes created from the Excel file in a list.
You can observe this in the following example.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df_list=[]
for name in sheet:
df=pd.read_excel(xls, name)
df_list.append(df)
print("The list of dataframes is:")
print(df_list)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The list of dataframes is:
[ Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65,
Student Physics Chemistry Math
0 Harry 68 92 69
1 Dan 74 95 96
2 Joshua 99 79 77
3 Jennifer 72 94 61
4 Lisa 98 99 93
5 Patricia 93 67 78
6 Elizabeth 99 76 78,
Student Physics Math Biology
0 Aditya 92 95 73
1 Chris 95 79 71
2 Sam 65 75 95
3 Harry 68 69 66
4 Golu 74 96 76
5 Joel 99 77 91
6 Tom 72 61 65
7 Harsh 98 93 95
8 Clara 93 78 79
9 Tina 99 78 94]
Pandas Read Multiple Excel Sheets By Position
To read multiple Excel sheets into pandas dataframes by position, we will use the parse()
method and the ExcelFile()
function. For this, we will use the following steps.
- First, we will create an
ExcelFile
object using theExcelFile()
function. TheExcelFile()
function takes the name of the Excel file as its input argument and returns anExcelFile
object. - Next, we find the number of sheets in the Excel file. For this, we will first get the list of sheet names using the
sheet_names
attribute of theExcelFile
object. We will also get the length of the list containing sheet names using thelen()
function. - Once we get the number of sheets in the Excel file, we will use the
parse()
method to read all the sheets into the pandas dataframes. Theparse()
method, when invoked on anExcelFile
object, takes the position of the sheet as its input argument and returns the data into a dataframe.
You can observe the steps discussed above in the following example.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
number_of_sheets=len(sheet)
for position in range(number_of_sheets):
df=xls.parse(position)
print("The dataframe at position {} is:".format(position))
print(df)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The dataframe at position 0 is:
Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65
The dataframe at position 1 is:
Student Physics Chemistry Math
0 Harry 68 92 69
1 Dan 74 95 96
2 Joshua 99 79 77
3 Jennifer 72 94 61
4 Lisa 98 99 93
5 Patricia 93 67 78
6 Elizabeth 99 76 78
The dataframe at position 2 is:
Student Physics Math Biology
0 Aditya 92 95 73
1 Chris 95 79 71
2 Sam 65 75 95
3 Harry 68 69 66
4 Golu 74 96 76
5 Joel 99 77 91
6 Tom 72 61 65
7 Harsh 98 93 95
8 Clara 93 78 79
9 Tina 99 78 94
In the above example, we printed the dataframes after reading them using a for loop and the parse()
method. Instead of printing the dataframes, you can also make a list of dataframes of all the sheets as shown below.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
number_of_sheets=len(sheet)
df_list=[]
for position in range(number_of_sheets):
df=xls.parse(position)
df_list.append(df)
print("The list of dataframes is:")
print(df_list)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The list of dataframes is:
[ Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65,
Student Physics Chemistry Math
0 Harry 68 92 69
1 Dan 74 95 96
2 Joshua 99 79 77
3 Jennifer 72 94 61
4 Lisa 98 99 93
5 Patricia 93 67 78
6 Elizabeth 99 76 78,
Student Physics Math Biology
0 Aditya 92 95 73
1 Chris 95 79 71
2 Sam 65 75 95
3 Harry 68 69 66
4 Golu 74 96 76
5 Joel 99 77 91
6 Tom 72 61 65
7 Harsh 98 93 95
8 Clara 93 78 79
9 Tina 99 78 94]
In this example, we created a list to store the output list of dataframes. Inside the for loop, we have read the sheets into dataframes using the parse()
method and appended it to the list using the append()
method. After execution of the for loop, we get the list of dataframes as shown above.
Pandas Read Multiple Excel Sheets into a Dictionary
Instead of a list, we can read an Excel sheet into a Python dictionary. Here, the keys of the Python dictionary are the sheet names and the values are the dataframes corresponding to each sheet. We will read the Excel file using the read_excel()
method. However, we will set the sheet_name
parameter to None
in the read_excel()
method. After execution of the read_excel()
method, we will get the dictionary containing the dataframe as shown below.
import pandas as pd
myDict= pd.read_excel('sample_excel_file.xlsx',sheet_name=None)
print("The dictionary of dataframes is:")
print(myDict)
Output:
The dictionary of dataframes is:
{'class_1': Student Physics Chemistry Biology
0 Wade 92 76 73
1 Dave 95 96 71
2 Ivan 65 62 95
3 Riley 68 92 66
4 Gilbert 74 95 76
5 Jorge 99 79 91
6 Dan 72 94 65,
'class_2': Student Physics Chemistry Math
0 Harry 68 92 69
1 Dan 74 95 96
2 Joshua 99 79 77
3 Jennifer 72 94 61
4 Lisa 98 99 93
5 Patricia 93 67 78
6 Elizabeth 99 76 78,
'class_3': Student Physics Math Biology
0 Aditya 92 95 73
1 Chris 95 79 71
2 Sam 65 75 95
3 Harry 68 69 66
4 Golu 74 96 76
5 Joel 99 77 91
6 Tom 72 61 65
7 Harsh 98 93 95
8 Clara 93 78 79
9 Tina 99 78 94}
In this example, the read_excel()
method returned a dictionary containing dataframes instead of a single dataframe. This is due to the reason that we have set the sheet_name
parameter to None
.
Read Multiple Excel Sheets into a Single Pandas DataFrame
To read multiple Excel sheets into a single dataframe, we will first read the sheets into a list of dataframes using the approaches discussed in the previous sections. Then, we will use the pandas concat operation to concatenate all the dataframes. You can observe this in the following example.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
df_list=[]
for name in sheet:
df=pd.read_excel(xls, name)
df_list.append(df)
output_df=pd.concat(df_list, ignore_index=True)
print("The output dataframe is:")
print(output_df)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The output dataframe is:
Student Physics Chemistry Biology Math
0 Wade 92 76.0 73.0 NaN
1 Dave 95 96.0 71.0 NaN
2 Ivan 65 62.0 95.0 NaN
3 Riley 68 92.0 66.0 NaN
4 Gilbert 74 95.0 76.0 NaN
5 Jorge 99 79.0 91.0 NaN
6 Dan 72 94.0 65.0 NaN
7 Harry 68 92.0 NaN 69.0
8 Dan 74 95.0 NaN 96.0
9 Joshua 99 79.0 NaN 77.0
10 Jennifer 72 94.0 NaN 61.0
11 Lisa 98 99.0 NaN 93.0
12 Patricia 93 67.0 NaN 78.0
13 Elizabeth 99 76.0 NaN 78.0
14 Aditya 92 NaN 73.0 95.0
15 Chris 95 NaN 71.0 79.0
16 Sam 65 NaN 95.0 75.0
17 Harry 68 NaN 66.0 69.0
18 Golu 74 NaN 76.0 96.0
19 Joel 99 NaN 91.0 77.0
20 Tom 72 NaN 65.0 61.0
21 Harsh 98 NaN 95.0 93.0
22 Clara 93 NaN 79.0 78.0
23 Tina 99 NaN 94.0 78.0
In the above example, we have used the read_csv()
method to read the sheets from the Excel file. Instead, you can use the approach using the parse()
method as shown below.
import pandas as pd
xls = pd.ExcelFile('sample_excel_file.xlsx')
sheet=xls.sheet_names
print("The sheet names in the excel file are:")
print(sheet)
number_of_sheets=len(sheet)
df_list=[]
for position in range(number_of_sheets):
df=xls.parse(position)
df_list.append(df)
output_df=pd.concat(df_list, ignore_index=True)
print("The output dataframe is:")
print(output_df)
Output:
The sheet names in the excel file are:
['class_1', 'class_2', 'class_3']
The output dataframe is:
Student Physics Chemistry Biology Math
0 Wade 92 76.0 73.0 NaN
1 Dave 95 96.0 71.0 NaN
2 Ivan 65 62.0 95.0 NaN
3 Riley 68 92.0 66.0 NaN
4 Gilbert 74 95.0 76.0 NaN
5 Jorge 99 79.0 91.0 NaN
6 Dan 72 94.0 65.0 NaN
7 Harry 68 92.0 NaN 69.0
8 Dan 74 95.0 NaN 96.0
9 Joshua 99 79.0 NaN 77.0
10 Jennifer 72 94.0 NaN 61.0
11 Lisa 98 99.0 NaN 93.0
12 Patricia 93 67.0 NaN 78.0
13 Elizabeth 99 76.0 NaN 78.0
14 Aditya 92 NaN 73.0 95.0
15 Chris 95 NaN 71.0 79.0
16 Sam 65 NaN 95.0 75.0
17 Harry 68 NaN 66.0 69.0
18 Golu 74 NaN 76.0 96.0
19 Joel 99 NaN 91.0 77.0
20 Tom 72 NaN 65.0 61.0
21 Harsh 98 NaN 95.0 93.0
22 Clara 93 NaN 79.0 78.0
23 Tina 99 NaN 94.0 78.0
You can also read the Excel sheets into a dictionary and then concatenate them into a single dataframe as shown below.
import pandas as pd
myDict= pd.read_excel('sample_excel_file.xlsx',sheet_name=None)
print("The output dataframe is:")
df=pd.concat(myDict, ignore_index=True)
print(df)
Output:
The output dataframe is:
Student Physics Chemistry Biology Math
0 Wade 92 76.0 73.0 NaN
1 Dave 95 96.0 71.0 NaN
2 Ivan 65 62.0 95.0 NaN
3 Riley 68 92.0 66.0 NaN
4 Gilbert 74 95.0 76.0 NaN
5 Jorge 99 79.0 91.0 NaN
6 Dan 72 94.0 65.0 NaN
7 Harry 68 92.0 NaN 69.0
8 Dan 74 95.0 NaN 96.0
9 Joshua 99 79.0 NaN 77.0
10 Jennifer 72 94.0 NaN 61.0
11 Lisa 98 99.0 NaN 93.0
12 Patricia 93 67.0 NaN 78.0
13 Elizabeth 99 76.0 NaN 78.0
14 Aditya 92 NaN 73.0 95.0
15 Chris 95 NaN 71.0 79.0
16 Sam 65 NaN 95.0 75.0
17 Harry 68 NaN 66.0 69.0
18 Golu 74 NaN 76.0 96.0
19 Joel 99 NaN 91.0 77.0
20 Tom 72 NaN 65.0 61.0
21 Harsh 98 NaN 95.0 93.0
22 Clara 93 NaN 79.0 78.0
23 Tina 99 NaN 94.0 78.0
Conclusion
In this article, we discussed different ways to read an Excel file into a pandas dataframe. To learn more about Python programming, you can read this article on how to convert JSON to INI format in Python. You might also like this article on how to convert XML to JSON in Python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.