Python lists and dataframes are two of the most used data structures in python. While we use python lists to handle sequential data, dataframes are used to handle tabular data. In this article, we will discuss different ways to convert pandas dataframe to list in python.
Convert Pandas DataFrame to a List of Rows
Each row in a pandas dataframe is stored as a series object with column names of the dataframe as the index and the values of the rows as associated values.
To convert a dataframe to a list of rows, we can use the iterrows()
method and a for loop. The iterrows()
method, when invoked on a dataframe, returns an iterator. The iterator contains all the rows as a tuple having the row index as the first element and a series containing the row data as its second element. We can iterate through the iterator to access all the rows.
To create a list of rows from the dataframe using the iterator, we will use the following steps.
- First, we will create an empty list to store the rows. Let us name it
rowList
. - Next, we will iterate through the rows of the dataframe using the
iterrows()
method and a for loop. - While iterating over the rows, we will add them to
rowList
. For this, we will use theappend()
method. Theappend()
method, when invoked on the list, takes the current row as its input argument and adds the row to the list.
After execution of the for loop, we will get the output list of rows. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=list()
for index,row in df.iterrows():
rowList.append(row)
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[Roll 1
Maths 100
Physics 80
Chemistry 90
Name: 0, dtype: int64, Roll 2
Maths 80
Physics 100
Chemistry 90
Name: 1, dtype: int64, Roll 3
Maths 90
Physics 80
Chemistry 70
Name: 2, dtype: int64, Roll 4
Maths 100
Physics 100
Chemistry 90
Name: 3, dtype: int64]
In this example, you can observe that we have created a list of rows from the dataframe. You can observe that the elements of the list are series objects and not arrays representing the rows.
Pandas DataFrame to List of Arrays in Python
Instead of creating a list of rows, we can create a list of arrays containing the values in rows from the dataframe. For this we will take out the values of the dataframe using the values
attribute. The values
attribute of the dataframe contains a 2-D array containing the row values of the dataframe.
Once we get the values from the dataframe, we will convert the array to a list of arrays using the list()
function. The list()
function takes the values of the array as its input and returns the list of arrays as shown below.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=list(df.values)
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[array([ 1, 100, 80, 90]), array([ 2, 80, 100, 90]), array([ 3, 90, 80, 70]), array([ 4, 100, 100, 90])]
In this example, you can observe that we have created a list of arrays from the dataframe.
Convert Pandas DataFrame to a List of Lists
Instead of creating a list of arrays, we can also convert pandas dataframe into a list of lists. For this, we can use two approaches.
Pandas DataFrame to a List of Lists Using iterrows() Method
To convert a dataframe into a list of lists, we will use the following approach.
- First, we will create an empty list to store the output list.
- Next, we will iterate through the rows of the dataframe using the
iterrows()
method and a for loop. While iteration, we will convert each row into a list before adding it to the output list. - To convert a row into a list, we will use the
tolist()
method. Thetolist()
method, when invoked on a row, returns the list of values in the row. We will add this list to the output list using theappend()
method.
After execution of the for loop, the pandas dataframe is converted to a list of lists. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=list()
for index,row in df.iterrows():
rowList.append(row.tolist())
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[[1, 100, 80, 90], [2, 80, 100, 90], [3, 90, 80, 70], [4, 100, 100, 90]]
Using tolist() Method And The Values Attribute
Instead of using the iterrows()
method and the for loop, we can directly convert the pandas dataframe to a list of lists using the values
attribute. For this, we will first obtain the values in the data frame using the values attribute. Next, we will invoke the tolist()
method on the values. This will give us the list of lists created from the dataframe. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rowList=df.values.tolist()
print("The list of rows is:")
print(rowList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of rows is:
[[1, 100, 80, 90], [2, 80, 100, 90], [3, 90, 80, 70], [4, 100, 100, 90]]
Get a List of Column Names From Dataframe
To get a list of column names from a dataframe, you can use the columns
attribute. The columns
attribute of a dataframe contains a list having all the column names. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
nameList=df.columns
print("The list of column names is:")
print(nameList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of column names is:
Index(['Roll', 'Maths', 'Physics', 'Chemistry'], dtype='object')
Alternatively, you can pass the entire dataframe to the list()
function. When we pass a dataframe to the list()
function, it returns a list containing the columns of the dataframe. You can observe this in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
nameList=list(df)
print("The list of column names is:")
print(nameList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of column names is:
['Roll', 'Maths', 'Physics', 'Chemistry']
Convert Dataframe Column to a List in Python
To convert a dataframe column to a list, you can use the tolist()
method as shown in the following example.
import pandas as pd
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}]
df=pd.DataFrame(myDicts)
print("The original dataframe is:")
print(df)
rollList=df["Roll"].tolist()
print("The list of Roll column is:")
print(rollList)
Output:
The original dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
The list of Roll column is:
[1, 2, 3, 4]
In this example, you can observe that we have used the tolist()
method to convert a dataframe column to a list.
Conclusion
In this article, we discussed different ways to convert pandas dataframe to list in python. We also discussed how to convert the dataframe to a list of rows as well as a list of lists. To know more about python programming, you can read this article on Dataframe Constructor Not Properly Called Error in Pandas. You might also like this article on how to split string into characters in Python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.