Pandas dataframes are used to handle tabular data in Python. In this article, we will discuss how to select a row from a dataframe in Python. We will also discuss how we can use boolean operators to select data from a pandas dataframe.
Select Row From a Dataframe Using iloc Attribute
The iloc
attribute contains an _iLocIndexer
object that works as an ordered collection of the rows in a dataframe. The functioning of the iloc
attribute is similar to list indexing. You can use the iloc
attribute to select a row from the dataframe. For this, you can simply use the position of the row inside the square brackets with the iloc
attribute to select a row of a pandas dataframe as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
position=1
row=myDf.iloc[position]
print("The row at position {} is :{}".format(position,row))
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The row at position 1 is :Class 1
Roll 12
Name Chris
Name: 1, dtype: object
Here, you can observe that the iloc
attribute gives the row at the specified position as output.
Select Row From a Dataframe Using loc Attribute in Python
The loc
attribute of a dataframe works in a similar manner to the keys of a python dictionary. The loc
attribute contains a _LocIndexer
object that you can use to select rows from a pandas dataframe. You can use the index label inside the square brackets with the loc
attribute to access the elements of a pandas series as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
index=2
row=myDf.loc[index]
print("The row at index {} is :{}".format(index,row))
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The row at index 2 is :Class 1
Roll 13
Name Sam
Name: 2, dtype: object
If you have defined a custom index for a dataframe, you can use the index value of a row to select the row from the pandas dataframe as shown below.
myDf=pd.read_csv("samplefile.csv",index_col=0)
print("The dataframe is:")
print(myDf)
index=1
row=myDf.loc[index]
print("The row at index {} is :{}".format(index,row))
Output:
The dataframe is:
Roll Name
Class
1 11 Aditya
1 12 Chris
1 13 Sam
2 1 Joel
2 22 Tom
2 44 Samantha
3 33 Tina
3 34 Amy
The row at index 1 is : Roll Name
Class
1 11 Aditya
1 12 Chris
1 13 Sam
If you have a multilevel index, you can use the indices to select rows from a dataframe as shown below.
myDf=pd.read_csv("samplefile.csv",index_col=[0,1])
print("The dataframe is:")
print(myDf)
index=(1,12)
row=myDf.loc[index]
print("The row at index {} is :{}".format(index,row))
Output:
The dataframe is:
Name
Class Roll
1 11 Aditya
12 Chris
13 Sam
2 1 Joel
22 Tom
44 Samantha
3 33 Tina
34 Amy
The row at index (1, 12) is :Name Chris
Name: (1, 12), dtype: object
Select Column Using Column Name in a Pandas Dataframe
To select a column from a dataframe, you can use the column name with square brackets as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
column_name="Class"
column=myDf[column_name]
print("The {} column is :{}".format(column_name,column))
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The Class column is :0 1
1 1
2 1
3 2
4 2
5 2
6 3
7 3
Name: Class, dtype: int64
If you want to select more than one column from a dataframe, you can pass a list of column names to the square brackets as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
column_names=["Class","Name"]
column=myDf[column_names]
print("The {} column is :{}".format(column_names,column))
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The ['Class', 'Name'] column is : Class Name
0 1 Aditya
1 1 Chris
2 1 Sam
3 2 Joel
4 2 Tom
5 2 Samantha
6 3 Tina
7 3 Amy
Boolean Masking in a Pandas Dataframe
Boolean masking is used to check for a condition in a dataframe. When we apply a boolean operator on a dataframe column, it returns a pandas series object containing True
and False
values based on the condition as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
boolean_mask=myDf["Class"]>1
print("The boolean mask is:")
print(boolean_mask)
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The boolean mask is:
0 False
1 False
2 False
3 True
4 True
5 True
6 True
7 True
Name: Class, dtype: bool
You can select rows from a dataframe using the boolean mask. For this, you need to pass the series containing the boolean mask to the square brackets operator as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
boolean_mask=myDf["Class"]>1
print("The boolean mask is:")
print(boolean_mask)
print("The rows in which class>1 is:")
rows=myDf[boolean_mask]
print(rows)
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The boolean mask is:
0 False
1 False
2 False
3 True
4 True
5 True
6 True
7 True
Name: Class, dtype: bool
The rows in which class>1 is:
Class Roll Name
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
Instead of using square brackets, you can also use the where()
method to select rows from a dataframe using boolean masking. The where()
method, when invoked on a dataframe, takes the boolean masks as its input argument and returns a new dataframe containing the desired rows as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
boolean_mask=myDf["Class"]>1
print("The boolean mask is:")
print(boolean_mask)
print("The rows in which class>1 is:")
rows=myDf.where(boolean_mask)
print(rows)
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The boolean mask is:
0 False
1 False
2 False
3 True
4 True
5 True
6 True
7 True
Name: Class, dtype: bool
The rows in which class>1 is:
Class Roll Name
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 2.0 1.0 Joel
4 2.0 22.0 Tom
5 2.0 44.0 Samantha
6 3.0 33.0 Tina
7 3.0 34.0 Amy
In the above example using the where()
method, the rows where the boolean mask has the value False
, NaN
values are stored in the dataframe. You can drop the rows containing NaN
values as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
boolean_mask=myDf["Class"]>1
print("The boolean mask is:")
print(boolean_mask)
print("The rows in which class>1 is:")
rows=myDf.where(boolean_mask).dropna()
print(rows)
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The boolean mask is:
0 False
1 False
2 False
3 True
4 True
5 True
6 True
7 True
Name: Class, dtype: bool
The rows in which class>1 is:
Class Roll Name
3 2.0 1.0 Joel
4 2.0 22.0 Tom
5 2.0 44.0 Samantha
6 3.0 33.0 Tina
7 3.0 34.0 Amy
You can also use logical operators to create boolean masks from two or more conditions as shown below.
myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
print(myDf)
boolean_mask=(myDf["Class"]>1) & (myDf["Class"]<3)
print("The boolean mask is:")
print(boolean_mask)
print("The rows in which class>1 and <3 is:")
rows=myDf.where(boolean_mask).dropna()
print(rows)
Output:
The dataframe is:
Class Roll Name
0 1 11 Aditya
1 1 12 Chris
2 1 13 Sam
3 2 1 Joel
4 2 22 Tom
5 2 44 Samantha
6 3 33 Tina
7 3 34 Amy
The boolean mask is:
0 False
1 False
2 False
3 True
4 True
5 True
6 False
7 False
Name: Class, dtype: bool
The rows in which class>1 and <3 is:
Class Roll Name
3 2.0 1.0 Joel
4 2.0 22.0 Tom
5 2.0 44.0 Samantha
After creating the boolean mask, you can use it to select the rows where the boolean mask contains True as shown below.
Conclusion
In this article, we discussed how to select a row from a dataframe. We also discussed how to select a column from a dataframe and how to select multiple rows from a dataframe using boolean masking.
To learn more about python programming, you can read this article on list comprehension. If you are looking to get into machine learning, you can read this article on regression in machine learning.
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.