We use pandas dataframes for many data processing tasks in Python. Sometimes, we need to drop some rows from the dataframe due to various reasons. In this article, we will discuss different ways to drop rows from a pandas dataframe using the drop()
method.
- The drop() Method
- Drop Rows From Pandas Dataframe by Index Labels
- Drop Rows From Pandas Dataframe by Position
- Drop the First Row From Pandas Dataframe
- Drop the Last Row From a Pandas Dataframe
- Drop Rows Inplace in a Dataframe
- Drop Row if Index Exists in a Pandas Dataframe
- Drop Multiple Rows by Index Labels in a Pandas Dataframe
- Drop Multiple Rows by Position From a Pandas Dataframe
- Drop the First N Rows in a Pandas Dataframe
- Drop the Last N Rows of a Dataframe
- Conclusion
The drop() Method
The drop()
method can be used to drop columns or rows from a pandas dataframe. It has the following syntax.
DataFrame.drop(labels=None, *, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
Here,
- The
index
parameter is used when we have to drop a row from the dataframe. Theindex
parameter takes an index or a list of indices that have to be deleted as its input argument. - The
columns
parameter is used when we need to drop a column from the dataframe. Thecolumns
parameter takes a column name or a list of column names that need to be dropped as its input argument. - The
labels
parameter represents the index or column label that we need to remove from the dataframe. To drop rows from a dataframe, we use the index label. To drop two or more rows, we can also pass a list of indices to thelabels
parameter. - When we don’t use the
index
parameter, we can pass the index of the row that needs to be deleted to thelabels
parameter as its input argument. In such cases, we use theaxis
parameter to decide if we want to drop a row or a column. if we want to drop a column from the dataframe, we set theaxis
parameter to 1. When we want to drop a row from the dataframe, we set theaxis
parameter to 0 which is its default value. - The
level
parameter is used to drop rows from a dataframe when we have multilevel indices. Thelevel
parameter takes the index level or the index name of the row that we want to drop from the dataframe. To drop two or more levels, you can pass the list of index levels or index names to thelevel
parameter. - The
inplace
parameter is used to decide if we get a new dataframe after the drop operation or if we want to modify the original dataframe. Wheninplace
is set to False, which is its default value, the original dataframe isn’t changed and thedrop()
method returns the modified dataframe after execution. To modify the original dataframe, you can setinplace
to True. - The
errors
parameter is used to decide if we want to raise exceptions and errors while executing thedrop()
method. By default, theerrors
parameter is set to“raise”
. Due to this, thedrop()
method raises an exception if anything goes bad while execution. If you don’t want the errors to be raised, you can set the errors parameter to“ignore”
. After this, thedrop()
method will suppress all the exceptions.
After execution, the drop()
method returns a new dataframe if the inplace
parameter is set to False. Otherwise, it modifies the original dataframe and returns None
.
Drop Rows From Pandas Dataframe by Index Labels
To drop columns of a dataframe by index labels, we will pass the index label to the labels
parameter in the drop()
method. After execution, the drop()
method will return a dataframe with all the rows except the row with the index label specified in the labels
parameter. You can observe this in the following example.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
print("After dropping rows with index 55")
df=df.drop(labels=55)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping rows with index 55
The modified dataframe is:
Class Roll Name Grade
Marks
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
78 3 23 Radheshyam B
50 3 11 Bobby D
In the above example, we have created a dataframe using a csv file. Then, we have dropped the rows in the dataframe with index 55. In the output dataframe, you can observe that all the rows with index 55 are absent. Thus, the drop()
method has deleted the rows with the specified index.
Instead of the labels
parameter, you can use the index
parameter in the drop()
method to drop a row from a dataframe as shown in the following example.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
print("After dropping rows with index 55")
df=df.drop(index=55)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping rows with index 55
The modified dataframe is:
Class Roll Name Grade
Marks
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
78 3 23 Radheshyam B
50 3 11 Bobby D
In the above example, we have used the index
parameter instead of the labels
parameter to pass the index value as input to the drop()
method. You can observe that output is same for both the cases. Hence, you can use any of index
or labels
parameter to drop rows from a pandas dataframe.
Drop Rows From Pandas Dataframe by Position
To drop rows from a dataframe by position, we will use the following steps.
- First, we will get the Index object of the dataframe using the
index
attribute. - Next, we will get the element of the index object present at the position of the row we want to drop from the dataframe using indexing operator. This element will be the label of the row we want to delete.
- After obtaining the label of the row to be deleted, we can pass the label to the
labels
parameter as an input argument in thedrop()
method.
After execution of the drop()
method, we will get the modified dataframe as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
position=3
print("After dropping row at position 3")
idx=df.index[position-1]
df=df.drop(labels=idx)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping row at position 3
The modified dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
In the above example, you can observe that we have dropped the row at the third position in the dataframe. Here, the row at the third position has index 82. Therefore, if there exists any other row with index 82, the row will also get deleted from the input dataframe.
In the above example, you can also pass the index label obtained from the index object to the index
parameter in the drop()
method. You will get the same result after execution of the program.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
position=3
print("After dropping row at position 3")
idx=df.index[position-1]
df=df.drop(index=idx)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping row at position 3
The modified dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
Drop the First Row From Pandas Dataframe
To drop the first row from a dataframe, we will first obtain the index label of the first row using the index attribute.
Then, we will pass the index label to the labels
parameter in the drop()
method to drop the first row of the dataframe as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
position=1
print("After dropping first row")
idx=df.index[position-1]
df=df.drop(index=idx)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping first row
The modified dataframe is:
Class Roll Name Grade
Marks
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
78 3 23 Radheshyam B
50 3 11 Bobby D
In this example, we have first use the dataframe index and the indexing operator to obtain the index label of the row at first position i.e. index 55. Then, we have passed the index label to the index
parameter in the drop()
method.
In the output, you can observe that more than one row has been dropped from the dataframe. This is due to the reason that the drop()
method drops the rows by index labels. Hence, all the rows that have the same index as the first row are dropped from the input dataframe.
Drop the Last Row From a Pandas Dataframe
To drop the last row from the dataframe, we will first obtain the total number of rows in the dataframe using the len()
function. The len()
function takes the dataframe as its input argument and returns the total number of rows in the dataframe.
After obtaining the total number of rows, we will obtain the index label of the last row using the index
attribute. After this, we will pass the index label to the labels
parameter in the drop()
method to drop the last row of the dataframe as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
total_rows=len(df)
position=total_rows-1
print("After dropping last row")
idx=df.index[position]
df=df.drop(labels=idx)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping last row
The modified dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
In this example, we have dropped the last row from the input dataframe. Again, if the input dataframe contains rows that have the same index as the last row, all such rows will also be deleted.
Drop Rows Inplace in a Dataframe
In the examples given in the previous sections, you can observe that the original dataframe isn’t modified after deleting rows from it. Instead, a new dataframe is created and returned by the drop()
method. If you want to modify the existing dataframe instead of creating a new one, you can set the inplace
parameter to True in the drop()
method as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
total_rows=len(df)
position=total_rows-1
print("After dropping last row")
idx=df.index[position]
df.drop(index=idx,inplace=True)
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping last row
The modified dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
In this example, we have set the inplace
parameter to True in the drop()
method. Hence, the input dataframe is modified instead of creating a new dataframe. In this case, the drop()
method returns None.
Drop Row if Index Exists in a Pandas Dataframe
If the index label passed to the drop()
method doesn’t exist in the dataframe, the drop()
method runs into a python KeyError exception as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
print("After dropping row at index 1117")
df.drop(index=1117,inplace=True)
print("The modified dataframe is:")
print(df)
Output:
KeyError: '[1117] not found in axis'
In the above example, we have tried to drop a column with index 1117 from the input dataframe. The index 1117 is not present in the input dataframe. Hence, the drop()
method runs into a KeyError exception.
By default, the drop()
method raises the KeyError exception if the index label passed to the labels
or the index
parameter doesn’t exist in the dataframe. To suppress the exception when the index doesn’t exist and drop rows if the index exists, you can set the errors
parameter to “ignore”
as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
print("After dropping row at index 1117")
df.drop(index=1117,inplace=True,errors="ignore")
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping row at index 1117
The modified dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
In this example, we have suppressed the exception by setting the errors parameter to "ignore"
in the drop()
method. Hence, when the index label passed to the labels or the index parameter is doesn’t exist in the input dataframe, the drop()
method has no effect on the input dataframe.
Drop Multiple Rows by Index Labels in a Pandas Dataframe
To drop multiple rows by index labels in a pandas dataframe, you can pass the list containing index labels to the drop()
method as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
indices=[55,88]
print("After dropping rows at indices 55,88")
df.drop(index=indices,inplace=True,errors="ignore")
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping rows at indices 55,88
The modified dataframe is:
Class Roll Name Grade
Marks
78 2 23 Clara B
82 3 33 Tina A
78 3 15 Prashant B
78 3 23 Radheshyam B
50 3 11 Bobby D
In the above example, we have passed the list [55, 88] to the index
parameter in the drop()
method. Hence, all the rows with index 55 and 88 are dropped from the input dataframe.
Suggested Reading: If you are into machine learning, you can read this MLFlow tutorial with code examples. You might also like this article on 15 Free Data Visualization Tools for 2023.
Drop Multiple Rows by Position From a Pandas Dataframe
To drop multiple rows by position from a dataframe, we will first find the index label of all the rows present at the positions that we want to drop using python indexing and the index
attribute. Then, we will pass the list of index labels to the labels
parameter in the drop()
method as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
positions=[3,4,5]
indices=[df.index[i-1] for i in positions]
print("After dropping rows at positions 3,4,5")
df.drop(labels=indices,inplace=True,errors="ignore")
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping rows at positions 3,4,5
The modified dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
55 3 27 Aditya C
50 3 11 Bobby D
In the above example, we have deleted the rows at positions 3, 4, and 5. For this, we have used list comprehension and indexing to obtain the index labels at the specified positions. Then, we passed the list of indices to the labels
parameter in the drop()
method to drop the rows by position in the pandas dataframe.
Drop the First N Rows in a Pandas Dataframe
To drop the first n rows of the dataframe, we will first find the index labels of the first n rows using the index
attribute of the dataframe. Then, we will pass the index labels to the drop()
method as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
n=3
indices=[df.index[i] for i in range(n)]
print("After dropping first 3 rows")
df.drop(index=indices,inplace=True,errors="ignore")
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping first 3 rows
The modified dataframe is:
Class Roll Name Grade
Marks
88 3 34 Amy A
50 3 11 Bobby D
In the above example, we have dropped only first three rows from the pandas dataframe. However, more rows are dropped when the drop()
method is executed. This is due to the reason that the drop()
method deletes the rows by indices. Hence, all the rows having the same indices as the first three rows will be dropped from the dataframe.
Drop the Last N Rows of a Dataframe
To drop the last n rows of a dataframe, we will first find the total number of rows in the dataframe using the len()
function. Then, we will find the index labels of the last n rows using the index attribute and indexing operator. After obtaining the index labels, we will pass them to the labels
parameter in the drop()
method to drop the rows as shown below.
import pandas as pd
df=pd.read_csv("grade2.csv",index_col="Marks")
print("The dataframe is:")
print(df)
total_rows=len(df)
n=3
indices=[df.index[i] for i in range(total_rows-n,total_rows)]
print("After dropping last 3 rows")
df.drop(index=indices,inplace=True,errors="ignore")
print("The modified dataframe is:")
print(df)
Output:
The dataframe is:
Class Roll Name Grade
Marks
55 2 27 Harsh C
78 2 23 Clara B
82 3 33 Tina A
88 3 34 Amy A
78 3 15 Prashant B
55 3 27 Aditya C
78 3 23 Radheshyam B
50 3 11 Bobby D
After dropping last 3 rows
The modified dataframe is:
Class Roll Name Grade
Marks
82 3 33 Tina A
88 3 34 Amy A
In this example, we have dropped only last three rows from the pandas dataframe. However, more rows are dropped when the drop()
method is executed. This is due to the reason that the drop()
method deletes the rows by indices. Hence, all the rows having the same indices as the last three rows will be dropped from the dataframe.
Conclusion
In this article, we have discussed different ways to drop rows from a pandas dataframe. To know more about the pandas module, you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.