The pandas apply()
or applymap()
method is used to apply a function to values in a dataframe or a series. In this article, we will discuss the syntax and use of the pandas apply function in Python.
The apply() Method
The apply()
method has the following syntax.
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
- The
func
parameter takes a function that is executed on the series or dataframe. If the input function takes a single value as input and provides a single value as output as in the square root function, the function is executed on each value in the series. When theapply()
method is invoked on a dataframe, the function should take a series as its input.- If the function is an aggregate function such as the sum function, the function is executed with the entire row or column as the input.
- The
axis
parameter is used to specify whether rows or columns are taken as input when we use an aggregate function as input to theapply()
function. By default, it has the value 0 or‘index’
which means that the input function is applied to each column. To apply the function on each row, you can set theaxis
parameter to 1. - The
raw
parameter is used to determine if a row or column is passed as a Series or ndarray object to the input function. By default, it is set to False which means that theapply()
function passes each row or column as a Series to the input function. If you want to improve the performance of the code, you can set theraw
parameter to True. After this, the input function will receive ndarray objects as its input. - The
result_type
parameter is used only when the axis parameter is set to 1. Theresult_type
parameter can take 4 values as input.- When the
result_type
parameter is set to‘expand’
, list-like results will be turned into columns. - When the
result_type
parameter is set toreduce
, theapply()
method returns a Series if possible rather than expanding list-like results. This is the opposite of‘expand’
. - When the
result_type
parameter is set to“broadcast”
, results will be broadcast to the original shape of the DataFrame, and the original index and columns will be retained. - If the
result_type
parameter is set to None, which is its default value, the return value of theapply()
function depends on the return value of the input function. Hence, theapply()
function returns list-like results as a series of those. However, if theapply()
function returns a Series these are expanded to columns.
- When the
After execution, the apply()
function returns the modified dataframe or series.
Pandas Apply a Function to a Series
To apply a function to a pandas series, you can simply pass the function as an input argument to the apply()
method as shown below.
import pandas as pd
import numpy as np
numbers=[100,90,80,90,70,100,60]
series=pd.Series(numbers)
print("The series is:")
print(series)
newSeries=series.apply(np.sqrt)
print("The updated series is:")
print(newSeries)
Output:
he series is:
0 100
1 90
2 80
3 90
4 70
5 100
6 60
dtype: int64
The updated series is:
0 10.000000
1 9.486833
2 8.944272
3 9.486833
4 8.366600
5 10.000000
6 7.745967
dtype: float64
In the above example, the apply()
method, when invoked on the series, takes the numpy.sqrt
function as its input argument. The function is executed on every element of the series and we get the output series.
Here, we have passed the inbuilt numpy.sqrt
function to the apply()
method. You can also pass a custom function to the apply()
method as shown below.
import pandas as pd
import numpy as np
def fun1(x):
nameDict={100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy", 60:"Sixty"}
if x in nameDict:
return nameDict[x]
else:
return x
numbers=[100,90,80,90,70,100,60]
series=pd.Series(numbers)
print("The series is:")
print(series)
newSeries=series.apply(fun1)
print("The updated series is:")
print(newSeries)
Output:
The series is:
0 100
1 90
2 80
3 90
4 70
5 100
6 60
dtype: int64
The updated series is:
0 Hundred
1 Ninety
2 Eighty
3 Ninety
4 Seventy
5 Hundred
6 Sixty
dtype: object
In this example, we have created a function fun1()
that takes a number as input and returns its alphabetical representation. When we pass fun1()
to the apply()
method, you can observe that the function is executed on all the elements of the series.
Pandas Apply Function to a Dataframe
Instead of a series, you can also use the apply function with pandas dataframe. For this, we have to use two functions based on the use case.
If you want to apply an in-built serializable function to a dataframe such as numpy.sqrt
function, you can invoke the apply()
function on the dataframe and pass the function as an input argument. After execution of the apply()
function, it will return a new dataframe as shown below.
import pandas as pd
import numpy as np
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.apply(np.sqrt)
print("The updated dataframe is:")
print(newDf)
Output:
The input dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
4 5 90 90 80
5 6 80 70 70
The updated dataframe is:
Roll Maths Physics Chemistry
0 1.000000 10.000000 8.944272 9.486833
1 1.414214 8.944272 10.000000 9.486833
2 1.732051 9.486833 8.944272 8.366600
3 2.000000 10.000000 10.000000 9.486833
4 2.236068 9.486833 9.486833 8.944272
5 2.449490 8.944272 8.366600 8.366600
If you have defined a custom function that works on dataframe values, the apply()
method doesn’t work with the function. when we pass such a function to the apply()
method as input, the program runs into a python TypeError exception as shown below.
import pandas as pd
import numpy as np
def fun1(x):
nameDict={100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy", 60:"Sixty"}
if x in nameDict:
return nameDict[x]
else:
return x
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.apply(fun1)
print("The updated dataframe is:")
print(newDf)
Output:
TypeError: unhashable type: 'Series'
To avoid the above error saying TypeError: unhashable type: ‘Series’, you can use the applymap()
function instead of the apply()
function to use a custom function on the dataframe values.
Apply Custom Function to Pandas Dataframe Values
You can pass a user-defined function to the applymap()
method to apply a custom function on the pandas dataframe as shown below.
import pandas as pd
import numpy as np
def fun1(x):
nameDict={100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy", 60:"Sixty"}
if x in nameDict:
return nameDict[x]
else:
return x
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
newDf=df.applymap(fun1)
print("The updated dataframe is:")
print(newDf)
Output:
The input dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
4 5 90 90 80
5 6 80 70 70
The updated dataframe is:
Roll Maths Physics Chemistry
0 1 Hundred Eighty Ninety
1 2 Eighty Hundred Ninety
2 3 Ninety Eighty Seventy
3 4 Hundred Hundred Ninety
4 5 Ninety Ninety Eighty
5 6 Eighty Seventy Seventy
In this example, we have used the applymap()
method instead of the apply()
method to apply a custom function to a dataframe. Hence, the program doesn’t run into any errors.
Pandas Apply a Function to a Column in a Dataframe
Instead of the entire pandas dataframe, you can also apply any function on a column in the dataframe. For this, you just need to invoke the apply()
method on the given column and pass the input function to the apply()
method as shown below.
import pandas as pd
import numpy as np
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df["Maths"]=df["Maths"].apply(np.sqrt)
print("The updated dataframe is:")
print(df)
Output:
The input dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
4 5 90 90 80
5 6 80 70 70
The updated dataframe is:
Roll Maths Physics Chemistry
0 1 10.000000 80 90
1 2 8.944272 100 90
2 3 9.486833 80 70
3 4 10.000000 100 90
4 5 9.486833 90 80
5 6 8.944272 70 70
A column in a pandas dataframe is essentially a series object. Hence, the apply()
method works on a column of pandas dataframe in the same manner it works on a series.
Apply Custom Function to One Column in a Dataframe
You can also apply a user-defined function to a column in a dataframe using the apply()
method as shown in the following example.
import pandas as pd
import numpy as np
def fun1(x):
nameDict={100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy", 60:"Sixty"}
if x in nameDict:
return nameDict[x]
else:
return x
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df["Maths"]=df["Maths"].apply(fun1)
print("The updated dataframe is:")
print(df)
Output:
The input dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
4 5 90 90 80
5 6 80 70 70
The updated dataframe is:
Roll Maths Physics Chemistry
0 1 Hundred 80 90
1 2 Eighty 100 90
2 3 Ninety 80 70
3 4 Hundred 100 90
4 5 Ninety 90 80
5 6 Eighty 70 70
Pandas Apply Function to Multiple Columns in a Dataframe
Instead of a single column, you can also apply a function to multiple columns in a dataframe. For this, you need to select all the columns of the dataframe and then apply the function on the columns using the apply()
method as shown in the following example.
import pandas as pd
import numpy as np
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df[["Maths","Physics", "Chemistry"]]=df[["Maths","Physics", "Chemistry"]].apply(np.sqrt)
print("The updated dataframe is:")
print(df)
Output:
The input dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
4 5 90 90 80
5 6 80 70 70
The updated dataframe is:
Roll Maths Physics Chemistry
0 1 10.000000 8.944272 9.486833
1 2 8.944272 10.000000 9.486833
2 3 9.486833 8.944272 8.366600
3 4 10.000000 10.000000 9.486833
4 5 9.486833 9.486833 8.944272
5 6 8.944272 8.366600 8.366600
The above approach works only if the function given to the apply()
method is a built-in serializable function such as numpy.sqrt
.
Apply Custom Function to Multiple Columns in a Dataframe
If you want to apply a custom function on pandas dataframe values, you can use the applymap()
method instead of the apply()
method as shown below.
import pandas as pd
import numpy as np
def fun1(x):
nameDict={100:"Hundred", 90:"Ninety", 80:"Eighty", 70:"Seventy", 60:"Sixty"}
if x in nameDict:
return nameDict[x]
else:
return x
myDicts=[{"Roll":1,"Maths":100, "Physics":80, "Chemistry": 90},
{"Roll":2,"Maths":80, "Physics":100, "Chemistry": 90},
{"Roll":3,"Maths":90, "Physics":80, "Chemistry": 70},
{"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90},
{"Roll":5,"Maths":90, "Physics":90, "Chemistry": 80},
{"Roll":6,"Maths":80, "Physics":70, "Chemistry": 70}]
df=pd.DataFrame(myDicts)
print("The input dataframe is:")
print(df)
df[["Maths","Physics", "Chemistry"]]=df[["Maths","Physics", "Chemistry"]].applymap(fun1)
print("The updated dataframe is:")
print(df)
Output:
The input dataframe is:
Roll Maths Physics Chemistry
0 1 100 80 90
1 2 80 100 90
2 3 90 80 70
3 4 100 100 90
4 5 90 90 80
5 6 80 70 70
The updated dataframe is:
Roll Maths Physics Chemistry
0 1 Hundred Eighty Ninety
1 2 Eighty Hundred Ninety
2 3 Ninety Eighty Seventy
3 4 Hundred Hundred Ninety
4 5 Ninety Ninety Eighty
5 6 Eighty Seventy Seventy
Conclusion
In this article, we have discussed different ways to apply a function to a dataframe using the apply() method. We also discussed how to apply a custom function to a pandas dataframe using the applymap() method.
To learn more about python programming, you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.