Right Join Dataframes in Python - PythonForBeginners.com

The right join operation is used to join two tables in SQL. In this article, we will discuss how we can perform the right join operation on two dataframes in python.

What is the Right Join Operation?

Consider two tables A and B, where A contains the details of students in a class and table B contains the marks of the students. Both tables A and B have a common column ‘Name’. When we perform the (A right join B) operation on the tables, we get a table that contains all the rows from table B along with the corresponding rows in table A. Apart from that, all the rows from table B that do not have any matching row in table A are also included in the output table. However, The rows belonging to table A, that don’t have any matching row in table B are omitted from the final result.

Hence, we will get a new table that contains personal details as well as marks of the students whose marks are given in table B. The output table will also contain the marks of students whose details are not given in table A. However, the output will not contain the details of students whose marks are not given in table B.

We can also perform the right join operation on pandas dataframes as dataframes contain data in a tabular form. For this, we can use the merge() method, and the join() method as discussed in this article.

You can download the files used in the programs using the below links.

name Download

grade Download

Right Join DataFrames Using the merge() Method in Python

We can perform the right join operation on the dataframes using the merge() method in python. For this, we will invoke the merge() method on the first dataframe. Also, we will pass the second dataframe as the first input argument to the merge() method. Additionally, we will pass the name of the column that is to be matched as the input argument to the ‘on’ parameter and the literal ‘right’ as an input argument to the ‘how’ parameter. After execution, the merge() method will return the output dataframe as shown in the following example.

import pandas as pd
import numpy as np
names=pd.read_csv("name.csv")
grades=pd.read_csv("grade.csv")
resultdf=names.merge(grades,how="right",on="Name")
print("The resultant dataframe is:")
print(resultdf)

Output:

The resultant dataframe is:
   Class_x  Roll_x        Name  Class_y  Roll_y Grade
0      1.0    11.0      Aditya        1      11     A
1      1.0    12.0       Chris        1      12    A+
2      2.0     1.0        Joel        2       1     B
3      2.0    22.0         Tom        2      22    B+
4      3.0    33.0        Tina        3      33    A-
5      3.0    34.0         Amy        3      34     A
6      NaN     NaN  Radheshyam        3      23    B+
7      NaN     NaN       Bobby        3      11     D

If there are rows in the first dataframe that have no matching dataframes in the second dataframe, the rows are not included in the output. However, this is not true for the rows in the second dataframe that do not have any matching row in the first dataframe. All the rows of the second dataframe will be included in the output even if they don’t have any matching row in the first dataframe. You can observe this in the following example.

If there are columns with the same name in both the dataframes, the python interpreter adds _x and _y suffixes to the column names. To identify the columns from the dataframe on which the merge() method in invoked, _x suffix is added. For the dataframe that is passed as the input argument to the merge() method, _y suffix is used.

Suggested Reading: If you are into machine learning, you can read this article on regression in machine learning. You might also like this article on k-means clustering with numerical example.

Right Join DataFrames Using the join() Method in Python

Instead of using the merge() method, we can use the join() method to perform the right join operation on the given dataframes. The join() method, when invoked on a dataframe, takes another dataframe as its first input argument. Additionally, we will pass the name of the column that is to be matched as the input argument to the ‘on’ parameter and the literal “right” as an input argument to the ‘how’ parameter. After execution, the join() method returns the output dataframe as shown in the following example.

import pandas as pd
import numpy as np
names=pd.read_csv("name.csv")
grades=pd.read_csv("grade.csv")
grades=grades.set_index("Name")
resultdf=names.join(grades,how="right",on="Name",lsuffix='_names', rsuffix='_grades')
print("The resultant dataframe is:")
print(resultdf)

Output:

The resultant dataframe is:
     Class_names  Roll_names        Name  Class_grades  Roll_grades Grade
0.0          1.0        11.0      Aditya             1           11     A
1.0          1.0        12.0       Chris             1           12    A+
3.0          2.0         1.0        Joel             2            1     B
4.0          2.0        22.0         Tom             2           22    B+
6.0          3.0        33.0        Tina             3           33    A-
7.0          3.0        34.0         Amy             3           34     A
NaN          NaN         NaN  Radheshyam             3           23    B+
NaN          NaN         NaN       Bobby             3           11     D

While using the join() method, you need to keep in mind that the column on which the join operation is to be performed should be the index of the dataframe that is passed as input argument to the join() method. If the dataframes have same column names for some columns, you need to specify the suffix for column names using the lsuffix and rsuffix parameters. The values passed to these parameters help us identify which column comes from which dataframe if the column names are the same.

Conclusion

In this article, we have discussed two approaches to perform the right join operation on dataframes in python. To know more about programming in python, you can read this article on dictionary comprehension. You might also like this article on list comprehension in python.

Recommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

What is the Right Join Operation?

Right Join DataFrames Using the merge() Method in Python

Right Join DataFrames Using the join() Method in Python

Conclusion

Related

Recommended Python Training

More Python Topics