Series data structure is used to handle one-dimensional data in Python. In this article, we will discuss how to create a series using the pandas module, its properties, and manipulation using examples.
- What Is a Series Is Pandas?
- Create a Pandas Series in Python
- The Data Type of Elements in a Pandas Series
- None Type Value in a Pandas Series
- Access Data From a Series Using the Indexing Operator
- Access Data From a Series Using iloc in Python
- Access Data From a Series Using loc Attribute in Python
- Insert Data Into a Pandas Series
- Delete Data From a Pandas Series in Python
- Update Data in a Pandas Series
- Conclusion
What Is a Series Is Pandas?
You can consider a pandas series as a combination of a list and a dictionary. In a series, all the elements are stored in order and you can access them using indices.
Just like we access values from a python dictionary using key names, you can assign labels to the elements in a pandas series and access them using the labels.
Create a Pandas Series in Python
To create a series, we use pandas.Series()
function. It takes a list or a python dictionary as its input argument and returns a series. We have discussed the use of the Series()
function in the following sections.
Convert Python List to Pandas Series
You can create a pandas series using the elements of a list. The Series()
method takes the list as its input argument and returns a Series object as shown below.
import pandas as pd
names = ['Aditya', 'Chris', 'Joel']
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
Output:
The input list is:
['Aditya', 'Chris', 'Joel']
The series is:
0 Aditya
1 Chris
2 Joel
dtype: object
In the output, you can see that the list elements are in the second column. The first column in a series consists of the indices of the series. The indices are used to access the elements from the Series.
By default, the index in the series starts with 0. However, you can explicitly assign the indices to the Series using the index parameter in the Series()
function.
The index parameter takes a list of index values and assigns the index to the elements in the series as shown below.
import pandas as pd
names = ['Aditya', 'Chris', 'Joel']
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names, index=["A","B", "C"])
print(mySeries)
Output:
The input list is:
['Aditya', 'Chris', 'Joel']
The series is:
A Aditya
B Chris
C Joel
dtype: object
Here, we have passed the list ["A", "B", "C"]
to the index parameter of the Series()
function. Due to this, “A”, “B”, and “C” have been assigned as index to the rows of the series.
Here, you need to keep in mind that the number of indices should be equal to the number of elements in the series. Otherwise, the program will run into an error as shown below.
import pandas as pd
names = ['Aditya', 'Chris', 'Joel']
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names, index=["A","B", "C", "D"])
print(mySeries)
Output:
The input list is:
['Aditya', 'Chris', 'Joel']
The series is:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_6004/861141107.py in <module>
4 print(names)
5 print("The series is:")
----> 6 mySeries=pd.Series(names, index=["A","B", "C", "D"])
7 print(mySeries)
ValueError: Length of values (3) does not match length of index (4)
In the above example, we have passed a list containing four elements to the index parameter. However, there are only three elements in the series. Hence, the program runs into ValueError exception.
Instead of passing the list of labels to the Series()
function, you can also assign the list of labels to the index attribute of the series. This will create the index labels for the series as shown below.
import pandas as pd
names = ['Aditya', 'Chris', 'Joel']
print("The input list is:")
print(names)
print("Series before index creation:")
mySeries=pd.Series(names)
print(mySeries)
mySeries.index=["A","B", "C"]
print("Series after index creation:")
print(mySeries)
Output:
The input list is:
['Aditya', 'Chris', 'Joel']
Series before index creation:
0 Aditya
1 Chris
2 Joel
dtype: object
Series after index creation:
A Aditya
B Chris
C Joel
dtype: object
In this example, instead of using the index parameter, we have used the index attribute of the Series object to create index for the rows in the series.
Convert Python Dictionary to Series in Python
To make a pandas series with labels, you can also use a python dictionary. When we pass a dictionary to the Series()
function, the keys of the dictionary become the index labels. The values corresponding to a key become the data values in the series. You can observe this in the following example.
import pandas as pd
names = {"A":'Aditya', "B":'Chris', "C":'Joel'}
print("The input dictionary is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
Output:
The input dictionary is:
{'A': 'Aditya', 'B': 'Chris', 'C': 'Joel'}
The series is:
A Aditya
B Chris
C Joel
dtype: object
In the above example, you can observe that the keys of the dictionary have become the index labels. The corresponding values of the dictionary are assigned to the rows associated with the indices.
Instead of a list or a dictionary, you can also pass a tuple or other ordered iterable objects to the Series()
function to create a pandas series. However, you cannot pass an unordered iterable object such as a set as an input to the Series()
function to create a series. Doing so will make your program run into an error as shown below.
import pandas as pd
names = {'Aditya', 'Chris', 'Joel'}
print("The input set is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
Output:
The input set is:
{'Joel', 'Aditya', 'Chris'}
The series is:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_6004/4101083988.py in <module>
4 print(names)
5 print("The series is:")
----> 6 mySeries=pd.Series(names)
7 print(mySeries)
TypeError: 'set' type is unordered
Here, we have passed a set to the Series()
function. Due to this, the program runs into Python TypeError exception and shows the message that set type is unordered.
The Data Type of Elements in a Pandas Series
When a series is created from the elements of a list or a dictionary, the datatype of the elements in the series is decided based on the data type of the input elements.
For instance, if you pass a list of integers to the Series()
function, the resultant data type of the series will be int64
as shown below.
import pandas as pd
names = [1,2,3,4]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
[1, 2, 3, 4]
The series is:
0 1
1 2
2 3
3 4
dtype: int64
The datatype of series elements is:
int64
The above condition is true for floating point numbers too. However, when we pass a list of floats and ints to the Series()
function, the resultant dataset in the Series is float64
as all the elements are converted to the highest level of compatible data type. You can observe this in the following example.
import pandas as pd
names = [1,2,3.1,4.2]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
[1, 2, 3.1, 4.2]
The series is:
0 1.0
1 2.0
2 3.1
3 4.2
dtype: float64
The datatype of series elements is:
float64
In the above examples, the datatype is written as float64
and int64
because the programs have been executed on a 64 bit machine. If you are running the program on a 32 bit machine, you will get the datatype as int32 and float32. So, don’t worry about that if you get this type of output.
When you pass a list of strings to the Series()
function, the resultant data type of the series elements is “object
” and not string as shown in the following example.
import pandas as pd
names = ['Aditya', 'Chris', 'Joel']
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
['Aditya', 'Chris', 'Joel']
The series is:
0 Aditya
1 Chris
2 Joel
dtype: object
The datatype of series elements is:
object
When we pass a list containing ints, floats, and strings to the Series()
function, the resultant datatype of the series elements is “object
”.
import pandas as pd
names = [1, 2.2, 'Joel']
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
[1, 2.2, 'Joel']
The series is:
0 1
1 2.2
2 Joel
dtype: object
The datatype of series elements is:
object
The elements are assigned the object datatype because we can enclose any value in the object data type. Storing values as object data type helps the interpreter handle the data type of elements in an easy manner.
None Type Value in a Pandas Series
While creating series objects, a special case arises when we pass a list containing the value None to the Series()
function.
When we pass a list of strings containing the value None
to the Series()
function, the data type of the series is “object
”. Here, the value None
is stored as the object type.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
The datatype of series elements is:
object
However, when we pass a list of integers containing the value None
, None
is converted to NaN
which is a floating point representation for a value that doesn’t exist. Thus, the datatype of the series becomes float64
. Similarly, when we pass the value None
in a list of floating point numbers, None
is converted to NaN
.
import pandas as pd
names = [1, 2.2, 3.2, None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
[1, 2.2, 3.2, None]
The series is:
0 1.0
1 2.2
2 3.2
3 NaN
dtype: float64
The datatype of series elements is:
float64
In the previous example, None
was stored as a NoneType
object because the series contains a string. In this example, None
is stored as the floating point value NaN
because the series contains only numbers. Hence, you can say that the python interpreter chooses the best datatype for the series according to the compatibility of existing elements.
When we pass a list containing ints, floats, and strings to the String()
function, None
is stored as the object type. You can observe this in the above example.
import pandas as pd
names = [1, 2.2, 3.2, None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
print("The datatype of series elements is:")
print(mySeries.dtype)
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
The datatype of series elements is:
object
Access Data From a Series Using the Indexing Operator
You can access data from a series using the indexing operator just as you do it to access list elements. For this, you can pass the position of the series element in the indexing operator as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
index=2
myVal=mySeries[index]
print("Element at index {} is {}".format(index,myVal))
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
Element at index 2 is Aditya
If you have assigned labels to the indices, you can use the labels in the indexing operator to access the series elements. This is similar to how we access values of a dictionary using keys and the indexing operator.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names, index=["A","B", "C", "D"])
print(mySeries)
index="B"
myVal=mySeries[index]
print("Element at index {} is {}".format(index,myVal))
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
Element at index B is 2.2
While using the indexing operator, you cannot use the position of the elements to access the elements when integers are used as index labels. For instance, consider the series in the following example. Here, the index labels are integers. Hence, we cannot use index 0 to access the first element in the series or index 1 to access the second element of the series, and so on. Doing so will result in the KeyError exception as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names, index=[4,5,6,7])
print(mySeries)
index=0
myVal=mySeries[index]
print("Element at index {} is {}".format(index,myVal))
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
4 1
5 2.2
6 Aditya
7 None
dtype: object
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/tmp/ipykernel_6004/208185265.py in <module>
7 print(mySeries)
8 index=0
----> 9 myVal=mySeries[index]
10 print("Element at index {} is {}".format(index,myVal))
KeyError: 0
Hence, you can only use the index labels while accessing the elements using indexing operators in these cases. However, you can use the iloc
attribute of the pandas series objects to access the elements using their positions in the series.
Access Data From a Series Using iloc in Python
The functioning of the iloc
attribute is similar to list index. The iloc
attribute contains an _iLocIndexer
object that you can use to access data from the series. You can simply use the position inside the square brackets with the iloc
attribute to access the elements of a pandas series as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
position=0
myVal=mySeries.iloc[position]
print("Element at position {} is {}".format(position,myVal))
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
Element at position 0 is 1
If you are using integers as index labels for the Series, this doesn’t have any effect on the iloc
attribute’s working. The iloc
attribute is used to access list at a certain position. Hence, it doesn’t matter what index we use, the iloc
attribute works the same way.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names,index=[1,2,3,4])
print(mySeries)
position=0
myVal=mySeries.iloc[position]
print("Element at position {} is {}".format(position,myVal))
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
1 1
2 2.2
3 Aditya
4 None
dtype: object
Element at position 0 is 1
Access Data From a Series Using loc Attribute in Python
The loc
attribute of a Series works in a similar manner to the keys of a python dictionary. The loc
attribute contains a _LocIndexer
object that you can use to access data from the series. You can use the index label inside the square brackets with the loc
attribute to access the elements of a pandas series as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
index="A"
myVal=mySeries.loc[index]
print("Element at index {} is {}".format(index,myVal))
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
Element at index A is 1
Insert Data Into a Pandas Series
To insert a single element into a series, you can use the loc
attribute or the append()
method.
To insert data into a series with index labels, you can use the loc
attribute. Here, we will assign the label and value to the series in the same way we add new key-value pairs in a python dictionary.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
index="D"
mySeries.loc[index]=1117
print("The modified series is:")
print(mySeries)
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
A 1
B 2.2
C Aditya
D 1117
dtype: object
The append()
method is used to append a Series to another series. When invoked on a series, it takes another series as its input argument, appends it to the original series, and returns a new series containing elements from both series.
To insert an element into a series, we will first create a new series with the given element. After that, we will append the new series to the existing series using the append()
method as shown in the following example.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
newSeries=pd.Series([1117])
mySeries=mySeries.append(newSeries)
print("The modified series is:")
print(mySeries)
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
A 1
B 2.2
C Aditya
D None
0 1117
dtype: object
You can observe that the indices of the output series are not in order. This is due to the fact that the index of the new series and existing series have been merged along with the elements. To maintain the order of the indices, you can use the ignore_index=True
parameter in the append()
function as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
newSeries=pd.Series([1117])
mySeries=mySeries.append(newSeries, ignore_index=True )
print("The modified series is:")
print(mySeries)
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
0 1
1 2.2
2 Aditya
3 None
4 1117
dtype: object
If the existing series has index labels and the data to be inserted also contains a specific label for the index, you can also use the append()
method to add a new element to the series as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The input list is:")
print(names)
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
newSeries=pd.Series([1117],index=["P"])
mySeries=mySeries.append(newSeries)
print("The modified series is:")
print(mySeries)
Output:
The input list is:
[1, 2.2, 'Aditya', None]
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
A 1
B 2.2
C Aditya
D None
P 1117
dtype: object
The append()
method has been deprecated and it will be removed from the future version of pandas (I am currently using pandas 1.4.3). If you are using the append()
method and getting errors, it might be possible that you are using a newer version of pandas. So, find an alternative approach to add the element to the series.
Suggested Reading: K-Means Clustering using sklearn module in Python
Delete Data From a Pandas Series in Python
To delete data from a series in Python, you can use the drop()
method. The drop()
method, when invoked on a series object takes an index label or a list of index labels as its input argument. After execution, it returns a new series after deleting the data at the specified indices.
To delete a single element from a series having index labels, you can pass the index label to the drop()
function as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
mySeries=mySeries.drop("A")
print("The modified series is:")
print(mySeries)
Output:
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
B 2.2
C Aditya
D None
dtype: object
To delete elements at multiple index labels, you can pass a list of index labels to the drop()
method as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
mySeries=mySeries.drop(["A","D"])
print("The modified series is:")
print(mySeries)
Output:
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
B 2.2
C Aditya
dtype: object
In the above examples, the original series isn’t modified. To delete the elements at the original series, you can use the inplace=True
parameter in the drop()
method as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
mySeries.drop(["A","D"], inplace=True)
print("The modified series is:")
print(mySeries)
Output:
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
B 2.2
C Aditya
dtype: object
To delete elements from a series without index labels, you can use the position of the element at the index and pass it to the drop()
method as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
mySeries.drop(0, inplace=True)
print("The modified series is:")
print(mySeries)
Output:
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
The modified series is:
1 2.2
2 Aditya
3 None
dtype: object
To delete elements at multiple positions, you can pass the list of indices to the drop()
method as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
mySeries.drop([0,1], inplace=True)
print("The modified series is:")
print(mySeries)
Output:
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
The modified series is:
2 Aditya
3 None
dtype: object
Update Data in a Pandas Series
To update an element at a given index, you can use the indexing operator with the assignment operator as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names)
print(mySeries)
mySeries[0]=12345
print("The modified series is:")
print(mySeries)
Output:
The series is:
0 1
1 2.2
2 Aditya
3 None
dtype: object
The modified series is:
0 12345
1 2.2
2 Aditya
3 None
dtype: object
For a series having index labels, you can use the index labels with the assignment operator as shown below.
import pandas as pd
names = [1, 2.2, "Aditya", None]
print("The series is:")
mySeries=pd.Series(names,index=["A","B","C","D"])
print(mySeries)
mySeries["D"]="Chris"
print("The modified series is:")
print(mySeries)
Output:
The series is:
A 1
B 2.2
C Aditya
D None
dtype: object
The modified series is:
A 1
B 2.2
C Aditya
D Chris
dtype: object
Conclusion
In this article, we have discussed how to created a series data structure in Python using the pandas module. We also discussed indexing in a series, how to delete elements from a series, how to update elements in a series, and how to insert element in a series.
To know more about Python programming, you can read this article on list comprehension in Python. You might also like the article on how to create chat application in Python.
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.