XML files are used to store data as well as to transmit data in software systems. This article discusses how to read, write, update, and delete data from an XML file in Python. For this task, we will use the xmltodict module in python.
What is an XML File?
XML (eXtensible Markup Language) is a markup language that is used to store and transmit data. It is similar to HTML in structure. But, unlike HTML, XML is designed to store and manipulate data, not to display data. XML uses a set of markup symbols to describe the structure and meaning of the data it contains, and it can be used to store any type of data, including text, numbers, dates, and other information.
An XML file is a plain text file that contains XML code, which can be read by a wide range of software applications, including web browsers, text editors, and specialized XML tools. The structure of an XML file consists of elements, which are defined by tags, and the data within those elements is stored as text or other data types.
The syntax for declaring data in an XML file is as follows.
<field_name> value <field_name>
To understand this, consider the following example.
<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>
The above XML string is a simple XML document that describes the details of an employee.
- The first line of the document,
<?xml version="1.0"?>
, is the XML declaration and it specifies the version of XML that is being used in the document. - The root element of the document is
<employee>
, which contains several other elements: - The
<name>
element stores the name of the employee, which is “John Doe”. - The
<age>
element stores the age of the employee, which is “35”. - The
<job>
element contains information about the employee’s job, including the<title>
(Software Engineer), the<department>
(IT), and the<years_of_experience>
(10) elements. - The
<address>
element contains information about the employee’s address, including the<street>
(123 Main St.),<city>
(San Francisco),<state>
(CA), and<zip>
(94102) elements.
Each of these elements is nested within the parent <employee>
element, creating a hierarchical structure. This structure allows for the data to be organized in a clear and concise manner, making it easy to understand and process.
XML is widely used for data exchange and storage because it is platform-independent, meaning that XML data can be transported and read on any platform or operating system, without the need for proprietary software. It is also human-readable, which makes it easier to debug and maintain, and it is extensible, which means that new elements can be added as needed, without breaking existing applications that use the XML data.
XML files are saved using the .xml extension. Now, we will discuss approaches to read and manipulate XML files using the xmltodict module. You can install this module using pip by executing the following command in your command prompt.
pip3 install xmltodict
Create an XML File in Python
To create an XML file in python, we can use a python dictionary and the unparse()
method defined in the xmltodict module. For this, we will use the following steps.
- First, we will create a dictionary containing the data that needs to be put into the XML file.
- Next, we will use the
unparse()
method to convert the dictionary to an XML string. Theunparse()
method takes the python dictionary as its input argument and returns the XML representation of the string. - Now, we will open an XML file in write mode using the
open()
function. Theopen()
function takes the file name as its first input argument and the literal“w”
as its second input argument. After execution, it returns a file pointer. - Next, we will write the XML string into the file using the
write()
method. Thewrite()
method, when invoked on the file pointer, takes the XML string as its input argument and writes it to the file. - Finally, we will close the file using the
close()
method.
After execution of the above steps, the XML file will be saved in the file system. You can observe this in the following example.
import xmltodict
employee={'employee': {'name': 'John Doe',
'age': '35',
'job': {'title': 'Software Engineer', 'department': 'IT', 'years_of_experience': '10'},
'address': {'street': '123 Main St.', 'city': 'San Francisco', 'state': 'CA', 'zip': '94102'},
}}
file=open("employee.xml","w")
xml_string=xmltodict.unparse(employee)
file.write(xml_string)
file.close()
Instead of using the write()
method, we can directly write the XML data into the file using the unparse()
method. For this, we will pass the python dictionary as the first input argument and the file pointer as the second input argument to the unparse()
method. After execution of the unparse()
method, the data will be saved to the file.
You can observe this in the following example.
import xmltodict
employee={'employee': {'name': 'John Doe',
'age': '35',
'job': {'title': 'Software Engineer', 'department': 'IT', 'years_of_experience': '10'},
'address': {'street': '123 Main St.', 'city': 'San Francisco', 'state': 'CA', 'zip': '94102'},
}}
file=open("employee.xml","w")
xmltodict.unparse(employee,file)
file.close()
The output file looks as follows.
Read an XML File in Python
To read an XML file in python, we will use the following steps.
- First, we will open the file in read mode using the
open()
function. Theopen(
) function takes the file name as its first input argument and the python literal “r” as its second input argument. After execution, it returns a file pointer. - Once we get the file pointer, we will read the file using the
read()
method. Theread()
method, when invoked on the file pointer, returns the file contents as a python string. - Now, we have read the XML file into a string. Next, we will parse it using the
parse()
method defined in the xmltodict module. Theparse()
method takes the XML string as its input and returns the contents of the XML string as a python dictionary. - After parsing the contents of the XML file, we will close the file using the
close()
method.
After executing the above steps, we can read the XML file into a python dictionary. You can observe this in the following example.
import xmltodict
file=open("employee.xml","r")
xml_string=file.read()
print("The XML string is:")
print(xml_string)
python_dict=xmltodict.parse(xml_string)
print("The dictionary created from XML is:")
print(python_dict)
file.close()
Output:
The XML string is:
<?xml version="1.0" encoding="utf-8"?>
<employee><name>John Doe</name><age>35</age><job><title>Software Engineer</title><department>IT</department><years_of_experience>10</years_of_experience></job><address><street>123 Main St.</street><city>San Francisco</city><state>CA</state><zip>94102</zip></address></employee>
The dictionary created from XML is:
{'employee': {'name': 'John Doe', 'age': '35', 'job': {'title': 'Software Engineer', 'department': 'IT', 'years_of_experience': '10'}, 'address': {'street': '123 Main St.', 'city': 'San Francisco', 'state': 'CA', 'zip': '94102'}}}
Add a New Section to an XML File in Python
To add a new section to an existing XML file, we will use the following steps.
- We will open the XML file in “r+” mode using the
open()
function. This will allow us to modify the file. Then, we will read it into a python dictionary using theread()
method and theparse()
method. - Next, we will add the desired data to the python dictionary using key-value pairs.
- After adding the data to the dictionary, we will erase the existing data from the file. For this, we will first go to the start of the file using the
seek()
method. Then, we will erase the file contents using thetruncate()
method. - Next, we will write the updated dictionary as XML to the file using the
unparse()
method. - Finally, we will close the file using the
close()
method.
After executing the above steps, new data will be added to the XML file. You can observe this in the following example.
import xmltodict
file=open("employee.xml","r+")
xml_string=file.read()
python_dict=xmltodict.parse(xml_string)
#add a single element
python_dict["employee"]["boss"]="Aditya"
#Add a section with nested elements
python_dict["employee"]["education"]={"University":"MIT", "Course":"B.Tech", "degree":"Hons."}
file.seek(0)
file.truncate()
xmltodict.unparse(python_dict,file)
file.close()
The output file looks as follows.
In this example, you can observe that we have added a single element as well as a nested element to the XML file.
To add a single element to the XML file, we just need to add a single key-value pair to the dictionary. To add an entire section, we need to add a nested dictionary.
Update Value in an XML File Using Python
To update a value in the XML file, we will first read it into a python dictionary. Then, we will update the values in the dictionary. Finally, we will write the dictionary back into the XML file as shown below.
import xmltodict
file=open("employee.xml","r+")
xml_string=file.read()
python_dict=xmltodict.parse(xml_string)
#update values
python_dict["employee"]["boss"]="Chris"
python_dict["employee"]["education"]={"University":"Harvard", "Course":"B.Sc", "degree":"Hons."}
file.seek(0)
file.truncate()
xmltodict.unparse(python_dict,file)
file.close()
Output:
Delete data From XML File in Python
To delete data from the XML file, we will first read it into a python dictionary. Then, we will delete the key-value pairs from the dictionary. Next, we will dump the dictionary back into the XML file using the unparse()
method. Finally, we will close the file using the close()
method as shown below.
import xmltodict
file=open("employee.xml","r+")
xml_string=file.read()
python_dict=xmltodict.parse(xml_string)
#delete single element
python_dict["employee"].pop("boss")
#delete nested element
python_dict["employee"].pop("education")
file.seek(0)
file.truncate()
xmltodict.unparse(python_dict,file)
file.close()
Output:
In this example, you can observe that we have deleted a single element as well as a nested element from the XML file.
Conclusion
In this article, we have discussed how to perform create, read, update, and delete operations on an XML file in python using the xmltodict module. To learn more about XML files, you can read this article on how to convert XML to YAML in Python. You might also like this article on how to convert JSON to XML in python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.