We use different file formats to store and transmit data generated from software systems. Sometimes, we need to convert an XML file to JSON to communicate between two software. This article discusses how to convert an XML file or string into JSON format in Python.
What is XML?
XML stands for Extensible Markup Language. This is a standard markup language for describing and structuring data. XML was designed to store and transport data, and it is widely used in various applications and technologies, such as web services, database systems, document formats, and many others.
XML uses tags to define elements, attributes to define properties of those elements, and a hierarchical structure to organize the data. XML documents can be validated against a specific schema or Document Type Definition (DTD) to ensure their conformance to a specific standard.
For instance, the following is an XML string containing the details of an employee.
<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>
In the above XML string, <employee> is the root element. You can observe that each element in the string is represented using the following syntax.
<element_name> value </element_name>
XML is a platform-independent language. We can use it on any operating system and easily share it between different applications and systems. It is also human-readable and can be easily understood and edited by both machines and humans.
What is JSON File Format?
JSON stands for JavaScript Object Notation. It is a lightweight, text-based data interchange format that is easy for humans to read and write and easy for machines to parse and generate. Its structure resembles a python dictionary.
JSON is based on a subset of the JavaScript programming language, but it is language-independent and can be used with any programming language. It is often used as a data format for web APIs, as it is a common way to exchange data between the client and server.
JSON uses a simple syntax to represent data structures, such as objects and arrays. Data is represented using key-value pairs, where the keys are strings and the values can be of any JSON-supported data type, such as strings, numbers, objects, and arrays.
Following is the JSON representation of data shown in the previous XML string.
{
"employee": {
"name": "John Doe",
"age": 35,
"job": {
"title": "Software Engineer",
"department": "IT",
"years_of_experience": 10
},
"address": {
"street": "123 Main St.",
"city": "San Francisco",
"state": "CA",
"zip": 94102
}
}
}
JSON is more concise and easier to parse than XML, making it a popular choice for data transfer between applications and systems. It is also widely supported by modern programming languages, making it an ideal format for web-based applications.
To convert XML files and strings to JSON, we will use the xmltodict module and the json module.
XML String to JSON String in Python
To convert an XML string to a JSON string, we will first convert the XML string to a Python dictionary. For this, we will use the parse() method defined in the xmltodict module. The parse() method takes an XML string as its input argument and returns the corresponding dictionary.
Next, we will convert the python dictionary to a JSON string. For this, we will use the dumps() method defined in the json module. The dumps() method takes the python dictionary as its input argument and returns the corresponding JSON string after execution.
You can observe this in the following example.
import xmltodict
import json
xml_string="""<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>"""
print("The XML string is:")
print(xml_string)
python_dict=xmltodict.parse(xml_string)
json_string=json.dumps(python_dict)
print("The JSON string is:")
print(json_string)
Output:
The XML string is:
<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>
The JSON string is:
{"employee": {"name": "John Doe", "age": "35", "job": {"title": "Software Engineer", "department": "IT", "years_of_experience": "10"}, "address": {"street": "123 Main St.", "city": "San Francisco", "state": "CA", "zip": "94102"}}}
Convert XML String to JSON File in Python
Instead of converting it into a string, we can also convert an XML string to a JSON file. For this, we will use the following steps.
- First, we will convert the XML string to a dictionary using the parse() method defined in the xmltodict module.
- Then, we will open an empty JSON file in write mode using the open() function. The open() function takes the file name as its first input argument and the python literal “w” as its second argument. After execution, it returns a file pointer.
- Once we get the file pointer, we will save the python dictionary as JSON to the file using the dump() method defined in the json module. The dump() method takes the dictionary as its first argument and the file pointer as the second input argument. After execution, it saves the dictionary into the json file as a json object.
- Finally, we will close the file using the close() method.
After executing the above steps, we can easily convert an XML string to a JSON file. You can observe this in the following example.
import xmltodict
import json
xml_string="""<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>"""
python_dict=xmltodict.parse(xml_string)
file=open("person.json","w")
json.dump(python_dict,file)
file.close()
The output file looks as follows.
XML File to JSON String in Python
To convert an XML file into a JSON string, we can use the dumps() method defined in the json module. For this, we will first open the XML file in read mode using the open() function. Then, we will read the contents of the XML file as a string using the read() method.
We will use the following XML file to convert it into a JSON string.
Once we get the XML string, we will convert it to a python dictionary using the parse() method defined in the xmltodict module. Next, we will convert the dictionary into JSON string using the dumps() method defined in the json module.
You can observe this in the following example.
import xmltodict
import json
xml_file=open("person.xml","r")
xml_string=xml_file.read()
python_dict=xmltodict.parse(xml_string)
json_string=json.dumps(python_dict)
print("The JSON string is:")
print(json_string)
Output:
The JSON string is:
{"employee": {"name": "John Doe", "age": "35", "job": {"title": "Software Engineer", "department": "IT", "years_of_experience": "10"}, "address": {"street": "123 Main St.", "city": "San Francisco", "state": "CA", "zip": "94102"}}}
Convert XML File to JSON File in Python
We can also convert an XML file to a JSON file in python. For this, we will use the following steps.
- First, we will open the XML file in read mode using the open() function.
- Next, we will read the contents of the XML file as a string using the read() function.
- Then, we will convert the XML string to a python dictionary using the parse() method defined in the xmltodict module.
- Now that we have the XML file as a dictionary, we will open an empty JSON file in write mode using the open() function.
- Next, we will dump the dictionary into the json file using the dump() method defined in the JSON module. The dump() method takes the dictionary as its first argument and the file pointer as the second input argument. After execution, it saves the dictionary into the json file as a json object.
- Finally, we will close the files using the close() method.
After execution of the above steps, the data in the XML file will be saved in the JSON file. You can observe this by executing the following code.
import xmltodict
import json
xml_file=open("person.xml","r")
xml_string=xml_file.read()
python_dict=xmltodict.parse(xml_string)
file=open("person.json","w")
json.dump(python_dict,file)
file.close()
After execution of the above code, the XML data from “person.xml” will be saved into “person.json” in JSON format.
Conclusion
In this article, we have discussed ways to convert an XML string or file into JSON format in Python. To learn more about file conversions, you can read this article on how to convert JSON to XML in python. You might also like this article on how to convert YAML to XML in Python.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.