XML and YAML are two of the most used file formats in software engineering. Sometimes, we need to convert an XML file to YAML or vice versa. This article discusses how to convert XML string or file to YAML format in Python.
What is XML Format?
XML is a markup language used for encoding documents in a human and machine-readable format. It defines elements using tags, which enclose content and can have attributes for additional information. XML is commonly used for data exchange and defining data formats between different systems. It can be verbose and complex compared to other formats, such as JSON or YAML.
Following is an XML document that contains the data of an employee.
<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>
This XML string represents an employee record. It starts with an XML declaration specifying the version of the standard XML.
- The root element is
<employee>
, which contains child elements for the employee’s name, age, job, and address. - The employee’s name is contained in the
<name>
element, and their age is contained in the<age>
element. - The
<job>
element contains child elements for the employee’s job title, department, and years of experience. - The
<address>
element contains child elements for the employee’s street address, city, state, and zip code.
The XML format is widely used in different areas, including web services, document processing, and data exchange between different systems.
One advantage of XML is that it allows for the definition of structured data formats. This can be helpful in ensuring that data is consistent and accurate. However, because XML can be verbose and more complex than other formats such as JSON or YAML, it may not always be the best choice for all situations.
What is the YAML File Format?
YAML (short for “YAML Ain’t Markup Language”) is a human-readable data serialization format often used for configuration files, data exchange between different systems, and storing structured data. It is designed to be easy to read and write for both humans and machines and uses indentation and simple syntax to define data structures.
YAML supports a wide range of data types, including strings, numbers, booleans, arrays, and maps (also known as dictionaries or hashes). It allows for comments and references between different document parts and can include complex data structures such as nested arrays and maps.
The data shown in the previous XML example can be stored in the YAML file format as shown below.
employee:
name: John Doe
age: 35
job:
title: Software Engineer
department: IT
years_of_experience: 10
address:
street: 123 Main St.
city: San Francisco
state: CA
zip: 94102
Look at the increased readability of the data. That’s why we use YAML files more for storing configurations and structured data.
Convert XML String to YAML String in Python
To convert an XML string to YAML, we will use the xmltodict module and the yaml module. For this, we will use the following steps.
First, we will convert the XML string to a python dictionary using the parse()
method defined in the xmltodict module. The parse()
method takes an XML string as its input argument and returns a python dictionary after execution.
After this, we will convert the python dictionary to a YAML string using the dump()
method defined in the yaml module. The dump()
method takes a dictionary as its input argument and returns a YAML string.
You can observe this in the following example.
import xmltodict
import yaml
xml_string="""<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>"""
print("The XML string is:")
print(xml_string)
python_dict=xmltodict.parse(xml_string)
yaml_string=yaml.dump(python_dict)
print("The YAML string is:")
print(yaml_string)
Output:
The XML string is:
<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>
The YAML string is:
employee:
address:
city: San Francisco
state: CA
street: 123 Main St.
zip: '94102'
age: '35'
job:
department: IT
title: Software Engineer
years_of_experience: '10'
name: John Doe
XML String to YAML File in Python
You can convert an XML string to a YAML file instead of a string. To convert an XML string to a YAML file, you can use the following steps.
- First, we will convert the XML string to a python dictionary using the
parse()
method defined in the xmltodict module. - Next, we will open an empty YAML file using the
open()
function. Theopen()
function takes the file name as its first input argument and the python literal “w” as its second input argument. After execution, it returns a file pointer. - Now, we will dump the python dictionary to the YAML file using the
dump()
method defined in the yaml module. Thedump()
method takes the dictionary as its first argument and the file pointer as its second input argument. - Finally, we will close the file using the
close()
method. After this, the yaml file will be saved to the storage.
You can use the above steps to convert an XML string to a YAML file as shown below.
import xmltodict
import yaml
xml_string="""<?xml version="1.0"?>
<employee>
<name>John Doe</name>
<age>35</age>
<job>
<title>Software Engineer</title>
<department>IT</department>
<years_of_experience>10</years_of_experience>
</job>
<address>
<street>123 Main St.</street>
<city>San Francisco</city>
<state>CA</state>
<zip>94102</zip>
</address>
</employee>"""
python_dict=xmltodict.parse(xml_string)
file=open("person.yaml","w")
yaml.dump(python_dict,file)
file.close()
The output file looks as follows.
Convert XML File to YAML String in Python
We can also convert an XML file to a YAML string. For this, we can use the following steps.
- First, we will open the XML file in read mode using the
open()
function. Here, theopen()
function takes the file name as its first input argument and the literal “r” as its second input argument. After execution, it returns a file pointer. - Next, we will read the contents of the XML file using the
read()
method. Theread()
method, when invoked on the file pointer, returns the file contents as a string. - Once we get the XML string, we will convert it to a python dictionary using the
parse()
method defined in the xmltodict module. Theparse()
method takes the XML string as its input argument and returns the python dictionary. - Finally, we will use the
dump()
method defined in the yaml module to convert the python dictionary to yaml string. Thedump()
method takes the dictionary as its input argument and returns the yaml string.
We will convert the following XML file to YAML.
You can observe the entire process in the following example.
import xmltodict
import yaml
xml_file=open("person.xml","r")
xml_string=xml_file.read()
python_dict=xmltodict.parse(xml_string)
yaml_string=yaml.dump(python_dict)
print("The YAML string is:")
print(yaml_string)
Output:
The YAML string is:
employee:
address:
city: San Francisco
state: CA
street: 123 Main St.
zip: '94102'
age: '35'
job:
department: IT
title: Software Engineer
years_of_experience: '10'
name: John Doe
Convert XML File to YAML File in Python
To convert an XML file to a YAML file in python, you first need to open the XML file and read it into a python dictionary. Then, you can open a yaml file and dump the dictionary into the yaml file as shown below.
import xmltodict
import yaml
xml_file=open("person.xml","r")
xml_string=xml_file.read()
python_dict=xmltodict.parse(xml_string)
file=open("person.yaml","w")
yaml.dump(python_dict,file)
file.close()
Conclusion
In this article, we have discussed how to convert an XML string to YAML in Python. To learn more about file conversions, you can read this article on how to convert YAML to XML in Python. You might also like this article on how to convert a Python dictionary to XML.
I hope you enjoyed reading this article. Stay tuned for more informative articles.
Happy Learning!
Recommended Python Training
Course: Python 3 For Beginners
Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.