Notify: Just send the damn email. All with one API call.

Convert XML to CSV Using Python

A guide on converting XML to CSV using Python.

Published on

image

Unlike HTML, XML files have their own tags. Based on these tags we can select the required columns to make a CSV.

This process has two conversions: XML → DICT → CSV

Sample XML:

<employees>
   <employee>
      <name>Carolina</name>
        <role>Data Engineer</role>
        <age>24</age>
    </employee>
    <employee>
      <name>Roosaka</name>
        <role>Data Scientist</role>
        <age>27</age>
    </employee>
    <employee>
      <name>Kumar</name>
        <role>Machine Learning Engineer</role>
        <age>31</age>
    </employee>
    <employee>
      <name>Vijay</name>
        <role>Devops Engineer</role>
        <age>26</age>
    </employee>
</employees>

Step 1: Import the required packages

Install the required packages if prompted:

#1
import csv
import xmltodict

Step 2: Converting XML to DICT

#2
# Reading xml file
with open("sample.xml", 'r') as file:
    filedata = file.read()

# Converting xml to python dictionary (ordered dict)
data_dict = xmltodict.parse(filedata)

The converted dict is an ordered dictionary, so the following code converts all the ordered dict elements to dict.

#3
# creating a list of employee data
employee_data_list = [dict(x) for x in data_dict["employees"]["employee"]]

This is how employee_data_list looks, we can parse the data based on our required columns.

[{'name': 'Carolina', 'role': 'Data Engineer', 'age': '24'},
{'name': 'Roosaka', 'role': 'Data Scientist', 'age': '27'},
{'name': 'Kumar', 'role': 'Machine Learning Engineer', 'age': '31'},
{'name': 'Vijay', 'role': 'Devops Engineer', 'age': '26'} ]

Step 3: Converting Dict to CSV

Making headers and rows of employee data and writing it to the CSV file.

#3
# Selecting headers for CSV
HEADERS = ['name', 'role' ,'age']

rows = []

# Interating through each element to get row data
for employee in employee_data_list:
    name = employee["name"]
    role= employee["role"]
    age = employee["age"]

	# Adding data of each employee to row list
	    rows.append([name,role,age])

#Writing to CSV
with open('employee_data.csv', 'w',newline="") as f:
    write = csv.writer(f)
    write.writerow(HEADERS)
    write.writerows(rows)

Conclusion

The data can also be parsed by using ordered dict but for a better understanding of data, I have converted the data to a dictionary in the code snippet #3.

LinkedIn GitHub

Enjoyed this article?

Share it with your network to help others discover it

Notify: Just send the damn email. All with one API call.

Continue Learning

Discover more articles on similar topics