Thought leadership from the most innovative tech companies, all in one place.

How to Generate Automated Word Documents with Python

Automating the repetitive tasks that you shouldn’t be wasting time over

image

Perhaps we can all relate to a landmark moment in our data science careers when we’ve had to spend a good portion of a day generating a multitude of repetitive and tedious reports that could have otherwise been generated automatically. It goes without saying that your time is far more valuable than having to attend to the pesky undertaking of filling in tables, and writing headers, sub-headers, and figure numbers. Savor the moment by doing something ever so slightly more purposeful, like reading up on TDS or checking out the latest frameworks available for server-less computing. Literally anything but filling in those pesky rows and columns.

Python-docx

Luckily there’s a package to remedy our predicament — python-docx. If you haven’t already perused their documentation, then I highly advise you to do so. It is quite possibly one of the most intuitive and self-explanatory APIs I have worked with in recent times. It allows you to automate document generation by inserting text, filling in tables, and rendering images into your report on demand. Without further ado, let’s go ahead and generate an automated report of our own!

Creating a Template

Before you can proceed, you must first create your very own template document that is basically a normal Microsoft Word Document (.docx) formulated exactly the way you want your automated report to be, down to every nitty-gritty detail such as typefaces, font sizes, formatting, and page structure. The only thing you need to do afterward is to create placeholders for your automated content and declare them with variable names as shown below.

image

As you’ve probably promptly guessed, any automated content can be declared inside a set of double curly brackets {{variable_name}}. This includes text and images. For tables, it is a little more complicated. You need to create a table with a template row with all the columns included, and then you need to append one row above and one row below with the following notation:

First row:

{%tr for item in _variable_name_ %}

Last row:

{%tr endfor %}

Please note that in the figure above the variable names are

  • table_contents for the Python dictionary that will store our tabular data
  • Index for the dictionary keys (first column)
  • Value for the dictionary values (second column)

Once done, save your document in your directory as a .docx file and proceed with writing the code to invoke the template and generate an automated document.

Source Code

Now that you’ve created your template, fire up Anaconda or any Python IDE of your choice and install the following packages:

pip install docx
pip install docxtpl

Then start a new script and import all the required packages:

from docx.shared import Cm

from docxtpl import DocxTemplate, InlineImage

from docx.shared import Cm, Inches, Mm, Emu

import random

import datetime

import matplotlib.pyplot as plt

Subsequently, import your created template and instantiate your variables:

#Import template document

template = DocxTemplate('automated_report_template.docx')

#Generate list of random values

table_contents = []

x = []

y = []

for i in range(0,12):

number = round(random.random(),3)

table_contents.append({

'Index': i,

'Value': number

})

x.append(i)

y.append(number)

#Plot random values and save figure

fig = plt.figure()

plt.plot(x, y)

fig.savefig('image.png', dpi=fig.dpi)

#Import saved figure

image = InlineImage(template,'image.png',Cm(10))

Finally, declare your variables in a dictionary and use it to render your automated word document:

#Declare template variables

context = {

'title': 'Automated Report',

'day': datetime.datetime.now().strftime('%d'),

'month': datetime.datetime.now().strftime('%b'),

'year': datetime.datetime.now().strftime('%Y'),

'table_contents': table_contents,

'image': image

}

#Render automated report

template.render(context)

template.save('generated_report.docx')

Results

And there you have it, your automatically-generated report in short order! And in case you’re wondering, yes you can generate as many rows and columns as you want in your table and can also render as many pages as needed in your document.

image

If you want to learn more about data visualization and Python, then feel free to check out the following (affiliate linked) courses:

GitHub repository:

☕ Liked this tutorial? Feel free to donate a coffee to me here.




Continue Learning