My document generation workflow with Markdown, YAML, Jinja2 and WeasyPrint
Published:
Updated:
Here is the workflow I am using to generate simple text documents (resume, cover letters, etc.) from Markdown, YAML and Jinja2 templates.
Summary:
- input document is in Markdown with YAML frontmatter
- HTML conversion using a Jinja2 template
- PDF conversion from HTML with WeasyPrint
Good-old make
coordinates the different steps.
The nice things about this approach is that:
- you write your content in Markdown;
- you can add structured data in YAML;
- you can use your CSS skills to style the document;
- it is VCS friendly (usable diff, merge, etc.);
- you can easily share the Jinja template and CSS style between documents.
Input file
The input of the document is a Markdown file with frontmatter and looks like that:
name: John Doe
title: Super hero
address:
- 221B Baker Street
- London
- UK
lang: en
phone: +XX-X-XX-XX-XX-XX
email: john.doe@example.com
website: http://www.example.com/john.doe/
---
## Introduction
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat
non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
## Discussion
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat
non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
## Conclusion
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat
non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
HTML conversion
I am using a Jinja2 template to convert the input Markdown document into HTML:
<html xmlns="http://www.w3.org/1999/xhtml" lang="{{ lang | escape}}">
<head>
<meta charset="utf-8"/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>{{ title | escape }}</title>
<link rel="stylesheet" type="text/css" href="style.css"/>
</head>
<body>
<header>
<address>
<strong>{{ name | escape }}</strong><br/>
{% for line in address %}
{{ line | escape }}<br/>
{% endfor %}
</address>
<span class="details">
<p><span title="{{ 'Téléphone' if lang == 'fr' else 'Phone' }}">☎</span>
<a href="tel:{{ phone | escape}}">{{ phone | escape }}</a></p>
<p><a href="mailto:{{ email | escape }}">{{ email | escape }}</a></p>
<p><a href="{{ website | escape }}">{{ website | escape }}</a></p>
</span>
</header>
<h1>{{ title | escape }}</h1>
{{ body }}
</body>
</html>
The conversion is done with a Python script (./render
):
#!/usr/bin/env python3
from sys import argv
import re
import yaml
import markdown
from markdown.extensions.extra import ExtraExtension
from jinja2 import Environment, FileSystemLoader
from atomicwrites import atomic_write
EXT = [
ExtraExtension()
]
env = Environment(
loader=FileSystemLoader('.'),
autoescape=False,
)
template = env.get_template('template.j2')
filename = argv[1]
out_filename = argv[2]
RE = re.compile(r'^---\s*$', re.M)
def split_document(data):
"""
Split a document into a YAML frontmatter and a body
"""
lines = str.splitlines(data)
if not RE.match(lines[0]):
raise Exception("Missing YAML start")
for i in range(1, len(lines)):
if RE.match(lines[i]):
head_raw = "\n".join(lines[:i+1])
head = list(yaml.load_all(head_raw))[0]
body = "\n".join(lines[i+2:])
return (head, body)
raise Exception("Missing YAML end")
with open(filename, "r") as f:
content = f.read()
(head, body) = split_document(content)
body_html = markdown.markdown(body, extensions=EXT)
with atomic_write(out_filename, overwrite=True) as f:
f.write(template.render(**head, body=body_html))
Called as:
./render doc.md doc.html
PDF conversion
I am using WeasyPrint to generate PDF from HTML:
weasyprint doc.html doc.pdf
WeasyPrint has some support for page CSS:
@page {
size: A4;
margin: 1cm;
margin-top: 2cm;
margin-bottom: 2cm;
}
@media print {
body {
margin-top: 0;
margin-bottom: 0;
}
}
h1, h2, h3 {
page-break-after: avoid;
page-break-inside: avoid;
}
li {
page-break-inside: avoid;
}
It has support for links, PDF bookmarks, attachements, fonts, etc.
Make
Currently I am using a Makefile
to compose the different steps:
.PHONY: all clear
all: doc.html
clear:
rm doc.html
doc.html: doc.md template.j2 render
./render doc.md doc.html
doc.pdf: doc.html
weasyprint doc.html doc.pdf