How to reverse a Jinja template

Not so long ago I had to update some email templates, 15 or so and by looking at content I noticed that there is very little difference between the templates and I even was able to find a pattern: header, body, footer. A solution was to copy/paste the updated text to each of this templates, but then for every little change I have to do the same operation and I did’t like it. All this would be very easy if I could have a layout file with some placeholders and a template to fill the placeholders and using a templating engine to generate the final output.

I already used Jinja in the past so I start working of the idea of reversing a Jinja template, and by that I mean take the layout and the output and try to generate the template. And yes it is possible. I will show how.

This is a very limited solution and only support detecting blocks but it was enough in my case.

Let' get started. For this demo I will use a very simple html template.

TL;DR

Check the project on GitHub github.com/gabihodoroaga/jinja-reverse.

The base template

Let’s take for example a sample file with this content

<html>
    <head>
        <title>My page</title>
    </head>
    <body>
        <h1>Titile</h1>
        <p>Test 1 - Sample title</p>
        <div>
            <p>Test 1 - sample content</p>
        </div>
    </body>
</html>

and save this file to samples/test_1.html.

Then we need to define the blocks, the parts from this file that are different for each template. It’s not necessary to clear the content of the blocks.

<html>
    <head>
        <title>My page</title>
    </head>
    <body>
        <h1>Titile</h1>
        {% block title %}
        <p>SampleBlock</p>
        {% endblock %}
        <div>
            {% block content %}
            <p>SampleBlock</p>
            {% endblock %}
        </div>
    </body>
</html>

and save this file as templates/base.html. This will be our base or layout file.

The reverse script

The approach is:

  • use regex to find the blocks in our base
  • generate a new regex pattern that search for everything except the blocks
  • use the new pattern to find the content of the blocks in the sample file
  • generate the child template
import os, re, glob, argparse

def readfile(file_path):
    with open(file_path,"r") as f:
        return f.read()

def generate(template_folder, base_file_name, sample_folder, sample_pattern):
    # find the blocks pattern
    block_regex = r"(\r\n|\r|\n)([ \t]*)({%\sblock\s([a-zA-Z_]+)\s%}(.*?){% endblock %})"

    base_template = readfile(template_folder+"/"+base_file_name)

    template_pattern = ""
    start = 0
    block_names = []
    for m in re.finditer(block_regex, base_template, flags=re.DOTALL):
        template_pattern+=re.escape(base_template[start:m.span()[0]]) + "(.*?)"
        start = m.span(3)[1]
        block_names.append((m.group(4),m.group(2)))
    # this is the new child template pattern
    template_pattern+=re.escape(base_template[start:])
    # loop through all sample files
    for sample_file_name in list(glob.glob(sample_folder +"/"+sample_pattern)):
        sample_file = readfile(sample_file_name)
        output_template = os.path.join(template_folder,os.path.splitext(os.path.basename(sample_file_name))[0]+".j2")

        with open(output_template,"w+") as t:
            t.write("{% extends \""+base_file_name+"\" %}")
            # find and write the content of the blocks
            for m in re.finditer(template_pattern, sample_file, flags=re.DOTALL):
                for idx, g in enumerate(m.groups()):
                    t.write("\n")
                    t.write("{% block " + block_names[idx][0] + " %}")
                    t.write(g)
                    t.write("\n{% endblock %}")

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('-f', '--folder',default='templates', help="Specify the templates folder.")
    parser.add_argument('-b', '--base',default='base.html', help="Specify the base file.")
    parser.add_argument('-s', '--samples', default='samples', help="Specify the samples folder.")
    parser.add_argument('-g', '--glob',default='*.html', help="Specify the glob pattern.")
    args = parser.parse_args()
    generate(args.folder, args.base,args.samples,args.glob)

if __name__ == '__main__':
    main()

and save this file as reverse.py.

Now let’s test our script

python3 reverse.py

and a new file should be generated named templates/test_1.j2 with the following content

{% extends "base.html" %}
{% block title %}
        <p>Test 1 - Sample title</p>
{% endblock %}
{% block content %}
            <p>Test 1 - sample content</p>
{% endblock %}

This is it. Now you have a base template and a child template.

Conclusion

Jinja is a very powerful and designer-friendly templating engine and together with this script you can save a lot of time at least for me it did.

I also created a new project and repository on GitHub. Check it out github.com/gabihodoroaga/jinja-reverse.