Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, PHP, and whatever else I happen to feel like hacking at.

Python PDF generation with Snakelets

Jerry Stratton, May 15, 2007

In the process of converting my web site from a FileMaker hack to a Django hack, I’m going to need dynamic PDF generation. The Dining After Midnight site desperately needs an update, but one of its features is a three-fold PDF of late-night restaurants.

A Google search indicated that the tool I need to use is ReportLab Toolkit. None of the examples looked easy, but they looked easier than anything else.

What I really want to be able to do is generate PDF using a Django template. I broke this task down into two steps. One step was to be able to create a PDF in Python. The other step was to be able to embed Python into Django templates.

This article is about the first step. As a test of being able to just plain create a PDF file and display it dynamically over the web, I decided to remove Django entirely from the problem, and use Snakelets. I decided to add a PDF option to my Quick & Dirty Snakelets “blog”. Snakelets is a great way to test programming ideas, and in this case, once I got past the ReportLab examples, it turned out to be very easy.

The first step was to add a new URL to the list of snakelets in the blog app’s __init__.py:

[toggle code]

  • snakelets= {
    • "pdf": blog.PDF,
    • "index.sn": blog.Blog,
    • "*": blog.Blog,
  • }

This is the same as before, except that I’ve added a line telling it to call the PDF class in the blog.py file, when I ask for /blog/pdf/.

The PDF class itself has only two methods on it: the required “serve” method, and a makePDF method that “serve” calls to generate the PDF.

[toggle code]

  • #creates a PDF version of the site
  • class PDF(Blog):
    • def serve(self, request, response):
      • response.setContentType("application/pdf")
      • out=response.getOutput()
      • self.makePDF(out)

This, again, is fairly simple. Except for the “self.makePDF()” method, which I’ll get to in a bit, there isn’t anything really special here. I’ve made PDF be a descendant of my Blog class. And I’ve overriden the Blog class’s “serve” method. It sets the content type to application/pdf so that the Snakelets will tell the browser to expect a PDF document; I get the response output stream so that I can write directly to Snakelet’s version of the page, and then I send that output stream to makePDF.

That’s all standard Snakelets. The second step is to generate a PDF version of the blog. First, I need to import the necessary features from ReportLab. I’m making heavy use of Platypus, which is ReportLab Toolkit’s “easy” version.

  • from reportlab.platypus import Paragraph, SimpleDocTemplate, KeepTogether
  • from reportlab.lib.styles import getSampleStyleSheet
  • from reportlab.lib.units import inch
  • from reportlab.lib.pagesizes import letter

The SimpleDocTemplate is for generating very simple PDF documents (it probably won’t suffice for the multi-column format I’m going to need in the end). I’m going to want to keep some paragraphs together, I’ll be working in inches, and generating letter-size pages.

Here is the makePDF() method for the PDF class:

[toggle code]

    • def makePDF(self, destination):
      • posts = []
      • #create the basic page and stylesheet
      • style = getSampleStyleSheet()
      • pdf = SimpleDocTemplate(destination, pagesize=letter)
      • #display the title and description of the blog
      • title = self.getTitle()
      • description = self.getDescription()
      • posts.append(Paragraph(title, style["Heading1"]))
      • posts.append(Paragraph(description, style["Normal"]))
      • #go through each blog post and display its title and content
      • for postKey in self.app.posts.posts:
        • #get the post
        • post = self.app.posts.posts[postKey]
        • body = post.body
        • title = post.title
        • #each paragraph should have a little space afterwards
        • paraStyle = style["Normal"]
        • paraStyle.spaceAfter = inch*.04
        • #the parts of the post will go into items list
        • items = []
        • headline = Paragraph(title, style["Heading2"])
        • items.append(headline)
        • for paragraph in body.split("\n"):
          • para = paragraph.decode("ascii", "ignore")
          • para = Paragraph(para, paraStyle)
          • items.append(para)
        • #a post should not break across pages
        • item = KeepTogether(items)
        • posts.append(item)
      • pdf.build(posts)
      • return

It loops through each blog post; for each blog post it adds the title as a headline. Then it loops through each paragraph and adds each paragraph with .04 inches of space between them. It creates the PDF document directly to the output stream that Snakelets gave it, and that’s it. Now when I go to http://george.local:9080/blog/pdf I get a PDF version of the blog.

  1. <- Combine PDF files
  2. Mako and Django ->