Python PDF generation with Snakelets
In the process of converting my web site from a FileMaker hack to a Django hack, I’m going to need dynamic PDF generation. The Dining After Midnight site desperately needs an update, but one of its features is a three-fold PDF of late-night restaurants.
A Google search indicated that the tool I need to use is ReportLab Toolkit. None of the examples looked easy, but they looked easier than anything else.
What I really want to be able to do is generate PDF using a Django template. I broke this task down into two steps. One step was to be able to create a PDF in Python. The other step was to be able to embed Python into Django templates.
This article is about the first step. As a test of being able to just plain create a PDF file and display it dynamically over the web, I decided to remove Django entirely from the problem, and use Snakelets. I decided to add a PDF option to my Quick & Dirty Snakelets “blog”. Snakelets is a great way to test programming ideas, and in this case, once I got past the ReportLab examples, it turned out to be very easy.
The first step was to add a new URL to the list of snakelets in the blog app’s __init__.py:
[toggle code]
-
snakelets= {
- "pdf": blog.PDF,
- "index.sn": blog.Blog,
- "*": blog.Blog,
- }
This is the same as before, except that I’ve added a line telling it to call the PDF class in the blog.py file, when I ask for /blog/pdf/.
The PDF class itself has only two methods on it: the required “serve” method, and a makePDF method that “serve” calls to generate the PDF.
[toggle code]
- #creates a PDF version of the site
-
class PDF(Blog):
-
def serve(self, request, response):
- response.setContentType("application/pdf")
- out=response.getOutput()
- self.makePDF(out)
-
def serve(self, request, response):
This, again, is fairly simple. Except for the “self.makePDF()” method, which I’ll get to in a bit, there isn’t anything really special here. I’ve made PDF be a descendant of my Blog class. And I’ve overriden the Blog class’s “serve” method. It sets the content type to application/pdf so that the Snakelets will tell the browser to expect a PDF document; I get the response output stream so that I can write directly to Snakelet’s version of the page, and then I send that output stream to makePDF.
That’s all standard Snakelets. The second step is to generate a PDF version of the blog. First, I need to import the necessary features from ReportLab. I’m making heavy use of Platypus, which is ReportLab Toolkit’s “easy” version.
- from reportlab.platypus import Paragraph, SimpleDocTemplate, KeepTogether
- from reportlab.lib.styles import getSampleStyleSheet
- from reportlab.lib.units import inch
- from reportlab.lib.pagesizes import letter
The SimpleDocTemplate is for generating very simple PDF documents (it probably won’t suffice for the multi-column format I’m going to need in the end). I’m going to want to keep some paragraphs together, I’ll be working in inches, and generating letter-size pages.
Here is the makePDF() method for the PDF class:
[toggle code]
-
-
def makePDF(self, destination):
- posts = []
- #create the basic page and stylesheet
- style = getSampleStyleSheet()
- pdf = SimpleDocTemplate(destination, pagesize=letter)
- #display the title and description of the blog
- title = self.getTitle()
- description = self.getDescription()
- posts.append(Paragraph(title, style["Heading1"]))
- posts.append(Paragraph(description, style["Normal"]))
- #go through each blog post and display its title and content
-
for postKey in self.app.posts.posts:
- #get the post
- post = self.app.posts.posts[postKey]
- body = post.body
- title = post.title
- #each paragraph should have a little space afterwards
- paraStyle = style["Normal"]
- paraStyle.spaceAfter = inch*.04
- #the parts of the post will go into items list
- items = []
- headline = Paragraph(title, style["Heading2"])
- items.append(headline)
-
for paragraph in body.split("\n"):
- para = paragraph.decode("ascii", "ignore")
- para = Paragraph(para, paraStyle)
- items.append(para)
- #a post should not break across pages
- item = KeepTogether(items)
- posts.append(item)
- pdf.build(posts)
- return
-
def makePDF(self, destination):
It loops through each blog post; for each blog post it adds the title as a headline. Then it loops through each paragraph and adds each paragraph with .04 inches of space between them. It creates the PDF document directly to the output stream that Snakelets gave it, and that’s it. Now when I go to http://george.local:9080/blog/pdf I get a PDF version of the blog.
- Snakelets
- “Snakelets is a very simple-to-use Python web application server. It provides a threaded web server, Ypages (Python HTML template language) and Snakelets: code-centric page request handlers. Snakelet’s focus is to make the creation of dynamic web sites as quick and easy as possible.”
- ReportLab Toolkit
- “The ReportLab Open Source PDF library is a proven industry-strength PDF generating solution, that you can use for meeting your requirements and deadlines in enterprise reporting systems.”
- Django
- “Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.” Oh, the sweet smell of pragmatism.
- Dining After Midnight
- There is nothing quite like the hunger you get at three in the morning when everyone else has gone to sleep. If you’re hanging with the late crowd in San Diego, come and see where you can time out for a bite after midnight!
More PDF
- Quality compressed PDFs in Mac OS X Lion
- The instructions for creating a “reduce PDF file size” filter in Lion are the same as for earlier versions of Mac OS X—except that for some reason ColorSync saves the filter in the wrong place (or, I guess, Preview is looking for them in the wrong place).
- Calculating true three-fold PDF in Python
- Calculating a true three-fold PDF requires determining exactly where the folds should occur.
- Adding links to PDF in Python
- It is very easy to add links to PDF documents using reportlab or platypus in Python.
- Multiple column PDF generation in Python
- You can use ReportLab’s Platypus to generate multi-column PDFs in Snakelets, Django, or any Python app.
- Embedding Mako into Django
- You got Mako in my Django! You got Django on my Mako! Two great templates that template great together.
- Two more pages with the topic PDF, and other related pages
More Python
- Parsing JSKit/Echo XML comments files
- While I’m not a big fan of remote comment systems for privacy reasons, I was willing to use JSKit as a temporary solution because they provide an easy XML dump of posted comments. This weekend, I finally moved my main blog to custom comments; here’s how I parsed JSKit’s XML file.
- Put a relative clock on your Desktop with GeekTool
- There are a lot of desktop clocks that show the absolute time. But sometimes you just want to know if the time is today, or yesterday, or two days ago. Here’s how to do it with Python and GeekTool.
- Multiple tables on the same command
- The way the “random” script currently stands, it does one table at a time. Often, however, you have more than one table you know you’re going to need. Why not use one command to rule them all?
- Easier random tables
- Rather than having to type --table and --count, why not just type the table name and an optional count number?
- Programming for Gamers: Choosing a random item
- If you can understand a roleplaying game’s rules, you can understand programming. Programming is a lot easier.
- 24 more pages with the topic Python, and other related pages
More Snakelets
- Multiple column PDF generation in Python
- You can use ReportLab’s Platypus to generate multi-column PDFs in Snakelets, Django, or any Python app.
- Quick & dirty Snakelets “blog”
- This “No Second Chances” blog engine was fun to write during spare time at ETech 2007. Snakelets appears to be a useful Python webapp server if you need a webapp server immediately.
