Adding links to PDF in Python
In Multi-column PDFs I went over how to add frames to a reportlab toolkit PDF document using Python. I did this for my San Diego Dining After Midnight web page. While it’s meant for printing, PDFs can contain hyperlinks, and with the growth of PDAs with quality screens, PDFs will probably become even more useful on the computer than on paper. So why not add links to the restaurants and to the map to get to the restaurant?
Adding a linked area to a PDF
There are two ways to make a link in reportlab. The first is by placing it on a rectangle on the page. For example, at the top of each page I have a header that contains the hostname for the Dining After Midnight web site. I create it using a function that draws the hostname in the upper right corner of the page.
[toggle code]
-
def addHeader(canvas, document):
- canvas.saveState()
- hostname = "DiningAfterMidnight.com"
- hostlink = "http://www." + hostname + "/"
- fontsize = 12
- fontname = 'Times-Roman'
- headerBottom = document.bottomMargin+document.height-document.topMargin
- bottomLine = headerBottom - fontsize/4
- topLine = headerBottom + fontsize
- lineLength = document.width+document.leftMargin
- canvas.setFont(fontname,fontsize)
- canvas.drawRightString(lineLength, headerBottom, hostname)
- hostnamewidth = canvas.stringWidth(hostname)
- linkRect = (lineLength, bottomLine, lineLength-hostnamewidth, topLine)
- canvas.linkURL(hostlink, linkRect)
The emphasized lines show the new code to create a link to the hostname over the hostname text. If you’re familiar with HTML, this is not like HTML at all. In HTML, we specify what text the link “belongs to”. In PDF, we specify the rectangular portion of the page that the link lives at. If we want that rectangular portion to correspond to some specific text, we need to determine where that text is and how big it is.
Since the hostname is drawn on the page using drawRightString, we already know the lower right hand corner of the text. The width is available from the stringWidth method.
Linking text in a PDF
The platypus extension to reportlab makes link creation sort of easier. Sort of, because there’s a bit of a gotcha if your PDF contains multiple markup.
The way to mark some text in platypus as being linked is to surround it with the link tag. If you are familiar with HTML, the link tag is very similar to the a tag in HTML:
- <link href="http://www.hoboes.com/Mimsy/'">Mimsy Were the Borogoves</link>
This, for example, modifies the Dining After Midnight PDF code to generate linked restaurant names and linked street addresses:
[toggle code]
- name = restaurant.name
- name = name.replace('’', '<unichar name="RIGHT SINGLE QUOTATION MARK"/>')
-
if restaurant.url:
- link = restaurant.url
-
if link.live:
- name = '<link href="' + link.get_absolute_url() + '">' + name + '</link>'
- address = restaurant.address()
- address = address.replace('’', '<unichar name="RIGHT SINGLE QUOTATION MARK"/>')
- address = '<link href="' + restaurant.mapref() + '">' + address + '</link>'
- items.append(platypus.Paragraph(name, nameStyle))
- items.append(platypus.Paragraph(address, infoStyle))
It’s simple enough. All it is is text manipulation. The “restaurant” object contains its name, its address, and a link object. And it knows how to generate a map link (mapref) to its address.
The problem is that if you have any markup or special characters already in the address or title, those, and only those, linked texts will be underlined. To the reader, it appears that some of the links are underlined, and some aren’t, which really ends up looking as if some texts aren’t linked when they really are.
The reason is that platypus automatically underlines links if other markup appears in the link text, but not if it doesn’t. This includes adding unicode characters using platypus’s unichar markup. As far as I can tell, this doesn’t look like a bug. I have no idea why it’s there, though. I can’t find any mention of why it happens or even that it happens in the documentation.
I “fixed” it by commenting out the line “tx._canvas.line(t_off+x1, y, t_off+x2, y)” on line 464 of reportlab/platypus/paragraph.py:
[toggle code]
-
for x1,x2,link in xs.links:
- #tx._canvas.line(t_off+x1, y, t_off+x2, y)
- _doLink(tx, link, (t_off+x1, y, t_off+x2, yl))
- xs.links = []
This may end up disabling some underlining capabilities in platypus, but since I don’t use underlining it’s not a problem for me. Your mileage may vary.
- Dining After Midnight
- There is nothing quite like the hunger you get at three in the morning when everyone else has gone to sleep. If you’re hanging with the late crowd in San Diego, come and see where you can time out for a bite after midnight!
- ReportLab Toolkit
- “The ReportLab Open Source PDF library is a proven industry-strength PDF generating solution, that you can use for meeting your requirements and deadlines in enterprise reporting systems.”
More Python
- Multiple Input Fields with multiple inheritance
- We needed to display one TextField as either a TextInput or a Textarea, depending on the value in the field. Multiple inheritance makes it easy, if a bit wonky.
- PyTown
- General rambling in code regarding Python, Mailman, and Django.
- Thinking Python: Django cache expiration time
- Django sets the expiration time when data is cached. Sometimes it makes more sense to expire data dynamically based on later changes to the database. Does this mean a change to CacheClass? Not necessarily.
- Django Twitter tag and RSS object
- I wanted to embed my twitter feed into my Django blog, and didn’t see any simple RSS readers for Python that did what I wanted.
- Excerpting partial XHTML using minidom
- You can use xml.dom.minidom to parse partial XHTML as long as you use a few tricks and don’t mind that getElementById doesn’t work.
- 18 more pages with the topic Python, and other related pages
More PDF
- Calculating true three-fold PDF in Python
- Calculating a true three-fold PDF requires determining exactly where the folds should occur.
- Multiple column PDF generation in Python
- You can use ReportLab’s Platypus to generate multi-column PDFs in Snakelets, Django, or any Python app.
- Embedding Mako into Django
- You got Mako in my Django! You got Django on my Mako! Two great templates that template great together.
- Python PDF generation with Snakelets
- One of the things I need to do to move my current web site over to Django is be able to automatically generate PDF documents. Step is to learn how to generate PDF using Python.
- Combining multiple PDF files into a single file
- Automator allows you to combine multiple PDF files into a single file.
- One more page with the topic PDF, and other related pages
