Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, PHP, and whatever else I happen to feel like hacking at.

Django 1.0 feedgenerator and unique IDs

Jerry Stratton, September 9, 2008

I’ll have a longer post on upgrading from Django 0.96.2 to Django 1.0 later, hopefully this weekend. But here’s a note on generating RSS feeds with a unique ID. The new version of feedgenerator.py in django/utils supports adding unique_id to items; but it still doesn’t check to see if the unique ID is a permaLink; the default assumption is still that it is.

For my purpose, this is easy to fix. Line 244 and 245 are:

[toggle code]

  • if item['unique_id'] is not None:
    • handler.addQuickElement(u"guid", item['unique_id'])

I’m going to make the assumption that I’m only passing in a link once. If I’m passing a link as unique_id, I won’t pass it in as link. So if link is not None, unique_id is not a permalink:

[toggle code]

  • if item['unique_id'] is not None:
    • if item['link'] is not None:
      • handler.addQuickElement(u"guid", item['unique_id'], {u"isPermaLink": "false"})
    • else:
      • handler.addQuickElement(u"guid", item['unique_id'])

Note that link is a required parameter for SyndicationFeed.add_item.

In response to Django syndication feed guid: Django’s syndication feed makes it fairly easy to set up a feed from any object, but it uses the object’s link as the unique ID for that object. This doesn’t always work.

August 6, 2012: Fixing Django’s feed generator without hacking Django

I installed security update 1.4.1 for Django yesterday, and when I went to hack feedgenerator.py I thought I’d take another look at somehow subclassing or otherwise overriding the offending code. It’s been a long time since I wrote that hack and maybe I’ve learned enough about Django and/or Python to stop having to hack Django’s source every time I upgrade.

The offending code is in add_item_elements in django.utils.feedgenerator.Rss201rev2Feed. When creating a feed, however, I don’t subclass Rss201rev2Feed, I subclass django.contrib.syndication.views.Feed. In fact, all of my feeds inherit from a base subclass called NSFeed.1

Feed uses Rss201rev2Feed by way of DefaultGenerator. It’s just a property, feed_type, on the Feed class. So I overrode the feed_type property with my own subclass of Rss201rev2Feed and was able to override add_item_elements. I tested it by just putting in one line, “pass”, and checking the feed contents; it was just a bunch of empty items, as hoped for. Replacing “pass” with a “super” call to get the parent method’s functionality restored the feed.

Unfortunately, add_item_elements does a lot of work—it adds everything via a series of if/then statements. It uses an XMLGenerator subclass—the “handler” variable—to add elements to itself depending only on the dict entries in the “item” variable. My first thought was to let the parent add_item_elements do its work and then just add the isPermaLink attribute to the newly-added guid element. As far as I can tell, however, XMLGenerator is focused purely on XML generation, with no methods for XML modifications.

Fortunately guid is an optional element. If it doesn’t exist in the item dict, add_item_elements doesn’t create one. So I can modify handler before passing it through to the parent and then set guid to None. The element already has a guid element with isPermaLink=False and the parent doesn’t add another.

Note that as far as I can tell, none of these classes are documented beyond their signature, so they’re likely subject to change in any Django revision.

February 8, 2011: feedgenerator potentially improved

Hey! I was browsing my referrers today and noticed I was getting hits from Django’s bug tracker. Looks like this hack won’t be necessary to create valid RSS feeds in an unknown version after 1.3.

andreiko’s solution involves adding a new property to a Feed; the feed object will use that property to determine whether the guid provided is a permalink or not.

Here’s the sample:

[toggle code]

  • class Rss(Feed):
    • title = "Chicagocrime.org site news"
    • link = "http://chicagocrime.org/rss/"
    • description = "Updates on changes and additions to chicagocrime.org."
    • guid_is_permalink = False

Looks like a great solution. Once this issue is fixed, I won’t have to hack the Django source when new versions come out—this is the last remaining hack that I use.