Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, PHP, and whatever else I happen to feel like hacking at.

Object-oriented HTML with Python

Jerry Stratton, March 16, 2005

When I chose to write the character management software for my role-playing game in Python instead of (most likely) PHP, I also had to face not being able to so easily intersperse HTML and programming code.

Rather than work up something PHP-like for Python, I decided to try and take advantage of the features that drew me to Python for this project: objects, inheritance, and scope.

Good HTML is hierarchical; HTML elements contain other HTML elements and do not cross over HTML elements.

Hello World

Here is your basic hello, world! script. It creates a HEAD and a BODY part for the web page, and them combines them into an HTML part.

  • import makeHTML
  • pageTitle = 'Hello World'
  • pageHead = makeHTML.part('head')
  • pageHead.addPart('title', content=pageTitle)
  • pageBody = makeHTML.part('body')
  • pageBody.addPart('h1', content=pageTitle)
  • pageBody.addPart('p', content="Oh no, not again!")
  • pageBody.addPart('hr')
  • fullPage = makeHTML.part('html')
  • fullPage.addPiece(pageHead)
  • fullPage.addPiece(pageBody)
  • fullPage.make()

This will produce the HTML code:

[toggle code]

  • <html>
    • <head><title>Hello World</title></head>
    • <body>
      • <h1>Hello World</h1>
      • <p>Oh no, not again!</p>
      • <hr />
    • </body>
  • </html>

You can see a couple of things that I required of my HTML generator:

  • Correctly-indented code: this is not just a matter of making “view source” look nice. It also is a significant aid on tracking down bugs.
  • Empty tags close efficiently and in a manner that makes it easy to read the code.
  • Tags that contain only one item do not needlessly appear on multiple lines.

Most of this is really my personal preference, but I like my personal preferences. That’s part of why I write my own scripts.

Now, because objects can inherit, it is easy to make this easier: some things we make we’ll be making for any page, so we can subclass those parts of the page.

  • import makeHTML
  • pageTitle = 'Hello World'
  • pageHead = makeHTML.head(pageTitle)
  • pageBody = makeHTML.body(pageTitle)
  • pageBody.addPart(content="Oh no, not again!")
  • pageBody.addPart('hr')
  • fullPage = makeHTML.page([pageHead, pageBody])
  • fullPage.make()

This displays:

[toggle code]

  • Content-type: text/html
  • <html>
    • <head><title>Hello World</title></head>
    • <body>
      • <h1>Hello World</h1>
      • <p>Oh no, not again!</p>
      • <hr />
    • </body>
  • </html>

Feeding Mimsy

Here’s an example of a more real but still simple application, that takes an RSS feed and displays it as a web page. It assumes that you have Mark Nottingham’s RSS.py to read the RSS feed.

[toggle code]

  • import makeHTML
  • import RSS
  • pageTitle = 'Mimsy Were the Borogoves'
  • mimsyLink = makeHTML.part('a', content="Mimsy Were the Borogoves", attributes={'href':'http://www.hoboes.com/Mimsy/'});
  • pageHead = makeHTML.head(pageTitle)
  • pageHead.addPiece(makeHTML.styleSheet('stylesheet'))
  • pageBody = makeHTML.body(pageTitle)
  • pageBody.addPart(content="This is an RSS feed from " + mimsyLink.make() + ".")
  • pageBody.addPiece(makeHTML.headline("Latest Headlines from Mimsy"))
  • feed = RSS.TrackingChannel()
  • feed.parse('http://www.hoboes.com/Mimsy/?RSS')
  • entries = makeHTML.part("dl")
  • for article in feed.listItems():
    • articleURL = article[0]
    • articleData = feed.getItem(article)
    • articleTitle = articleData.get((RSS.ns.rss10,'title'))
    • articleDescription = articleData.get((RSS.ns.rss10,'description'))
    • articleLink = makeHTML.part('a', content=articleTitle, attributes={'href':articleURL})
    • entryTitle = makeHTML.part("dt", content=articleLink)
    • entryText = makeHTML.part("dd", content=articleDescription)
    • entries.addPieces([entryTitle, entryText])
  • pageBody.addPiece(entries)
  • fullPage = makeHTML.page([pageHead, pageBody])
  • fullPage.make()

makeHTML.py in depth

How does this work? The makeHTML.py file basically does this by defining a class of object called "part". This is the most basic HTML tag. The part knows how to add IDs, attributes, and content to itself when it is made. It also knows how to add other parts to itself, and how to compile itself and its parts into HTML.

A ‘page’ is a subclass of a part that is, at its most basic, an ‘html’ tag. It inherits all of the abilities of a part, and enhances two portions: it knows how to accept parts when it is made, and it knows that when it compiles itself, it should also print itself out with the text “Content-type: text/html” at the head.

The big piece of this is the ‘make’ method.

[toggle code]

  • class part:
    • def __init__(self, code="p", content=None, style=None, id=None, attributes=None):
      • self.style = style
      • self.id=id
      • self.pieces = []
      • self.code = code
      • if attributes == None:
        • self.attributes = {}
      • else:
        • self.attributes = attributes
      • if isinstance(content, list):
        • self.addPieces(content)
      • elif content != None:
        • self.addPiece(content)
    • def addPiece(self, thePart):
      • self.pieces.append(thePart)
    • def addPieces(self, theParts):
      • for part in theParts:
        • self.addPiece(part)
    • def addAttribute(self, attributename, attributevalue):
      • self.attributes[attributename] = attributevalue
    • def addPart(self, code='p', content=None, style=None, id=None, attributes=None):
      • newPart = part(code, content, style, id, attributes)
      • self.addPiece(newPart)
    • def make(self, tab="\t"):
      • startHTML = '<' + self.code
      • if (self.attributes):
        • for attribute in self.attributes:
          • content = self.attributes[attribute]
          • if content == None:
            • startHTML += ' ' + attribute
          • else:
            • startHTML += ' ' + attribute + '="' + str(content) + '"'
      • if (self.style):
        • startHTML += ' class="' + self.style + '"'
      • if (self.id):
        • startHTML += ' id="' + self.id + '"'
      • if self.pieces:
        • startHTML += '>'
        • partItems = [startHTML]
        • if len(self.pieces) > 1:
          • sep = "\n" + tab
          • finalSep = sep[:-1]
          • newtab = tab + "\t"
        • else:
          • newtab = tab
          • sep = ""
          • finalSep = ""
        • for piece in self.pieces:
          • if isinstance(piece, str):
            • partItems.append(piece)
          • elif isinstance(piece, int) or isinstance(piece, float):
            • partItems.append(str(piece))
          • elif piece == None:
            • partItems.append("")
          • else:
            • partItems.append(piece.make(newtab))
        • code = sep.join(partItems)
        • code += finalSep + '</' + self.code + '>'
        • return code
      • else:
        • startHTML += ' />'
        • return startHTML

This class is made up of six methods: the initialization method, addPiece, addPieces, addAttribute, addPart, and make.

initialization

The initialization method accepts the tag (here called “code”), which defaults to a paragraph, some content text, a style (a css class), a css ID, and some attributes. The attributes must be a dictionary, of the form {'attributename':'attributevalue'}. For example:

  • mimsyLink = makeHTML.part('a', content="Mimsy Were the Borogoves", attributes={'href':'http://www.hoboes.com/Mimsy/'});

This will create a link (“a” tag) surrounding the phrase “Mimsy Were the Borogoves”, with the href attribute of the URL for this web site.

addPiece

The addPiece method is pretty simple: it does nothing except add what it is sent to the list of items that the part contains. Normally, you’ll add either text or another part.

addPieces

The addPieces method is just as simple: it loops through the list and adds each item in the list to this part, one by one. It simply calls “addPiece” for each item to do this.

addAttribute

The addAttribute method adds another named attribute to the part. Attributes are things like the “src” in the IMG tag, or “href” in the A tag.

addPart

The addPart method is a quick way of adding other parts to this part. It accepts the tag name (which defaults to a paragraph), the content, style, id, and attributes dictionary. It has the same syntax as the initialization method.

make

The parts assemble themselves into HTML using the make method. The make method accepts an indentation level, but otherwise gets all of its information from the object. The make method loops through each piece that this part contains, and, if it is a string or a number, concatenates that piece to the current HTML. If the piece is another part, then it calls the make method on that part, and appends the result.

Special children

For the most part, the children of part make it slightly easier and somewhat more readable to do things like create headlines and add style sheets. The potential, however, is for much greater automation of HTML part creation. For example, there is the linkedList class:

[toggle code]

  • class linkedList(part):
    • def __init__(self, links=None, outer = "ul", inner="li", oclass=None, iclass=None, id=None, attributes=None):
      • part.__init__(self, code=outer, style=oclass, id=id, attributes=attributes)
      • self.innercode = inner
      • self.innerstyle = iclass
      • if isinstance(links, list):
        • self.addLinks(links)
    • def addLink(self, link):
      • [url, name] = link
      • link = part("a", attributes={"href": url}, content=name)
      • listitem = part(self.innercode, content=link, style=self.innerstyle)
      • self.pieces.append(listitem)
    • def addLinks(self, links):
      • theLinks = []
      • for link in links:
        • self.addLink(link)

This inherits everything from the basic part but adds a differentiation between outer code and inner code (defaulting to an unordered list and a list item), as well as a list of lists that consist of two parts: the URL of the item and the name of the item. For example:

[toggle code]

  • import makeHTML
  • import RSS
  • feed = RSS.TrackingChannel()
  • feed.parse('http://www.hoboes.com/Mimsy/?RSS')
  • entries = makeHTML.part("dl")
  • articles = []
  • for article in feed.listItems():
    • articleURL = article[0]
    • articleData = feed.getItem(article)
    • articleTitle = articleData.get((RSS.ns.rss10,'title'))
    • articles.append([articleURL, articleTitle])
  • urlList = makeHTML.linkedList(articles, outer="ol")
  • print urlList.make()

This will result in the HTML:

[toggle code]

  • <ol>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=125">Code in HTML</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=84">Berlin Stories</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=96">Criminal Profiles</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=92">End of Society</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=91">Prisoners in Times</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=87">FireBlade Reviews</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=78">Mistress of Mistresses</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=86">Hunter Thompson</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=83">Racist Forfeiture</a></li>
    • <li><a href="http://www.hoboes.com/Mimsy/?ART=81">MP3tunes.com</a></li>
  • </ol>

RSS part

There is a similar class for creating tables, and even a class for creating tables by column instead of by row. It is easy to automate the creation of repeated web page parts, through extending the basic part into special classes. For example, we’ve been using the RSS feed separately from our HTML creation and merging them after grabbing the data. But chances are any application we create is going to display the RSS data in similar ways. There’s no reason we can’t create something like this to handle that:

[toggle code]

  • import makeHTML
  • import RSS
  • class RSSDisplay(makeHTML.part):
    • def __init__(self, rssSource, id=None, style=None):
      • makeHTML.part.__init__(self, code="dl", id=id, style=style)
      • self.readFeed(rssSource)
    • def readFeed(self, rssSource):
      • feed = RSS.TrackingChannel()
      • if feed.parse(rssSource):
        • for article in feed.listItems():
          • articleURL = article[0]
          • articleData = feed.getItem(article)
          • articleTitle = articleData.get((RSS.ns.rss10,'title'))
          • articleDescription = articleData.get((RSS.ns.rss10,'description'))
          • articleLink = makeHTML.part('a', content=articleTitle, attributes={'href':articleURL})
          • entryTitle = makeHTML.part("dt", content=articleLink)
          • entryText = makeHTML.part("dd", content=articleDescription)
          • self.addPieces([entryTitle, entryText])
      • else:
        • self.addPart("dt", content="Unable to load rss feed " + rssSource)

Go ahead and call this with:

  • mimsyFeed = RSSDisplay('http://www.hoboes.com/Mimsy/?RSS')
  • print mimsyFeed.make()

And there you have good, readable HTML for integrating with other web services or local applications.

There was a problem with makePart where it ignored attributes on elements that didn’t have content. I’ve fixed it in the archive and in the example above.

  1. <- Code in HTML
  2. Hacks Category ->