Mimsy Were the Borogoves

Hacks: Articles about programming in Python, Perl, PHP, and whatever else I happen to feel like hacking at.

Preflighting blog comments in the Pythonista share screen

Jerry Stratton, August 22, 2018

Pythonista share screen

Pythonista’s share screen lets you run any script you have stored locally; you can also bring up the console, and add commonly-used scripts to a shortcuts area.

The more I use it, the more useful Pythonista becomes at customizing my iPad and iPhone experience. I don’t do a whole lot of commenting on other people’s blogs. One I do comment on occasionally is the Ace of Spades HQ, and it has a somewhat… fragile… 1 commenting system. Specifically, (a) there’s a disconnect with how it handles UTF-8 characters and what it tells browsers it can handle, which means any diacriticals or smart quotes will cause it to fail spectacularly; and (b) it uses a primitive BBCode-like means of italicizing that makes it very possible to “end up in the barrel” by not closing your tags. Not closing your tags means every succeeding comment will be italicized (or bolded, or underlined, or so on) too.

I do a lot of my commenting on the iPhone or iPad, where it is very easy to get screwed over by either autocorrect or fat fingers.

It occurred to me yesterday that this was a great opportunity to start using Pythonista’s sharing extension. You can turn on Pythonista in the sharing window, and it will allow you to run scripts2. Normally, this means running scripts on a text document that you’re “sharing” with Pythonista, but it also has access to the clipboard and whatever text is currently selected.

Having written a script to ensure that I don’t post non-ASCII characters and that I don’t forget to close my brackets, I realized I could make formatting a lot easier by using a simple Markdown converter.

[toggle code]

  • from unicodedata import normalize, name
  • from codecs import register_error
  • from re import sub
  • import clipboard
  • import appex
  • def ascify(text):
    • characters = text.object[text.start:text.end]
    • newValue = ''
    • for character in characters:
      • if character in '’‘':
        • character = "'"
      • elif character in '”“':
        • character = '"'
      • elif character == '—':
        • character = '--'
      • elif name(character).startswith('COMBINING '):
        • character = ''
      • else:
        • print(character, ord(character), name(character))
        • character = ''
      • newValue += character
    • return newValue, text.end
  • register_error('ascify', ascify)
  • def markdown2BB(text):
    • newText = ''
    • for line in text.splitlines():
      • if line.startswith('> '):
        • line = '[i]' + line[2:] + '[/i]'
      • line = sub('\*\*([^*]+)\*\*', '[b]\\1[/b]', line)
      • line = sub('\*([^*]+)\*', '[i]\\1[/i]', line)
      • line = sub('~~([^~]+)~~', '[s]\\1[/s]', line)
      • line = sub('__([^_]+)__', '[u]\\1[/u]', line)
      • newText += line + "\n"
    • newText = newText.strip()
    • return newText
  • def closeBBCode(text):
    • for code in 'ibus':
      • start = '[' + code + ']'
      • end = '[/' + code + ']'
      • starts = text.count(start)
      • ends = text.count(end)
      • while ends < starts:
        • text += end
        • ends += 1
    • return text
  • text = appex.get_text()
  • if not text:
    • text = clipboard.get()
  • if text:
    • fixed = normalize('NFKD', text).encode('ascii', 'ascify')
    • fixed = fixed.decode('ascii')
    • fixed = markdown2BB(fixed)
    • fixed = closeBBCode(fixed)
    • clipboard.set(fixed)
    • #print(fixed)
  • appex.finish()
iPad selection highlight bar

This does three things:

  1. It uses normalize and encode to turn any UTF8 characters into their ASCII equivalents.
  2. It converts simple Markdown to BBCode-style formatting.
  3. It makes sure that any manual formatting is closed.

In the first case, it sets up a function called ascify that will be called every time .encode runs into a non-ASCII character. The function doesn’t worry about most of them—Python knows that ‘é’ needs to be degraded to ‘e’ and that ‘…’ needs to be degraded to ‘...’. It does not, however, know that smart quotes need to be converted to their straight equivalents, or that emdashes need to be turned into double-dashes. Because .encode also sends the function a COMBINING character with each diacritical, the function ignores any character that has a UTF8 name beginning with COMBINING.

If the function is sent any character it doesn’t know what to do with, it prints out the character, the numerical code for the character, and its UTF8 name.

In the second case, it just converts double asterisks to bold, single asterisks to italics, double tildes to strikethrough, and double underscores to underline3. There is a semi-standard on the blog for quotes to be italicized; so I convert Markdown quotes to just italicize the line. It will, in other words, take this:

Sharing pop-up

> The Picture of Dorian Gray is fairly short…

**Yes**. Very, very ~~much sucks donkey balls~~ good in my opinion.

And return this:

[i]The Picture of Dorian Gray is fairly short...[/i]

[b]Yes[/b]. Very, very [s]much sucks donkey balls[/s] good in my opinion.

In the third case, it just counts up the number of start and end codes for italics, bold, underline, and strikethrough, and if there are more start codes than end codes, it appends enough end codes to match.

It first tries to get text using appex.get_text() which will provide text only if some text is highlighted and the share button from the highlight bar was chosen, or an actual text file has been shared. If not, it will look in the clipboard for text.

In both cases, once it has modified the text it sets the clipboard to the modified (fixed) text. I use it by highlighting my proposed comment, sharing to ascify in Pythonista, and then pasting back the modified text.

This is pretty basic stuff. But it has, so far, kept me out of the barrel on the site4 and made it a lot easier to conform to the fragile standards of the comments system there.

Obviously, your own needs will be different, but the basic idea of preflighting using Pythonista and the share screen works very well. And the use of Markdown for comments really is nice. While I do the Markdown conversion by hand, because my needs are both simple and nonstandard, if you’re using this for comments on a blog that supports links and images you could use the Markdown modules built in to Pythonista.

To run the script directly from Safari on the iPad or iPhone, you’ll need to tell iOS to show Pythonista in the list of sharing options:

Turn on Pythonista in Sharing Extension

If Pythonista isn’t already showing up, add it to your share options by flipping the switch to green. You can also change where it appears in the list using the drag bars.

  1. It’s the backend that’s fragile. The front-end is perfect, at least for the kind of commenting that goes on on the site. It’s simple and eschews a lot of the stuff that gets in the way of commenting, such as cascading comments. It was the major inspiration for my own comment system.

  2. The scripts must be local; they cannot be on iCloud. Hopefully this can be changed in future updates.

  3. The latter two are, of course, not Markdown, but they are things that get used in the blog’s comments section. Especially strikethrough, which is often used for sarcasm.

  4. Of course, it also ensures that if I do end up in the barrel, it will be for a spectacular screwup.

  1. <- Stuck program