Mimsy: Hacks

Goodreads: What books did I read last week and last month?—Friday, February 26th, 2016

I started using Goodreads in 2014, and it’s a nice way of tracking what books I read and what I thought about them. One thing I have definitely noticed missing, however, is an advanced search. Every once in a while I want to see what and how many books I read in the last week, or the last month, or the last x days, and there is no such search. You have to count them up yourself.

At the end of the year, I wanted to find all of the books from 2015 that I’d rated at 5. The only way I could find to do this was to eyeball the list. An advanced search would have made this easy.

In my Django database of books purchased, this sort of data is easy to drill into. It’s very easy to see how many books I purchased in February 2015, for example. And since one of my New Year’s Resolutions is that I am going to read more than I buy, comparing books purchased to books read is important!

However, while it does not provide an advanced search Goodreads does offer data export so that you can save your data to your hard drive. If you use Goodreads extensively it’s a good idea to make regular backups. The export creates a CSV, or comma-separated file, of your books, ratings, reviews, pretty much everything associated with each book you read. This allows us to make an advanced search of our own.

I chose Python because I’m familiar with it, and because it has both a CSV module and an in-memory SQL module (based on SQLite 3) built in.

The drawback to the CSV module is that it is relatively old and not unicode aware. In its defense it assumes everything is utf–8, but it doesn’t mark it that way so that the rest of Python knows. Goodreads, fortunately, provides its CSV file as utf–8. It’s not too hard to make a Python generator that will return unicode/UTF8 values when importing from the Goodreads csv file.

One more trick is that the csv reader only knows of one type of value, the string. But if we want last week’s data, we need to be able to search on a date. So, riffing on LMatter’s stack overflow code, I made a UnicodeDictReader that also converts Goodreads’s Date Read to a Python date.

This part of the script then loops through every line in the csv file and inserts the relevant data into an on-the-fly sqlite3 database.

Converting an existing Django model to Django-MPTT—Friday, February 12th, 2016

For a long time I’ve been painfully aware of a glaring inefficiency in my custom blog software built off of Django. A lot of what it does relies on the tree model: folders within folders. Finding all the front-page articles for my main blog means getting all the children of each category, and then any descendants of those children, and so on. Previewing a new blog article can take a minute or more, mainly because of the recursive nature of building the sidebar. There are only seven articles on the front page, but they require looking through a tree with 1,505 articles spread over five levels.

I’ve been aware of various solutions for turning Django into more of a tree-hugger, but they’ve tended either to not work well with an existing installation, or if they did, their documentation hid that. For example, django-mptt until a few years ago hid the very important .rebuild method, which is necessary for building the tree data for an existing dataset.

Having just finished the basic installation, however, django-mptt is very helpful. On my very simple, not very tree-like blog, the Walkerville Weekly Reader, publish time dropped by around 13% to 21%; since it only took about two minutes anyway, that’s not necessarily a big deal. But Mimsy Were the Borogoves, which I’ve been running since the nineties, has a lot more articles and a much more complex structure. Publish time dropped from about 30 minutes to about 10 minutes. That’s worth the installation troubles.

Converting my model to use django-mptt

I installed django-mptt by downloading version 0.8.0, unpacking it, and running setup, but, despite the documentation claiming that it’s okay with Django 1.8.x, it performed an automatic upgrade to Django 1.9.1. I had to downgrade again using:

  • sudo pip uninstall django
  • sudo pip install django==1.8.8

I’m using django-ajax-selects version 1.3.6, which doesn’t work with Django 1.9.x. While the 1.4 version is compatible with Django 1.9, the 1.4 version doesn’t work on inlines, and inlines are the main reason I use ajax-selects. I have a list of 9,000 URLs and 14,000 pages that can be attached via inline to any page, not to mention the keywords and authors and so forth. Without ajax-select, it takes forever for browsers to render pages because those become pull-down menus.

Converting to django-mptt is very easy. I added this to the top of my models.py:

  • from mptt.models import MPTTModel, TreeForeignKey, TreeManager

I had to include TreeManager because I have a custom manager. If you use the standard manager, you shouldn’t need it.

[toggle code]

  • class PageManager(models.Manager):
  • class Page(models.Model):
    • parent = models.ForeignKey(‘self’, blank=True, null=True)
Test classes and objects in python—Saturday, February 6th, 2016

One of the critical advances in computer programming since I began programming in the eighties are Objects. An Object is a thing that can include both functions and variables. In Python, an object is an instance of a class. For example, Django relies heavily on classes for its models. In Adding links to PDF in Python the main class used is for a restaurant. Each object is an instance of the restaurant class.

But one of the great things about object-oriented programming is that the things that access your objects don’t care what class it is. They care only whether it has the appropriate functions and variables. When they are on an object, a function is called a method and a variable is called a property.

Any object can masquerade as another object by providing the same methods and properties. This means that you can easily make test classes that allows creating objects for use in the PDF example.

In Adding links to PDF in Python, I had a Django model for the Restaurants and a Django model for the Links that were each restaurant’s web page. But because Django models are nothing more than (very useful) classes, you can make a fake Restaurant and fake Link to impersonate what the code snippet expects.

[toggle code]

  • # in real life, the Link class would probably pull information from a database of links
  • # and live would be whether it is currently a valid link,
  • # and get_absolute_url would be the actual URL for that link
  • class Link():
    • def __init__(self, title):
      • self.title = title
    • def live(self):
      • return True
    • def get_absolute_url(self):
      • return "http://www.example.com/" + self.title.replace(" ", "_")
  • # in real life, the Restaurant class would probably be a table of restaurants
  • # and would store the name of each restaurant, an id for the restaurant's web site
  • # and the restaurant's address
  • class Restaurant():
    • def name(self):
      • return "The Green Goblin"
    • def url(self):
      • myURL = Link("The Green Goblin")
      • return myURL
    • def address(self):
      • return "1060 West Addison, Chicago, IL"
    • def mapref(self):
      • return "https://www.google.com/maps/place/" + self.address().replace(" ", "+")

Save that as restaurant.py.

Objects are created from classes using:

What app keeps stealing focus?—Friday, January 8th, 2016

I’ve been having a problem lately on Mac OS X Yosemite, 10.10, with losing focus on the window I’m typing in. Most of the time it didn’t happen often; sometimes it would happen three or four times over several minutes. The screen doesn’t change: the menu bar still tells me I’m in Safari or iA Writer or whatever. But suddenly my typing is going nowhere.

A quick Google search on “Mac OS X application window keeps losing focus” and I found I wasn’t alone. The problem seems to be that some background app is stealing the focus, but since it’s a background app the menu doesn’t change from the current app.

On How do I tell which app stole my focus in OS X?, medmunds provided a neat little Python script for displaying the current active app as soon as the active app changes. It also displays the path to the app so that you know specifically where the offender lies.

I quickly realized that I keep a lot of windows open, and the Terminal window soon disappeared behind all of my working windows. Obviously once I noticed a focus change, I could go back to Terminal to see what it was, but if it happened while I was reading a web page, I might not notice it for a long time—scrolling doesn’t need focus to work.

So I did another search on “Python Mac OS X speak” and found an answer by arainchi that should, and did, allow importing the Mac’s built-in speech synthesizer software. I added a volume reduction (hearing a loud voice announcing the app change every time I switched an app became annoying very quickly) and improved the responsiveness of the script. Mac OS X Python supports fractional sleep times, so I reduced it to two-tenths of a second instead of a whole second.

Draw a circle on an iPad map from three points in Pythonista—Tuesday, December 1st, 2015

I happened across a copy of 57 Practical Programs & Games in BASIC while traveling a month ago, and was once again fascinated by the simple toolchests we used back in the seventies and eighties. I fooled around with it a bit, typing up the programs in HotPaw BASIC on the iPad. I now have the BASIC code in HotPaw necessary to tell what day of the week any date, post 1752, fell on, as well as a Chi-Square evaluator that I’ll probably never use.

Day of the week in HotPaw BASIC

While reading through it, I came across the code for Circle determined by three points and thought about how cool it would be to use that simple code on modern mobile tools. It should be a snap to write an on-the-fly app in HotPaw BASIC or Pythonista. Take a snapshot of a map, tap three points, and see what the circle is.

HotPaw BASIC does not appear to have access to the iPad’s photo library, but Pythonista does. It has a photos module that allows you to pick_image and several methods in the scene module for simple manipulations and display.

The code itself is pretty simple. Create a scene, override touch_end to capture points (touch_end is similar to onClick in JavaScript), and the BASIC code from 57 Programs converted to Python, to determine the center and radius of a circle given three points on the screen.

icalBuddy and eventsFrom/to—Wednesday, August 19th, 2015

I use icalBuddy extensively along with GeekTool to display events on my desktop. I have the fairly standard sort of “here are your upcoming events”:

  • icalBuddy --excludeCals Television --excludeEventProps url --dateFormat "%A" --includeOnlyEventsFromNowOn eventsToday+4

This shows everything from the rest of today through four days from now. But you’ll notice it excludes one calendar: I really don’t need to see a list of the television shows I sometimes watch. They’re not that important to me.

Up until about a week ago, I showed only television shows for today:

  • icalBuddy --includeCals Television --excludeEventProps url --dateFormat "%A" --includeOnlyEventsFromNowOn eventsToday

And it worked fine, until the local old movies television station had William Castle’s classic 13 Ghosts on at 1:20 AM a few weeks ago. I missed it, because 1:20 AM isn’t today, it’s tomorrow. But by the time I look at it tomorrow morning, 1:20 AM is long gone.

I initially changed it to “eventsToday+1”, but that clutters up my desktop with events for tomorrow night, which I don’t need to know about now. I don’t plan my life around television shows, I just want to know if there’s something interesting right now. What I really want is for today’s list to include until tomorrow morning. The icalBuddy man page indicates that it’s possible to specify a range that includes an hour on a relative end date, but the documentation is currently wrong. For the option “eventsFrom:START to:END”, it says:

Print events occurring between the two specified dates. The dates (START and END) may be specified in a natural language form (such as "tomorrow at noon" or "june 10 at 6 pm") or as relative dates (such as "today+3" or "yesterday-2") but the safest format is "YYYY-MM-DD HH:MM:SS +HHMM"

Specifying tomorrow at noon, or tomorrow at 8 am, or tomorrow at anything just shows everything from tomorrow. A quick use of --debug confirmed it: icalBuddy interprets “tomorrow at” anything to be “[tomorrow] at 11:59:59 PM Central Daylight Time”.

I verified, however, that the to: option can accept partial days, by using the actual date (August 20 at 7 pm, for example), so I messed around until I found a format that works. Rather than “tomorrow at time”, use “time at tomorrow”:

  • icalBuddy --includeCals Television --excludeEventProps url -f --dateFormat "%A" --includeOnlyEventsFromNowOn eventsFrom:today to:"noon tomorrow"
  • icalBuddy --includeCals Television --excludeEventProps url -f --dateFormat "%A" --includeOnlyEventsFromNowOn eventsFrom:today to:"9 am tomorrow"

The first is interpreted as noon tomorrow, and the second as 9 am tomorrow, only showing things in the morning from tomorrow instead of all day.

Retry SSH connections after transient error—Monday, October 20th, 2014

The Timeout class works great for retrying connections after they timeout, but what about more prosaic errors? I’ve been getting a bunch of AuthenticationException errors in my Python/Paramiko connection attempts lately. I’d been just capturing all SSHExceptions (of which AuthenticationException is a subclass) and reporting the error, but this is just a transient error that almost always goes away on the very next upload.

That makes it a perfect candidate for retrying the connection. I renamed the class from Timeout to Persistence, because this more generic class is going to be more persistent at making connections.1

[toggle code]

  • from paramiko import SSHException, AuthenticationException
  • class Persistence(object):
    • def __init__(self, function=None, seconds=30, tries=3, errorMessage='Timeout'):
      • self.seconds = seconds
      • self.tryLimit = tries
      • self.tries = 1
      • self.function = function
      • self.errorMessage = errorMessage
    • def act(self):
      • signal.signal(signal.SIGALRM, self.handleTimeout)
      • signal.alarm(self.seconds)
      • try:
        • self.function()
      • except AuthenticationException, error:
        • self.tryAgain(AuthenticationException(error), 'Authentication exception')
      • signal.alarm(0)
    • def tryAgain(self, exception, message):
      • if self.tries >= self.tryLimit:
        • raise exception
      • else:
        • print message, 'try', self.tries, self.errorMessage
        • sleep(2*self.tries)
        • self.tries = self.tries + 1
        • self.act()
        • print 'Succeeded on try', self.tries
    • def handleTimeout(self, signum, frame):
      • self.tryAgain(TimeoutError(self.errorMessage), 'Timed out')

All it really does is add a tryAgain method that can be called both by the handleTimeout method and any exceptions in try/except. If the failure continues more than three times, the exception is passed back up as normal.

AppleScript, variables, and dropped filenames in Automator—Monday, October 6th, 2014
Prepend Disclaimer workflow

Simple workflow to prepend a known PDF to an arbitrary dropped PDF.

Over on Stack Overflow the other day, someone asked an intriguing question about wanting to prepend a disclaimer page to hundreds of PDF files. Now, merging multiple PDFs into a single PDF is the kind of task Automator excels at, and it’s very easy to do. Appending a single page to multiple PDF files, however, is a classic loop. This is not something Automator excels at. It requires thinking outside the workflow.

The basic workflow is an application—application workflows take dropped items and do things with them.

  1. Get Specified Finder Items (the disclaimer)
  2. Combine PDF Pages (appending pages)
  3. Move Finder Items (to a special folder created for the combined items)
Disclaimer Loop workflow

Automator can loop through dropped items one at a time using Run Workflow.

However, this only works for one file at a time. How to prepend the disclaimer to hundreds of files? Automator has a loop workflow action, but all it does is go back to the beginning of the workflow. It doesn’t do anything to incrementally loop through the list of dropped files. For that, we need the Run Workflow action. This runs a different workflow, and can send its own dropped items one at a time to that second workflow.

But, there’s a second problem: the Combine PDF Pages action creates files with a random filename. There is no option to tell it to take one of the dropped files and use that as its new filename.

Automator has some very impressive variable options: there are variables that are really AppleScripts, variables that are really shell scripts, variables containing many system names and common system paths. There is no variable that contains the dropped filename, however, and the AppleScript/Shell Script variables don’t take any arguments. As far as I can tell, the variables are all, basically, static, even if in a dynamic way. There is, thus, no way to modify the dropped file’s path using either an AppleScript variable or a Shell Script variable in order to get the dropped file’s base name.

Older posts.