Goodreads: What books did I read last week and last month?

Jerry Stratton, February 26, 2016

I started using Goodreads in 2014, and it’s a nice way of tracking what books I read and what I thought about them. One thing I have definitely noticed missing, however, is an advanced search. Every once in a while I want to see what and how many books I read in the last week, or the last month, or the last x days, and there is no such search. You have to count them up yourself.

At the end of the year, I wanted to find all of the books from 2015 that I’d rated at 5. The only way I could find to do this was to eyeball the list. An advanced search would have made this easy.

In my Django database of books purchased, this sort of data is easy to drill into. It’s very easy to see how many books I purchased in February 2015, for example. And since one of my New Year’s Resolutions is that I am going to read more than I buy, comparing books purchased to books read is important!

However, while it does not provide an advanced search Goodreads does offer data export so that you can save your data to your hard drive. If you use Goodreads extensively it’s a good idea to make regular backups. The export creates a CSV, or comma-separated file, of your books, ratings, reviews, pretty much everything associated with each book you read. This allows us to make an advanced search of our own.

I chose Python because I’m familiar with it, and because it has both a CSV module and an in-memory SQL module (based on SQLite 3) built in.

The drawback to the CSV module is that it is relatively old and not unicode aware. In its defense it assumes everything is utf–8, but it doesn’t mark it that way so that the rest of Python knows. Goodreads, fortunately, provides its CSV file as utf–8. It’s not too hard to make a Python generator that will return unicode/UTF8 values when importing from the Goodreads csv file.

One more trick is that the csv reader only knows of one type of value, the string. But if we want last week’s data, we need to be able to search on a date. So, riffing on LMatter’s stack overflow code, I made a UnicodeDictReader that also converts Goodreads’s Date Read to a Python date.

This part of the script then loops through every line in the csv file and inserts the relevant data into an on-the-fly sqlite3 database.

[toggle code]

# import data
# csv is not unicode
#http://stackoverflow.com/questions/5004687/python-csv-dictreader-with-utf-8-data
def UnicodeDictReader(utf8_data):
- csvReader = DictReader(utf8_data)
- if args.verbose:
  - print csvReader.fieldnames
- for row in csvReader:
  - yield {key: massageValues(key, value) for key, value in row.iteritems()}
def massageValues(key, value):
- if key == 'Date Read':
  - if value:
    - (year, month, day) = value.split('/')
    - return date(int(year), int(month), int(day))
  - else:
    - return None
- return unicode(value, 'utf-8')
csvfile = args.csvfile[0]
goodreads = UnicodeDictReader(csvfile)
# create database
sq3connection = sqlite3.connect(':memory:')
sq3connection.row_factory = sqlite3.Row
cursor = sq3connection.cursor()
cursor.execute('CREATE TABLE books (title, author, authorLastFirst, rating int, dateRead date, bookshelf)')
for row in goodreads:
- book = {
  - 'title': row['Title'],
  - 'author': row['Author'],
  - 'authorLastFirst': row['Author l-f'],
  - 'rating': row['My Rating'],
  - 'dateRead': row['Date Read'],
  - 'bookshelf': row['Exclusive Shelf'],
- }
- cursor.execute('INSERT INTO books VALUES (:title, :author, :authorLastFirst, :rating, :dateRead, :bookshelf)', book)
# SEARCH CODE WILL GO HERE
# clean up
# commit is necessary for testing, if writing to a real file
sq3connection.commit()
sq3connection.close()

Note that if you are modifying the script to import other fields, you may find it useful to be able to view the database using the sqlite3 command-line command. Change “:memory:” to, say, “testfile.sqlite3” and you should see a file by that name show up in the current working directory.

As you can see, that code expects a csv file name. Here’s the code with the ArgumentParser setup and all of the command-line options:

[toggle code]

#!/usr/bin/python
import argparse, sqlite3
from datetime import date, timedelta
from csv import DictReader
sortFields = {
- 'title': 'title ASC',
- 'author': 'authorLastFirst ASC',
- 'rating': 'rating DESC',
- 'date': 'dateRead DESC',
}
parser = argparse.ArgumentParser(description='Search Goodreads export file.')
parser.add_argument('csvfile', metavar='CSVFILE', type=argparse.FileType('r'), nargs=1, help='a Goodreads file to search')
parser.add_argument('--author')
parser.add_argument('--bookshelf')
parser.add_argument('--rating', type=int, choices=range(1, 6))
parser.add_argument('--rating-exact', type=int, choices=range(1, 6))
parser.add_argument('--title')
parser.add_argument('--period', choices=['week', 'month', 'year', 'week-to-date', 'month-to-date', 'year-to-date'])
parser.add_argument('--days', type=int)
parser.add_argument('--order', default='date', choices=sortFields.keys())
parser.add_argument('--verbose', action='store_true')
args = parser.parse_args()

With these options, I can do searches such as for last week, or the last 14 days:

~/bin/goodreads goodreads_library_export.csv --period week
 **** Drunkard's Walk by Frederik Pohl (2016-01-21)
   ** The Mother Tongue - English & How It Got That Way by Bill Bryson (2016-01-21)
  *** Treks Not Taken by Steven R. Boyett (2016-01-20)
   
   
~/bin/goodreads goodreads_library_export.csv --days 14
 **** Drunkard's Walk by Frederik Pohl (2016-01-21)
   ** The Mother Tongue - English & How It Got That Way by Bill Bryson (2016-01-21)
  *** Treks Not Taken by Steven R. Boyett (2016-01-20)
 **** A Dangerous Friend by Ward Just (2016-01-15)
  *** Envisioning Information by Edward R. Tufte (2016-01-15)

The searches can be combined, as well. For example, the books I enjoyed the most from last year:

~/bin/goodreads goodreads_library_export.csv --rating 5 --period year
***** The Dying Earth (The Dying Earth, #1) by Jack Vance (2015-11-30)
***** Lebanese Cuisine: Over Two Hundred Authentic Recipes Designed for the Gourmet, the Vegetarian, the Healthfood Enthusiast by Madelain Farah (2015-11-10)
***** Advise & Consent by Allen Drury (2015-10-08)
***** The Collected Stories of Philip K. Dick, Volume 5: We Can Remember It For You Wholesale (2015-09-21)
***** The Congressman Who Loved Flaubert: 21 Stories and Novellas by Ward Just (2015-09-19)
***** The Case for Democracy: The Power of Freedom to Overcome Tyranny and Terror by Natan Sharansky (2015-07-09)
***** Mockingbird by Walter Tevis (2015-04-30)
***** Zen in the Art of Writing by Ray Bradbury (2015-03-04)
***** The Origins and Development of the English Language by John Algeo (2015-01-31)

And books that were at least okay last month:

~/bin/goodreads goodreads_library_export.csv --rating 2 --period month
 **** The Best of C. L. Moore by C.L. Moore (2015-12-29)
  *** The Space Scavengers by Cleve Cartmill (2015-12-20)
 **** Knights of the Dinner Table: Bundle of Trouble vol. 45 by Jolly R. Blackburn (2015-12-18)
  *** Astro City Vol. 8: Shining Stars by Kurt Busiek (2015-12-17)
  *** Green Magic: The Fantasy Realms of Jack Vance (2015-12-17)
  *** The Salads Cookbook by Southern Living Magazine (2015-12-16)
 **** Intellectuals and Society by Thomas Sowell (2015-12-14)
 **** Dear Committee Members by Julie Schumacher (2015-12-10)
  *** The Best of Fritz Leiber (2015-12-06)
 **** Scoop by Evelyn Waugh (2015-12-01)

Everything I’ve read by Ray Bradbury:

~/bin/goodreads goodreads_library_export.csv --author bradbury --bookshelf read
***** Zen in the Art of Writing by Ray Bradbury (2015-03-04)
***** Dandelion Wine (Green Town, #1) by Ray Bradbury (2014-09-03)
 **** The Illustrated Man by Ray Bradbury (2014-06-10)
 **** The October Country by Ray Bradbury
  *** I Sing the Body Electric! & Other Stories by Ray Bradbury
***** Something Wicked This Way Comes (Green Town, #2) by Ray Bradbury
 **** Fahrenheit 451 by Ray Bradbury
***** The Martian Chronicles by Ray Bradbury

It can also search through titles; for example, here are all of the books with “best of” in the title that I’ve marked on Goodreads as having been read:

~/bin/goodreads goodreads_library_export.csv --bookshelf read --title "best of"
 **** The Best of C. L. Moore by C.L. Moore (2015-12-29)
  *** The Best of Fritz Leiber (2015-12-06)
 **** For the Love of Mike: More of the Best of Mike Royko (2015-09-09)
 **** The Best of Edmond Hamilton (2015-08-30)
  *** The Best of Cordwainer Smith (2015-07-15)
 **** The Best of Fredric Brown (2015-07-02)
 **** One More Time: The Best of Mike Royko (2015-06-20)
  *** The Best of Omni Science Fiction No. 2 by Ben Bova (2015-04-16)

Each book’s line is printed out from a Book class. I often find it useful to put a database row into a class to make it easier to output custom data from the row. Here’s the simple one I use for a book row:

[toggle code]

# a book class to make it easier to access and display book data
class Book:
- def __init__(self, row):
  - self.data = row
- def __getattr__(self, name):
  - if name in self.data.keys():
    - return self.data[name]
  - else:
    - return None
- def stars(self):
  - stars = self.rating * '*'
  - stars = stars.rjust(5)
  - return stars
- def line(self):
  - line = '%s %s' % (self.stars(), self.title)
  - if self.author:
    - line = line + ' by %s' % self.author
  - if self.dateRead:
    - line = line + ' (%s)' % self.dateRead
  - elif self.bookshelf and not args.bookshelf:
    - line = line + ' (%s)' % self.bookshelf
  - return line

And, finally, I construct a query from whatever options were provided on the command line:

[toggle code]

# construct query
query = 'SELECT * FROM books'
searches = []
searchValues = []
if args.author:
- searches.append('author LIKE ?')
- searchValues.append('%'+args.author+'%')
if args.bookshelf:
- searches.append('bookshelf=?')
- searchValues.append(args.bookshelf)
if args.rating:
- searches.append('rating >= ?')
- searchValues.append(args.rating)
if args.rating_exact:
- searches.append('rating IS ?')
- searchValues.append(args.rating_exact)
if args.title:
- searches.append('title LIKE ?')
- searchValues.append('%'+args.title+'%')
if args.days:
- searches.append('dateRead >= ?')
- searchValues.append(date.today() - timedelta(days=args.days))
if args.period:
- searches.append('dateRead BETWEEN ? AND ?')
- today = date.today()
- if args.period.endswith('-to-date'):
  - endDate = today
  - if args.period.startswith('year-'):
    - startDate = today.replace(month=1, day=1)
  - elif args.period.startswith('month-'):
    - startDate = today.replace(day=1)
  - else:
    - weekSoFar = today.weekday()+1
    - if weekSoFar == 7:
      - weekSoFar = 0
    - startDate = today - timedelta(days=weekSoFar)
- else:
  - if args.period == 'year':
    - startDate = date(today.year-1, 1, 1)
    - endDate = date(today.year-1, 12, 31)
  - elif args.period == 'month':
    - month = today.month - 1
    - year = today.year
    - if month == 0:
      - month = 12
      - year = year - 1
    - startDate = date(year, month, 1)
    - endDate = today.replace(day=1) - timedelta(days=1)
  - else:
    - weekSoFar = today.weekday()+2
    - if weekSoFar == 8:
      - weekSoFar = 1
    - endDate = today - timedelta(days=weekSoFar)
    - startDate = endDate - timedelta(days=7)
- searchValues.append(startDate)
- searchValues.append(endDate)
if searches:
- query = query + ' WHERE ' + ' AND '.join(searches)
query = query + ' ORDER BY ' + sortFields[args.order]
# perform search
if args.verbose:
- print query
- print
resultCount = 0
results = cursor.execute(query, searchValues)
for row in results:
- book = Book(row)
- resultCount = resultCount + 1
- print book.line()
print
print "Found %i book%s." % (resultCount, 's'[resultCount==1:])

The only particularly complex option is for the date-related periods, since it needs to calculate when the last month, year, or week was. For most search options, if you need to add a new one all you’ll have to do is add a new “if args.option”, append a query format to searches and the query data to search for to searchValues.

You can use the –verbose option to see what the current script passes to .execute for queries.

CSV File Reading and Writing: “The csv module implements classes to read and write tabular data in CSV format… Programmers can also describe the CSV formats understood by other applications or define their own special-purpose CSV formats.”
Goodreads advanced search (Zip file, 5.0 KB): A Python script for searching a Goodreads CSV export file.
Jerry@Goodreads: Jerry Stratton at Goodreads: On my Goodreads account, you can see how erudite and well-read I am.
Python CSV DictReader with UTF-8 data at StackExchange: “I’d like to access the row columns by name as it is done by csv.DictReader but with UTF-8 encoded CSV input file.”

More Python

Quick-and-dirty old-school island script: Here’s a Python-based island generator using the tables from the Judges Guild Island Book 1.
Astounding Scripts on Monterey: Monterey removes Python 2, which means that you’ll need to replace it if you’re still using any Python 2 scripts; there’s also a minor change with Layer Windows and GraphicConverter.
Test classes and objects in python: One of the advantages of object-oriented programming is that objects can masquerade as each other.
Timeout class with retry in Python: In Paramiko’s ssh client, timeouts don’t seem to work; a signal can handle this—and then can also perform a retry.
Percentage-based random tables: Our current random item generator assumes that each item shows up as often as any other item. That’s very OD&D-ish. But AD&D uses percentage dice to weight toward some monsters and items more than others.
30 more pages with the topic Python, and other related pages

More SQLite

Maven: SQLite front end: Cutedge has a SQLite3 front-end that almost works.
Use LIKE to find duplicate strings in SQLite: SQLite is case sensitive with strings—but not if you use LIKE.

Comments?

The undiscovered comment form, whose bourn no poster returns.

Your email, URL, and location are optional—but I won’t be able to contact you if you don’t leave a working email. Your email does not get displayed, your URL and location do. Your name is required but may vary as the needs of the day demand, or you can just use the anonymous Hark Thrice name. You can use the following tags: <em>, <a>, <blockquote>. Use them wisely and post intelligently. Comments may take some time to approve, especially if I’m stuck in a Mexican jail.

If you have private comments, or questions about this page, please, leave a message on the Negative Space Comments Page.

Lost?

If you’re looking for something here, use the search box in the navigation to limit your search to this part of the site, or use the Negative Space search page.

Jerry

The light shines in the darkness, and the darkness has never put it out. — John (The Gospel According to John)

Contents of Negative Space™ as a whole Copyright © 1994-2024 Jerry Stratton. Individual copyrights remain held by their respective authors unless they specify otherwise. Site titles, such as Negative Space, Strange Bedfellows, Biblyon Broadsheet, Highland Games, and FireBlade Coffeehouse are trademarks of Jerry Stratton.

Code and code snippets, to the extent that they are copyrightable, may be re-distributed under the terms of the GNU General Public License 3.

Goodreads: What books did I read last week and last month? last modified February 26th, 2016.

Your comment
Your name
Your email
Your web page
Your location

Mimsy Were the Borogoves

Goodreads: What books did I read last week and last month?

More Python

More SQLite

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

About Mimsy

Comments?

Lost?

Mimsy Were the Borogoves

Goodreads: What books did I read last week and last month?

More Python

More SQLite

Editorials

Books, Movies, & Music

Technology & Hacks

Food

42 Astounding Scripts

Walkerville Reader

Biblyon Broadsheet

Blogroll

Keep in touch

About Mimsy

Comments?

Lost?