Goodreads Microblog
I read a lot. I also run a Ghost blog at partiallypeaceful.com. For years, I'd finish a book, rate it on Goodreads, and that was it. Maybe I'd remember to post something on the blog. Usually I wouldn't.
I wanted every book I read to show up on partiallypeaceful.com. Not as a big review post. Just a short microblog entry: cover image, star rating, a few thoughts if I wrote a review, and a link to buy it somewhere that isn't Amazon.
So I built a cron job. It runs once a day, checks my Goodreads "read" shelf, and publishes anything new to Ghost. The whole thing is about 150 lines of Python. Here's how it works.
The pipeline
Every morning at 8 AM, a cron job on my home server fires `goodreads-to-ghost`. It pulls my Goodreads RSS feed (every shelf gets one at `https://www.goodreads.com/review/list_rss/<user-id>?shelf=read`), parses the XML, and walks through each book in reverse chronological order.
For each book, it checks Ghost for an internal tag called `#goodreads-book-{id}`. If the tag is there, skip. If not, this book is new.
For new books, three things happen. One: the cover image gets downloaded from Goodreads and re-uploaded to Ghost's image library via the Admin API. Goodreads puts weird size suffixes in their cover URLs. The parser strips those out to get the full-resolution original. Two: the post HTML gets assembled. Ghost-hosted cover linked to Goodreads, a "Read [title] by [author]" line, unicode star rating, review text in a blockquote if I wrote one, and a Bookshop.org link. Three: the post gets created through Ghost's Admin API with public tags `microblog` and `books`, plus that invisible `#goodreads-book-{id}` tag.
Once the post is live, Ghost fires a `post.published` webhook. A separate process, `ghost-webhook-forwarder`, sits on port 9001 waiting for it. When the webhook hits, it sends a `repository_dispatch` event to GitHub. That kicks off a GitHub Actions workflow that rebuilds the Astro site. The new book post shows up on partiallypeaceful.com within a couple minutes.
Why RSS instead of the Goodreads API
Goodreads stopped issuing API keys in 2020. Their RSS feeds are still maintained though, and they include everything I need: book ID, title, author, cover URLs, rating, review text, publication date. No authentication required.
The parsing was the only challenge. Goodreads has two different RSS item formats. The standard shelf feed uses custom namespace fields (`book_id`, `book_title`, `author_name`). The updates feed buries the data in HTML descriptions with CSS classes like `bookTitle` and `authorName`. I wrote the parser to handle both. The updates feed parser uses Python's `HTMLParser` from the standard library to pull structured fields out of description blobs. Not elegant, but solid.
Idempotency
Cron jobs fail. Network blips, Ghost doing maintenance, whatever. If the job runs twice, duplicate posts are a bad look.
Internal tags solve this. Before creating a post, the job asks Ghost: "any posts with tag `hash-goodreads-book-{id}`?" If yes, skip. The tag is internal so it doesn't show up in the blog's tag cloud. It's purely operational plumbing.
Dry-run mode is the other safety net. `goodreads-to-ghost --dry-run` prints exactly what it would publish without touching Ghost. I test every change this way before letting cron run unattended.
Zero runtime dependencies
I'm stubborn about dependencies on small tools. Every library is a future maintenance headache. This project has none. No `requests`, no `httpx`, no Ghost SDK. All stdlib: `urllib` for HTTP, `xml.etree.ElementTree` for RSS, `hmac` and `hashlib` for Ghost's JWT-based Admin API auth, `html.parser` for the Goodreads updates feed.
The Ghost Admin API auth was the most satisfying piece. Split the API key into key ID and hex secret, build a JWT with HMAC-SHA256, and send it as a bearer token. About 15 lines of code.
The rebuild pipeline
My blog is a static Astro site on Cloudflare Pages. When Ghost gets a new post, the site needs to be rebuilt to pick it up.
The webhook forwarder bridges that gap. It's a tiny HTTP server on my home server behind Tailscale. Ghost sends `post.published` webhooks to it. The forwarder calls GitHub's `repository_dispatch` API, GitHub Actions runs `npm run build`, and Cloudflare gets the new static files.
It's intentionally bare. Parses the JSON, logs the post title, and fires the dispatch. No queue, no retries, no state. If it fails, it fails. The next webhook catches whatever was missed.
Is this overengineered?
Yeah, probably. I could paste my Goodreads reviews into the Ghost admin manually. 30 seconds per book. But I'd forget. I'd skip weeks. And the point of having a personal blog is that it reflects what I'm actually reading, not what I remembered to post about.
The automation removes the friction completely. Finish a book, rate it on Goodreads, write a review if I have thoughts. It appears on the blog the next morning. I don't think about it.