I like the idea. You could use ISBN codes for the book IDs.
ISBNs should certainly be involved somewhere. However, if you did use them as IDs, you’d probably need a way to link or aggregate related ISBNs, as e.g. a hardcover, a paperback and an ebook with the same content will have separate ISBNs, while something like a review is not usually tied to the format (Somebody looking for a review of a book will expect to see all the reviews in one place, not spread across separate entries).
openlibrary.org deserves a mention here, which allows you to create book lists that others can subscribe to via Atom.
i like it
i think one change i’d think about is a properties section, where people can start embedding various meta-data, such as what “shelf” the book is on (read, to-read, never-read-again, etc).
Perhaps this should use the BookBrainz database for book IDs and the like? MusicBrainz is the standard open music database, and BookBrainz aims the be the same for books. (and it’s by the same organization!)
I find it so depressing that this concept isn’t gaining more traction, faster, for many more things.
I really liked this idea!
I’ve already been maintaining a reading list page on my blog, and just added a quick-and-dirty JSON feed based on the proposed spec: https://benjamincongdon.me/books/index.json
Right now, it’s being auto-updated from my Goodreads profile using a small Python script I wrote, but I like the idea that this could eventually be completely independent from Goodreads.
I really hope something like this gains traction. I use Book Catalogue on Android to catalogue my books. It uses Amazon, Goodreads, or LibraryThing on the back end, though it appears that most of the time it only gets good data from Amazon.
I’d love to have a more cross-platform tool for it, because I want to use my phone for scanning bar codes, but updating the metadata on the phone is horrible. A decent open standard for the data is a great step.
One of the things that Book Catalogue struggles with is omnibus editions. I have, for example, a load of Agatha Christie novels that are in 2-book volumes. Most of these are sufficiently old that they don’t have ISBNs (and still have the original non-PC titles), but even providing the data manually there isn’t a good way of separating the physical and logical book. I want to be able to see the list of logical books (e.g. see Murder on the Orient Express as a ‘book’, even though it’s in an omnibus physical book combined with another logical book) but also then find the physical book that contains it. For extra fun, a couple of the more popular books appear on my bookshelf in multiple omnibus editions. I’d like to be able to see a single version of the logical book and find all of the physical books that it’s in.
It’s actually worse than that, because I have a few books that are published with the opposite layout: A single logical book split into multiple physical books (and also published as a single physical book edition). For example, Stephen King books are quite bulky to hold and so you can buy some of them split into 3 or 6 smaller volumes.
I think that’s where a good specification becomes really hard. Do you have separate entries for each copy of the logical books within the JSON for a single physical volume and rely on some unique identifier (spoiler: ISBNs are not this!) to deduplicate them or do you have the logical books as first-class objects that you reference from the physical books? How do you unify single-volume editions with multi-volume editions of a single logical book?
The other thing that I don’t like about this is that it seems to have the same problem of ontology drift for author names. I really, really want a book metadata format to have a layer of indirection between author ID and author names. For example, if I look at War of the Worlds, I see that different people have called the author “HG Wells”, “H.G. Wells”, “H. G. Wells” and “Hubert George Wells”. Each book should have a list of unique author IDs and then the alternative names of the authors should be listed separately so that you can correct ontology drift by simply merging author metadata and updating the author IDs. Books should also possibly include a preferred name for the author for that book (for example, consider Iain Banks vs Iain M. Banks: the name used is also a genre signifier). Does a user want to display The Wasp Factory and Use of Weapons under the same author or not? They were written by the same human, but they’re generally in different sections in a library. How do you capture that in the metadata?