I've learned something about how the systems of traditional publishing work, since I started in 2012. As a career systems technologist, I've paid particular attention to the data systems, standards, and tools that I've been able to learn about.
While it's a truism that traditional authors who have gone hybrid or converted completely to independent publishing have shared a lot about traditional publishing with the indie community, by the nature of things we don't get a lot of conversations from the techies in the book trade, so it's not terribly easy coming up to speed on the technical systems used by traditional publishers.
Why do I care?
I can't help but notice, whenever I compare my own ebook listings at a retailer with a traditional publisher's listings, that theirs are often cleaner and more complete. Combined with the knowledge that they have large catalogues to maintain, I want to know how that's done, so that I can achieve the same effect. My own catalogue is now 24 titles, so it's not just a matter of data quality but also data quantity.
Managing and Updating Metadata for Many Titles
Whenever I have a bright idea about a better way to manage keywords or categorization, or how to adjust pricing or format book descriptions, I often find myself facing my catalogue and shaking my head about 24 titles times all my distributors.
Today my ebooks are widely distributed. I go directly to Amazon, Barnes&Noble, Kobo, and Smashwords, and I use PublishDrive and StreetLib for Apple and Google Play, and for dozens of international retailers. That's in effect 6 retailer/distributors, for each of whom I might have to update 24 titles.
It would be so much better to have a single database for all my titles and just use that to update all my trading partners, so that they could update their own trading partners or international sites, wouldn't it? My massive spreadsheet can marshal all the data, perhaps, but that just sits on my computer and doesn't communicate with anyone.
Ensuring Metadata Quality
Here's a tiny example. So-called “smart quotes” (left or right-facing single or double quotes) are not understood by all systems, depending upon things like character encoding schemes. When you create your book descriptions in a spreadsheet (with default smart quotes) and copy/paste them into a system that only understands apostrophes and straight double quotes, they often manifest as unprintable characters.
Once you discover the problem and understand the issue, you have long and short book descriptions, author bios, and other text items to clean up, across all your distributed titles. Just fixing your spreadsheet so it doesn't happen again won't fix the retailers and distributors — those all have to be updated.
Providing paragraph breaks is another painful area where the learning curve can be lengthy before you arrive at a solution that works (almost) everywhere.
Presenting Similar Marketing-Specific Metadata as Traditional Publishers
Not only do I want my book metadata to be correct, I want it to be as complete as a traditional publisher. I want my book pages at a retailer to look just as good as anyone else's book pages, so that a prospective buyer has no reason to think of my books as somehow lower in quality.
A few of the leading retailers who pioneered direct access for indies have invested in serious efforts to make that possible (Amazon, Kobo, Barnes&Noble, even Ingram & CreateSpace). But there are hundreds of potential retailers out there, and they're not in the business of special catering for indie authors.
Using the Book Trade's Own Solutions
Traditional publishers are companies like any other in a long-established industry, and they have to share data with their trading partners. The general technical term for the standards that make this possible is EDI — Electronic Data Interchange.
At its base, all it really means is that some version of your company's datafile can get sent to another company to facilitate business. That could include financial and manufacturing data — my order goes to your company and your invoice comes back to me.
For the book trade, we're talking about a publisher's book data — the books' files, images, and metadata. To keep this discussion simpler, I'm restricting this article just to discuss ebooks, but print and audio editions are also candidates.
A special standard has evolved for the book trade: ONIX (ONline Information eXchange).
What is ONIX?
I've spoken at length before about ONIX and I suggest you pause and read that article. I'll wait — it's a lot to digest.
If we were traditional publishers, we'd be crazy not to use ONIX to maintain and transmit our book data to our trading partners, especially since there are intermediaries who can handle some of the nuts and bolts of the data transmission as well as the sales data and invoicing with their trading partners.
On the other hand, traditional publishers have IT departments, and we don't. Also, not all of our trading partners' trading partners (retailers all over the world) are ready to handle ONIX datafiles themselves — many of them insist on spreadsheets — and we can't afford the same sorts of intermediaries that the traditional publishers can to ease that situation.
How Can Indies Manage ONIX Data?
So, even if we want to, and are prepared to learn how, can we use ONIX data, and does it help?
I'm an IT department all by myself, so I decided to make the experiment.
Here's how it works…
Like most of us, I have a big fat spreadsheet with all the metadata for my books. A subset of that information is useful for packaging up as an ONIX datafile.
Why a subset? For starters, my spreadsheet has identifiers for EPUB, MOBI, Print, and Audio editions. At this time, only EPUB is being handled by the distributors available to us. I also have lots of identifiers which are retailer-specific (e.g., Amazon ASIN numbers) or bibliographic rather than commercial (ISNI numbers). So not all the data in my master spreadsheet is appropriate for my ONIX file.
I picked up the ONIXEDIT software to create the ONIX XML datafile. This presents a complicated series of forms that allows me to define every single ONIX XML data element that is useful for my books. This is not for the faint-of-heart — the learning curve is substantial and it helps quite a bit if you're something of a data geek already. But it can be done, there is plenty of documentation, and the technical support folks are very responsive, especially considering their more usual customers are techies already in the book trade.
Where Can We Use ONIX Data?
Well, that's the good news. The retailers you go to directly have online forms instead, but the two most advanced and forward-thinking distributors available to us, StreetLib and PublishDrive, each accept ONIX data feeds. That was, in fact, the knowledge that spurred me on to take this step. At the time I started, I had all my books already at PublishDrive and was just about to put them all up on StreetLib.
The great thing about ONIX data is that I only need to make one master file, and both my distributors can use it. So I built my ONIX master file using ONIXEDIT, and made one output file for PublishDrive with 2 new book bundles, and a different output file for StreetLib with all my titles.
It's taken a little fine-tuning from responsive tech support at the distributors and corrections and adjustments from me in building the ONIX records. Even ONIXEDIT got involved, since there were parts of the ONIX 3.0 standard they hadn't implemented yet.
And it worked!
I almost wish there were some sort of universal change I had to make to all my titles right now, just to revel in a push-button update. I will certainly be going this route with my next releases — a permanent change in my processes.
The Current State of the Art
What about the distributors themselves? So far, there are only so many of their publishing customers (indie or trad) who are taking advantage of the ONIX possibility, but some of those have large catalogues of titles. This can only grow for them as more take advantage of the functionality on offer.
There are some less fundamental ONIX marketing data elements, not yet supported by the distributors, that we would find compelling (e.g., editorial reviews), which I hope to see added to the input specifications as soon as possible. My wants are simple — all I want is everything the traditional publishers can pass to their trading partners.
More distributors like PublishDrive and StreetLib might arise, but those two distributors have set the bar high enough that I would expect to see ONIX input feeds as part of their functionality, too, especially as they hope to attract publishers as well as indies.
The ONIX feeds were (and are) an investment for the distributors in becoming and staying competitive, just like they're an investment for me, and I hope they deepen that investment.
I anticipated this would take some real effort to get past the learning curve, and even at that I underestimated it, but I'm past that now. I learned a lot about the uses of the individual data elements that was an education in itself about the commercial data that the book trade uses. In passing, the Bowker ISBN records hold few mysteries for me anymore — those are all part of the ONIX data set, and if you're large enough (I'm not), you can send ONIX feeds to Bowker, too, and control the whole publication/release cycle that way. It also means you can use ONIX documentation to understand exactly what the Bowker data fields are trying to collect, and get smarter about the book trade requirements.
I now have functioning ONIX feeds to my two big distributors, and that makes me very happy.
But is this a bridge too far? Maybe not for small publishers, but probably so for most indies…
A service provider could arise who finds a way to take ordinary per-title data from a form or spreadsheet and creates ONIX records, for a fee. So instead of investing in ONIXEDIT and learning how to create the ONIX records, an indie could fill out a very full form, get back an ONIX file, and use it for both distributors. On the other hand, an indie today can just fill out the standard form at both distributors and skip an intermediary, so I can't see that happening at this point.
I look at it as an investment in the future, partly to ease the effort of maintaining more and more titles in my catalogue, and partly as a way of providing all the data the traditional publishers do, once the distributors catch up.
Willing to invest in the learning curve yourself?