Visit Homepage
Skip to content

Category: Just for Writers

Book Metadata, ONIX, and Digital Asset Management (DAM) systems

Posted in Distribution, and Just for Writers

Book Metadata

Maintaining the metadata for your books can be quite a chore. Joel Friedlander's overview covers some of the basics.

It takes me two pages of my multi-page management spreadsheet to hold the simple columnar data (ISBNs, Library of Congress (LoC) numbers, publication dates, page counts, etc.) and textual data (blurbs in various lengths, keywords, BISAC codes, etc.). Some of that data is static, but parts of it are changeable based on marketing experiments, temporary sales, and so forth.

When we go to a distributor site that caters to indies, we are usually presented with a form to fill out for each book. Since I strive for consistency, I always have to open up the form for a book I've already posted to that distributor, to make sure I fill out all the questions the same way across all my titles.

How do the big boys handle this?

Well, that's a question, isn't it? It's hard to get the details. The big traditional publishers have complex internal needs (moving a title from acquisition through edit, formatting, marketing, publication) involving different departments and requirements, and their management systems are crafted with that in mind to give them a shareable single complex record for the title that holds both in-house private data and public data intended for use with their trading partners (distributors, retailers, etc.)

The output of their systems these days tends to be in the form of ONIX records (see below) — this much I know. But exactly how they share those records with their partners is obscure to me. (Alas, trad-published authors who've added indie-publishing share insights with us, but we don't meet a lot of back-office technical types from the traditional publishing firms.) You can see how vague the specific details are for an overview on metadata maintenance that's meant to be helpful, or for a discussion about refreshing metadata as things change.

The big traditional publishers have much more complicated problems. As indies or micro-publishers, we get to choose what we want to deal with, and some of the industry tools are available for us to use, if we think it's important enough.

Checking up on book distributors

Posted in Distribution, and Just for Writers

So, you've gone wide and international with your ebook distribution, and your print edition is in Ingram's database, making it available to a fair chunk of the world's bookstores, both physical and online. Your dashboards that list your titles with your various distributors all look fine and dandy. You've given them your books, and they're making sure they're getting into the world's bookstores.

Time to sit back, proud of your books' availability in online stores all over the world, right?

If only it were that simple.

How are my distributors doing?

It’s not easy to figure that out.

I've been trying to sort out my various distribution options recently as I retire a couple of distributors and take on new ones. It's a confusing area, and the lists you can get of their channel partners are not always current or complete. I was focused on who had the best reach, or reached unique retailers, with reasonable returns and the ability to turn channels on and off to avoid duplication.

I get to retailers in a variety of ways.

Ebooks

  • Direct from my website (ecommerce). Gumroad (in several formats).
  • Direct upload. Amazon, Kobo, Barnes & Noble. If I could (no Mac), this would include Apple iBooks.
  • Hybrid (storefront & distribution). Smashwords.
  • Distributor. PublishDrive, Streetlib (coming soon).

Of these, I use PublishDrive to reach every channel (including Apple iBooks and Google Play) that I don't go to directly. I restrict Smashwords to its storefront and its unique partners only. PublishDrive, Streetlib, and Smashwords all let you select or disable individual partner channels to avoid overlap.

Already there are complications — Kobo is also a distributor, so though I go there directly, it distributes my titles to its own partners. There is no ability to pick and choose among Kobo's partners, so it's up to me to avoid enabling one of Kobo's partners at one of my other distributors. (Perhaps that can be controlled at the manual level, via email requests to Kobo, but I prefer something more automated and reliable.)

Other complications — I have to manually request special retail pricing for Google Play, to keep its automated discounting from creating a problem with Amazon. Hard to find distributors that will let you set per-channel pricing, but I think that must be essential to adjust pricing in different parts of the world (like India).

Audio

  • Distributor. AuthorsRepublic.
  • Direct upload. CD Baby (coming soon).

Print

  • Hybrid (storefront & limited distribution). Createspace. (Not expanded distribution.)
  • Distributor. Ingram LSI.

Createspace only distributes to Amazon so there are no channels to disable to avoid duplication. Ingram can't provide a list of print partners — much too broad, and much of its reach is through intermediate distributors or aggregators. No telling where your books will end up at online retailers.

Where are your books, really?

I began by taking the lists of known channel partners from PublishDrive, Smashwords, Kobo, and AuthorsRepublic. I then went to each of those sites and tried to find my books there. That alone was an eye-opener.

What happens to my metadata when it leaves the house?

Posted in Distribution, Just for Writers, and Publishing

That's the title of an excellent if brief essay by Laura Dawson of Numerical Gurus. Her site is an excellent resource for the explanations and history of some of the acronyms that haunt the world of books.

Since I seem to be on a kick lately with what metadata exists and how it sloshes around through the book ecosystem, I thought we could all benefit.

How many of those girls are properly dressed (um, properly formatted data)? And how can you keep them clean, out there in the big ol' world? Where there are boys, and parties, and fast cars, and lots of dark alleys to wander into.

We've all seen it. We spend time perfecting the metadata in our feeds, send it out to our trading partners, and had to take complaints from agents, authors, and editors. “Why is it like that on Amazon?”

The truth is, data ingestion happens on whatever schedule a given organization has decided to adhere to. Proprietary data gets added. Not all the data you send gets used. Data points get mapped. So what appears on any trading partner's system may well differ somewhat from what you’ve sent out. There are so many different players in the metadata arena that can affect what a book record looks like. When you send your information to Bowker, they add proprietary categories, massage author and series names, add their own descriptions, append reviews from sources they license – and send out THAT information to retailers and libraries. The same thing happens at Ingram, at Baker & Taylor – so what appears on a book product page is a mishmash of data from a wide variety of sources, not just you.

What does your book look like to booksellers?

Posted in Distribution, Just for Writers, and Publishing

Print on Demand (POD) (versus short-run print jobs) is the typical method used initially by indie authors, and the two big providers are Createspace (owned by Amazon) and Ingram, either via Ingram LSI (Lightning Source) or IngramSpark.

The merits of Createspace vs Ingram is a common discussion topic among indie authors who produce paperback editions. This post is an update of this analysis and focuses on what your books look like to booksellers placing orders with Ingram.

The recommended practice these days is to use both vendors for print, if you can: Createspace (without their expanded distribution option) for Amazon, for inexpensive orders for inventory, for an online store, and for direct shipping; and Ingram for everything else. (The recent news about availability of print from Amazon KDP seems to signal that access to Createspace directly might change.)

Some authors create a separate Library edition, just to use that part of Createspace's expanded distribution with a Createspace ISBN.

If you don't want to go to the bother (and the expense) of getting your own ISBN (a whole separate discussion) Createspace will supply you with an ISBN owned by Createspace for you to use, for free. (If you have an ISBN, you can use your own.)

Since the common progression for indies seems to be to start with Createspace only, and Createspace has an expanded distribution option and a free ISBN that gets your books into Ingram (“Booksellers and Online Retailers”), the question often comes up: why bother going to Ingram directly?

Why go to Ingram directly, in addition to Createspace?

The manufactured products are slightly different (quality issues with a small and debatable preference given to Ingram), and unlike Createspace, the Ingram edition costs money: a title setup fee (circa $49), an annual market fee (to stay listed in Ingram's database) ($12), and a revision fee for any change in cover or content ($40 each). The fee details vary a bit between Ingram LSI (mostly for traditional publishers) and IngramSpark (mostly for indies) and coupons/discounts are not infrequently available.

And you need your own ISBN, a not-inconsiderable expense in the US.

But there are other concerns.

Bookseller-specific issues

1) Discounts

A bookstore with good credit and broad needs may use Ingram as its main supplier. Other bookstores use smaller, more targeted suppliers who get their list of offerings from Ingram (and charge a fee).

Ingram allows you to set the same standard discount that traditional publishers use: 55%. Createspace's maximum is 40%.

Here's what that means. Ingram takes 15% of that discount for its services. It subtracts that from the books you list directly with Ingram, but it also subtracts that from the books it lists that were given to it by Createspce via expanded distribution.

So at 55% (Ingram's standard), minus Ingram's 15%, there remains 40%. Some of that may go to an intermediate distributor. Whatever's left over is the bookseller's potential profit, which he may discount to push sales.

At 40% (Createspace's max), once you subtract Ingram's fee of 15%, all that's left is 25% for the intermediate distributor(s) and bookseller to share. That is unattractive to many booksellers. Some won't even order books to fulfill customer requests at that small a profit to themselves.

2) Free ISBN / Publisher name

Createspace offers its own ISBN, if you don't have or want to use one of your own.

The general rule is: whoever owns the ISBN is the Publisher, from the perspective of forms and databases. Some recent forms have separated the two things, so that the owner of the ISBN and the “Publisher of record” and the provider of the data feed on a form can be different things, but that is far from general.

For a while, the Createspace data downstream showed Publisher = Createspace, even if you used your own ISBN. Now that only happens if you use a Createspace ISBN. And even so, what shows on a form depends on what the form uses for data sources: if it shows ISBN and looks up the ISBN, then it would get the ISBN owner. If it simply assumes the supplier of the data feed (Createspace) is the owner, then it shows Createspace as the publisher. That's how audiobooks often show up as Publisher = AudiobookFeedProvider even when the ISBN belongs to the Publisher.

How booksellers see a traditionally published book when they order from Ingram

The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more-4

Posted in Just for Writers, and Publishing


Index of topics


ISNI – International Standard Name Identifier

This is one of the international standards for disambiguating identical names for different individuals. You can contact one of the regional standards bodies and request an ISNI (free). It's intended for all sorts of creatives — writers, musicians, artists, etc. — as well as anyone else in the public eye who needs to be uniquely identifiable, including scientists, living and dead. Einstein has an ISNI number.

You can see at the bottom links to VIAF and the LC (Library of Congress LCCN-N) authority records mentioned in the MARC and LCCN sections above.

They are not the place to send all your ISBNs to — links to a list of books on your site will do the trick.

The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more-3

Posted in Just for Writers, and Publishing


Index of topics


MARC – MAchine-Readable Cataloguing

This is a standard for a digital version of the information carried in a CIP or PCIP block — and more.

Here's a MARC record for one of my books, from my local library. Easier to show than to tell.

This is how it looks in the library's internal catalogue. The MARC specification goes on forever with all sorts of optional fields, and when it gets passed along to Worldcat (see OCLC below) a lot more had been added to the “Linked Data” XML record they maintain, with a great deal of additional info.

The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more-2

Posted in Just for Writers, and Publishing


Index of topics


CIP / PCIP – Cataloguing-in-Publication / Publisher's Cataloguing-in-Publication

Remember the card catalogues in libraries (if you're old enough)? This is the same sort of information, done as a brief bibliographic block that goes at the bottom of the copyright page of your book. It takes the place of your LCCN statement (because it incorporates the LCCN on the last line) like this:

Library of Congress Cataloging-in-Publication Data

Myers, Karen, 1953-
To carry the horn : the hounds of Annwn : 1 / Karen Myers. — 1st ed.
p. cm.
ISBN 978-0-9635384-0-6
1. Fox hunting–Fiction. 2. Fantasy fiction. I. Title.
PS3613.Y4726T6 2012
813′.6–dc23

2012040017

Even though LCCNs are only for print editions (see above), and the LCCN is included in the CIP block, the same CIP block applies to all formats of the work (Print, EPUB, MOBI, Audio). A PCIP block (see below) can be created for books without LCCNs, including books that have no print edition.

Let's look at some of these cryptic numbers.

The ISBN is the 13-digit version. It has a complex history of its own. A digression — the original ISBN (International Standard Book Number) was 10-digits long. As it began to run out of numbers to accommodate new books, it evolved into a version of the 13-digit EAN (European Article Number), an identifier related to the UPC (Universal Product Code) for any manufactured article in the world. The first 3 digits of the EAN indicate the country and, since books were already considered to be international, the prefixes “978” and “979” were dedicated to books, and consequently referred to, jokingly, as the country of “Bookland”.

The 10-digit ISBN is always called an “ISBN”, but the 13-digit version can be referred to as an “ISBN” or, in a nod to its derivation, as an “EAN”.

The “PS3613.Y4726T6 2012” is a Library of Congress Call Number (which, despite its initials, is not to be confused with the LCCN above). This book is in “American Literature / Prose fiction”.

The “813′.6–dc23” is a Dewey Decimal Classification. This book is in “American Literature in English”.

The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more

Posted in Just for Writers, and Publishing

This 4-part post covers a lot of the bibliographic data that holds the knowledge part of the book trade together, with a section for each of these.

I've written about ISBNs elsewhere. If you're a member of the #neverISBN or the #oneISBNtoRuleThemAll tribes, then this post is not for you — the ISBN holds it all together. It's a prerequisite for all of this. And remember, the ISBN identifies a single format of your work.

Many of these standards are international, but some of the national library stuff is, well, national. I'm describing the situation in the US, but other countries have similar setups.

The intent of this post is to provide basic orientation for indie authors. To find out more about these standards and the groups that maintain them, break out your search engines and go to work.

Will you sell any more books if you enable these standards for your books? Probably not. But there are other reasons to create and maintain high-quality bibliographic data for your books, not least of which is future-proofing your work and making it just that bit more appetizing for library acquisition.

Ready or not, let's dive right in.


Index of topics

 


LCCN – Library of Congress Control Number

The Library of Congress (LoC) is the “library of record” for the United States. Check out the link — it's had a long and fascinating history. The LoC created its own cataloguing system, the Library of Congress Classification, which gives every document an individual identifying number, the LCCN.

LCCNs refer only to print editions. You get one from the Library of Congress by asking for one. It can be a confusing process, mostly because of the nomenclature of the various programs. (Here is a useful guide.) Basically, you sign up for a program (the Pre-Assigned Control Number program, or PCN) that allows you to request an LCCN, which is in the form of YYYYnnnnnn, where “YYYY” is the current year, and “nnnnnn” is a numerical sequence that starts over each year.

Once granted, the LCCN goes on the copyright page of your book like this:

Library of Congress Control Number: 2012040017

Incidentally, the LoC makes LCCN records for other things. For example, it maintains various “authority” lists such as subjects and names, and other systems can refer to them (such as Worldcat (OCLC) below).


Index of topics