Visit Homepage
Skip to content

Category: Publishing

Book Metadata, ONIX, and Digital Asset Management (DAM) systems

Posted in Distribution, and Just for Writers

Book Metadata

Maintaining the metadata for your books can be quite a chore. Joel Friedlander's overview covers some of the basics.

It takes me two pages of my multi-page management spreadsheet to hold the simple columnar data (ISBNs, Library of Congress (LoC) numbers, publication dates, page counts, etc.) and textual data (blurbs in various lengths, keywords, BISAC codes, etc.). Some of that data is static, but parts of it are changeable based on marketing experiments, temporary sales, and so forth.

When we go to a distributor site that caters to indies, we are usually presented with a form to fill out for each book. Since I strive for consistency, I always have to open up the form for a book I've already posted to that distributor, to make sure I fill out all the questions the same way across all my titles.

How do the big boys handle this?

Well, that's a question, isn't it? It's hard to get the details. The big traditional publishers have complex internal needs (moving a title from acquisition through edit, formatting, marketing, publication) involving different departments and requirements, and their management systems are crafted with that in mind to give them a shareable single complex record for the title that holds both in-house private data and public data intended for use with their trading partners (distributors, retailers, etc.)

The output of their systems these days tends to be in the form of ONIX records (see below) — this much I know. But exactly how they share those records with their partners is obscure to me. (Alas, trad-published authors who've added indie-publishing share insights with us, but we don't meet a lot of back-office technical types from the traditional publishing firms.) You can see how vague the specific details are for an overview on metadata maintenance that's meant to be helpful, or for a discussion about refreshing metadata as things change.

The big traditional publishers have much more complicated problems. As indies or micro-publishers, we get to choose what we want to deal with, and some of the industry tools are available for us to use, if we think it's important enough.

Read More Book Metadata, ONIX, and Digital Asset Management (DAM) systems

Checking up on book distributors

Posted in Distribution, and Just for Writers

So, you've gone wide and international with your ebook distribution, and your print edition is in Ingram's database, making it available to a fair chunk of the world's bookstores, both physical and online. Your dashboards that list your titles with your various distributors all look fine and dandy. You given them your books, and they're making sure they're getting into the world's bookstores.

Time to sit back, proud of your books' availability in online stores all over the world, right?

If only it were that simple.

How are my distributors doing?

It’s not easy to figure that out.

I've been trying to sort out my various distribution options recently as I retire a couple of distributors and take on new ones. It's a confusing area, and the lists you can get of their channel partners are not always current or complete. I was focused on who had the best reach, or reached unique retailers, with reasonable returns and the ability to turn channels on and off to avoid duplication.

I get to retailers in a variety of ways.

Ebooks

  • Direct from my website (ecommerce). Gumroad (in several formats).
  • Direct upload. Amazon, Kobo, Barnes & Noble. If I could (no Mac), this would include Apple iBooks.
  • Hybrid (storefront & distribution). Smashwords.
  • Distributor. PublishDrive, Streetlib (coming soon).

Of these, I use PublishDrive to reach every channel (including Apple iBooks and Google Play) that I don't go to directly. I restrict Smashwords to its storefront and its unique partners only. PublishDrive, Streetlib, and Smashwords all let you select or disable individual partner channels to avoid overlap.

Already there are complications — Kobo is also a distributor, so though I go there directly, it distributes my titles to its own partners. There is no ability to pick and choose among Kobo's partners, so it's up to me to avoid enabling one of Kobo's partners at one of my other distributors. (Perhaps that can be controlled at the manual level, via email requests to Kobo, but I prefer something more automated and reliable.)

Other complications — I have to manually request special retail pricing for Google Play, to keep its automated discounting from creating a problem with Amazon. Hard to find distributors that will let you set per-channel pricing, but I think that must be essential to adjust pricing in different parts of the world (like India).

Audio

  • Distributor. AuthorsRepublic.
  • Direct upload. CD Baby (coming soon).

Print

  • Hybrid (storefront & limited distribution). Createspace. (Not expanded distribution.)
  • Distributor. Ingram LSI.

Createspace only distributes to Amazon so there are no channels to disable to avoid duplication. Ingram can't provide a list of print partners — much too broad, and much of its reach is through intermediate distributors or aggregators. No telling where your books will end up at online retailers.

Where are your books, really?

I began by taking the lists of known channel partners from PublishDrive, Smashwords, Kobo, and AuthorsRepublic. I then went to each of those sites and tried to find my books there. That alone was an eye-opener.

Read More Checking up on book distributors

What happens to my metadata when it leaves the house?

Posted in Distribution, Just for Writers, and Publishing

That's the title of an excellent if brief essay by Laura Dawson of Numerical Gurus. Her site is an excellent resource for the explanations and history of some of the acronyms that haunt the world of books.

Since I seem to be on a kick lately with what metadata exists and how it sloshes around through the book ecosystem, I thought we could all benefit.

How many of those girls are properly dressed (um, properly formatted data)? And how can you keep them clean, out there in the big ol' world? Where there are boys, and parties, and fast cars, and lots of dark alleys to wander into.

We've all seen it. We spend time perfecting the metadata in our feeds, send it out to our trading partners, and had to take complaints from agents, authors, and editors. “Why is it like that on Amazon?”

The truth is, data ingestion happens on whatever schedule a given organization has decided to adhere to. Proprietary data gets added. Not all the data you send gets used. Data points get mapped. So what appears on any trading partner's system may well differ somewhat from what you’ve sent out. There are so many different players in the metadata arena that can affect what a book record looks like. When you send your information to Bowker, they add proprietary categories, massage author and series names, add their own descriptions, append reviews from sources they license – and send out THAT information to retailers and libraries. The same thing happens at Ingram, at Baker & Taylor – so what appears on a book product page is a mishmash of data from a wide variety of sources, not just you.

Read More What happens to my metadata when it leaves the house?

What does your book look like to booksellers?

Posted in Distribution, Just for Writers, and Publishing

Print on Demand (POD) (versus short-run print jobs) is the typical method used initially by indie authors, and the two big providers are Createspace (owned by Amazon) and Ingram, either via Ingram LSI (Lightning Source) or IngramSpark.

The merits of Createspace vs Ingram is a common discussion topic among indie authors who produce paperback editions. This post is an update of this analysis and focuses on what your books look like to booksellers placing orders with Ingram.

The recommended practice these days is to use both vendors for print, if you can: Createspace (without their expanded distribution option) for Amazon, for inexpensive orders for inventory, for an online store, and for direct shipping; and Ingram for everything else. (The recent news about availability of print from Amazon KDP seems to signal that access to Createspace directly might change.)

Some authors create a separate Library edition, just to use that part of Createspace's expanded distribution with a Createspace ISBN.

If you don't want to go to the bother (and the expense) of getting your own ISBN (a whole separate discussion) Createspace will supply you with an ISBN owned by Createspace for you to use, for free. (If you have an ISBN, you can use your own.)

Since the common progression for indies seems to be to start with Createspace only, and Createspace has an expanded distribution option and a free ISBN that gets your books into Ingram (“Booksellers and Online Retailers”), the question often comes up: why bother going to Ingram directly?

Why go to Ingram directly, in addition to Createspace?

The manufactured products are slightly different (quality issues with a small and debatable preference given to Ingram), and unlike Createspace, the Ingram edition costs money: a title setup fee (circa $49), an annual market fee (to stay listed in Ingram's database) ($12), and a revision fee for any change in cover or content ($40 each). The fee details vary a bit between Ingram LSI (mostly for traditional publishers) and IngramSpark (mostly for indies) and coupons/discounts are not infrequently available.

And you need your own ISBN, a not-inconsiderable expense in the US.

But there are other concerns.

Bookseller-specific issues

1) Discounts

A bookstore with good credit and broad needs may use Ingram as its main supplier. Other bookstores use smaller, more targeted suppliers who get their list of offerings from Ingram (and charge a fee).

Ingram allows you to set the same standard discount that traditional publishers use: 55%. Createspace's maximum is 40%.

Here's what that means. Ingram takes 15% of that discount for its services. It subtracts that from the books you list directly with Ingram, but it also subtracts that from the books it lists that were given to it by Createspce via expanded distribution.

So at 55% (Ingram's standard), minus Ingram's 15%, there remains 40%. Some of that may go to an intermediate distributor. Whatever's left over is the bookseller's potential profit, which he may discount to push sales.

At 40% (Createspace's max), once you subtract Ingram's fee of 15%, all that's left is 25% for the intermediate distributor(s) and bookseller to share. That is unattractive to many booksellers. Some won't even order books to fulfill customer requests at that small a profit to themselves.

2) Free ISBN / Publisher name

Createspace offers its own ISBN, if you don't have or want to use one of your own.

The general rule is: whoever owns the ISBN is the Publisher, from the perspective of forms and databases. Some recent forms have separated the two things, so that the owner of the ISBN and the “Publisher of record” and the provider of the data feed on a form can be different things, but that is far from general.

For a while, the Createspace data downstream showed Publisher = Createspace, even if you used your own ISBN. Now that only happens if you use a Createspace ISBN. And even so, what shows on a form depends on what the form uses for data sources: if it shows ISBN and looks up the ISBN, then it would get the ISBN owner. If it simply assumes the supplier of the data feed (Createspace) is the owner, then it shows Createspace as the publisher. That's how audiobooks often show up as Publisher = AudiobookFeedProvider even when the ISBN belongs to the Publisher.

How booksellers see a traditionally published book when they order from Ingram

Read More What does your book look like to booksellers?

Looking for a tune

Posted in Audiobook, Tales of Annwn, and Under the Bough

That's usually a topic for my fiddling website, but not this time.

This year I'm planning to do several of my audio books. That includes the stories from Tales of Annwn, and one of those (Under the Bough) includes a song.

I better come up with a tune for it. Oops.

It's a rollicking drunks-at-the-wedding sort of ditty. If any readers would care to make suggestions, I'll be glad to consider them before rolling my own, and give you a credit in the audiobook. Welsh or general Celtic styling is what I have in mind.


What did she see in him?
Who could explain?
Another full glass,
And we’ll not mind the pain.
Pain, no pain,
Again and again,
Another full glass,
And we’ll not mind the pain.

Over and under him,
Country or town,
Give us one more
And we’ll drink it right down.
Down, down,
Away with her gown.
Give us one more
And we’ll drink it right down.

Lift up your glasses,
And do what is right.
Wish them the best,
Of both day and of night.
Night, night,
An inspiring sight,
Wish them the best,
Of both day and of night.


The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more-4

Posted in Just for Writers, and Publishing


Index of topics

 


ISNI – International Standard Name Identifier

This is one of the international standards for disambiguating identical names for different individuals. You can contact one of the regional standards bodies and request an ISNI (free). It's intended for all sorts of creatives — writers, musicians, artists, etc. — as well as anyone else in the public eye who needs to be uniquely identifiable, including scientists, living and dead. Einstein has an ISNI number.

You can see at the bottom links to VIAF and the LC (Library of Congress LCCN-N) authority records mentioned in the MARC and LCCN sections above.

They are not the place to send all your ISBNs to — links to a list of books on your site will do the trick.
 


ISTC – International Standard Text Code

This is one of the international standards for aggregating all manifestations of a text under a single identifier. For example, I have 4 formats for my book To Carry the Horn — trade paperback, epub, mobi, and streaming audio. That's one text, 4 manifestations. For some sorts of sales analysis, it's useful to be able to lump those things together. That's what the ISTC is used for. You can contact the standards body and request an ISTC for each of your titles (free).

In the MARC section above, you saw that Worldcat had its own version of the same concept.
 


OCLC – Online Computer Library Center

OCLC is an organization whose member libraries cooperatively build and maintain the Worldcat database of library records.

Much of what they carry is described in the MARC section above. Zero in on a typical record and expand all the sections to see the sort of data that they focus on.

Not only do they maintain the bibliographic data about the works in their system and the libraries that hold them but they also maintain or cross-reference to a variety of authority files.

Worldcat even maintains its own summaries by person, using the Library of Congress LCCN-N identifier from the LCCN section above. Handy for seeing a summary of libraries (without the details, alas).


Hope this has been useful for setting a context for some of the alphabet soup you may have encountered.


Index of topics


The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more-3

Posted in Just for Writers, and Publishing


Index of topics


MARC – MAchine-Readable Cataloguing

This is a standard for a digital version of the information carried in a CIP or PCIP block — and more.

Here's a MARC record for one of my books, from my local library. Easier to show than to tell.

This is how it looks in the library's internal catalogue. The MARC specification goes on forever with all sorts of optional fields, and when it gets passed along to Worldcat (see OCLC below) a lot more had been added to the “Linked Data” XML record they maintain, with a great deal of additional info..

It's easier to show you most of this data in XML format, for the book mentioned above, than to attempt to explain the details. This comes from the Linked Detail section at the bottom of a book record in Worldcat. Here the data has been amplified and cross-referenced, as part of a general internet initiative to connect isolated sets of domain expertise into larger supersets.

Primary Entity

<http://www.worldcat.org/oclc/814529418> # To carry the horn : the hounds of Annwn : 1
a schema:Book, schema:CreativeWork ;
library:oclcnum “814529418” ;
library:placeOfPublication <http://id.loc.gov/vocabulary/countries/vau> ;
library:placeOfPublication <http://experiment.worldcat.org/entity/work/data/1413542221#Place/hume_va> ; # Hume, Va.
schema:about <http://id.worldcat.org/fast/933467> ; # Fox hunting
schema:about <http://dewey.info/class/813.6/e23/> ;
schema:about <http://experiment.worldcat.org/entity/work/data/1413542221#Topic/fox_hunting> ; # Fox hunting
schema:bookEdition “1st ed.” ;
schema:bookFormat bgn:PrintBook ;
schema:creator <http://viaf.org/viaf/56058221> ; # Karen Myers
schema:datePublished “2012” ;
schema:exampleOfWork <http://worldcat.org/entity/work/id/1413542221> ;
schema:genreFiction“@en ;
schema:genreFantasy fiction“@en ;
schema:inLanguage “en” ;
schema:isPartOf <http://experiment.worldcat.org/entity/work/data/1413542221#Series/hounds_of_annwn> ; # Hounds of Annwn ;
schema:nameTo carry the horn : the hounds of Annwn : 1“@en ;
schema:productID “814529418” ;
schema:publication <http://www.worldcat.org/title/-/oclc/814529418#PublicationEvent/hume_va_perkunas_press_2012> ;
schema:publisher <http://experiment.worldcat.org/entity/work/data/1413542221#Agent/perkunas_press> ; # Perkunas Press
schema:workExample <http://worldcat.org/isbn/9780963538406> ;
wdrs:describedby <http://www.worldcat.org/title/-/oclc/814529418> ;
.

Who uses MARC records and associated XML records? Libraries do. Anyone can use a MARC record (I suspect that Ingram does for its Ipage product above, for example), but it's primarily for libraries.

When a library receives a book, it typically looks on a repository like Worldcat (see OCLC below) to see if some other library has already created a MARC record for that book (in that format — the MARC record is specific to print, ebook, audiobook, etc.) If it finds such a record, it may duplicate it or modify it for its own intra-library catalogue.

If the MARC record isn't there already, the library can create one and populate the repository with it so that other libraries later on can benefit.

In theory, every trained librarian can create a MARC record. In practice… I recently donated books to 3 branches of my local library system only to discover that all 3 branches created separate and contradictory records, with errors and typos, until my 9 donated books ended up with 19 records (and two different versions of the author) in their internal regional library system. That was bad enough, but then those poor records began to make their way into the Worldcat repository. Only those same libraries can correct the records (and they're working on it now that I've explained the ramifications).

We can't make the records — we're not authorized. We can't add data to Worldcat — only libraries and a few others can.

As for the CIP / PCIP block above, there are 3rd parties who make MARC records for publishers and place them on Worldcat, FiveRainbows being one. For a fee.

So, on the one hand libraries will do it for you for free (Overdrive, for example is good for this), but on the other hand you're stuck with whatever errors they create. If you want to control the process and create the MARC record proactively via a 3rd party who can also produce the PCIP block, then you have that option.

There's interesting information in the XML record.

  • The official Worldcat (OCLC) number: 814529418.
  • The VIAF number for the author: 56058221. VIAF (Virtual International Authority File) is a way of uniquely identifying an author or other contributor. It is different from ISNI (see below), but you can see that my ISNI number is included as a cross-reference.
  • The Worldcat Entity id: 1413542221 which seeks to identify the common work with its multiple formats, similar to how the ITSC works (see below). It also includes a “creator” person id which is not the VIAF id, though if you scroll to the bottom of that link, you will find matching cross-reference links to the VIAF number and the ISNI, as well as to the Library of Congress name authority from the LCCN section above.
  • Lots of other useful tidbits.

To see more of the XML record in context, go here, and scroll down, and click on Linked Data. You'll find it educational.

Part of what you see with all the cross-referencing is that there are lots of standards kicking around — some international, some parochial, and some experimental. Libraries have been handling bibliographic information for quite a long time.

I don't know how the cross-referencing is done in detail — for example, how did they look up my ISNI number which is not in the MARC record or the CIP block? I've read of automated cross-referencing programs and lots of backoffice attention to the exceptions those programs kick out, but there must be quite a bit of dirty data scattered among the automatable clean records.


Index of topics


The world of deep metadata for your books: LCCN, PCIP, MARC, ISNI, ISTC, OCLC, and more-2

Posted in Just for Writers, and Publishing


Index of topics

 


CIP / PCIP – Cataloguing-in-Publication / Publisher's Cataloguing-in-Publication

Remember the card catalogues in libraries (if you're old enough)? This is the same sort of information, done as a brief bibliographic block that goes at the bottom of the copyright page of your book. It takes the place of your LCCN statement (because it incorporates the LCCN on the last line) like this:

Library of Congress Cataloging-in-Publication Data

Myers, Karen, 1953-
To carry the horn : the hounds of Annwn : 1 / Karen Myers. — 1st ed.
p. cm.
ISBN 978-0-9635384-0-6
1. Fox hunting–Fiction. 2. Fantasy fiction. I. Title.
PS3613.Y4726T6 2012
813′.6–dc23

2012040017

Even though LCCNs are only for print editions (see above), and the LCCN is included in the CIP block, the same CIP block applies to all formats of the work (Print, EPUB, MOBI, Audio). A PCIP block (see below) can be created for books without LCCNs, including books that have no print edition.

Let's look at some of these cryptic numbers.

The ISBN is the 13-digit version. It has a complex history of its own. A digression — the original ISBN (International Standard Book Number) was 10-digits long. As it began to run out of numbers to accommodate new books, it evolved into a version of the 13-digit EAN (European Article Number), an identifier related to the UPC (Universal Product Code) for any manufactured article in the world. The first 3 digits of the EAN indicate the country and, since books were already considered to be international, the prefixes “978” and “979” were dedicated to books, and consequently referred to, jokingly, as the country of “Bookland”.

The 10-digit ISBN is always called an “ISBN”, but the 13-digit version can be referred to as an “ISBN” or, in a nod to its derivation, as an “EAN”.

The “PS3613.Y4726T6 2012” is a Library of Congress Call Number (which, despite its initials, is not to be confused with the LCCN above). This book is in “American Literature / Prose fiction”.

The “813′.6–dc23” is a Dewey Decimal Classification. This book is in “American Literature in English”.

This information comes into use in unusual places. For example, the page that booksellers see when they look at Ingram Ipage to place an order shows this detail:

Notice that the LCCN in the above example ends in 40017, for a book published in October of 2012 — in other words, in that year there were probably about 60000-70000 books for which publishers requested an LCCN and then a CIP block from the Library of Congress.

Alas, the LoC only goes to the effort of creating the CIP block for about 50000 books per year. They have a semi-automated process, but it still requires human intervention, and they've taken steps to deal with the onslaught of volume since 2012.

Indies need not apply. That's right — if you're an imprint with only one author (specifically, that has published fewer than three books by an author other than yourself), the LoC assumes your books are not likely to be in sufficient demand that it is worthwhile to create a CIP block in advance of a request. They can always create a CIP block later, if enough libraries ask for one.

So while I received a CIP block for my first book, above, the LoC declined to do the same for my next book, two months later.

What can you do?

There are a handful of independent 3rd parties who create CIP blocks for publishers. The result is referred to as a PCIP (a Publisher's version of the CIP) to distinguish it from the sanctioned LoC CIP. Companies such as FiveRainbows, Donahue Group, Quality Books, and others will prepare a PCIP block for your books, for a fee. Some of them also create MARC records (see below).

There is more useful information from Joel Friedlander here, especially as the discussion is continued in the comments on his post. This comment from Lisa Shiel of FiveRainbows is especially helpful for sorting through some of the CIP and LCCN-related acronyms.


Index of topics