This article is crossposted to Jaye Manus's website.
Silly me. I’m an old programmer and I pride myself on trying to get my ebooks “just so”, as if I were writing a piece of code. I want to create worthy offerings to add to humanity’s river of books; at the very least, they should be shiny and well-scrubbed.
So when Jaye Manus offered to judge the formatting of a few books from her blog fans, I hopped right on board, and she was very kind in her review. But I read with horror things like “squishy line spacing” and links to chapters not working quite as they should, systematically.
I use an EPUB reader and hadn’t seen the book on a Kindle device other than the PC Previewer, so it was useful to see this from the Kindle reader’s perspective, since none of my buyers had complained (yet). Without a Kindle device, I hadn’t realized quite how irritating it was to not properly trigger the “Cover” and “TOC” hard buttons.
Now on the one hand, it wasn’t really broken, but on the other hand, I want perfection in book formatting, and some cosmetic and graphic flourishes. I’m not willing to settle for “good enough”, so Jaye was nice enough to coach me through some of the issues.
If you’re content with auto-conversion from EPUB to MOBI or vice versa, or output direct to ebook formats from products like Scrivener, then this is overkill for you and you can stop reading now. But if you want as much control as possible over the results without killing yourself, you might find the following approach useful.
Originally I formatted my ebooks in raw HTML using tips from a variety of online recommendations, like Guido Henkel’s. The first book was a real learning curve for me, but after that it wasn’t difficult to just do the same for later books. Automated search/replace macros took care of things like wrapping lines with tags, and so forth.
I took the HTML output, opened it in Calibre, added the cover and converted it to EPUB using Calibre defaults (more or less). I did the same with a separate conversion to MOBI, which required me to maintain a different HTML file because of the way Calibre generates the MOBI TOC. These outputs were what I was uploading to the distributors. The MOBI files produced this way were not ideal, possibly because of AZW vs MOBI choices (in other words, me as a Calibre user, not necessarily Calibre as a tool), and I was left with two files to maintain (EPUB and MOBI) and the Smashwords EPUB as a third file, since my approach wasn’t modular. So every time I found a typo…
Jaye has taught me a better way…
I poured my HTML file for a book, broken up into chapters, into a Sigil template. Each part of my book has a separate file: Beginning Blurb, Title Page, Copyright Page, Also-By-This-Author Page, Small TOC, Chap 1,…, Chap Last, Guide & Name Index, If-You-Liked-This-Book Page, Excerpt-From-Next-Book Page, Author Bio Page, Long TOC.
I have 3 outputs: MOBI, EPUB, Smashwords EPUB. The difference between Mobi and the two EPUBs is that the “stylesheet.css” file is a little different between MOBI and EPUB, and the Cover page is treated differently (EPUBs require an extra step). The difference between my EPUB and my Smashwords EPUB version is that the Copyright Page has different content.
Now, thinking in the long term, I expect that the differences between MOBI output and EPUB output are likely to be persistent [UPDATE 2021 — I was wrong, MOBI no longer required by Amazon], and other devices may come along and generate different optimum stylesheet requirements. So I’m fine with having two different (but very similar) stylesheets which I maintain externally and copy in as needed into the stylesheet.css shell.
Likewise, the fact that Smashwords requires its own ISBN means only that I maintain two different external Copyright Page HTML documents and copy the contents of whichever one I want into the shell in Sigil. UPDATE 2016 — Smashwords no longer requires a unique ISBN – you can use the ISBN for your own EPUB file. So you only need two Copyright Page HTML documents: one for MOBI and one for EPUB, with the different ISBNs. UPDATE 2021 — Amazon now accepts (prefers) EPUB files and does its own conversion to MOBI. So now I only need one ISBN for an ebook, and only the EPUB format.
Both of these make use of a simple modular structure.
So, what happens when I finish a new book and want to format it?
PREP
1) I copy the Sigil file (MOBI version UPDATE 2021 — EPUB) from the previous book and rename it.
2) I create Copyright.HTML files for both the normal and Smashwords copyright pages by copying the ones for the last book, renaming them, and updating the content.
3) I create a new Title page (it’s a graphic) and a new Cover.
MAIN CONTENT (MOBI [UPDATE 2021 — EPUB])
4) I work on the MOBI version first (it’s the master) [UPDATE 2021 — Only EPUB is now required]. I copy the text in, chapter by chapter, the front blurb, and the back excerpt. I run saved searches to wrap the lines with <p></p> tags and to convert special characters to named entities.
5) I update the Also By and If You Like This Book pages by hand.
6) I run a saved search to update the Title field on all the HTML pages to the new work, and update the equivalent fields in the TOC.NCX and CONTENT.OPF files.
7) If the book is a little longer or shorter (number of chapters) than the last one, I update the TOC.NCX and CONTENT.OPF files and the HTMLTOC file.
8) I update the metadata in the TOC.NCX and CONTENT.OPF files. This allows me to do some things that either Calibre doesn’t, or I don’t know how to find, such as set a UUID (Unique User ID) for my short stories that don’t have ISBNs, embed book descriptions, add keywords, etc. There’s a great tool for this that Jaye told me about:
9) Run the file through Kindle Previewer (which runs Kindlegen) and check the results.
How much time does this take? I just updated my entire backlist (3 novels, 5 short stories, 1 story collection) to Sigil – it takes me about an hour for a novel, and 20 minutes for a short story. Making the short story collection from the already-formatted short story files was truly trivial.
EPUB VARIANTS
10) Substitute the content of the EPUB stylesheet into the stylesheet.css.
11) Run the Sigil tool to insert the cover. (Kindlegen does that a different way for MOBI) [UPDATE 2021 — Only EPUB is now needed by Amazon].
12) For the Smashwords variant, substitute the contents of the Smashwords copyright.html for the default one.
That’s it. I import the files into Calibre for one last look to make sure they seem healthy, and do a quick scroll through on my EPUB reader. If I find typos, I fix them in the MOBI version (and the Scrivener original) and redo steps (10)-(12).
Why not use Calibre? I am confused by the various options and clearly, for MOBI conversion, I wasn’t doing it quite right. Also, my original HTML file was one big file with a stylesheet and all chapters together, making modular changes clumsy. Calibre created its own version of the styles it found, and they weren’t always what I expected. It’s a big black box to me, and there were some issues with the results, which may be my fault, not Calibre’s.
Why not use Scrivener? Like Calibre, you are at the mercy of whatever Scrivener decides to do to instantiate the different conversions. Since the Scrivener text isn’t in HTML, there are all the issues of named entity conversions to deal with, and you have little control over the default styles. The results may be clean, but you can’t do anything special, such as use graphic chapter heads, scene dividers, and so forth, at least not in the Windows version. Perhaps there’s a way…
Other tools, like Scrivener, will take your word processor input and generate EPUB and MOBI output, but the black box in between what you write and what they produce leaves you at the mercy of the limitations of others, and so your output will remain at best functional vanilla. That’s not a bad thing, but we can do better.
It’s really not that hard to go through the learning curve once. After that, each new book becomes quite easy. Your book designers or people like Jaye can help you get started by setting up the first one and explaining how it works.
UPDATED (April, 2016)
My processes are rather simplified from this now (no Calibre) and external things have changed (Smashwords allows your ordinary EPUB ISBN — you don't need a separate one).
In brief:
- Start w/Sigil file from last book (MOBI variant UPDATE 2021 — EPUB). Fill with new content. Update the Title statement on each content page. Replace special characters with HTML named entities. (Sigil's Saved Searches will help with this).
- Adjust TOC.NCX and CONTENT.OPF files to match the number of chapters (STRUCTURE) and the correct METADATA.
- Update the Cover, Title page, Copyright page, If You Like page, Other Books By page. Update separators images.
- Save, and create EPUB variant (different stylesheet, modification in Cover treatment).
- EPUB version is done. Copy, rename, save. Use for EPUB retailers.
- Use Kindlegen to create MOBI file from MOBI variant and test for Kindle button behavior. Personal MOBI version is done. Copy, rename, save. Use only for direct distribution (sale from publisher/author site).
- Use MOBI variant (which is an epub filetype) for MOBI retailer (Amazon). Their conversion into MOBI is more robust for some applications.
Unfortunately, Sigil is outputting EPUB files for me with a bunch of gobbledygook which, as far as I can tell, makes no difference to the actual formatting, but throws their system into a tizzy and disallows you from distributing the book. 😛 So it’s back to the drawing board for me. (And I’m not doing novels; I’m doing non-fiction with essential illustrations, so it’s a real headache if/when I have to strip out all the formatting and start from nothing.
If you or any of your readers have any suggestions for how to get this stuff stripped out without starting from scratch, I’d love to hear them.