For centuries, printed books have included indexes to give readers easy access to particular aspects of their contents. But since the beginning of the digital age, many content providers have considered indexes extraneous, even though indexes offer users a higher degree of success than search functions in finding information; at least one study has shown that finding content takes less time via indexes than via searching.
Indexes began vanishing with the advent of full-text search on mainframes once space was available to store the whole content of journals and books. CD-ROMs and web sites also originally elected not to include indexes. Users pushed back, wanting indexes with each new technology, but the pattern is now repeating itself with e-books.
Three reasons are often cited to explain the absence of indexes in e-books: (1) search replaces indexes; (2) there are no pages in e-books; and (3) making indexes functional in an e-book costs money. But let’s look at the facts:
Search Can’t Replace Indexes
Admittedly search technology has improved over the years, but natural language still befuddles search engines because they look for patterns of words and grammar clues, not the meanings behind the written text, to present a set of possible results. Searchers then have to click on each result to see if it’s useful.
Indexes, on the other hand, are written by information specialists who analyze content and develop a list of terms representing its topics, their breakdowns, and their relationships. Indexes are crystallizations of access paths to the content’s elements.
The index terms are really a kind of metacontent. They disambiguate homographs, relate synonyms and similar topics, consolidate terminology in multi-authored works, ignore passing mentions, include references to images, guide users to preferred terms, indicate the span of coverage, break large topics into subtopics, and more. Often, they make content accessible to casual readers as well as experts in a field by providing alternate terminology.
Page Numbers Aren’t Necessary
It is true that many e-books don’t display page numbers. However, page breaks are often encoded in e-books, so they can be destination points for links from an index. Also, some publications use section numbers or other types of locators that can be cited in an index. These make suitable candidates for hyperlinking as well.
Costs Are Low
Publishers often pay hundreds or thousands of dollars to have indexes created for their physical books, yet they seem ready to toss this investment aside when creating an e-book. Partly this may be due to the lack of tools for converting legacy titles. However, most major conversion houses can create a linked, active index during conversion, and new books produced by one-source publishing can easily yield indexes for both print and electronic publications—possibly with the index locators linking to paragraphs rather than page breaks.
The tools will only improve over time. Compared to the other costs of developing a book, the cost for including an index in an e-book edition is quite small.
Current and Future E-book Index Options
Today’s two major e-book formats are Kindle (MOBI/KF8) and EPUB2. For new titles, Adobe InDesign (Create Cloud) produces an EPUB2 or 3 with a linked index if the index is embedded in the book file. Other less common workflows exist as well. For legacy books, when conversion houses work from PDF files and a PDF contains an index, the index can and should be included in the e-book. Many conversion houses can use a script to link page numbers from the index, and cross-references may also be linked within the index.
In Kindle and EPUB2, index entries are structured as simple paragraphs for display—i.e., there is no indication of the indexes’ actual structure. In the near future, as EPUB3 continues to be adopted by more reading systems and workflows, creating EPUB3 formatted books, particularly nonfiction titles, will become more common (EPUB3 is designed to be readable by EPUB2 reading systems although all the functionality may not be available.). And a proposed EPUB3 Indexes Specification has been developed to allow reading systems to exploit standard index structure encoding that will provide new user functionality for an improved user experience.
For instance, when a user highlights a section of content in a book, the reading system could show the user the index entries that point to that section, and the user could then easily jump to related content. Indexes and content would become a two-way street for exploring a book.
Breaking new ground, the EPUB3 Indexes Specification will tag all indexes the same way, allowing index crawlers to retrieve an index or its headings for use in guiding potential purchasers to a book. This means, for example, that:
- Each e-book index could be displayed to a user for perusal.
- The index’s headings could be matched against the headings for indexes in other books owned by the user to suggest another purchase to the user.
- Index entries could be mashed up for a user to browse, with links to the books they came from.
- The main headings from all book indexes could be used as search terms on a publisher’s or retailer’s site.
In short, by providing indexes in their e-books, publishers can already give readers a better, faster way than search offers for finding what they want. And as reading systems adopt the EPUB3 Specification, providing indexes in e-books will give readers still more discovery approaches that will lead to increased sales.
David Ream is the president of Leverage Technologies, Inc., a consulting/software firm that serves publishers and currently produces EPUB3-draft-compliant indexes. He has worked with indexers and provided software for indexes since 1975. Currently co-chair of the American Society for Indexing’s Digital Trends Task Force (DTTF), he is also co-chair of the International Digital Publishing Forum’s EPUB3 Indexes Working Group. To learn more: www.LevTechInc.com; DaveReam@LevTechInc.com.