PUBLISHED DECEMBER 2015
by Phil Madans, Executive Director of Digial Publishing Technology, Hachette Book Group
You walk into a room and see six people you’ve never met. Your host says, “You really must meet Jane. She’s the one in the blue jacket.” The color of her jacket has now differentiated Jane from the rest of the group and identified her. This kind of identification can work in certain situations, but think about the same exchange taking place in Times Square on New Year’s Eve, where tens of thousands of people might be wearing blue.
Then think of booksellers and libraries offering thousands or hundreds of thousands of books. How are potential buyers going to find yours? By the cover? Not likely. By the title? Maybe if it’s unique, but the same title may appear on many different books (the WorldCat database lists more than 46,000 entries for The Gift, for example).
Two things ensure a smooth path to a particular book: an identifier and metadata. Together, they are the basis of discovery. What follows focuses primarily on identifiers in the book industry, but it also focuses on metadata, because one won’t work without the other.
An identifier is “generally a sequence of alpha-numeric characters” that “unambiguously differentiates one thing from another in a particular context,” as defined by “BISG Best Practices for Identifying Digital Products.”
Identifiers are everywhere. Look inside your wallet and you will probably find at least a few, including your driver’s license number, Social Security number, credit card numbers, and library card number. Each identifies you as a unique individual in a particular context, and each is linked to information (metadata) pertinent to that context.
Key identifiers for publishers are the relatively new ISNI, the well-established ISBN with added guidelines for the digital era; the DOI, which is used primarily by nontrade publishers; and the ISLI, which was officially released earlier this year.
Along with such standard identifiers, proprietary identifiers are used within the industry today.
Standard identifiers, which are distributed publicly and designed to ensure unambiguous communication between and among trading partners, are regulated, maintained, and revised by accredited governing bodies with a clear chain of command and a standard set of policies and procedures. Stringent control is necessary to keep duplicate or counterfeit identifiers from entering the marketplace.
Standard identifiers make clear information paths possible. Consider, for instance, the chain of information between publishers and their customers. Publishers attach metadata to each book’s ISBN and send it to bookstores, data aggregators, distributors, and, ultimately, the end consumer. And communication goes both ways. Customers send publishers information about sales, inventory, bestseller status, reviews, and feedback.
Unlike standard identifiers, proprietary identifiers should not be transmitted between trading partners. Any organization can create its own proprietary IDs. Think of Amazon’s ASIN, for example. Amazon created the ASIN so it could easily track the thousands of different types of products it sells regardless of the originating industry. But only Amazon generates ASINs. If another company tried to generate and distribute them, duplicate identifiers would cause confusion in the marketplace.
If you want more information, or have a fondness for acronyms, you can access the comprehensive reference guide to identifiers used in every segment of publishing via bisg.org/guide-identifiers.
Four Must-Have Identifiers for Publishers
International Standard Name Identifier (ISNI). The relatively new ISNI unambiguously identifies contributors to creative works—writers, musicians, composers, artists, producers, publishers, agents, and so on. Since it identifies a public identity, a writer who publishes under more than one name would have more than one ISNI, which could be linked through metadata unless the writer’s real identity is not to be publicly known.
Over the past several years, the ISNI database has been populated with more than nine million entries, including entries for 500,000 organizations. That work has been an international effort by a number of national libraries and other online resources such as OCLC. Since it is common for many people to have the same name, the vetting process is stringent and time consuming.
The ISNI is considered a “bridge identifier.” It is meant to be a connection across multiple domains and linked to other standard or proprietary IDs in other databases.
Anyone can register for an ISNI. Bowker, which is the US Registration Agency, offers signup via its website and has started including ISNIs in its Books In Print metadata. ISNIs have also been incorporated into Wikipedia pages. To search the ISNI database, go to isni.org/search. If you’re not in it, you can apply for inclusion, and if you are in it but see mistakes in your entry, you can submit corrections to the registry.
International Standard Book Number (ISBN). This familiar unique product identifier for books enables discovery, standardized processing, and distribution throughout the global supply chain. It’s made machine-readable and represented on physical products by an EAN-13 barcode
ISBNs are appropriate for books, chapters, maps, and audiobooks. They should not be assigned to greeting cards, updatable databases, web pages, games, or music.
Substantial change in a book’s content (rule of thumb, 20 percent) requires a new ISBN.
Originally created to identify physical products in a physical supply chain, the ISBN has faced some challenges as it is applied to digital products. In response, the BISG Identification Committee created best practice recommendations (which are available at bisg.org/best-practices-identifying-digital-content-0).
The key recommendation is this:
A publisher should always assign a unique ISBN to serve as the identifier of each unique digital book it releases into the supply chain in order to maintain an official link by which metadata and sales information can be communicated back and forth. As this is the beginning of the supply process and represents the root form of the content, it is important at this stage that the publisher use an official ISBN rather than a proprietary identifier.
It is easy to differentiate between physical products such as a hardcover and a paperback. It is clear that an ISBN should never be reused (for instance, a new paperback print-on-demand version of an out-of-print hardcover title should get a new ISBN). And it is also clear that a digital book must not be assigned the same ISBN as that book in physical format; even if the physical book is no longer in print, the physical book ISBN cannot be reused for the digital book.
But how do you decide what identifier to use for products that exist only as strings of 1s and 0s? Consider three key elements: content, format, and usage constraints.
Digital books that have differing content should be assigned different identifiers (think all-text and enhanced e-book editions).
Digital books that have differing file formats should be assigned different identifiers (an EPUB and a PDF are as different as a hardcover and a paperback).
Digital books with different usage constraints (i.e., different rights granted to consumers) should be assigned different identifiers.
How do you determine whether the content, format, and usage constraints of particular electronic files indicate that they need their own identifiers? Metadata. If the metadata is describing different products, the products need different identifiers.
The table at right offers a few examples of how metadata determines product. In these five cases, differences in metadata indicate different products that need their own separate identifiers. Even a slight difference such as the ability to print or not print a PDF file requires a different identifier.
Note that the label in the chart is Identifier, not ISBN. In many cases, the identifier of a digital product should be an ISBN, but sometimes a proprietary identifier is perfectly acceptable. The key factor is how the product is being presented and sold. If it is being sold only from the source company’s website, or only in a closed system, a proprietary identifier is all that is required. But if it is being made publicly available and being distributed with metadata to points within the marketplace, it needs an ISBN; and a third party (such as a conversion house) can assign one if the publisher doesn’t.
In any event, ISBNs are assigned, not created. One ISBN registration agency per country or community is designated by the International ISBN Agency to assign and distribute ISBNs to the publishers and self-publishers located in that area. Nobody should ever present a number not issued by these agencies as an ISBN.
Third parties can assign correctly acquired ISBNs for their customers and clients. But before you opt to use such an ISBN, note that it will be registered to the third party, not to you as the publisher, and you may not be able to take that ISBN with you if you decide to switch partners. Also, be careful when you see online vendors reselling ISBNs. It’s very likely that you will not be listed as the Publisher of those ISBNs at the registration agency, which means that metadata fed out to the industry by the agency will list a different publisher for your book.
For a wealth of details, see the International ISBN Agency’s User Manual, which is downloadable free at isbn-international.org/content/isbn-users-manual. Sometime in the coming year, after a periodic revision concludes, a revised user manual will be available.
The Digital Object Identifier (DOI). A victim of acronym confusion, the DOI has sometimes been misconstrued as an identifier of a digital object. In fact, it is a unique and persistent digital identifier for objects that can be either digital or physical.
What makes it persistent? A simple URL link on a web page can have a very short shelf life. Content changes; content is moved to different pages; websites are bought or sold or renamed; URLs change and links get broken. The DOI solves this problem by creating a unique link that is displayed on the source web page instead of a hyperlink to the target URL. When the DOI is clicked, it resolves to the DOI registration database, finds the associated target URL there, and redirects the request to the required page.
Because the DOI remains the same if the target content changes or moves, there’s no need to change the source link on web pages. Only the target URL in the DOI registry needs to be updated. Changing the URL in that one database will direct all future requests to the new location. No broken links.
The DOI is used extensively in scientific and journal publishing where linking and cross-linking of material and references are very common. It has not gained much adoption in general trade publishing, but new opportunities are being explored.
DOIs are not product identifiers and should not be used in place of an ISBN for digital products. However, an ISBN can be made actionable (clickable) by encoding it into a DOI to create an ISBN-A. For more information: dx.doi.org/10.1000/182.
International Standard Link Identifier (ISLI). The ISLI is one of the newest members of the International Organization for Standards family that also includes the ISBN, the DOI, and the ISNI. Officially published by ISO in May 2015, it is an identifier for links between entities specified by associated metadata. The link ID is one-way.
Here are two examples:
- Jack marries Jill (because the link is one-way, Jill marries Jack would need a different ISLI ).
- “Marries” is the link and would have metadata about date, place, location, and so on.
- http://dx.doi.org/XX/XXX is referenced in ISBN 978XXXXXXXXXX
- The music file located here [DOI] can be found in this book.
- Metadata can be registered to the place in the book where the reference occurs to ID the piece of music.
More information about the ISLI will appear as it continues to take shape. Its official website (still in beta, but accessible) is isli-international.org/isli/en/index.
Without metadata, an identifier has no meaning. A fingerprint can be used to identify a person, but only if there is metadata related to the fingerprint about the person it belongs to—name, gender, Social Security number, age, height, eye color, hair color, and the like. Otherwise it is just an ink smudge.
Similarly, without an identifier, book metadata elements are just random pieces of information floating around untethered, lost in a crowd. Standard identifiers are about enabling communication that makes each book findable and trackable. Using the right identifiers coupled with the right metadata can dramatically increase a publisher’s ability to communicate with trading partners and readers, and therefore dramatically increase the likelihood of making sales and acquiring actionable data.
Phil Madans is the executive director of Digital Publishing Technology for the Hachette Book Group and chair of the Book Industry Study Group Identification Committee. This article is adapted from a webinar he delivered that was co-produced by BISG and IBPA.