PUBLISHED MARCH 2017
by Deb Vanasse, Reporter, IBPA Independent
A media company uses propriety algorithms to identify trending topics, then assigns nonfiction book projects to freelancers to meet the demand. A pair of authors create an algorithm to identify the elements of bestselling novels. Publishers mine the data generated by e-book readers to inform editorial and marketing decisions.
In these ways and more, today’s publishing innovators are reaching deep into the vast stores of data generated by 21st century technologies to make smarter, nimbler decisions. In order to put the numbers to work for your publishing business, you need to understand the types of data that can be gathered and the ways it can—and can’t—be used.
Traditionally, the numbers on which publishers have most relied have been sales figures and other so-called “small data.” As explained by Tom Davenport, author of Big Data @ Work, “Small data is usually from your internal transaction systems, is structured, doesn’t change often, and usually fits on a single server.” In contrast, he says, “Big data is typically external, isn’t structured nicely in rows and columns, is continually flowing, or requires multiple servers to accommodate.”
Through the internet, copious amounts of big data are generated—so much, in fact, that companies hire analysts to interpret the numbers, suggesting tactics based on consumer behavior. For instance, big data might measure the success of a social media campaign or determine whether a certain search term is driving traffic to a product’s web page.
A company should approach big-data analytics with a definitive plan, Davenport says. “The most important thing is to have some clear business objectives and questions you are trying to address with the management and analysis of big (or, for that matter, small) data,” he explains. “It doesn’t work well to just look into the data and fish around for some insights. Starting with a question like, ‘Who are our most loyal customers and how can we get them to buy more from us’ will yield much more valuable results.”
Founded by veterans of the tech and digital media industries, book publisher Callisto Media uses big data to address the broader question of what topics will generate the most interest among readers. Using a proprietary algorithm, the company mines big data, then assigns teams of freelancers to produce nonfiction books that are published through one of Callisto’s 14 imprints. According to Jim Milliot, editorial director at Publishers Weekly, Callisto was identified as one of the fastest growing American publishers in March 2016.
Others in the book industry tout the value of data that measures not just general consumer interest as quantified through web searches but the behaviors of the subset of consumers who actually buy and read books. Using reader analytics—data mined from e-book reader behavior—publishers can inform acquisition, editorial, and marketing decisions.
Value in Numbers
Responding to a digital innovation challenge supported by Penguin Random House, Andrew Rhomberg turned the attentions of his UK-based Jellybooks start-up toward reader analytics. “We have been developing, refining, improving, and scaling it ever since,” he says.
In an arrangement Rhomberg tags as “book candy for readers, data candy for publishers,” Jellybooks offers free avanced reader copies to e-book readers in exchange for their participation in test reading campaigns that involve consensual data collection. Through a range of supported reading apps, Jellybooks observes behaviors such as reading velocity and completion rates. The platform also asks readers to explain the reasons behind their behaviors, such as why they selected a certain title or why they abandoned a book.
Gathered pre-publication, reader analytics help publishers allocate resources and refine marketing efforts. Jellybooks analyzes the demographics of readers who download titles—young or old, female or male, commuters or weekend readers. They also identify what titles readers deem to be guilty pleasures and what ones they’ll rave about to their friends. In addition, readers weigh in on covers, titles, and book descriptions.
Rhomberg believes that reader analytics offer publishers more value than traditional pre-publication market testing. “You could spend tens of thousands on traditional focus groups and research methods, and all you would get are opinions from readers who may or may not have actually read the book,” he says.
Operating out of Seattle, Bluefire is another innovator in reader analytics, developing apps that enable retailers and libraries to aggregate data on reader behaviors. When readers are logging longer sessions or reading all the way through, that’s the mark of a page-burner, notes Bluefire founder and CEO Micah Bowers—a better predictor, potentially, than past sales for gauging how apt readers are to enjoy similar titles.
In its white paper “Publishing in the Era of Big Data,” Kobo affirms the value of reader analytics over traditional sales data. One example involves a comparison of award-winning titles, one with an open rate of 95 percent and a completion rate of 66 percent, the other with a more lackluster open rate of 25 percent and a completion rate of only 35 percent. The numbers point clearly to which author will have the stronger following.
Kobo’s white paper also points out how publishers can benefit from reader analytics that identify high-engagement, low-sales titles that warrant more marketing attention as well as high-completion rate authors whose sales potential may be under-realized. By comparing open rates, completion rates, average reading time, average reading sessions, and average reading length per session, publishers can also evaluate the performance of imprints, of titles within a series, and of serialized segments of a book.
Likewise, reader analytics can inform the decision of whether sales of a series will benefit from the first book being offered as a free download (permafree). On average, the download-to-sales conversion rate is 1.4 percent. But citing a six-week study, Kobo’s director of self-publishing and author relations Mark LeFebvre points out that among readers who open a free title (most don’t), the rate of sales jumps to 10 percent. For those who complete reading the free title, the rate of sales hits a whopping 51 percent.
Ultimately, says Rhomberg, reader analytics are about making publishers more productive by taking some of the guesswork out of the process and helping publishers better connect with readers. “Let’s stop this being a lottery and make it more like a profession,” he says.
To publish by the numbers, companies need to be able to access reader data that has been analyzed in a way that makes it meaningful—hence the term analytics. They also need to be nimble enough to act, giving independent publishers an advantage over larger, more siloed operations.
As Rhomberg points out, e-tailers such as Amazon and Google gather data on reader behavior, but they don’t share it with publishers. The Jellybooks solution involves publishers offering pre-publication titles and pointing readers to the Jellybooks platform, where data on their behaviors is gathered and analyzed. As of now, Jellybooks works mostly with large publishers. But by the end of 2017, Rhomberg plans to roll out a self-service option to generate reader analytics for smaller presses, to the tune of between $600 and $1,000 per title.
During business review sessions with high-volume Kobo Writing Life (KWL) publishers, Kobo shares certain reader analytics. In the future, Lefebvre says Kobo hopes to be able to offer data on reader engagement to more publishers through the KWL dashboard feature so that it, too, can use analytics to inform decisions.
At Bluefire, Bowers works primarily with companies and institutions that have the resources (in the tens of thousands of dollars) to brand their own reading apps. But no matter how publishers gather reader data, he notes that they need institutionalize ways to think through the information and consider the best ways to act on it. In too many instances, he laments, departments within a publishing house don’t talk to each other, and after a book is released, it moves out of the cycle of consciousness.
What Numbers Can’t Tell—or Shouldn’t
To those on the forefront of using analytics, publishing by the numbers only makes sense. For others, the prospect of honing editorial and marketing decisions by the numbers raises ethical and practical concerns.
To authors who worry that data will decide what gets published, Rhomberg points to Nielsen, which, for decades, has provided viewer analytics that shape decisions in the TV industry. “Would you rather decisions be based on the sales data of a book perceived as similar to yours or on how real flesh-and-blood readers engage with your manuscript?” he asks. Publishing has always had a commercial side, he notes, and data just makes that more transparent.
And, as Rhomberg points out, it’s not as if computer algorithms have the final say in publishing decisions—at least not yet. “The data helps human editors, publishers, publicists, and marketing executives do their jobs better,” he says. “People may praise that ‘literary gut feel,’ but gut feel is also biased, relying on rules of thumb that may be out of date. Data doesn’t care if you are white or black, female or male. It only cares about how readers enjoy the book. It’s extremely meritocratic.”
Davenport concurs. “I am sure there are some concerns here, but the fact is that other industries are so far ahead of book publishers in getting and using data for editorial and marketing decisions that I don’t think the industry has to worry too much,” he says. “In the long run, there is the issue of creating more shallow content because that’s what readers often want—the issue many newspapers have already faced, for example—but that is a general trend in media businesses anyway.”
Data mining also evokes privacy concerns. “Some readers may not want publishers to know who they are, so you have to get their permission for any steps you take with their data,” Davenport explains. “But that’s a good policy in any industry, and if you offer them something in exchange for the use of their data, most customers are okay with that exchange of value.”
At Jellybooks, the exchange of value involves readers being able to access their own data to learn more about how they read. “It’s like Fitbit for e-books,” says Rhomberg. “Consensual data is happy data.”
Bluefire and Kobo ascribe to anonymity in the distribution of reader data. “We never share information unless there is enough sales and reading data associated with a particular title or group of titles in order to ensure that the readers can remain anonymous,” Lefebvre says.
The Future of Analytics
What does the expanding use of analytics bode for publishing? Will the analysis of reader behaviors become so finely tuned that books will be written by formula, making authors obsolete?
“Maybe 50 years from now a computer could write a bestseller,” posits Bowers. In the shorter term, he says, analysis becomes yet another tool for evaluating a book. Publishers might be able to run a book through an algorithm that predicts a book’s engagement rate, but that’s only another data point, adding to those achieved through more traditional practices such as the distribution of galleys.
Look for some interesting developments in the use of analytics as e-book standards continue to develop, says Rhomberg, especially if the industry moves from native reading apps to responsive web apps. He also predicts that publishing lead times will shorten as analytics enable publishers to better understand the forces that drive demand for certain titles.
As readers continue to demand content in smaller pieces, Davenport anticipates more disaggregation of content, with more of it becoming available online. “That will mean that publishers can get a better idea of who is reading it and what they like,” he explains. “If you look at a lot of the things Amazon is doing—with Prime, Singles, Search Inside This Book, book recommendations, and so forth—those are a lot of the same kinds of things that traditional book publishers need to be doing. It may also mean that small publishers will need to form alliances and collaborations so that they can afford these innovations and offer a broad enough range of content to appeal to a wide variety of readers.”
Whatever the future of analytics brings, independent publishers are likely to be on the forefront of the innovations, says Lefebvre. “Smaller publishers and indie authors tend to have a better handle on the author/reader engagement element,” he says. “They are usually publishing either niche titles to a very specific market or know their customers better than any of the larger multinational companies. They take the best of traditional publishing in terms of professionalism and quality standards and combine that with the flexibility, digital adeptness, and savvy of the indie author.”
Deb Vanasse is co-founder of 49 Writers and founder of the author co-op Running Fox Books. She is the author of 17 books. Among her most recent are Write Your Best Book, a practical guide to writing books that rise above the rest, and What Every Author Should Know, a comprehensive guide to book publishing and promotion, as well as Wealth Woman: Kate Carmack and the Klondike Race for Gold.