Video and the Digital Book: Exploring Risky and Promising Possibilities

by Peter Meyers, Content Strategist

What happens to a book — and, more importantly, to readers — when you add video to the text?

Because our reading devices can display moving images, we often conclude that they should. The allure is clear. A seemingly high-fidelity audiovisual capture, a real-life screenshot. Surely the footage of Martin Luther King’s “I Have a Dream” speech exceeds the transportive power of a written report.


  • Does it really depict, comprehensively, everything its multimedia sheen suggests?
  • What are the cognitive costs its presence imposes on the reader’s focus?
  • What does an author lose in the quest to direct a reader’s attention—Look here, at this thing, in this way—by adding an editorial element that mixes ambient information with key bits in a pell-mell fashion?

Managing Video Modules

A book is a road map of an argument or a story. It charts the path an author wants you to follow. And yet its guidance differs from geographic maps in one key way: Only those markers the writer wishes you to see are made visible. Good writing is a guide that depicts only essential stopping points. It is aware of the surrounding universe alongside its path, but it understands that saturating readers with detail helps no one. An awareness that there is too much to know is precisely why readers turn to books in the first place. Words, and the partnership they form with the human mind, are particularly well made for this task.

Words direct us; their linear path channels our attention. They cause our eyes’ teeming, wide-span receptors to narrow their focus, and consume each word or phrase in sequence. How unique among all media, among life’s enveloping input, this opportunity to reduce what we observe.

I am not looking to make a case that video is an inferior form of media. But its effects on us are so substantially different from the effects of the written word that we need to think carefully before mixing the two.

Meyers -p55-page-001A video module positioned on a screen of text exerts a disruptive force. It is an attention magnet—its presence impossible to ignore, even for the committed word lover. Consider the two layouts shown below.

The long blocks of text on the top spread generate an immersive rhythm. Contrast that layout with the one containing video. Upon reaching that second page, readers find two forces tugging on their attention: the text on the top line and the video several paragraphs below it. But our readers are not computers, programmed rigidly to process media bits in the order in which they appear on screen. Their eyes catch sight of that video box and its finger-friendly play button. Its implicit invitation:

Down here! Come! Live a little! Who among us can resist?

chart of reader behaviorsThe chart on the right breaks down the behavior of readers when confronted with video in a book:

So, what is to be done with video and the digital book? To begin with: a rigorous audit of any clip under consideration. Does the text truly need its support?

When prose alone has difficulty conveying concepts, moving images make sense. Think about cookbooks, for example—how to make an omelet or roll sushi. These guides are consulted mainly in snatches; accessing them, people rarely seek an immersive state. Their frequent shifts among presentation styles—body text for introduction and background, bullet list for ingredients, numbered steps for cooking instructions—mirror the reader’s switch between mental modes, between review and action. Video is at home in this type of cognitively diverse environment. The same can’t be said for long-form narrative.

Where we position videos also deserves careful thought. The worst implementations scatter clips haphazardly, as though the simple treat of watching things move on screen is enough to satisfy readers. Order matters. At a minimum, videos should be placed at logical points on the reading path; they should serve as seamless partners with the text and with the case both are making to the reader.

Achieving tonal consistency—does the video have the same editorial personality as the text?—is another challenge. Incorporating existing footage or commissioning new videos should be done with an eye, and ear, to achieving continuity and cohesion. One experiment in so-called “enhanced” e-books from a large publisher provides a good example of how easy it is to get these matters wrong.

Archival videos from NBC were added to a section of a book on the World War II D-Day invasion. The prose they accompanied was well-crafted popular history by a serious storyteller educating the reader. It was the kind of writing you might encounter in the New York Times Magazine or the Atlantic.

Problems began with the videos’ musical background. One had a patriotic, almost jingoistic, quality. Another verged on parody: the soundtrack of cartoonish fiends preparing to do bad things.

Imagine watching a movie for which a dozen different composers were independently commissioned to score different scenes; that’s what the videos in this book felt like. If I had to pick a video whose music was closest to the book’s verbal tone, I’d select one featuring a lone piano, a simple melody, and minimal flourishes. Sound design is now a new job for publishers.

Even things like video editing choices have an effect. In one video featuring the German commander Rommel, dramatic cuts created a feeling that a shocking truth was about to be exposed. That this doesn’t happen—we just see Rommel walking around, looking mean—deflates the tension that the music and the video editing build up. All in all, an editorial experience likely at odds with what the author sought to create.

Considering Control

If the main problem with video inside a book is its disruptive nature—the stark shift it triggers between media that a reader controls (text) to media that unspools automatically (video)—is there a way to diminish this jolt?

One option involves radically reducing the amount of text used. That’s a solution that asks text to behave in a more video-friendly way. What if we asked video to return the favor?

Here is where a little-used form of video deserves more attention and experimentation. I’m not sure there’s a standard name for what I have in mind. It’s certainly related to stop-motion video. Another way to describe it: “scrubbable video.” Its signature attribute is that it repositions control from within the autoplaying video back to the viewer.

The cookbook app Hello Cupcake uses a version of this technique. Its instructional video shifts from being a “view only” presentation to one where the learner actively controls aspects such as progress speed. Users can swipe, slowly, to the right to advance the video or swipe in the opposite direction to go back.

The connection between the speed of your finger and the stop-motion changes provides a soothing sense of control, a result closer to the mindset that prevails when we process text. And while this particular app doesn’t contain large blocks of body text, it’s not hard to imagine the addition of text and how the two elements—movable pictures and prose—would serve readers better than the usual arrangement.

A video is a shiny, opaque media object that can disrupt the sense of control we value as readers. Returning control to the reader is one way to avoid this problem. Imagining and exploring others as well will bring us closer to video compositions that are better suited to books.

Peter Meyers, a New York City–based content strategist, frequently speaks and writes about digital books. He has been a contributor to the New York Times, the Wall Street Journal, Wired, Salon, and the Village Voice. This article is excerpted from his new book, Breaking the Page: Transforming Books and the Reading Experience. To learn more: @petermeyers and breakingthepage.com.

