High-tech tracking enables publishers to watch your reading habits

E-readers such as the Kindle produce data on how deeply into a book readers tend to go. But the publishing industry and libraries express only mild interest in using the numbers.
E-readers such as the Kindle produce data on how deeply into a book readers tend to go. But the publishing industry and libraries express only mild interest in using the numbers. AP

Since the days of Gutenberg, we’ve had clues about what people read.

But only clues.

We could tell how many copies of a book a publisher printed, the number sold, the portion that stores couldn’t unload. Libraries could track which titles their patrons checked out most often, what novels drew the longest waiting lists and which memoirs went untouched for years.

Our insight to actual reading habits mostly stopped there.

It’s possible, for instance, that many a lover of literature (or those aspiring to be literary) knew that Melville wanted them to call his narrator Ishmael but abandoned his tale 100 chapters before Ahab went down with that pale whale.

But thanks to electronic readers such as Amazon’s popular Kindle, now we know, or at least know more, about what people actually read. Spoiler alert — we don’t read much of what we buy.

Electronic books make it possible to log which books readers actually begin, which of those they finish and if there’s a particular passage where they tend to drop off.

That holds new possibilities for publishers, librarians, marketers and authors. Based on how their readers interact with books, they could all tailor what they do.

They mostly don’t.

The publishing industry remains steeped in old traditions and its own faith in what makes for a good read, and good sales. Librarians look to other metrics — primarily what folks check out — to build their collections. Marketers show only mild interest. And authors either think they know what their public wants or find themselves allergic to by-the-numbers formulas.

“The industry’s not that far along with it,” said Thad McIlroy, an electronic publishing consultant at The Future of Publishing.

Still, it’s early in this particular chapter of the Digital Age. Lovers of e-reader analytics imagine a time when the data becomes an important muse.


For publishers, the emerging data could offer game-changing hints about which writers to put their money behind. If a first-time author has only middling sales, but readers tended to finish what he wrote, that could be reason to plop down a hefty advance financing his next try.

On the other hand, if a writer’s book sells better than expected but readers don’t dive very deep into the work, then maybe a smaller investment in her second book makes sense.

For the libraries that put tax dollars into the publishing marketplace, the new data might help them pick better books — or at least ones that their readers appreciate more.

Yet libraries, note those who decide how to stock them, aren’t like websites. They don’t draw nourishment from clicks.

(The web has always feasted on metrics, and statistical analyses that only grow more sophisticated. Editors and reporters at The Star, for instance, know not only which stories draw the most eyeballs, but how far the average reader scrolls down though a particular story, how many seconds they spend on it and how many clicks an article generates to other posts on the website.)

Of course, the collection managers say, they want books that will get read.

Still, libraries buy some books realizing that few people will consume them. If someone looks for that obscure title that fills a research hole, or that odd but notable examination of local history, that reader counts on the library to have it waiting. A book on how to raise a blind child will appeal to a tiny audience, but the anxious parents who can’t afford to buy a copy dearly hope that the library can lend them one.

It’s a balance, the librarians say. They’ve long delved into statistics on which items prove most popular.

Some just look skeptically at what the numbers from e-readers would actually show.

“There are a hundred reasons I may give up on a book depending on what’s going on in my life,” said Terri Clark, the collection development manager for the Mid-Continent Public Library. “I just don’t know how you would collect data that would give you an accurate metric on the why that particular item was not completed all the way through.”

For instance, someone might check out an audiobook for a family driving vacation. Dad loves it, but the rest of the family gets bored and wants to listen to the radio. But if the digital audiobook only played for its first 10 minutes, it would look like a dud.

Or single readers might consume one book multiple ways. Maybe they prefer the hardback when sitting at the pool, but then pick up the story on an audiobook during their morning commute and finish the novel with the e-book version on their phone. The data might suggest that the three check-outs were a flop — none finished from beginning to end. But for the library patron, they were the height of convenience.

Some analysts think the aggregate numbers would balance out such anomalies, forming a clearer sense of whether a book was truly a library success.

“Right now, we don’t have any good indicators of the efficacy of a checkout,” said Adam Wathen, the collection development manager at the Johnson County Library. “Metrics would be valuable.”

For the most part, the companies that sell electronic books don’t often offer to libraries the same sort of reading statistics that the publishing industry has begun to study.

Even if libraries had those numbers, collection managers see real limits to their value.

People often never intend to read books from beginning to end. Patrons checking out a car manual probably just need instruction on how to fix the one thing on their busted Camry. Who stretches out on the beach to devour an entire cookbook?

“How much somebody reads a particular book isn’t the only way to measure things,” Wathen said. “We want to have available material they tell us they want.”

What’s long been the key statistic to measure success on that score is called the “holds ratio.” At the Johnson County Library, for instance, the goal is a five-to-one holds ratio, meaning an average of five library users waiting for every copy of a new book. It can be a tricky line to walk. Lowering that ratio is great for folks in line for popular material, but it comes at the expense of diverting money that could broaden what’s available.


Libraries are also culturally resistant to tracking what their patrons do. They’ve long been fastidious about scrubbing records of what you check out to guard your privacy. Keeping tallies of what you read, even if those tables are aggregated and anonymized, might send the wrong signal.

The publishing industry is driven by different motivations, chiefly sales. Amazon, for instance, comes up with its “recommended for you” titles based on past purchases. The company — it did not respond to requests for comment for this story — doesn’t reveal how it might use Kindle-produced data on how thoroughly you’ve read the books you bought.

E-book sellers collect reading data through a sync function. To allow readers to hop from one device to the next as they move through a book — from desktop to tablet to phone — the syncing tells you what page of the book you last had open.

That’s where the numbers come from. Even the most avid readers of e-books never open about a fourth of the electronic titles they buy. Average readers start just 60 percent of their e-books and finish only 40 percent, says e-book retailer Kobo, “meaning that on average a digital reader buys almost twice the amount of books that they read from cover to cover.”

The industry assumes that physical books, often gifts or trophies put on shelves to impress, get read far less.

Micah Bowers, the founder and CEO of Bluefire Reader, works with publishing houses on electronic books and makes apps for readers. The company has invested in collecting statistics but found the industry largely indifferent.

“The reality is that it’s not being used yet,” he said. “There’s a fear of data in the book industry. … It’s a very emotional, gut industry. They pride themselves on being able to call winners.”

He argues that book publishers are missing out. Imagine that the data from an e-cookbook shows an extraordinary number of searches for “paleo.” Bowers argues that could be used to sell more copies of the title by promoting its value to people on the caveman-style paleo diet. Or it could reveal to a publisher the promise of putting out a full-on paleo cookbook.

Skeptics abound. Many argue that the publishing world already knows what sells: books by celebrities, romance novels and mysteries. More high-minded literary works, they contend, wouldn’t benefit from geeking out over the numbers.

Consider, suggests McIlroy, the electronic publishing consultant, Harper Lee’s classic “To Kill a Mockingbird.” A great first novel. And even though it pre-dated e-reader data by a half century, it’s safe to say that it wasn’t just widely purchased but widely read. Along comes the discovery of a sort of prequel by Lee, “Go Set a Watchman.”

The metrics on the first book would suggest a hit with the second. Instead, its sales dove after an initial spike because readers and reviewers didn’t find the same heart of “Mockingbird” in “Watchman.”

“There’s no harm in having this data,” McIlroy said. “But I can’t see the action that I would take from it.”

Authors could be the ultimate audience for the metrics. They also figure to be dubious. Thoughtful notes from an editor, input from an agent or a colleague often make a good book great. But crowdsourcing based on e-reader metrics represents less welcome input.

Candice Millard has written two nonfiction works, “The River of Doubt” and “Destiny of the Republic,” that both drew strong reviews and landed on The New York Times’ best seller lists. The Leawood author says data from e-readers are “not something that the publisher shares with me or I seek out.”

“Do I want to know that? No, not really.”

She said her motivation comes from her enthusiasm for the subjects she explores. Millard said she ultimately wants to write books in ways that make sense to her. She saw readers on Amazon criticizing her first book about Teddy Roosevelt’s journey through the Amazon for including too much natural history. But understanding the jungle was critical to appreciating Roosevelt’s time there, she said.

“Of course, I hope my books appeal to other people,” Millard said. “But I write what interests me.”

Scott Canon: 816-234-4754, @ScottCanon