Today, data metrics has replaced editorial instinct, and there are very few questions that remain unanswered about what device readers are choosing or how engaged they with the journalistic content. Except for what is perhaps the biggest question of all: what’s the impact of all this data mining on the quality of journalism? 

By Ira Basen, Future of News Editor

What would you think if you knew that someone was monitoring how long it was taking you to read a story on your favourite website? And those same people also knew what device you were reading it on, what route you took to get to that page and how often you’d been there before.

Chances are you’d be pretty unhappy about your privacy being invaded. You’d complain about the pervasiveness of the “surveillance state.” You’d wonder who gave CSIS or your ISP the right to spy on your online habits that way.

Well, here’s a surprise. There actually are people out there collecting that data, but they’re not who you might think they are. In this case, the snoops are more likely to be the publishers of your favourite news sites—The Globe and Mail, Gawker, CBC, BuzzFeed, New York Times—they’re all doing it.   

Yes, the same people who like to rail on about the evils of government spying are themselves collecting enough data about your online habits to make a CSIS agent blush.  

Related content on J-Source:

Data is king

In online publishing today, data is king. In order to drive traffic to your site and make yourself attractive to advertisers, you need to know everything you can possibly know about what kinds of stories will bring readers to your site, keep them there, and get them to come back.

On one level, there’s nothing new about this. Publishers have always been interested in discovering what readers wanted. But until recently, they had to rely on gut instinct, or the results of the occasional survey or focus group.  

But today, data has replaced instinct, and there are very few questions that remain unanswered.  Except for what is perhaps the biggest question of all: what’s the impact of all this data mining on the quality of journalism? 

Charting Chartbeat

The quest to accumulate ever more data about online readers has spawned an enormous new industry. There are several companies now offering sophisticated analytical tools to publishers, but the largest is Chartbeat, located in a sprawling sixth floor office in midtown Manhattan, directly above the legendary Strand Bookstore.

Chartbeat was founded in 2009 and now provides data to about 5,000 publishers and media companies worldwide, including 80 per cent of the top 50 publishers in the U.S. (Time, Forbes, Gawker), and several in Canada, including CBC.      

I recently spent a couple of hours in the Chartbeat office speaking to Joe Alicata, who manages the company’s publishing platform, and Adam Clarkson, a Canadian who is in charge of user experience. We were sitting in front of a large wall-mounted computer monitor as they described the kind of data they accumulate and make available to their clients.

“Every 15 seconds we get a packet of data which describes exactly where the user is on the page,” Clarkson explained. “We know whether they have scrolled or typed on their keyboard, where they came from, from a traffic source standpoint and a geographical standpoint and whether they've been to the site X number of times in the past 30 days or not.”

All of this data is important because these days, the kinds of stories news outlets cover and the way they cover them largely depends on when users are reading stories and what devices they’re reading them on.   

For example, in the mornings, when people tend to get their news on their phones while on their way to work, a few hundred words written as a list (“7 Things you Should Know About the Fair Elections Act”) will generate more traffic than a 2,000 word article.

All about engagement

But the most important metric for publishers today is what the people at Chartbeat call engagement or how long readers spend on a page. The longer you spend reading a story, the more likely you are to notice an ad on that page, and maybe even click on it. That’s why advertisers tend to be more interested in sites with lots of “engaged” users than in sites that simply attract a lot of pageviews.

But how can data analysts in an office in New York City possibly know how long it takes you to read an article? The answer lies in that mouse you’re holding in your hand.

“Users typically read with their mouse,” Joe Alicata explained. “They show activity by mouse movements every 5 or so seconds on average. So we use that as a threshold to say if we don't see activity occurring than we mark that user as idle at that point and when they start interacting again, we effectively start the timer again and say they're an engaged active user.”

Too much information?    

The massive quantities of data that companies, such as Chartbeat, make available are both a blessing and a curse for editors and publishers. They can now tailor their content to meet their readers’ needs in ways that would have been inconceivable just a few years ago. And giving customers what they want has always been pretty reliable key to running a successful business.

Plus, trying to increase “engagement” by getting readers to spend more time on the page may actually be good for journalism. Sites that simply chase pageviews tend to skew towards shallow and superficial content. But if you want readers to spend a few minutes on your story, you’d better give them something worth reading.

On the other hand, too much information can be hazardous to the health of journalism. It’s fine to give readers what you know they want, but what they don’t want is also important, and so is introducing them to stories they don’t yet know they want because they haven’t been exposed to them. Great journalism is about challenging readers, not pandering to them.

Adam Clarkson of Chartbeat said that when they first started sending out a continuous stream of data to publishers, several of them expressed concerns about where it might all be leading.

“You're giving me a realtime dashboard that's telling me what's popular right now,” Clarkson recalls some clients saying. “That's going to lead me to just continuing to write about the things that are most clicked on which is probably going to be wardrobe malfunctions or content that isn't necessarily the type of quality that we want to write about.”

Clarkson’s response: “Don’t deviate from your essential editorial plan. Look for a specific audience and find the fit between that audience and the content you produce.” In other words, don’t chase the same audience everyone else is chasing by seeking the lowest common denominator. 

Today’s sophisticated metrics make it easy to ignore gut instincts and make editorial decisions based on real-time numbers and data about stories that are trending. Maybe too easy.    

Listen to Ira Basen’s documentary, 13 Fascinating Things You Probably Didn't Know, About Online News, which aired on CBC.

Related content on J-Source:

Tamara Baluja is an award-winning journalist with CBC Vancouver and the 2018 Michener-Deacon fellow for journalism education. She was the associate editor for J-Source from 2013-2014.