Screens Might Bad for Mental Health

Psychologists can’t seem to agree on what technology is doing to our sense of well-being. Some say digital devices have become a bane of modern life; others claim they’re a balm for it. Between them lies a shadowy landscape of non-consensus: As the director the National Institutes of Health recently told Congress, research into technology’s effects on our thoughts, behaviors, and development has produced limited—and often contradictory—findings.

As if that uncertainty weren’t vexing enough, many of those findings have sprung from the same source: Giant data sets that compile survey data from thousands or even millions of participants. “The problem is, two researchers can look at the same data and come away with completely different findings and prescriptions for society,” says psychologist Andrew Przybylski, director of research at the Oxford Internet Institute. “Technological optimists tend to find positive correlations. If they’re pessimists, they tend to find negative ones.”

In the latest issue of Nature Human Behavior, Przybylski and coauthor Amy Orben use a novel statistical method to show why scientists studying these colossal data sets have been getting such different results and why most of the associations researchers have found, positive and negative, are very small—and probably not worth freaking out about.

Consider the Millennium Cohort Study. A long-term study on the health outcomes of kids born in the UK in 2000 and 2001, the survey contains dozens of questions whose answers a researcher could reasonably interpret as relevant to a person’s well-being.¹ Those questions span topics as disparate as self-esteem, suicidal thoughts, and overall life satisfaction. “But different researchers have different conceptions of well-being and can choose different questions to fit that conception,” Orben says.

Whether they realize it or not, a researcher who chooses to focus only on certain questions is making a decision to pursue one analytical path at the exclusion of many, many others. How many? In the case of the MCS, combining the survey’s questions on well-being with those on things like TV watching, videogame habits, and social media use produces a total of 603,979,752 analytical paths a researcher could take. Combine them with questions directed to the caregivers of study participants, and that figure balloons to 2.5 trillion.

Granted, the vast majority of those 2.5 trillion results are not all that interesting. But the sprawling nature of these data sets allows for associations to emerge that are technically statistically significant but are very, very small. In science, large sample sizes are generally considered to be a good thing. Yet when you combine the large number of analytical paths afforded by subjective survey questions with an enormous number of survey participants, it opens the door to statistical skullduggery like p-hacking—the practice of fishing for favorable results in a large set of data.

“Researchers will essentially torture the data until it gives them a statistically significant result that they can publish,” Przybylski says. (Not all researchers who report such results do so with the intention to deceive. But researchers are people; science as an institution may strive for objectivity, but scientists are nevertheless susceptible to biases that can blind them to their misuse of data.) “We wanted to move past this kind of statistical cherry-picking. So we decided to look for a data-driven method to collect the whole orchard, all at once.”

Przybylski and Orben found that method in a statistical tool called specification curve analysis. Rather than investigate a single analytical path through the Millenium Cohort Study, SCA allowed them to investigate 20,000 of them. It also permitted them to probe all 41,338 paths through two other large-scale data sets, called Monitoring the Future and the Youth Risk and Behaviour Survey, that are commonly used to assess the association between digital habits and adolescent well-being.

The result was a series of visualizations that map the wide gamut of potential effects researchers could detect in the three repositories, and they reveal several important things: One, that small changes in analytical approach can lead to dramatically different findings along that spectrum. Two, that the correlation between technology use and well-being is negative. And three, that this correlation is very, very small, explaining—at most—0.4 percent of the variation in adolescent well-being.

To put it in perspective, the researchers compared the link between technology use and adolescent well-being to that of other factors examined by the large-scale data sets. “Using technology is about as associated with well-being as eating potatoes,” Przybylski says. In other words: hardly at all. By the same logic, bullying had an effect size four times greater than screen use. Smoking cigarettes? 18 times. Conversely, getting enough sleep and eating breakfast were positively associated with adolescent well-being at a magnitude 44 and 30 times that of technology use, respectively.

Put another way: Technology’s impact on well-being might be statistically significant, but its practical significance—according to existing data sets—appears negligible. “The level of association documented in this study is incongruent with the level of panic we see around things like screen time,” says University of California Irvine psychologist Candice Odgers, who researches how technology affects kids’ development and was unaffiliated with the study. “It really highlights the disconnect between conversations in the public sphere and what the bulk of the data are showing us.”

What the study doesn’t do is close the book on questions surrounding technology’s effects. Instead, it highlights the need for more nuanced questions. Not all screen time is created equal, but most studies to date treat it as monolithic. “That’s like asking if food is good or bad for you, and in the end, questions like that will never help us,” Orben says. “We need to stop the debate about the effect of generic tech use on well-being and open space for more and better research about the kind of technologies people are using, who’s using them, and how.”