ML p(r)ior | Exploring limits to prediction in complex social systems

Exploring limits to prediction in complex social systems

2016-02-02
How predictable is success in complex social systems? In spite of a recent profusion of prediction studies that exploit online social and information network data, this question remains unanswered, in part because it has not been adequately specified. In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two generic sources: insufficiency of available data and/or models on the one hand; and inherent unpredictability of complex social systems on the other. We then use this model to motivate an illustrative empirical study of information cascade size prediction on Twitter. Despite an unprecedented volume of information about users, content, and past performance, our best performing models can explain less than half of the variance in cascade sizes. In turn, this result suggests that even with unlimited data predictive performance would be bounded well below deterministic accuracy. Finally, we explore this potential bound theoretically using simulations of a diffusion process on a random scale free network similar to Twitter. We show that although higher predictive power is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty in estimating product quality or slight variation in quality across products leads to substantially more restrictive bounds on predictability. We conclude that realistic bounds on predictive accuracy are not dissimilar from those we have obtained empirically, and that such bounds for other complex social systems for which data is more difficult to obtain are likely even lower.
PDF

Highlights - Most important sentences from the article

Login to like/save this paper, take notes and configure your recommendations

Related Articles

2017-03-01
1703.00535 | stat.ML

Many recommendation algorithms rely on user data to generate recommendations. However, these recomme… show more
PDF

Highlights - Most important sentences from the article

2018-05-08

Gender prediction has typically focused on lexical and social network features, yielding good perfor… show more
PDF

Highlights - Most important sentences from the article

2019-01-16

Microblogging platforms constitute a popular means of real-time communication and information sharin… show more
PDF

Highlights - Most important sentences from the article

2019-01-29

Social media communications are becoming increasingly prevalent; some useful, some false, whether un… show more
PDF

Highlights - Most important sentences from the article

2019-03-18
1903.07562 | physics.soc-ph

Human achievements are often preceded by repeated attempts that initially fail, yet little is known … show more
PDF

Highlights - Most important sentences from the article

2019-03-27

The ability to track and monitor relevant and important news in real-time is of crucial interest in … show more
PDF

Highlights - Most important sentences from the article

2019-04-15

Blended courses that mix in-person instruction with online platforms are increasingly popular in sec… show more
PDF

Highlights - Most important sentences from the article

2019-03-12

Social network and publishing platforms, such as Twitter, support the concept of a secret proprietar… show more
PDF

Highlights - Most important sentences from the article

2018-08-09

Social media, once hailed as a vehicle for democratization and the promotion of positive social chan… show more
PDF

Highlights - Most important sentences from the article

2017-11-02
1711.00726 | cs.SI

Recent work have done a good job in modeling rumors and detecting them over microblog streams. Howev… show more
PDF

Highlights - Most important sentences from the article

2017-07-20

Understanding the dynamics of social interactions is crucial to comprehend human behavior. The emerg… show more
PDF

Highlights - Most important sentences from the article

2018-10-05

Modeling human behavioral data is challenging due to its scale, sparseness (few observations per ind… show more
PDF

Highlights - Most important sentences from the article

2019-02-10

Social media are nowadays one of the main news sources for millions of people around the globe due t… show more
PDF

Highlights - Most important sentences from the article

2018-12-21

The prediction of information diffusion or cascade has attracted much attention over the last decade… show more
PDF

Highlights - Most important sentences from the article

2018-09-11

Models of contagion dynamics, originally developed for infectious diseases, have proven relevant to … show more
PDF

Highlights - Most important sentences from the article

2018-10-25

Online forums provide rich environments where users may post questions and comments about different … show more