The Deceptive Practice of Faking the System: The Case of Rome B&Bs Reviews
23 Pages Posted: 21 Oct 2018
Date Written: September 24, 2018
We present an innovative technique based on unsupervised machine learning, to show the presence of potentially fake items in a corpus of user generated reviews. We focused on the case of Rome’s B&Bs reviews (taken from an important field portal), drawing attention to the impact this have on the corresponding structures rankings. After an initial exploration of the dataset, we analyzed a consistent number of reviews (more than 237.000). Such analysis shows that more than 50.000 reviews are written by users with a common username pattern and share some really clustered stylistic and lexical features. We consider these reviews as fake. If the policy maker would decide to introduce measures to regulate the sector of online reviews, removing all fakes, 44 B&Bs would completely disappear from the portal rankings.
Keywords: machine learning, spam detection techniques, electronic word of mouth, regulatory systems, consumer protection.
JEL Classification: D8, D18, D83, G18, L83
Suggested Citation: Suggested Citation