-
Notifications
You must be signed in to change notification settings - Fork 4
Paper Introduction
Online reviews of products and services have become significantly more important over the last two decades. Reviews sway probability of purchase through review score and volume of reviews~\cite{MASLOWSKA20171}. Statistics show that 90% of consumers read the reviews before a purchase. \cite. It has been found that on average the conversion rate of a product increases by 270% as it gains reviews. For high price products, reviews can increase conversion rate by 380% ~\cite{Askalidis16}. In competitive, ranked conditions it is worthwhile for unlawful merchants to create fake reviews. For TripAdvisor, in 80% of cases a hotel could become more visible than another hotel using just 50 deceptive reviews \cite{}. Faking reviews is an established problem \cite{} and has been exploited as it was found that 1 in 5 (20%) of Yelp reviews are marked as fake by Yelp's algorithm \cite{}. \cite{DBLP:journals/corr/abs-1805-10364}.
We consider this problem a classification problem, where a review is either classified as deceptive or genuine. However, the biggest problem in this domain is the lack of large, labelled datasets \cite. There exists large copora of unlabelled online reviews, however manual labelling is costly, tedious and subjective \cite. The main dataset used by people is OpSpam \cite, consisting of 1600 reviews of 20 Chicago hotels, made up of 400 genuine positive reviews and 400 negative reviews, taken from TripAdvisor, and 400 deceptive negative and 400 deceptive positive reviews from Mechanical Turk \cite. However, it is worth noting that OpSpam may not be indicative of actual real-world opinion spam \cite. Another is the much larger corpus of 1,035,045 Yelp reviews produced by Shebuti et. al using the Yelp spam filter. The reviews are split into three sub-sets: 359,052 New York resteraunt reviews, 67,395 Chicago hotel and resteraunt reviews, and 608,598 resteraunt reviews for 5 states, ordered by zipcode. There is also a dataset known as DERev \cite, consisting of 6819 reviews downloaded from Amazon, concerning 68 books and written by 4811 different reviewers, subjectively split by suspicion clues into 41% deceptive and 59% genuine. There is also a small dataset compiled by Li et. al \cite consisting of the OpSpam hotel data, as well as positive restaurant and doctor reviews gathered using Mechanical Turk. This dataset was mainly used to run cross-domain experiments, although is rarely used in recent research.
In our research, we trained a number of supervised, statistical models, of which the SVM performed best. We also trained a number of supervised neural models, of which convolutional networks performed best. We finally progressed to experimentation with semi-supervised approaches, namely Generative Adversarial Networks, as recent papers have shown some promise in beating or matching SOTA results by supplementing with generative models. In the following sections we will explore the classification problem in more detail, show results of our experiments and research, and give some ideas about possible future developments in the domain of opinion spam detection.
- ACLSW 2019
- Our datasets
- Experiment Results
- Research Analysis
- Hypothesis
- Machine Learning
- Deep Learning
- Paper Section Drafts
- Word Embeddings
- References/Resources
- Correspondence with H. Aghakhani
- The Gotcha! Collection