Skip to content

Data Science Research Analysis

Niall Walsh edited this page Nov 27, 2018 · 1 revision

Analysis of research papers in opinion spam detection in the last 24 months

[Niall Walsh, 13/11/18]

The godfather of all opinion spam detection research that has been done appears to be the paper by Jindal and Liu, 2008.. It has 1116 citations, cited by almost every paper concerning the topic since.

Also, an incredibly comprehensive and useful paper published in 2015 by Crawford dissects the thus-far approaches across supervised, unsupervised and semi-supervised learning in the review-centric and reviewer-centric domains.

We now wish to see the most recent cutting-edge procedures and methods used to tackle this problem in the scholarly domain, to give us a better idea of the direction best taken going forward, and a better idea of what can be improved on for our own publication.

I did this by using Google Scholar, and searching for the initial paper that produced the datasets we use. After finding this paper, we see which papers cited it, and sort them by most recent.

Yelp CHI, Yelp NYC, Yelp ZIP

Shebuti et al, 2016: 'Collective Opinion Spam Detection: Bridging Review Networks and Metadata'

Above was found to be the first paper to use the 3 large Yelp datasets, and make them publicly available. I had to contact Shebuti by email in order to obtain access.

There were 93 listed citations on Google Scholar. Here is the full list.

The most relevant of these appear to be, as follows:

This paper provides a highly useful, in depth cross-examination of a whole bunch of research papers concering opinion spam detection, the methods used, datasets, features, and criteria for what seperates 'spam' from 'ham'. It also looks at the possible methods that can be employed to take a domain-specific spam detector, eg. Reviewskeptic, and apply some techniques according to the criteria outlined in the paper as to how to make it perform better cross-domain.

This paper builds on the below paper by Wang in 2016, building on the ability for the word embeddings to use a domain classifier to allow a classifier to be more domain adaptive. This is proposed as a solution to the cold-start problem of deception detection, where not enough data is present in the current model to classify review in a new domain. The results of this experiment were very conclusive, significantly beating the cutting edge on the same data.

  • Wang et al, 2016: 'Learning to Represent Review with Tensor Decomposition for Spam Detection'

    This paper proposes the use of tensor decomposition for representing the relationship between a reviewers and products using 11 automatically generated relations using two basic patterns. This representation is then concatenated with review text to give a more accurate representation of a review, and classifiers can thus detect opinion spam more readily. The results of this experiment showed that a combination of the reviewer embeddings, product embeddings and review text bigrams resulted in an increase in accuracy in hotels and resteraunts by about 5% using a 50:50 data balance.

DERev Corpus (DEception in Review corpus)

Fornaciari et al, 2014: 'Identifying fake Amazon reviews as learning from crowds'

Above was the paper that involved the creation of the DERev dataset. I had to email Fornaciari in order to obtain access.

There were 23 listed citations on Google Scholar. Here is the full list.

Sorted by date, the most interesting and relevant of these appear to be, as follows:

OpSpam Corpus

Ott et. al, 2011: 'Finding Deceptive Opinion Spam by Any Stretch of the Imagination'

Ott et. al, 2013: 'Negative Deceptive Opinion Spam'

Above are the papers where the OpSpam dataset was compiled. It is available freely for download on Myle Ott's personal site. This dataset was published as the first 'gold standard' review deception corpus.

Important takeaways of these papers:

  • Standard n-gram text categorization techniques can detect negative deceptive opinion spam with performance far surpassing that of human judges.
  • The combination of negative and positive deceptive and genuine reviews showed correlations between deception and sentiment. They are as follows:
  • Fake reviews included less spatial language. Negative fake reviews had more verbs relative to nouns than truthful. In the fake reviews, highly emotional language was frequent ('terrible', 'disappointed' and 'luxurious', 'elegant'). Emphasis of self is not as evident in fake negative reviews as in positive.

There were 748 listed citations on Google Scholar. Here is the full list.

Sorted by date, the most relevant of these citations to be, as follows:

  • Aghakhani et al, 2018: 'Detecting Deceptive Reviews using Generative Adversarial Networks'

    The above paper is the first paper applies generative adversarial networks to the deceptive review detection problem, and thus the chance of finding novelty in this approach is high, as well as the opportunity for beating the current cutting edge. Main takeaways:

    • GAN's are rarely stable, however the use of two discriminator networks in this study was found to maintain stability as it can prevent mode collapse by learning from both Genuine and Deceptive distributions.
    • The generator is modeled as a stochastic policy agent in reinforcement learning (RL). It is an RNN (LSTM) network pre-trained using MLE (Maximum Liklihood Estimation). The discriminators, D and D', are CNN's pretrained by minimizing the cross-entropy. D and D' use the Monte Carlo search algorithm to estimate and pass the intermediate action-value as the RL reward to the generator.
    • It was found that this method, although being a semi-supervised learning approach, showed the same accuracy as state of the art supervised machine learning approaches. This proves that FakeGAN was effective at detecting fake reviews.
    • FakeGAN's accuracy converges at 89.1% with d = 6 and g = 1.. Traditional GAN architecture only manages 84%. SOTA is currently ~89.8%.
  • Kumar et al, 2018: 'Detecting Review Manipulation on Online Platforms with Hierarchical Supervised Learning'

    This paper applies a heirarchical approach to classification that models interactions between users as univariate and multivariate distributions and stacks the distributions and evaluates them using classifiers we have so far looked at, including SVM and kNN. It contributes to the literature by incorporating distributional aspects of features in ML techniques.

  • Wang, 2017: '“Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection'

    This paper, although not directly involved in fake review detection, provides a significant new dataset for the modelling of deceptive text, and can be used for generating stance detection models.

  • Ren et al, 2017: 'Neural networks for deceptive opinion spam detection: An empirical study'

    The above paper is a great resource due to it's comprehensiveness on the use of conventional neural network technology being applied to opinion spam detection. It challenges the current problem of linguistic and psychological cues not being enough to accurately encode the semantic meaning of a review. Therefore, a neural network based approach is proposed with the intention of learning a document-level representation of a review. Sentence representation is learned with a CNN, sentence representations are combined via a gated RNN, which yields a document vector. These vectors are then directly used as features for detecting opinion spam. In-domain and cross-domain tests both outperform the state of the art, proving the hypothesis that deep learning is the way to go.

Clone this wiki locally