Estimating sequential bias in online reviews: A Kalman filtering approach

Estimating sequential bias in online reviews: A Kalman filtering approach

Knowledge-Based Systems 27 (2012) 314–321 Contents lists available at SciVerse ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier...

400KB Sizes 0 Downloads 72 Views

Knowledge-Based Systems 27 (2012) 314–321

Contents lists available at SciVerse ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Estimating sequential bias in online reviews: A Kalman filtering approach Riyaz T. Sikora ⇑, Kriti Chauhan Department of Information Systems, University of Texas at Arlington, P.O. Box 19437, Arlington, TX 76019, United States

a r t i c l e

i n f o

Article history: Received 4 April 2011 Received in revised form 7 September 2011 Accepted 19 October 2011 Available online 3 November 2011 Keywords: Online ratings Reviewer bias Sequential bias Kalman filter Mining consumer reviews

a b s t r a c t Online reviews of products along with reviewer related data are regarded by many as one of the most significant knowledge base systems created by online commerce websites. They have played a big role in fueling the popularity and growth of electronic marketplaces like Amazon and eBay. Although the main attraction of online reviews is that they are perceived by most consumers to be independent and unbiased, many studies have shown the existence of various types of biases inherent in the product reviews. In this paper we present a novel approach of estimating the bias in reviews using Kalman filtering technique that is computationally feasible and can update the estimation of bias with every new review without having to store all the past ratings information. We further extend our model to study the existence of sequential bias in the reviews. We use panel data from 19 different products collected from Amazon.com and show the existence of sequential bias in ratings that depends on previous review and reviewer characteristics. Ó 2011 Elsevier B.V. All rights reserved.

1. Introduction Online reviews of products are becoming ubiquitous and have played a big role in fueling the popularity and growth of electronic marketplaces like Amazon and eBay. Online opinion and consumer-review sites have correspondingly changed the way consumers shop, enhancing or even supplanting traditional sources of consumer information such as advertising [15]. Collaborative filtering techniques that use online reviews have made recommender systems very useful for online retailers in recommending related products to customers [16]. Many studies have documented the positive effects of online reviews on product sales. In some product categories, such as electronics, surveys suggest that online reviews have a greater influence on purchase decisions than any other medium [3]. Many online sites like Amazon not only provide consumer reviews of products but also present detailed information about each reviewer. This user generated content is regarded by many as one of the most significant knowledge base systems created by the online commerce websites that can be mined for useful information related to consumer reviews and their effect on sales. Although the main attraction of online reviews is that they are perceived by most consumers to be independent and unbiased, many studies have shown the existence of various types of biases inherent in the product reviews [2]. Since biased ratings can have a deteriorating effect on the reputation models used by a market⇑ Corresponding author. Tel.: +1 817 272 5397; fax: +1 817 272 5801. E-mail addresses: [email protected] (R.T. Sikora), [email protected] (K. Chauhan). 0950-7051/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.knosys.2011.10.011

place [25] it is in the interest of providers of electronic marketplaces to mitigate the effects of bias by estimating and correcting for the bias in the ratings. Most traditional online businesses also publish reputation profile for sellers that reflect average of the ratings received in previous transactions. However, using such summary statistics can be misleading if most of the ratings are extreme. For example, Hu et al. [7] have shown the presence of bimodal distribution in an overwhelming majority of the online reviews. Moreover, because of the importance of these ratings, there is an incentive for traders to partake in strategic behavior (for example shilling) to artificially inflate their rating. There have been many reputation mechanisms proposed to counter the effects of such strategic behavior [24,13]. In this paper our focus is on estimating and studying the presence of bias in ratings that might be unintentional rather than being strategic. Most approaches for estimating the bias in reviews require all the past data to be processed every time the estimate of bias has to be updated. In this paper we present a novel approach of studying the presence of bias in reviews that does not impose the above restriction. We develop a two level model to estimate and quantify the bias. At the base level, we model the temporal dynamics of the ratings vector as a linear dynamic system. We use Kalman filtering technique to estimate the biased and unbiased part of the ratings. Being a filtering technique, this approach is computationally feasible and can update the estimation of bias with every new rating without having to store all the past ratings information. We further extend our base model by incorporating a second level model to study the existence of sequential bias in the reviews. Sequential bias is the effect of past reviews on current review, and

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321

we hypothesize that this bias is dependent on both the review characteristics (for e.g., how helpful a review is) and the reviewer characteristics (for e.g., how trustworthy a reviewer is). We model the effect of these characteristics of previous reviews on current rating as another linear dynamic system. We use panel data from 19 different products collected from Amazon.com and classify the products into two categories of search and experience goods that have been studied before. Our results show a significant presence of sequential bias in the ratings for both the search and the experience goods. Our results further show that the sequential bias in experience products is predominantly explained by reviewer characteristics, whereas the bias in search goods is explained by both review and reviewer characteristics. To the best of our knowledge this is the first study of its kind that tries to quantify the presence of sequential bias in consumer reviews. The rest of the paper is organized as follows. In the next section we discuss related work. In Section 3, we describe our data set and the relevance of the different variables pertaining to the review and reviewer characteristics. Section 4 presents our base model and explains the Kalman filtering technique for estimating the bias. In Section 5 we present an example of the computation process of our method applied to one of the products. In Section 6 we discuss a method for validating our model and present the results of the base model. Section 7 extends our base model by adding a second level for modeling the bias as a sequential bias. In Section 8 we present the results of our 2-level model. Finally, Section 9 provides some additional discussion and concludes the paper.

2. Related work There has been a lot of work done on studying online reviews. Broadly speaking, this work can be classified into two groups. The first group of related articles studies the helpfulness of online reviews, what factors make the reviews helpful, and how they affect product sales. The most relevant study from this group is the work done by Ghose and Ipeirotis [5]. They study the helpfulness of reviews and their impact on the sales of a product. They also develop an extensive model to predict the helpfulness of a review based on reviewer and review characteristics, including a thorough textual analysis of reviews. They provide empirical results to validate their model on data collected from Amazon.com on almost 400 products. Our study is motivated by this work. We use similar type of product data from Amazon.com. Whereas their study is focused on predicting the helpfulness of a review (as defined by the helpful votes it receives) and its impact on product sales, we are concerned with estimating the sequential bias in the ratings. Mudambi and Schuff [18] also study what factors make the online reviews helpful. They develop a parsimonious model that studies the effect of review extremity and review depth on the helpfulness of the review, and provide empirical results to validate their model based on data for six products collected from Amazon.com. Bobadilla et al. [1] present a new metric for combining the information about helpfulness of online reviews with the similarity of votes given by pair of users to improve the performance of recommender systems. Hijikata et al. [6] present a social summarization method that uses social relationships in online auctions for summarizing feedback comments. Forman et al. [4] study the effect that disclosure of reviewer identity has on review helpfulness and product sales. Using data from Amazon.com they show that online community members rate reviews containing identity-descriptive information more positively, and the prevalence of reviewer disclosure of identity information is associated with increases in subsequent online product sales.

315

The second group of related articles studies the presence of different types of biases found in online reviews. Li and Hitt [15] investigate self-selection bias that exists when the early buyers (and hence the early reviewers) hold different preferences about the quality of the product than do later consumers. They study the effect of self-selection bias on product sales and show that it decreases consumer surplus. Staddon and Chow [22] study the presence of reviewer bias in book reviews on Amazon.com that result from an undisclosed relationship between a reviewer and author. Hu et al. [8] document the existence of online book review manipulation by vendors, publishers, and writers by using data from Amazon and Barnes and Noble, and show that the manipulation strategy of firms is a monotonically decreasing function of the product’s true quality or the mean consumer rating of that product. Lauw et al. [14] study the presence of bias in online reviews, without attributing the bias to any possible causes. They develop two related concepts of bias in reviews and controversy of products, and present an Inverse Reinforcement model linking both the concepts. They show the effectiveness of their model by experimental results on real-life and synthetic data sets. One of the drawbacks of this model is that it enforces higher-order data requirement. To quantify the bias in ratings for a particular product, their model requires the ratings of other products given by the reviewers of that product, all the ratings of those other products, and so on. In effect, their model requires data from a bipartite network of reviewers and products, even if the goal is to quantify the bias in the ratings of only one product. Moreover, because of propagation effects, their model requires all the calculations to be repeated every time a new rating is added to that network. Sequential bias has also been studied before in different contexts. Page and Page [20] show how sequential presentation of alternatives induces systematic biases in the way performances are evaluated. Rabin and Schrag [21] study the role played by first impressions in creating confirmatory bias where people misinterpret new information as supporting previously held hypotheses. Kapoor and Piramuthu [12] study the dynamics of bias that is introduced as a result of sequential ordering of online product reviews. However, their study and results are based on synthetic data. In the next section we discuss the data set used in this paper that was collected from Amazon and explain the relevance of the different variables in the data set.

3. Data set and variables Previous studies in marketing have broadly classified products into search or experience goods and have shown the differences in consumer behavior between these two categories [19,9]. Search goods are those whose quality can be observed before buying the product (for example, electronics) whereas experience goods are those where consumers have to consume/experience the product to determine its quality (for example, movies). For our study we created a panel data set of 19 products belonging to both search and experience goods, and collected detailed ratings data about these products from Amazon.com that had been posted till May 2010. Table 1 provides summary statistics of the 19 products collected. Since our goal is to study the presence of sequential bias and quantify it, we hypothesize that to the extent there is sequential bias in the reviews it would primarily depend on the helpfulness of the past reviews as well as the trust worthiness and experience of the past reviewers. Towards that end, we collected various product specific, review specific, and reviewer specific characteristics for each of the products that can serve as a proxy for evaluating the helpfulness of the past review and the trust worthiness and

316

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321

Table 1 Summary statistics of the 19 products in the data set. Category Experience Books Movies

Music CD

Search Video games Electronics HDTVs

Printers

Title

List price

Number of ratings

Average rating

Standard dev.

$22.99 $15.00 $39.99 $24.98 $24.98 $12.98 $18.98 $7.99 $18.98 $11.98

5082 2072 1005 1268 1183 1443 640 582 1003 1565

3.575 3.619 3.595 4.131 4.025 3.484 4.186 3.533 3.952 4.54

1.624 1.603 1.609 1.329 1.352 1.479 1.3 1.391 1.299 0.971

$29.99 $199.99 $129.99 $1,500.00 $549.95 $2,499.99 $299.99 $338.40 $199.99

342 699 227 248 226 205 528 337 245

3.956 3.794 3.56 4.423 4.566 4.507 4.263 3.608 3.551

1.259 1.371 1.52 1.184 0.782 1.174 1.154 1.541 1.572

Breaking Dawn (Twilight Saga, Book No. 4) Eat, Pray, Love; Elizabeth Gilbert Avatar (2 Disc Blue Ray/DVD Combo) The Dark Knight (+BD Live) [Blu-ray] 300 [Blu-ray] The Matrix Reloaded (DVD) Let It Be... Naked (The Beatles) Destiny Fulfilled (Destiny’s Child) Death Magnetic (Metallica) I Dreamed A Dream (Susan Boyle) Grand Theft Auto IV (PlayStation3 Game) Flip Ultra HD Camcorder 120 min (Black) Nikon Coolpix L20 10MP Digital Camera Samsung LN52B750 52-Inch LCD HDTV Panasonic TC-L26X1 26-Inch LCD HDTV Sony KDL-52W4100 52-Inch LCD HDTV Brother HL-2170W 23ppm Laser Printer HP Officejet All-in-One Printer (Q7311A#ABA) Epson 600 All-in-One Printer (C11CA18201)

Table 2 The variables collected for our study. Type

Variable

Explanation

Product

Retail price Average rating Number of reviews

The retail price at Amazon.com Average rating of the posted reviews Total number of reviews posted for the product

Review

Rating Review length Helpful votes Total votes Helpfulness

Number of stars (1–5) The length of review in words The number of helpful votes for the review The total number of votes for the review Helpful votes/total votes

Reviewer

Number of past reviews Past helpful votes Past total votes Past helpfulness Real name Nickname Anniversary Birthday Location Web page Interests In my own words Personal data count

Number of reviews posted by the reviewer Number of helpful votes accumulated in the past by the reviewer Number of total votes accumulated in the past by the reviewer Past Helpful Votes/Past Total Votes Has the reviewer disclosed his/her real name? Does the reviewer have a nickname listed in the profile? Does the reviewer list his/her wedding anniversary? Does the reviewer list his/her birthday? Does the reviewer disclose his/her location? Does the reviewer have a homepage listed? Does the reviewer list his/her interests? Does the reviewer share his/her tastes in other products? How many of the above 8 variables are disclosed?

experience of the reviewer. Table 2 lists the different variables that were collected. For each product we collected the retail price listed on Amazon.com at the time the data was collected, the total number of reviews, and the average rating of all the posted reviews. For each individual review we collected the rating (number of stars on a scale of 1–5), the length of the review in words, the number of helpful votes, and the total number of votes received by that review. Amazon, like many other online retailers, allows all users to vote reviews posted by other users as being helpful or not. Furthermore, for each individual reviewer we scanned their profile and collected summary information about the reviewer such as the total number of reviews they have written, the total number of votes their reviews have received, and the total number of helpful votes they have received. Amazon also allows users to provide additional personal information on their profile page. We collected information about eight such variables that are listed in Table 2. All these variables are binary indicating whether the user provided that information or not. We then created a summary

variable called the personal data count variable that counts the number of the above eight personal variables for which the user has provided information. The personal data count variable can take values from 0 to 8. In the next section we present our base model for modeling the bias in online ratings and using the Kalman filter for estimating it. 4. Modeling and analysis of bias 4.1. Base model We model the ratings time series as a linear dynamic system given by the following equation:

xtþ1 ¼ Axt þ wt ;

ð1Þ

where A is the system matrix, wt is zero mean Gaussian noise representing the system noise, and xt is the rating vector at time t consisting of the unbiased (ut) and biased (bt) parts as shown below

317

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321

xt ¼ ½ut bt T :

ð2Þ

The modeled ratings are related to the observed ratings by the following equation:

zt ¼ Hxt þ v t ;

ð3Þ

where zt is the actual observed rating at time t, H is the observation matrix, and vt is zero mean Gaussian noise representing the measurement noise. The system and the measurement noise are given by,

wt  Nð0; Q Þ;

ð4Þ

v t  Nð0; RÞ;

ð5Þ

where Q is the system noise covariance matrix and R is the measurement noise variance. In our case we want to estimate the ratings vector xt based on the actual observed ratings zt. We impart our domain knowledge into the model by specifying the estimated variance and covariance of the components of xt. In our case, we set R = 0.1 as the measurement noise variance and Q matrix as 0 except for Q(2, 2) = 0.01 since the unbiased rating is fixed but the biased rating has some variance. The system matrix is A = I, and the observation matrix H = [1 1]. We use Kalman filter [11,17], which is a Bayes optimal minimum mean-squared-error estimator for linear systems with Gaussian noise, for estimating the ratings vector xt. Since Kalman filter is a recursive data processing algorithm for solving the discrete data linear filtering problem, it updates the estimate of the system variables at each time step using only the observation at that time without having to store all the past observations. The Kalman filter estimates a process by using a form of feedback control. The filter estimates the process state at some time and then obtains feedback in the form of (noisy) measurements. It works by using the dual steps of time update (also known as prediction) and measurement update (also known as correction) at each iteration. The time update equations are responsible for projecting forward (in time) the current state, xt, and error covariance estimates, P, to obtain the a priori estimates for the next time step, as shown below

^0t ¼ Ax ^t1 ; x P0t

ð6Þ T

¼ APt1 A þ Q :

ð7Þ

The measurement update equations are responsible for the feedback for incorporating a new measurement, zt, into the a priori estimate to obtain an improved a posteriori estimate, as shown below

Kt ¼ P0t HT ðHP0t HT þ RÞ1 ; ^t ¼ x ^0t þ Kt ðzt  Hx ^0t Þ; x Pt ¼ ðI 

Kt HÞP0t ;

ð8Þ ð9Þ ð10Þ

where Kt is the Kalman gain that minimizes the a posteriori error   ^0t is the residual that reflects the discrepcovariance, and zt  Hx ^0t and the actual meaancy between the predicted measurement Hx surement zt. The Kalman filter has been shown to be very powerful as it supports estimations of past, present, and even future states, and it can do so even when the precise nature of the modeled system is unknown. 4.2. Kalman filter algorithm’s computational complexity Fig. 1 presents the algorithm implementing the Kalman filter. Assume that the state vector x is an n  1 vector, the system matrix A is n  n, the error covariance estimate matrix P is n  n, the measurement noise variance matrix Q is n  n, the observation vector H is 1  n, and the Kalman gain vector K is n  1. We can now analyze the computational complexity of each of the five steps 9–13 of

Fig. 1. The Kalman filter algorithm for the base model.

Table 3 Computational algorithm.

complexity

of

the

Kalman

filter

Algorithm steps

Complexity

x0 = A.x P0 = A.P.AT + Q K = P0 .HT.(H.P’.HT + R)1 x = x0 + K(z  H.x’) P = (I  K.H).P’

O(n2) O(n3) O(n2) O(n) O(n3)

the algorithm as shown in Table 3. We can see from Table 3 that the overall computational complexity of the Kalman filter algorithm is O(n3). In the next section we present an example application of the Kalman filter algorithm on one data set.

5. Example application of the Kalman filter for the base model We show an example of applying the Kalman filtering algorithm of Fig. 1 to the base model for the data set for the Grand Theft Auto IV video game. As shown in Table 1, this product had data related to a total 342 ratings. We begin by initializing the various matrices and vectors as mentioned in the algorithm. As mentioned in Section 4.1 above, we set R = 0.1 as the measurement noise variance and Q matrix as 0 except for q22 = 0.01 since the unbiased rating is fixed but the biased rating has some variance. The system matrix is A = I, and the observation matrix H = [1 1]. Table 4 shows the actual rating (z) of the product in the first column. Since the first rating for this product is 5, we initialize the elements of vector x as x1 = 5.0 and x2 = 0.0. All the elements of matrix P are initialized to 1.0 as shown in the first row of Table 4. For the next time period, we calculate the a priori estimate x’ using step 9 of the algorithm. Since A is an identity matrix, the a priori estimate x’ is same as the estimate x from the previous time period, as shown in the second row of Table 4. We update the values of the a priori estimate P’ as shown in the second row of Table 4, using step 10 of the algorithm and the above mentioned values of A and Q. We now calculate the value of the Kalman gain vector K using step 11 of the algorithm and above calculated value of P’. As shown in the second row of Table 4, we get k1 = 0.49 and k2 = 0.49. Using this value of K and the actual rating (z) of the product in this time period, we update the estimate x using step 12 of the algorithm. Finally, using the value of K and P’, we update the matrix P using step 13. The second row of Table 4 show the values for the vector x and matrix P that are calculated at the end of this iteration. This whole process is repeated over several iterations. Table 4 shows all the intermediate values of the different variables for the first five

318

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321

Table 4 An example application of the Kalman filter. z

x10

x20

p110

p120

p210

p220

k1

k2

x1

x2

p11

p12

p21

p22

5 5 5 4 4 3

– 5.00 5.00 5.00 4.86 4.79



– 1.00 0.03 0.02 0.01 0.01

– 1.00 0.02 0.01 0.00 0.00

– 1.00 0.02 0.01 0.00 0.00

– 1.01 0.04 0.03 0.03 0.03

– 0.49 0.23 0.14 0.10 0.07

– 0.49 0.28 0.24 0.23 0.23

5.00 5.00 5.00 4.86 4.79 4.70

0.00 0.00 0.00 0.24 0.38 0.70

1.00 0.03 0.02 0.01 0.01 0.01

1.00 0.02 0.01 0.00 0.00 0.00

1.00 0.02 0.01 0.00 0.00 0.00

1.00 0.03 0.02 0.02 0.02 0.03

0.00 0.00 0.00 0.24 0.38

iterations. Fig. 2 shows the plot of the actual rating (z) along with the unbiased rating (x1) and bias (x2) for the first few iterations of the Kalman algorithm. Figs. 3 and 4 show the plot of the Kalman gain components and the error covariance matrix components for the first few iteration. As can be seen, both the Kalman gain

and the estimation error covariance matrix quickly stabilize and remain constant, with the estimation error covariance converging to 0 fairly quickly. In the next section we discuss validating the above model and present the results of the base model on our Amazon data set of 19 products. 6. Empirical results for the base model 6.1. Validity of the model

Fig. 2. Actual (z) and estimated unbiased rating (x1) together with bias (x2).

Superior performance of the Kalman filter depends on proper fine-tuning of filter parameters (especially Q and R). Under conditions where Q and R are in fact constant, both the estimation error covariance Pt and the Kalman gain Kt stabilize quickly and then remain constant. One way to evaluate the performance of the Kalman filter is to evaluate the statistical properties of the residual. Kailath [10] had shown that one such measure is to test whether the residual is a white noise sequence. We use a test for whiteness that was original proposed by Stioca [23] and has since been widely used in system identification and filtering. Let d be a discrete random process, in our case the residuals, and let

ri ¼

ni 1X dt dtþi n t¼0

ð11Þ

be a consistent estimate of its theoretical covariance computed from its sample d0, d1, . . . , dn. Then, the sequence dt is white noise sequence with a significance level of 0.05 if for all large k (k  3), k X

pffiffiffiffiffiffi r 2i < ðk þ 1:65 2kÞr 20 =n:

ð12Þ

i¼1

We apply the above test to the residuals at each stage and calculate the fraction of residuals that meet the statistical properties of white noise. We consider the model to be a good fit at a level of 0.95 if 95% or more of the residuals meet the above criteria. Fig. 3. Kalman gain vector components.

6.2. Discussion of the results

Fig. 4. Estimation error covariance matrix components.

We first conducted experiments to study the presence of bias in the product ratings as modeled by the Eqs. (1)–(5). The only part of the data set used for this study was the time series of the ratings for each product. We applied the recursive Kalman filtering algorithm using Eqs. (6)–(10) for each product data. The resulting time series of residuals was analyzed for white noise using Eqs. (11) and (12) indicating goodness of fit for the proposed model. The results of these experiments are presented in Table 5. It shows the unbiased part of the ratings (u in Eq. (2)), the average bias and the proportion of the residuals that satisfy the statistical requirement of being white noise. Although the amount of bias estimated by the model varies from rating to rating, we can estimate the average bias by comparing the average rating with the estimated unbiased rating, as can be seen, the residual white noise is significant at 0.95 level, validating our base model and

319

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321 Table 5 Results of Kalman filter for the base model. Category Experience Books Movies

Music CD

Search Video Games Electronics HDTVs

Printers

Product

Average rating

Unbiased rating (u)

Average bias

White noise residuals

Breaking Dawn (Twilight Saga, Book No. 4) Eat, Pray, Love; Elizabeth Gilbert Avatar (2 Disc Blue Ray/DVD Combo) The Dark Knight (+ BD Live) [Blu-ray] 300 [Blu-ray] The Matrix Reloaded (DVD) Let It Be... Naked (The Beatles) Destiny Fulfilled (Destiny’s Child) Death Magnetic (Metallica) I Dreamed A Dream (Susan Boyle)

3.575 3.619 3.595 4.131 4.025 3.484 4.186 3.533 3.952 4.54

3.17 4.98 4.59 4.71 4.46 4.34 2.99 4.18 4.08 5.00

0.40 1.36 0.99 0.58 0.44 0.86 1.20 0.64 0.13 0.46

1.000 1.000 1.000 1.000 0.993 1.000 1.000 1.000 1.000 0.996

Grand Theft Auto IV (PlayStation3 Game) Flip Ultra HD Camcorder 120 min (Black) Nikon Coolpix L20 10MP Digital Camera Samsung LN52B750 52-Inch LCD HDTV Panasonic TC-L26X1 26-Inch LCD HDTV Sony KDL-52W4100 52-Inch LCD HDTV Brother HL-2170W 23ppm Laser Printer HP Officejet All-in-One Printer (Q7311A#ABA) Epson 600 All-in-One Printer (C11CA18201)

3.956 3.794 3.56 4.423 4.566 4.507 4.263 3.608 3.551

4.72 2.87 2.50 4.74 4.95 4.78 4.35 4.57 4.89

0.76 0.93 1.06 0.32 0.38 0.27 0.09 0.96 1.34

1.000 1.000 1.000 1.000 0.983 1.000 0.969 1.000 1.000

confirming the presence of bias in ratings of products in all categories. Moreover, the bias is negative for most of the products, except for the electronic items and couple of products in the experience category.

Having confirmed the presence of bias in the ratings with our base model, we now extend our model to study the presence of sequential bias in the ratings. We hypothesize that there exists a sequential bias in the reviews that depends on the previous reviews. Specifically, we hypothesize that the sequential bias depends on both the review characteristics and reviewer characteristics of the previous review that are listed in Table 2. We model the biased part of the rating, bt, in Eq. (2) as follows:

bt ¼

ait yit1 þ v 0t ;

ð13Þ

i¼1

where, v 0t is zero mean Gaussian noise representing bias measurement noise, y1–y3 are the review characteristics, and y4–y7 are reviewer characteristics of the previous review as given below: y1 = log(1 + HelpfulVotes), y2 = log(1 + TotalVotes), y3 = log(1 + ReviewLength), y4 = log(1 + NumberOfPastReviews), y5 = PersonalDataCount, y6 = log(1 + PastTotalVotes), y7 = log(1 + PastHelpfulVotes). When establishing the effect of different independent variables simultaneously on a dependent variable, it is important to make sure that the absolute value and the range of values of some variables do not dominate other variables. Since some variables (for e.g., the number of total votes) can take infinitely large values, we normalize these variables by modeling their effect in a loglinear relationship. We model the coefficient vector as a linear dynamic system given by the following equation:

atþ1 ¼ A0 at þ w0t ;

ð15Þ

We can re-write Eq. (13) in vector form relating the bias at time t with the review and reviewer characteristics from time t  1 as shown below

bt ¼ Yt1 at þ v 0t ;

7. Modeling sequential bias

7 X

at ¼ ½a1t a2t . . . a7t T :

ð14Þ

where A0 is the system matrix, w0t is zero mean Gaussian noise representing the system noise, and at is the coefficient vector at time t consisting of the seven coefficients from Eq. (13) as shown below

ð16Þ

where bt is the actual observed bias at time t, Y is the observation matrix, and v 0t is zero mean Gaussian noise representing the measurement noise. As in the base model, the system and the measurement noise are given by,

w0t  Nð0; Q 0 Þ;

ð17Þ

v 0t  Nð0; R0 Þ; 0

ð18Þ 0

where Q is the system noise covariance matrix and R is the measurement noise variance. In our case we want to estimate the coefficient vector at based on the actual observed bias bt that is estimated by the base model. As before we use the Kalman filter to estimate the coefficient vector at. In effect we use two Kalman filters in sequence. At each iteration, the first Kalman filter uses the observed rating zt to estimate the ratings vector xt, and the second Kalman filter uses the bias part of the estimated ratings vector to estimate the coefficient vector at. We impart our domain knowledge into the model by specifying the estimated variance and covariance of the components of at. In our case, we set R0 = 0.1 as the measurement noise variance. The system matrix is A = I, and the observation matrix Y is set by the observed values of the review and reviewer characteristics. We set the system noise covariance matrix as shown below since we assume that the effect of review and reviewer characteristics on the bias are independent of each other but their individual effects have some variance

Q 0 ði; jÞ ¼



8i – j; 0:01 8i ¼ j: 0

ð19Þ

8. Empirical results for the 2-level model To delineate the effects of review and reviewer characteristics, we first tested the above model using review characteristics (y1– y3) and reviewer characteristics (y4–y7) separately, and then combined them in a single model with all the seven characteristics. We

320

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321

Table 6 Results of the dual Kalman filter on the sequential bias models. Significant values are shown in bold and italic. Category

Experience Books Movies

Music CD

Search Video Games Electronics HDTVs

Printers

Product

White noise residuals

Model coefficients

Review

Reviewer

Combined

a1

a1

a3

a4

a5

a6

a7

Breaking Dawn (Twilight Saga, Book No. 4) Eat, Pray, Love; Elizabeth Gilbert Avatar (2 Disc Blue Ray/DVD Combo) The Dark Knight (+ BD Live) [Blu-ray] 300 [Blu-ray] The Matrix Reloaded (DVD) Let It Be... Naked (The Beatles) Destiny Fulfilled (Destiny’s Child) Death Magnetic (Metallica) I Dreamed A Dream (Susan Boyle)

0.732 0.872 0.886 0.753 0.673 0.557 1.000 0.674 0.778 0.707

0.856 1.000 1.000 0.946 1.000 1.000 0.508 1.000 1.000 0.909

0.941 1.000 1.000 1.000 0.978 0.943 1.000 1.000 1.000 1.000

0.939 0.501 1.190 0.067 0.668 0.460 0.420 0.470 0.174 0.421

0.456 1.240 1.496 0.484 0.860 0.327 0.257 0.629 0.368 0.531

0.133 0.830 0.238 0.199 0.307 0.401 0.643 0.296 0.008 0.220

0.320 0.005 0.348 0.028 0.491 0.315 0.395 0.425 0.014 0.496

0.021 0.218 0.244 0.105 0.425 0.021 0.553 0.170 0.040 0.062

0.315 1.155 0.658 0.429 0.511 0.262 1.553 0.406 0.280 0.452

0.343 0.158 0.590 0.382 1.259 0.443 1.391 0.712 0.131 0.848

Grand Theft Auto IV (PlayStation3 Game)

1.000

1.000

1.000

0.210

0.144

0.418

0.378

0.333

0.498

0.674

Flip Ultra HD Camcorder 120 min (Black) Nikon Coolpix L20 10MP Digital Camera Samsung LN52B750 52-Inch LCD HDTV Panasonic TC-L26X1 26-Inch LCD HDTV Sony KDL-52W4100 52-Inch LCD HDTV Brother HL-2170W 23ppm Laser Printer HP Officejet All-in-One Printer (Q7311A#ABA) Epson 600 All-in-One Printer (C11CA18201)

1.000 0.604 0.995 1.000 1.000 1.000 0.826

1.000 1.000 1.000 1.000 1.000 1.000 1.000

1.000 1.000 1.000 1.000 1.000 1.000 1.000

0.267 0.197 0.894 0.352 0.042 0.157 0.040

0.181 0.022 0.836 0.184 0.006 0.407 0.157

0.194 0.244 0.328 0.165 0.584 0.871 0.393

0.706 1.505 0.207 0.369 0.033 1.480 0.964

0.125 0.209 0.302 0.082 0.335 0.802 0.205

0.252 0.335 0.905 0.271 0.318 0.329 0.468

0.756 0.896 1.105 0.371 0.016 0.694 0.843

1.000

1.000

1.000

0.821

1.037

1.814

5.359

0.765

1.209

0.617

applied the recursive Kalman filtering algorithm using Eqs. (6)– (10) for each product data in two stages. In the first stage it uses the observed rating zt to estimate the ratings vector xt, and in the second stage it uses the bias part of the estimated ratings vector to estimate the coefficient vector at. Note that all the variables that are based on votes (total votes, helpful votes, etc.) are dynamic in nature and vary with time for each review. Although it appears that we are using fixed coefficient values for these variables in our model, they are used as a proxy for the effect of previous reviews on the bias and are in fact dynamic as they change with every new rating that is added to the data set. As before, the resulting time series of residuals was analyzed for white noise using Eqs. (11) and (12) indicating goodness of fit for the proposed model. The results of these experiments are presented in Table 6. It shows the proportion of the residuals that satisfy the statistical requirement of being white noise for the three versions of the model (review, reviewer, and combined), and the model coefficients of the seven characteristics. All results that are significant at 0.95 level based on the white noise residuals ratio are highlighted in bold and italic. From the results for the combined model, it is clear that both the search and experience goods show significant presence of sequential bias in their ratings. However, the review characteristics are not significant in explaining the sequential bias for experience goods. Reviewer characteristics solely explain the presence of sequential bias for experience goods. For search goods in contrast, both the review and reviewer characteristics significantly impact the presence of sequential bias. Although there are some differences between products, one can make some general conclusions with respect to the effects of individual characteristics on the sequential bias by comparing the direction of the bias reported in Table 5 with the signs of the coefficients in Table 6. For example, within the reviewer characteristics the total number of reviews, the personal data count, and total number of votes all have a positive effect on the bias, whereas the helpful number of votes has a negative effect. We find a similar effect for review characteristics for search products. The length of the review and total number of votes have a positive effect, whereas the helpful number of votes has a negative effect.

9. Conclusions and future work We analyzed the online ratings of a set of products from Amazon.com, classified as either search or experience goods, as a linear dynamic system and showed the presence of bias in the ratings by using a Kalman filtering approach. Unlike other approaches presented in the literature, our approach does not require the storage of all the past ratings data to estimate the bias. Moreover, it can update its estimate of the bias every time a new rating is posted, making the approach practical enough to be implemented by an online retailer as a service to the consumers to mitigate the effects of bias on product sales. We further modeled the bias estimated by the base model as a sequential bias that depends on review and reviewer characteristics of the past reviews. Using a second level of Kalman filter in tandem with the base level model we estimated the effect of these characteristics on the bias. The key results from our model can be summarized as follows:  Both the search and experience goods show significant presence of bias in their ratings.  For majority of the products the average bias present in the ratings tends to be negative, i.e., the unbiased rating estimated by our model is generally greater than the average rating.  The bias estimated by the Kalman filter can be modeled as a sequential bias that depends on the review and reviewer characteristics of the past reviews.  The sequential bias in experience products is predominantly explained by reviewer characteristics, whereas the bias in search goods is explained by both the review and reviewer characteristics.  For the reviewer characteristics, the total number of reviews, the personal data count, and total number of votes have a positive effect on the bias, whereas the helpful number of votes has a negative effect.  For the review characteristics, the length of the review and total number of votes have a positive effect on the bias, whereas the helpful number of votes has a negative effect.

R.T. Sikora, K. Chauhan / Knowledge-Based Systems 27 (2012) 314–321

We consider this as a first step in building a better understanding of the presence of bias in online ratings and estimating the effects of temporal dynamics on the ratings. There are several interesting directions in which this work can be further explored. One can study whether a forum or a platform can itself have an effect on the bias in the ratings by analyzing the temporal dynamics of ratings of the same product on different forums. For example, reviews of a particular book could be collected from Amazon and Barnes and Nobles, or reviews of an electronic item could be studied from Amazon and CNET to see if there are any differences in their dynamics on different platforms. There are however some drawbacks in using Kalman filtering technique as presented in this paper. The foremost is that the filter parameters Q and R have to be properly fine-tuned. Although the measurement error R is easier to measure, the system noise covariance Q has to be fine-tuned. Also, there is an underlying assumption that the process being modeled is linear. If either the process or the measurement is non-linear then one has to use the extended Kalman filter. References

[8] [9]

[10]

[11] [12] [13]

[14]

[15] [16] [17] [18] [19]

[1] J. Bobadilla, F. Serradilla, J. Bernal, A new collaborative filtering metric that improves the behavior of recommender systems, 23(6)(2010), 520–528. [2] C. Dellarocas, The digitization of word of mouth: promise and challenges of online feedback mechanisms, Management Science 49 (10) (2003) 1407–1424. [3] DoubleClick, DoubleClick’s Touchpoints II: The Changing Purchase Process, 2004. [4] C. Forman, A. Ghose, B. Wiesenfeld, Examining the relationship between reviews and sales: the role of reviewer identity disclosure in electronic markets, Information Systems Research 19 (3) (2008) 291–313. [5] A. Ghose, P. Ipeirotis, Estimating the helpfulness and economic impact of product reviews: mining text and reviewer characteristics, IEEE Transactions on Knowledge and Data Engineering (2010). [6] Y. Hijikata, H. Ohno, Y. Kusumura, S. Nishida, Social summarization of text feedback for online auctions and interactive presentation of the summary, Knowledge Based Systems (2007) 527–541. [7] N. Hu, P. Pavlou, J. Zhang, Can online reviews reveal a product’s true quality? Empirical findings and analytical modeling of online word-of-mouth

[20]

[21] [22]

[23] [24]

[25]

321

communication, in: Proceedings of the 7th ACM Conference on Electronic Commerce, 2006, pp. 324–330. N. Hu, L. Liu, V. Sambamurthy, Fraud detection in online consumer reviews, Decision Support Systems 50 (3) (2010) 614–626. P. Huang, N.H. Lurie, S. Mitra, Searching for experience on the web: an empirical examination of consumer behavior for search and experience goods, Journal of Marketing 73 (2) (2009) 55–69. T. Kailath, An innovations approach to least-squares estimation – Part I: Linear filtering in additive white noise, IEEE Transactions on Automatic Control AC-13 (1968) 646–655. R.E. Kalman, A new approach to linear filtering and prediction problems, Transactions of the ASME – Journal of Basic Engineering (1960) 35–45. G. Kapoor, S. Piramuthu, Sequential bias in online product reviews, Journal of Organizational Computing and Electronic Commerce 19 (2009) 85–95. B. Khosravifar, J. Bentahar, M. Gomrokchi, R. Alam, CRM: an efficient trust and reputation model for agent computing, Knowledge Based Systems, 2011. H.W. Lauw, E. Lim, K. Wang, Bias and controversy in evaluation systems, IEEE Transactions on Knowledge and Data Engineering 20 (11) (2008) 1490–1504. X. Li, L.M. Hitt, Self selection and information role of online product reviews, Information Systems Research 19 (4) (2008) 456–474. M.P.O. Mahoney, B. Smyth, A classification-based review recommender, Knowledge-Based Systems 23 (4) (2010) 323–329. P.S. Maybeck, Stochastic Models, Estimation, and Control, Academic Press, 1979. S.M. Mudambi, D. Schuff, What makes a helpful online review? A study of customer reviews on Amazon com, MIS Quaterly 34 (1) (2010) 185–200. P. Nelson, Advertising as information, Journal of Political Economy 81 (4) (1974) 729–754. L. Page, K. Page, Last shall be first: a field study of biases in sequential performance evaluation on the idol series, Journal of Economic Behavior and Organization 73 (2010) 186–198. M. Rabin, J.L. Schrag, First impressions matter: a model of confirmatory bias, The Quarterly Journal of Economics (1999) 37–82. J. Staddon, R. Chow, Detecting reviewer bias through web-based association mining, in: Proceedings of the Second ACM Workshop on Information Credibility on the Web, 2008, pp. 5–9. P. Stioca, A test for whiteness, IEEE Transactions on Automatic Control AC-22 (6) (1977) 992–993. L. You, R. Sikora, An adaptive reputation mechanism for online traders, in: Conference on Information Systems and Technology, INFORMS Annual Meeting, Washington, DC, 2008. L. You, R. Sikora, The influence of bias on reputation models in E-marketplace, in: Proceedings of the Annual DSI Meeting, San Diego, 2010, pp. 3231–3236.