ScienceDirect Procedia Computer Science 00 (2018) 000–000 ScienceDirect
Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
www.elsevier.com/locate/procedia
ScienceDirect
www.elsevier.com/locate/procedia
Procedia Computer Science 00 (2018) 000–000
Procedia Computer Science 132 (2018) 1202–1209
International Conference on Computational Intelligence and Data Science (ICCIDS 2018) International Conference on Computational Intelligence and Data Science (ICCIDS 2018) A Novel Approach to Web-Based Review Analysis
UsingtoOpinion Mining A Novel Approach Web-Based Review Analysis a Opinion Mining Using Monica Malik Sharib Habiba Parul Agarwala* a
Department of Computer Science and Engineering, Jamia Hamdard, New Delhi-110062, India
a
Abstract
Monica Malika Sharib Habiba Parul Agarwala*
Department of Computer Science and Engineering, Jamia Hamdard, New Delhi-110062, India
With the advent of E-Commerce, a huge amount of data is being generated these days. This data can be useful if knowledge can Abstract be extracted from the business perspective. People often use websites reviews for decision making about whether the products shouldthe beadvent boughtoforE-Commerce, not. Opinion Mining enables of thedata process of selection and decision easier. Though techniques With a huge amount is being generated these days. making This data can be usefulseveral if knowledge can exist for opinion based perspective. on decision People makingoften in this the approach is novel. In making this paper thewhether approach is in be extracted frommining the business usepaper websites reviews for decision about theused products additionbetobought the opinions from the reviews E-Commerce websites andeasier. calculating theseveral overalltechniques sentiment should or not. generated Opinion Mining enables the collected process offrom selection and decision making Though for decision making, this based paper also incorporates a priority particular featuresisofnovel. a product entered buyer for making exist for opinion mining on decision making in this for paper the approach In this paper by thethe approach used is in the final to decision. This has been incorporated in thecollected form of additional weights which canand be entered by the andsentiment adjusted addition the opinions generated from the reviews from E-Commerce websites calculating the user overall according the priority. The reason to do this isa priority that priority for a particular of a product vary fromfor person to for decisionto making, this paper also incorporates for particular featuresfeature of a product enteredmay by the buyer making person. lay in the buyer’s hand of in additional addition toweights the opinions and analysed the finalMoreover, decision. the Thisfinal has decision been incorporated in the form whichcollected can be entered by thefrom userreviews. and adjusted according to the priority. The reason to do this is that priority for a particular feature of a product may vary from person to person. Moreover, the final decision lay in the buyer’s hand in addition to the opinions collected and analysed from reviews. © 2018 2018 The The Authors. Authors. Published Published by by Elsevier Elsevier Ltd. B.V. © This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/) Peer-review thePublished responsibility of the scientific committee of the International Conference on Computational Intelligence © 2018 The under Authors. byofElsevier B.V. committee Peer-review under responsibility the scientific of the International Conference on Computational Intelligence and and Data Science (ICCIDS 2018). Data Science (ICCIDS 2018). Peer-review under the responsibility of the scientific committee of the International Conference on Computational Intelligence Keywords: Priority; Weights;2018). Web-based reviews; Opinion Mining; Ontology; Modified OGC. and Data Science (ICCIDS * Corresponding author. Tel.: Web-based +919873076361. E-mail address:
[email protected] Keywords: Priority; Weights; reviews; Opinion Mining; Ontology; Modified OGC. Corresponding author. Tel.: +919873076361. E-mail address:
[email protected] 1.*Introduction
1. Introduction Opinion mining is also referred as Sentiment Analysis. It’s a study that comprises of people’s emotions, sentiments, and behavioural patterns, opinions towards objects like situations, events, products, persons, organizations and Opinionobjects mininginisnature also referred Analysis. a study that comprises of people’s sentiments, similar aroundasus.Sentiment People make their It’s buying decisions on reviews. Famous emotions, examples of websites and behavioural patterns, opinions towards objects like situations, events, products, persons, organizations and having reviews include www.amazon.com, www.flipkart.com, www.ebay.com and many others. These websites similar objects in nature around us. People make their buying decisions on reviews. Famous examples of websites allow the customers to express their views about the merchandise bought. Thus, when a buying decision has to be havingbyreviews include www.ebay.com and many others. Ontology These websites made a new user, thatwww.amazon.com, new user reads the www.flipkart.com, reviews and benefits from these reviews. [1] Defines as the allow the customers to express their views about the merchandise bought. Thus, when a buying decision has easily to be specialization of the conceptualization. Ontology stipulates knowledge about particular domains which are made by a new user, that new user reads the reviews and benefits from these reviews. [1] Defines Ontology as the specialization conceptualization. Ontology stipulates knowledge about particular domains which are easily 1877-0509© 2018of Thethe Authors. Published by Elsevier B.V. Peer-review under the responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science (ICCIDS 1877-0509© 2018 The Authors.2018). Published by Elsevier B.V. Peer-review under the responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science (ICCIDS 2018).
1877-0509 © 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/) Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science (ICCIDS 2018). 10.1016/j.procs.2018.05.035
Monica Malik et al. / Procedia Computer Science 132 (2018) 1202–1209 Monica Malik et al. / Procedia Computer Science 00 (2018) 000–000
1203
comprehensible by developers and computers. The papers consisting metadata are footnoted by using the same. Ontology helps in the process of recovery of information and reasoning which enables data practical between several applications[2].A lexicon is described as a language-specific ontology as per [3]. Wherein, thesaurus is termed as both a domain-specific ontology and an application(s) specific ontology. Few examples of domainspecific ontology are manufacturing, corporate law and laptop-manufacturing and for application-specific ontology can be Airline Reservations, Conference Organization and Inventory Control. In recent times it has been seen a massive boost in the amount of content that consists of either the information on the internet [4]or the data published in the articles. Approximately 1.1 billion population of the world uses internet and amongst them approximately 768 million are regular users of social media websites, as stated by “comScore” a marketing research company. The easy availability of almost all types of data available on the web changed the importance of Social media and ecommerce websites by making them more and more important. Nowadays internet is not only being used to gather information about the products but also to read reviews and make an informed decision before buying any product. People also take advantage of this information by getting regularly updated about the social and political issues around the globe [5]. Organizations have become more cautious about their reputations because of the effective and rapid spread of information on the internet. In [6], the authors state that reviews influence about 82% of purchase decisions. The reason behind this is very simple. The opinions generated about products and services such as hotel bookings, movie tickets, restaurant reservations, etc. are shared ones. It can be easily understood by taking an example of any famous e-commerce website, such as, www.flipkart.com where users do shopping via the internet. A new user who wants to buy some commodity accesses the website and reads the reviews and opinions as expressed by the previous users about the product and hence can perform comparison [7]. The comments which have been posted on a website by the users for different products usually cover various issues linked to the product, some comments may be positive, some maybe negative while others can be just neutral[8]. Whenever there is a negative review which contains words like ‘not’, ‘never’ or ‘didn’t’, it has to be paid special attention [4]. When some comments expressed in a review by the customer has argued about a feature and describes product attributes at varied levels, it gets difficult to make sense in such situations. For example, a review say HP is an average one in general but HP Pavilion is good. To integrate the sentiments of a customer with the features of a product is a very important task [8]. 2. Literature Survey The process of opinion mining is explained by [7] in which the author proposes a framework which analyses the existing research in opinion mining. Item mining is done to know more about the particular product for which any opinion can be mined e.g. Laptop, Fit-bit, Mobiles, Cameras, Variable Bluetooth devices etc. The opinion given by the user could be either good or bad. A bad opinion about any merchandise does not imply that all features of the merchandise are unlikable. Feature mining is also vital which provides a complete credibility to the particular opinion and that comes under the second step.After extraction of features, in third step, feature sentiment is acknowledged which helps to represent the debilitated and better points of the product’s feature. For example, a good battery life, light weight, beautiful colours and good user interface. The outcome of the above stated steps is the overall sentiment which can be expressed as “Yes” or “No”. For example ‘Yes, buy this product or ‘No, don´t buy this product’ or ‘Yes, this product is recommended’ or ‘No, this product is not recommended. It’s very useful for the buyers to save a lot of time and helps them to make a decision very fast. In [8], the authors propose an architecture that uses a multidimensional model for combining user’s sentiments about the product and their remarks about the items/merchandises. In this method on the basis of the customer’s reviews, first step is to identify the entities. Then the opinions or sentiments identified are changed into an attribute table by using a 7 point polarity system table (-3 to 3) as shown in the Table 1. Based on the user's reviews a table is constructed. The opinion table constructed contains the polarities. Then using the proposed formula author calculates the opinion polarity which is referred as ogc. Thus, for any product[9] opinions can be examined. Good conclusions need good quality information, so it is indispensable to evaluate quality information in the restricted amount of time [10].
Monica Malik et al. / Procedia Computer Science 132 (2018) 1202–1209 Monica Malik et al. / Procedia Computer Science 00 (2018) 000–000
1204
3
Table 1. Polarity Representation Polarity
Represents
-3
Very Poor
-2
Poor
-1
Weak
0
Neutral
1
Excellent
2
Very Good
3
Acceptable
3. Approach Used The approach used in this paper is the modification of the approach stated in [9]. The stepwise procedure followed by the proposed model is mentioned below: 1.
100 reviews were taken from the very well-known e-commerce website such as “www.amazon.com” for the product ‘Fit-Bit’. A snapshot of the product ‘Fit-Bit’ is shown in the [11] Fig.1.(a) and the reviews for the same is shown in the Fig.1.(b)
Fig 1 (a) Product
Overview
Fig. 1.(b) Product Reviews
Monica Malik et al. / Procedia Computer Science 132 (2018) 1202–1209 Monica Malik et al. / Procedia Computer Science 00 (2018) 000–000
2.
1205
A sample size of 30 reviews is being shown in this paper. The attributes along with the corresponding polarity values (as described in section 2) are shown in Table 2. Table 2. Attributes Polarity Comment
Attributes Polarity (AP)
1 2
(Battery,2) (Display,2) (Accuracy,2) (Waterproof,-3) (Accuracy,1) (Synchronization, 1)
3
(Accuracy,2) (Battery,2)
4
(Accuracy,2) (General,2) (Durable,2)
5
(Accuracy,2) (Connectivity,2)
6 7
(Accuracy,2) (General,2) (Synchronization, 1) (Synchronization, 1) (Alarm,0)
8
(General,1) (Waterproof,-3)
9
(Battery,3) (Connectivity,2)
10 11
(General,1) (Waterproof,-3) (Synchronization, 1) (Accuracy,1) (Durable,2) (Connectivity,2)
12
(Accuracy,2) (Synchronization, 1)
13
(Features,2) (General,2)
14 15
(Reliable,2) (Accuracy,2) (Synchronization, -1) (Accuracy,2)
16
(Display,2) (Synchronization, 2)
17
(Synchronization, 1) (Battery,2)
18
(Display,2) (Waterproof,-3)
19
(Durable,2) (Connectivity,2)
20
(Battery,2) (Synchronization, 1)
21
(Accuracy,1) (Alarm,0)
22
(Synchronization, -1) (Connectivity,1)
23
(Synchronization, -1) (General,1)
24
(Accuracy,1) (Synchronization, 1)
25
(Waterproof,-3) (Accuracy,2)
26
(Synchronization, 1)
27
(General,2)
28
(Synchronization, 1)
29
(Connectivity,2)
30
(Accuracy,3)
Monica Malik et al. / Procedia Computer Science 132 (2018) 1202–1209 Monica Malik et al. / Procedia Computer Science 00 (2018) 000–000
1206
3.
5
In [8], the authors proposed the formula for calculating the OGC values which is shown in the equation 1 for the OGC values as shown in Table.3. (1)
Every user while making a purchase decision for a particular product has priority of one attribute over another. For example, Person A assigns higher priority to batter attribute and less priority to say durability. To capture this, for calculated OGC using the formula given is given in equation 1. The assigned weights to attributes. The modified formula for the equation 1 is shown in equation 2. (2)
Where, Attributes
Polarity
Table 3. Calculated OGC, Weight and Modified OGC OGC Wi Modified OGC -1 0 1 2 3
Orientation
-3
-2
Battery
0
0
0
0
0
4
1
11
0.2
2.2
POSITIVE
Display
0
0
0
0
0
3
0
6
0.1
0.6
POSITIVE
Accuracy
0
0
0
0
4
4
1
25
0.1
1.5
POSITIVE
Waterproof
5
0
0
0
0
0
0
-15
0.2
-3
NEGATIVE
Sync
0
0
3
0
10
1
0
9
0.1
0.9
POSITIVE
General
0
0
0
0
3
4
0
11
0.01
0.11
POSITIVE
Durable
0
0
0
0
0
3
0
6
0.09
0.54
POSITIVE
Alarm
0
0
0
2
0
0
0
0
0.01
0
POSITIVE
Connectivity
0
0
0
0
1
5
0
11
0.09
0.99
POSITIVE
Reliable
0
0
0
0
0
1
0
2
0.01
0.02
POSITIVE
Features
0
0
0
0
0
1
0
2
0.09
0.18
POSITIVE
4. Experimental Results The snapshots as shown in fig. 2 (a), (b), (c), (d), (e), (f) and (g) depict the working of the approach used:
(a)
Monica al. / Procedia Computer Science 132 (2018) 1202–1209 Monica MalikMalik et al. /etProcedia Computer Science 00 (2018) 000–000
(b)
(c)
(d)
1207
1208
Monica Malik et al. / Procedia Computer Science 132 (2018) 1202–1209 Monica Malik et al. / Procedia Computer Science 00 (2018) 000–000
7
(e)
(f)
(g) Fig. 2. (a) Implementation Main Page ; (b) Generation of Matrix ; (c) Generation of Matrix ; (d) Generation of Matrix ; (e) Generation of Matrix ; (f) Weight Allocation ; (g) Ontology Review
Monica Malik et al. / Procedia Computer Science 132 (2018) 1202–1209 Monica Malik et al. / Procedia Computer Science 00 (2018) 000–000
1209
5. Conclusion The main aim of the paper was to evaluate the reviews in an efficient manner. Several reviews are posted by the customer but only a few prove to be of significance. The result obtained largely depends on the type of reviews. Random reviews were collected for the purpose and using these reviews sentiment analysis was conducted. The opinion polarity was calculated using the weight method. In addition to the sentiments of the customers whose reviews were analysed the purpose of the proposed approach in introducing the weight method was to consider the priority that the customer who wishes to buy the product should be considered. By assigning weights to the attributes depending upon the priority that the user wishes to assign, it has been observed that the proposed model works effectively. The result obtained is thus more buyer choice specific thus making the decision making process accurate. References [1] Gruber, T. R. (1993) “A translation approach to portable ontology specifications.” Knowledge acquisition, 5(2): 199-220. [2] Zhou, L., and Chaovalit, P. (2008). “Ontology supported polarity mining.” Journal of the Association for Information Science and Technology, 59(1): 98-110. [3] Spyns, P., Tang, Y. and Meersman, R. (2008) “An ontology engineering methodology for DOGMA.” Applied Ontology, 3(1-2): 13-39. [4] Deshpande, M. and Sarkar, A. (2010) “BI and sentiment analysis.” Business Intelligence Journal, 15(2): 41-49. [5] Eirinaki, M., Pisal, S. and Singh, J. (2012) “Feature-based opinion mining and ranking.” Journal of Computer and System Sciences, 78(4), 1175-1184. [6] Economics, D. A. (2013). “Economic contribution of the Great Barrier Reef”: 1-42. [7] Binali, Haji and Wu, Chen and Potdar, Vidyasagar. (2010). “Computational Approaches for Emotion Detection in Text”, in Ismail, L. and Chang, E. and Karduck, A.P. (eds), IEEE international conference on digital ecosystems and technologies: 172-177 [8] Yaakub, M. R., Li, Y., Algarni, A., & Peng, B. (2012). “Integration of opinion into customer analysis model”. in Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology: 164-168. [9] Haider, S. Z. (2012). "An Ontology-Based Sentiment Analysis: A case study (Dissertation).": 1-103. [10] Sukumaran, S., and Sureka, A. (2006). “Integrating structured and unstructured data using text tagging and annotation.” Business Intelligence Journal 11(2): 8-17. [11] https://www.amazon.in/Fitbit-Wireless-Activity-Tracker-Wristband/dp/B01K9S260E/ref=pd_cp_364_4?_encoding=UTF8&psc