Advanced Engineering Informatics 25 (2011) 162–176
Contents lists available at ScienceDirect
Advanced Engineering Informatics journal homepage: www.elsevier.com/locate/aei
Ontology-based customer preference modeling for concept generation Dongxing Cao a,c,*, Zhanjun Li b, Karthik Ramani c a
Department of Mechanical Engineering, Hebei University of Technology, Tianjin, China EaglePicher Medical Power, Plano, TX, USA c School of Mechanical Engineering, Purdue University, West Lafayette, IN, USA b
a r t i c l e
i n f o
Article history: Received 2 November 2009 Received in revised form 16 May 2010 Accepted 19 July 2010 Available online 19 August 2010 Keywords: Ontology Design semantics Design information Customer preference Concepts
a b s t r a c t Customers often present certain preferences relative to the same product, such as function, shape, color, and cost. The ideas in the mind of the customer can be represented by higher level concepts. However, the actual shape, color, and cost embodied in the product can only be viewed as lower-level features. In this paper, a model of preference elicitation from customers is proposed to bridge the gap between low-level features and high-level concepts. First, the attributes of customer preferences are classified using preference taxonomies that we develop. These taxonomies are represented using unstructured documents that are directly collected from customer descriptions. Second, the documents or catalogs of design requirements, containing some textual descriptions and survey reports, are then normalized by using an ontology-based semantic representation. Some semantic rules are developed to describe the low-level features of customer preferences to build an ontological knowledge base. Third, customer preferences are mapped to domain ontologies for driving high-level concept generation. A customer preference modeling framework is developed to construct a vector space model to measure the similarity between two preference concept ontologies. Finally, an empirical study is implemented, and five different customer groups are surveyed about the cell phone preferences. The query results are analyzed to deeply understand the validity of concept generation from the customer preferences. Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction In today’s rapidly changing market, demand for a product which determines an enterprise strategy is often influenced by customer preferences [10]. Customers definitely exhibit heterogeneity in their preferences and buying behavior relative to the same product [21]. However, the ideas in the minds of customers are always flexible, and they do not know what exactly they want until they see it [43]. As there is no fixed benchmark in their mind, they are not always satisfied with the existing products. Furthermore, the ambiguities of some terms or phrases cannot exactly describe the preferences corresponding to their minds, which make product development very challenging. In these cases, a virtual preference model is imagined by the product designers to elicit feedback from the customers. Although the intention of a customer is a very subjective issue based on high-level concepts, many people often have similar preferences for the same type of products. Therefore, it is necessary to build a framework that captures customer preferences and guides the engineers in working towards successful product development. However, the existing approaches, such as * Corresponding author. Tel.: +86 22 60204935/242; fax: +86 10 950507 to 716480. E-mail address:
[email protected] (D. Cao). 1474-0346/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.aei.2010.07.007
West et al. [40], Ji et al. [17], Orsborn et al. [32], and Erin et al. [14], mainly depend on some statistical measures which are often based on low-level features. A customer’s preferences for a product can be viewed as a reflection of his or her inner world. They depend on customers’ behavior and intention. Ha [16] developed a customer management analysis model which tracks customer behavior and predicts customer behavior patterns. The accuracy of a predictive model is evaluated by using real-world data. Chen et al. [7] concluded that multicultural factors are the most important issues for eliciting and managing customer requirements to achieve success in new product development. They maintained that product development for an enterprise should focus on ‘‘how customers do it” rather than ‘‘what customers do.” In the past decades, the voice of customers has been widely accepted as a crucial source of input for obtaining design metrics and specifications for product concept generation [8,22]. Traditional methods for concept generation, such as quality function deployment (QFD) and the house of quality, mainly focused on special groups, product surveys, and environmentally-friendly studies to assess customer needs [15]. Also the psychographic activities still affect customers’ interest in specific product preferences. Erin et al. [14] modeled a framework for understanding preference inconstancies by studying behavior psychology and gave three examples of preference inconsistencies.
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
For example, some customers may only like the communication function of the cell phone, while others may like it to have a camera function or receive part of its performances and attributes. Due to the varieties of the preferences of different customers, it is difficult to exactly put a product on the market. Our development of a flexible representation of high-level customer preferences allows for rapid feedback from the customers towards product development and enables a more dynamic product development strategy. Ontology-based information retrieval has been successfully applied to semantic indexing [6,22]. A large amount of product information is described through using the design documents, such as users’ requirements, customer dialogs, survey reports, etc. At the same time, the concepts of customer preferences hide within these documents. How to extract customer preference information is essential to achieve success in new product development. However, few studies exist that focus on customer preference modeling through ontology information retrieval. Thus, a theoretical prototype that concentrates on customer preference modeling is needed at the earliest stages of product design. We will attempt to build a new model that addresses this issue by including ontology-based information retrieval. The rest of this paper is organized as follows. In Section 2, we review related work. In Section 3, we introduce the researched approach and ontology modeling for customer preferences in this context. In Section 4, the elicitation techniques of customer preferences are obtained. The process of preference semantic extraction is described in Section 5. In Section 6, we give a detailed description of the prototype and show how our ontology-based model compares with the traditional keyword-based search techniques. An empirical study is presented in Section 7. Finally, in Section 8, we present the conclusions and discussions.
2. Related work 2.1. Methods for customer preferences In engineering design, several models have been proposed for understanding customer preferences to support the new product development [32]. Quality function deployment (QFD) is a suitable expression by using the house of quality [15]. Also, according to product features and functionality it is easy to map QFD to customer preferences based on fuzzy sets [36]. Although the preferences depend on customers’ subjective scales, the product shapes cannot be ignored because they have a considerable influence on customer purchase decision [3]. Furthermore, from aesthetic and psychological points of view, Orsborn et al. [32] quantitatively explored form preference by using a utility function, and Erin et al. [14] gave a design decision modeling of preference inconsistency. In past decades, customer satisfactions have been widely studied by a lot of researchers [10], such as SCSB (Swedish Customer Satisfaction Barometer), ACSI (American Customer Satisfaction Index) [11], NCSB (Norwegian Customer Satisfaction Barometer), etc. However, these research models are mostly based on statistical measures for low-level features. In general, higher level concepts or ideas with specific domain knowledge can incarnate the designers’ intentions, while lower-level concepts affect the final schematic configuration [41]. It is necessary to provide an approach to combining low-level feature groups and high-level semantic clusters by identifying the customers’ vague descriptions of preferences [5]. Therefore, a quantitative approach is needed to shorten the critical gap between the product designer’s envisioned features and latent high-level customer concepts [43]. In fact, the customer often shows different preferences relative to the same type of products that include function, shape, color, and cost. These attributes can certainly affect the market activity of the product in the future [21] and hence important to
163
monitor for active feedback to product development. Also, customers often do not know what they want and their preferences may be influenced by their browsing and experimenting with the options from new product concepts. Customer preference representations that can be described by using textual descriptions and shape representations can be said to be lower-level features. The activity of concept generation by designers strives to reduce this gap between unconstrained/higher level customer beliefs and constrained/lower-level representations as shown in Fig. 1 [43]. A flexible representation which the customers will interact with to indicate their preferences can be modeled using new representations that combine shape and ontologies. The results of the customer interaction by search, relevance feedback, and modifications can provide valuable input to the design activity and even group the customers into different categories or classes for further concept generation [20]. Constrained/lower-level features are highly structured, such as project proposals, final reports, geometric shapes, and CAD drawing, which locate on the bottom of the design spectrum. At the same time, unconstrained/high-level concepts are unstructured and some fragmentary documents, such as interviews, design logbooks, text description in the drawings which they situate on the top of the design spectrum [26,41,42]. The middle of the design spectrum remedies the defects at both ends, and essentially shortens the critical gap between low-level features and high-level concepts, as shown in Fig. 1. 2.2. Customer preferences for shape ontologies Low-level features generally embody the shape, size, number of entities, etc. in which the shape can be specified from a geometric point of view, or it can be sensed [1]. Orsborn et al. [33] analyzed the fundamental features of components about vehicle classes to generate new designs based upon the derived shape relationships. In general, the shape is described by different features in the hierarchy. This hierarchical representation describes the main shape categories that can be identified [2]. The customers have a great interest in shape preferences which depend on their desires [24]. For example, some people like a standard cell phone, some people like to have clean lines and soft edges, some people like a cell phone with a slider design, and some people like a flip phone. A variety of shape information is provided to the customers for selection from knowledge in the shape repository. The domain knowledge is needed to describe the shape information. A specific shape type is associated with the shape information. Some important properties of the shape description are also described by using an associated text sheet. The shape type hierarchy captures information regarding the shape features that can be processed by a shape semantic description [12]. A high-level hierarchical relationship describes the main concepts of the shape ontology which include shape program, shape repository, shape concepts, etc. Shape program contains the program rules and semantic structure [32]. These can be extracted
Fig. 1. Different stage spectrum of concept generation process.
164
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
from the text information. Shape repository stores the shape semantic information and structural information. File information is used to describe the shape concepts which capture some information regarding a product or shape associated with the various shape models stored in the repository. The concept of shape can help us obtain the group of shapes that share some common features. We can map the structure information to the shape features. It is also possible to decompose the shapes into hierarchical chunks or features which can be annotated with text [28]. Shape texts store information related to a shape, such as its size, material, color, etc. The shape description is the main part of shape concept in the ontology and encapsulates information that is inherent in the shape model. It also constitutes the basic concept of customer preferences, which can be extended and defined in the domain ontologies further. 2.3. Customer preferences for semantic ontologies Concept indexing, rather than character strings, has been the motivation of a large body research in information retrieval [6,29]. An important task is how to extract preference terms from all kinds of information and how to manage them efficiently [5,26]. It is a challenging issue to discover, extract, and manage preferences effectively in the preliminary design stage. A fundamental deficiency of current information retrieval methods is precision problems in which the meaning of the indexed words is not exactly what the customer is seeking [22]. Sometimes, customers can express and describe the same preferences while using different terms and phrases because of different contexts, different needs or linguistic habits. In fact, individual words provide unreliable evidence about the conceptual topic or meaning of a document. There are usually many ways to express a given preference concept, so the literal terms in a customer query may not match those of a relevant document. In addition, most words have multiple meanings. Therefore, terms in a customer query will literally match terms in documents that are not of interest to the customer. To extract preferences, we may consider any documents to consist of the scattered information that might come from customers’ comments for a product. Many semantic concept similarities and statistical word measurements have been researched [13], and one wellknown application tool is WordNet [31], which is an online lexical reference system used in semantic analysis and text information extraction across many domains.
der to extract useful information. At the beginning of design, the original content of preferences from customers, such as from survey reports, transaction data, and customer dialogs, should be filtered. A normal design information text or document is used to extract preference information after transformation. Automatically extracting semantics from the normalized document requires recognizing the syntactic structure as well as the semantic meaning of the text. Linguistic knowledge and domain knowledge are needed to fulfill preference semantic extraction. To accurately represent the preference semantics in design information texts and documents, we need to extract as much relevant information as possible. A preference knowledge base is constructed by analyzing and collecting the varied product preference terms. It can be used to evaluate customer preferences for different products. The knowledge base includes the preference lexicon, domain ontology, semantic rules, and so on. Fig. 2 depicts the schematic of the infrastructure of a prototype for design preference information extraction. Four aspects contents (A, B, C, and D) are described as follows. First of all, the original material documents from customers, such as user requirements, survey reports, transaction data, etc. are acquired and then transformed into design information texts or documents. We then adopt methods to deal with this unstructured information. Second, some terms and concepts are extracted from these documents based on preference semantic structures, such as noun phrases, verb phrases, adverb phrases, and adjective phrases. They can be effectively represented by using an ontologybased design semantic analysis and information extraction. The concepts of customer preferences are classified into different taxonomies to acquire the relationships between two concepts further. The specific thesauruses or lexicons are built to capture preference concepts from the preference knowledge base. The ontology expression and preference semantic extraction process are described, and the process of semantic extraction is based on a shallow natural language process algorithm for the domain ontology. Next, the preference ontology concepts are established after carrying out the extraction operation. A concept-document matrix is built for customer preference information retrieval. The extracted algorithm and the preference concept measures are described and used for preference ontology modeling. Finally, an empirical study for design preference extraction is introduced and five group queries are processed. A prototype system interface is provided to aid the process of preference information retrieval in order to capture and generate the concepts of customer preferences.
3. The proposed approach
3.2. Ontology modeling for customer preferences
3.1. Overview
Ontology is a formal, explicit specification of a shared conceptualization [39], where conceptualization refers to an intended model of the world’s phenomena identified by its concepts and relation. Explicit means that the concepts and relations are explicitly defined, while formal means that it can be communicated across people and computers. Therefore, ontology defines a set of representational terms we call concepts. They can be described by adopting the hierarchical correlations or tree structures [19]. On the other hand, the taxonomies are only reviewed as concept classifications in the hierarchy. It simply links concepts by using ontology relationships. Most ontology concepts have multiple parents and form the complex relations of inheritances. Some concepts share a common genetic attribute with each other. At present, considering ontology modeling for customer preferences, there are two main problems: one is the extraction of the semantic concepts by using the preference words and the other is the document indexing from customers’ requirements. As for the first problem, the key issue is to identify appropriate preference concepts and build preference
Product designers always like to extract some useful information from documents in order to carry out design tasks. As the input content from customers are unstructured documents, most of them are qualitatively described, such as user requirements, survey reports, transaction data, and customer dialogs. We need analysis to transform them into formal documents. The transformation operation is to combine qualitative with quantitative aspects. Qualitative transformation is used to characterize design information of the unstructured arrangement with an abstract manner into formal documents. This allows product designers to make a transformation to improve the concept description according to customers’ requirements. On the other hand, quantitative transformation can be used to provide a canonical document description, which allows designers to easily understand, evaluate, and reuse previous design information. In general, most of the existing information is unstructured documents. These documents need product designers’ analysis in or-
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
165
Fig. 2. System architecture of customer preference modeling.
lexicon based on customer documents. At the same time, the precision problem of extraction is about the semantic expression employed in customer requests as far as preference term indexing from customer documents is concerned. A hierarchical analysis process has been used to aggregate preferences in a group using a pair-wise approach [9]. However, a significant assumption of the proposed method is that the decision maker in the group is assumed to be equally important. That is, the information is handled equally without any preferences to one of the group members being considered superior to another. Ontology modeling provides an effective approach to indexing terms/concepts which can be used to match with customer requests. However, the taxonomy acquisition of customer preferences of different products is of a certain subjective nature. Their generation is either by brainstorming or by interviewing or dialoging with customers. In similar circumstances, we can acquire preference ontologies [26]. Fig. 3 presents the taxonomy of customer preference ontology, which comes from cell phone handbooks or knowledge resources. For example, cell phone handbooks often classify engineering components which can be clustered into an ontology model as concepts and taxonomy in the hierarchy. Each component is described in detail, including its attributes such as
material, physical, geometric, and functional properties, which can easily be identified and mapped to ontologies as well as their corresponding relationships. Customer preference ontology includes concepts, taxonomies, and relationships. Each taxonomical concept is acquired from various engineering knowledge resources. We can adopt terms or phrases to describe the concepts of the taxonomy as well as their relationships with other concepts. For example, multimedia belongs to the Function taxonomy of a cell phone. We can represent it as F-MULTIMEDIA, where the prefix of each concept represents the taxonomy which the concept belongs to. Therefore, the relationships are structured between concepts across taxonomies. For example, has_feature (COL-SIVER, SH-KITTY-PHONE), in which COL-SIVER stands for a color concept in the color taxonomy, SH-KITTY-PHONE represents a shape concept in the shape taxonomy [18,26]. Table 1 lists customer preference ontological concepts and acquisition resources of a cell phone, in which they include the number of concepts corresponding to different taxonomies. The classification of their relationships is represented in Table 2. At present, we have collected 10 taxonomies, 450 documents, 312 preference concepts, and 7 types of relationships in customer preference ontology. The standard worksheets have been devel-
166
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
Fig. 3. The taxonomy of preference ontologies.
Table 1 Preference ontological concepts and acquisition resources. Taxonomy
No. of concepts
Example of concepts
Acquisition resources
Function
116
Environment
6
Shape
22
Collecting function concepts from http://www.ssrnarena.com and the other websites Environmental concepts based on eco-friendly and green manufacturing technology, etc. http://www.halfbakery.com
Performance
91
Cost
6
Voice, text, multimedia, memory, chat apps, MP3, internet, digital camera, bluetooth, etc. Radiation protection, man–machine friendly, recycle, health risk, etc. Flip phone, kitty phone, hand-writing and PDA, lighter-shaped, moustache phone, etc. Good performance, signal strength, coverage area, large speaker, long talk time, etc. Top grade phone, middle price phone, low end phone, etc.
Color Material Standard Model
13 13 8 30
Attachment
7
Black, white, green, red, yellow, silver, oyster color, etc. Metal, polycarbonate, plastic, stainless, synthesis materials, etc. Communication protocols, power, voltage, Wi-Fi, port, AMjFM, etc. Blackberry Bold 9000, Motorola Hint QA30, Nokia N97, Curve 3520, etc. Headset, lanyard, leather portfolio, clip, etc.
Investigating different cell phone performances based on customers Separating them according to price difference in customer preferences According to the existing colors of cell phone on the market Manufacturing materials used as main parts of cell phones Cell phones use standards in different areas and countries Different brand models for customer uses on the market http://www.amazon.com
Table 2 Classification of the relationships. Relationship
Concept
Definitions of the relationship
Examples
is_a has_part has_function use_material has_property has_feature has_standard
F-VOICE/F-ME3 E-HEALTH-RISK/E-CYCLE F-VOICE/ P-LONG-TALK COS-LOW-END-PHONE/ M-METAL SH-FLIP-PHOHE/ M-STEELLESS COL-SIVEER/SH-KTTTY-PHONE F-3G/ ST-NETWORK
Relationships between parent and son or special and general Relationships between part and whole Refer to the connection bet ween two concepts The type of materials Physical attribute/geometric attribute Geometric shape Domain specific standards
Is_a(F-VOICE, F-MP3) Has_part(E-HEALTH-RISK, E-CYCLE) Has function(F-VOICE, P-LONG-TALK) Use material(COS-LOW-END-PHONE, M-METAL) Has_property(SH-FLIP-PHONE, M-STEELLESS) Has feature(COL-SIVER, SH-KITY-PHONE) Has_standard(F-3Q, ST-NETWORK)
oped to easily acquire the preference ontology and lexicon. At the same time, these worksheets can automatically upload the required data into the Protégé editor (http://protégé.stanford.edu). Therefore, the proposed customer preference ontological concepts can also be presented by using Protégé 3.1, which is one of the most widely used ontology editors. Protégé provides a visual tool for preference ontology editing, including concept, taxonomy, and relationship building as well as preference ontology visualization [23].
4. Elicitation techniques of customer preferences 4.1. Hierarchical attribute of preferences The enterprises aim to build up a good image of their product on customers’ minds [10]. They often inquire of customers in order to find out the needs that are not met by existing products. Then they develop a product towards a set of market demands, define the product in terms of attributes of preferences, and assess demand degrees for new products where no product currently exists. After finishing the analysis of needs-preferences, product designers can work towards concept generation in order to customize product configurations. However, preference cannot be viewed as equivalent to demand [30]. Preference has subjectivity and is related to
customer behavior with personal feelings, whereas demand is more objective and mainly depends on other factors, such as availability, familiarity, public praise, and advertising, and backed by willingness to purchase. A best-selling product is definitely based on a favorable customer preference [11]. First, the main factors from customer perspectives should be identified and the domain knowledge of the product should be collected in a professional survey before the product is launched. Second, a survey activity is conducted to determine the customers’ needs and desires before putting the new product on the market. This survey can be analyzed by using a software tool to determine the specific customer preferences. Based on this consideration, a measure about the acceptance of potential customers can be taken and market simulations can become feasible. Therefore, customer preferences can support demand analysis, conceptual design, and embodiment design, and at the same time, they are related to experimental results, public praise, and market surveys as shown in Fig. 4. 4.1.1. Experimentation In the past two decades, enterprises always employed some traditional approaches to generate different concepts of a design and to conduct experiments with customers to capture preferences [15]. By using a software tool as the customer service platform, it is much easier to run experiments online. And most of them always
167
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
Fig. 4. The attribute of preferences.
ran such experiments and showed a raise in the browsing rate through clicking and determining whether a new design increased sales in a few days. If a product advertisement is allowed on website, we can discover in a few hours whether experiment results or ad click rates increase, and the transaction data can reveal commercial activities of customer purchasing products. Some popular websites certainly have a high ad click rate. When customers shopping, transaction data can reveal their preferences for a particular product and enable results targeted to specific buyer groups or buyer categories (see Section 7). How easy it is to identify customer preferences depends on the context online and on the customer’s willingness to buy a product. However, these text descriptions are disorganized, but do have a great deal of information. It is necessary to extract customer preferences by adopting some effective methods, such as AHP (Analytic Hierarchy Process) [9], statistical method [40], decision algorithm [37]. In addition, customer preferences can be viewed as a multidimensional function, such as price, features, quality, performance, brand, distribution channel, safety, usability, etc. Therefore, they are of multidimensional properties.
4.1.2. Public praise It is a magic weapon for a product to receive a good evaluation in public, i.e., public praise [4]. Some products can obtain a deeprooted impression on customers’ minds. They most likely win a good market on sales [10]. Here, public praise can be divided into subjective impression and firsthand experience, in which subjective impression includes exterior shape and brand consciousness. They directly describe which exterior shape the customers like the most and which brand they are deeply in love at the first sight. At the same time, firsthand experience contains performance traits and configuration. They often indicate and report that the customers have an approval level for some products, such as their function, performance, configuration, cost, shape, etc.
4.1.3. Survey Enterprises often ask their customers preference questions of a product in order that they can provide better service for their customers and improve customers’ satisfaction degree [4]. Surveys are sometimes described as informal conversations between product designers and customers. A number of measures can be taken to conduct the survey [4]. A common method to survey is interview. Product designer or enterprise first put forward some questions that relate to preferences for a particular product function, feature, shape, cost, or even service quality. Then the customers answer these questions. For example, designers often ask the question:
How often do you use a cell phone? Four potential answers are given as follows: (1) (2) (3) (4)
Never, once in a while, 1–5 times a day, more than 5 times a day.
These multiple choice questions can be immediately selected from customers. Sometimes, they give direct answers on a scale, which can improve the survey data quality. This method can basically eliminate customer subjective bias. It yields metric data that can be analyzed with far more statistical rigor than justified from traditional surveys. The other survey method concentrates on questionnaire. At the same time, these questionnaires can be distributed to some special groups who have a general characteristic that holds a certain common preference [37]. We can analyze their preferences to determine the relative weights of the preference for each separate attributes. Respondents with similar preferences can be identified, and their characteristics or profiles can propose a label for each customer segment. These respondents can further urge the companies to customize a special product feature for the different groups. Generally speaking, the scale is commonly used in survey questions to elicit preferences or evaluations. In this paper, we use five level scales to describe the customer preference degree, such as ‘‘5” strongly preference, ‘‘1” weakly preference. The value of the scales perhaps has a prejudice against the selected results which depend on customers’ personal desire [34]. A questionnaire has been distributed, and the results from 56 questionnaires of different cell phone customers have been obtained. A lot of concepts exist within each brand. A normalized scale is calculated as follows:
Scale ¼ Q 5 ðQ c max QÞ
Q5 Q1 Q c max Q c min
ð1Þ
where, Q5 stands for the highest scale value while Q1 means the lowest scale value. Qc max stands for the number of the maximum concepts while Qc min means the number of the minimum concepts. Q means the number of the actual concepts. Different model concepts are normalized into 1–5 interval scale values. Their scales of five models are processed and normalized into 1–5 scales corresponding to function, shape, color, and cost shown in Table 3. 4.2. Transmutability of customer preferences Customer preference is somewhat relative and is not absolute [38]. It is changeable with time span, scene, and attribute. The time
168
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
Table 3 Different scales corresponding to preferences for cell phones. Taxonomies
Function Shape Color Cost
Models Bold™
Pearl 8110
Hint QA30
Nokia E75
Tundra
5 5 1 3.4
2.3 1.8 3 5
4.2 4.2 5 1
1 1 1 3.4
3.9 4.2 1 1.8
span, which depends on the category of a product, is uncertain. Sometimes it is long and sometimes short. The scene, which depends on customer behavior, is associated with different cultures and geographies [7]. In addition, customer preference may changes, when the value of some attributes is changed, such as function, shape, cost, etc. It is much easier to ask attribute questions from the customers that are sensitive buyers. Therefore, the enterprise should be concerned with ‘‘How does the attribute affect a customer’s willingness to buy a cell phone?” However, the existing approaches can only evaluate the diversity of preference or conveniently generate alternatives, even though each customer has a different interest in or preference for the same product. We can view attribute as the concept of preferences, i.e., what we can learn about customer preferences and which targets are technically infeasible or unrealistic. As a result, the preference concept is simply the result of what customers want. Therefore, product designers can employ lower-level variables to achieve goals for preference characteristics or attributes. For example, sturdy or durable is perceptual attributes, and they can be translated into the set of technical specifications for physical characteristics, such as loading conditions, allowable deflections, yield strength. These specifications will be satisfied through manipulation of design variables, such as metal thickness, spring tension, etc. subject to inviolable physical and geometric constraints [30].
5. Preference semantic extraction 5.1. Preference semantic representation Semantic ambiguity often occurs in design queries when customers do not know the exact expressions or the related concepts they want to pursue though they may have some contextual clues, such as the functional preference of the design and other interacting parts of the product in question. A preference lexicon is a better way to evaluate customer preferences. Lexical terms are the natu-
ral language words or phrases of the corresponding concepts. They are used to map the concepts with words of texts and to explicitly represent the vocabularies of different ontology concepts. Therefore, word morphs, abbreviations, acronyms, and synonyms of the word/phrase are lexical terms and share the same concept with the original lexical terms [25]. Also, some noun phrases, verb phrases, adverb phrases, and prepositional phrases can be extracted as preference terms. The morphs of original lexical terms can easily and automatically be obtained by WordNet (http:// wordnet.princeton.edu/) [31], whereas other terms can be acquired manually because WordNet is a general lexical resource but not a specific preference lexicon. We aim to extract implicit customer preferences from product domain knowledge. As the existing case studies are almost special products, the extracted texts have a certain limitation. If a preference lexicon is built, it can be used for concept indexing and extended to improve the preference evaluation possibilities. However, it is not easy to model and extract the semantic information of implicit customer preference from design texts which are embedded into natural language. In order to identify linguistic forms of customer preferences, we build a preference lexicon to support automatic indexing. Logically speaking, such preference information is implicit within engineering design texts, but it could be difficult to extract from unstructured documents. In order to overcome this difficulty, we build the preference semantic model and its mapping into the ontology concepts. We identify linguistic forms of preferences, produce a specific preference lexicon, develop customer preference ontology concepts, and generate design alternatives. A preference lexicon can show what the customers want. Fig. 5 represents a common preference lexicon for the cell phone, which describes the preference terms of cell phone functions, performance, shape, cost, color, and so on. Each can be decomposed further in the hierarchy. Semantic rules are used to link preference terms and concepts together to build the customer preference concepts to aid in searching for design information. In general, there are two types of semantic rules. One is from the combination of preference terms and concepts, in which each term includes a noun, verb, adj, adv, pron, etc. and each concept is composed of several words. For example, the combination of ‘‘RED” and ‘‘SH-FLIP-PHONE” forms a new preference concept ‘‘SH-RED-FLIP-PHONE.” The other is the combination of two concepts. For example, ‘‘F-WORD” and ‘‘F-TEXT” constitute a new concept ‘‘F-WORD-TEXT.” The concept from semantic synthesis is called Instance Concept (IC) which has a certain entity meaning. Generally speaking, some instance concepts exist in specific relationships, such as is_a, has_function, has_part, has_material, etc. These relationships are on the basis
Fig. 5. The lexicon of customer preferences.
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
of forming concept ontology relationships. They are the main part of building semantic rules. The process of building a semantic rule base is shown in Fig. 6. First of all, semantic rules are input by the designers, and some rules can be formalized or edited, and a validity check is carried out. If there is any conflict among these rules, they will be returned and edited again. Then, the redundant check is implemented. The redundant rules will be eliminated or united. Finally, all satisfied rules will be put into the rule base, otherwise they are rejected. We use instance concept to structure design semantics through a set of slots and relations. For example, each concept instance has several slots which describe its functions, properties, materials, and relationships. In the process of system work, the documents are scanned to search for instance concept and its specific value. Each concept corresponds to a relative slot value. For example, design object of cell phone has a specific slot ‘‘has_part” which corre-
169
sponds to the instance concept, ‘‘SHOW SCREEN.” It will be scanned and tagged in the process of indexing sentences. Meanwhile concept ‘‘SHOW SCREEN” has a function slot ‘‘SLIDE MOTION” and it exists in a material slot ‘‘has_material” that is, ‘‘SHOW SCREEN” is made of material ‘‘RDP.” In the same way, we can find the cell phone function slot ‘‘has_function,” and it has a function ‘‘COMMUNICATION” and three properties: ‘‘TEXT, AUDIO, and VIDEO,” as shown in Fig. 7. Customer preference ontologies exist in two kinds of relationships, that is, internal and external relations, where an internal relation exists in the same taxonomy. For example, ‘‘F-TEXTENGLISH” and ‘‘F-WORD” belong to ‘‘Function taxonomy” and they are called the internal relationships. In contrast, ‘‘SH-FLIP-PHONE” and ‘‘M-STEELLESS” belong to different taxonomies, and they exist in external relationships. Fig. 8 presents preference ontology internal relationships of the cell phone, in which different line types
Fig. 6. Process of semantic rule base establishment.
Fig. 7. Ontology-based design semantic expressions.
170
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
Fig. 8. Preference ontology internal relationships of a cell phone.
mean different ontology relationships. Each ontology concept corresponds to a node in the ontology tree. In this taxonomy, the hierarchical decomposition can be treated as a set of psychological semantics of customer perception about a special cell phone, which is an interaction of customer involvement [8]. It comprises a tree architecture topping down from the highest level to the lowest level for ontology concepts of customer preferences. [29]. 5.2. Text information extraction We assume that input design information is expressed in plain English. If the input is transaction data, it needs to be quantitatively changed into identified texts. They all need to transform into structured text information. Tokenization is carried out from the text of the customer request after stemming and removing the stop words. According to a preference lexicon, customer preference words are tagged to mark their position. Preference terms and phrases are recognized on the basis of indexing preference domain knowledge base. Using a list of synonyms, these tokens are associated with concepts in the ontology through Depth First Search (DPF) or Breadth First Search (BFS) [27,22]. Therefore, after preference semantic extraction embedded in the customer queries, the concepts are generated by matching to terms and phrases in the ontology. The algorithm operations as shown in Fig. 9 are described as follows. Fig. 9. The process of customer preference extraction.
5.2.1. Stemming stop words and tokenizing Some auxiliary words are removed from the phrases, such as pronouns, common verbs, common nouns, adjectives, and frilly words. The tokens/words and punctuation symbols are marked by analyzing input texts. 5.2.2. POS tagging Each word is first inquired in the preference lexicon and marked with its most likely POS tags as defined in the preference lexicon. The combination operation of automatic POS assignment and manual correction is carried out to improve the speed and accuracy of the mapping process. If the word does not have a match in the lexicon, then the word is assigned an unknown tag. After manual correction, any incorrect tags will be removed ([27]). 5.2.3. Recognizing terms and phrases The purpose of recognizing a concept is to select the most appropriate terms or phrases in the domain ontology. This stage can be divided into two steps.
(1) Concept matching: Assigning the tagged terms/phrases to the concepts it refers to. Words that match with a preference lexicon term will be assigned the pertinent ontology concept. Note that multiple concepts may be assigned to a single word or a series of words/phrases because different concepts may have the same lexicon term. (2) Concept disambiguation: A word or term which matches with multiple concepts causes ambiguities. This ambiguity exists in polysemantic and ellipsis semantic structure [26]. It can be disambiguated by referring to the context of the term/ phrase meaning. The context of a term refers to the concepts to which its adjacent words/phrases are tagged. 5.2.4. Joining relationships The relationship between two concepts is joined together by a certain semantic relation. The joining phase scans the sentences iteratively to generate relationships of the two concept instances according to the semantic rules [25]. Both concepts maybe exist
171
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
in meronymy, holonymy, hyponymy, hypernymy, causality, etc. These lexicon relationships are used to include: has_part, is_a, has_property, etc. The similarity degree of different concepts should also be considered. In the next section, the lexical relationships among the keywords will be built and the semantic analysis will be employed to extract the information. 5.3. Preference concept disambiguation In the process of customer querying and ontological indexing, semantic ambiguities often result in a lower retrieval precision, or even in retrieval of errors. Theoretically speaking, three ambiguities may appear in text indexing, as follows: (1) Polysemy: a term or phrase perhaps matches several concept resulting in semantic ambiguities. For example, picture appears in function taxonomy concepts ‘‘STATIC-PICTURE” and ‘‘DYNAMIC-PICTURE” because both concepts have the same lexical term picture. (2) Accuracy of term description: some concepts can be expressed by using different terms, phrases, or synonyms, but they are of little difference in semantics. For example, ‘‘VOICE-COMMUNICATION” and ‘‘INTERNET-COMMUNICATION” are two close to concepts, but there is a subtle difference in meaning, and they respectively are stored in the knowledge base. (3) Ellipsis and acronym: part structures of a sentence or demonstrative pronoun are omitted because they may lead to semantic error or misunderstanding, while some special term acronyms still result in mistake. For example, both special terms ‘‘INFORMATION-RETRIVAL” and ‘‘IMAGE-RECOGNITION” have the same acronyms ‘‘IR,” they will be ambiguous if there is no additional explanation. These ambiguities are direct reasons for in the lower concept retrieval precision. For example, if customers like the price of about $80 with volume 80 40 10 cm3 for a cell phone. The two numbers ‘‘80” often appear ambiguous. In preference ontology concepts, we have divided customer preferences into different classifications (see Section 4.1). By marking different taxonomical signs during tagging terms, such as, COS-MIDDLE PRICE EIGHTY and SH-SIZE EIGHTY, we can distinguish them. A detailed algorithm of concept disambiguation is described in Section 6.2. 6. Customer preference evaluations
Fig. 10. Vector space model of corpus matrix.
of multiple words, where they are assigned the weight values gk and for each gk 2 [0, 1]. To calculate the concept score Cscore for each Ci, we can first calculate the term score Tscore of all its lexical terms as follows:
T score ðijÞ ¼
# Of words in the document di matches with # Of words in document
PH
k¼1 g k
ð2Þ Let us assume a document dj includes H words and k is the indexing position of the words in the document from left to right, that is, it means the order of tagging words. gk is the weights of P each word and the sum of them is 1 ( g = 1). g equates to 1 if the matched term contains only a word. When there is no matched word g is to 0. We can normalize the value of Tscore between 0 and 1 according to Eq. (2). At the same time, the Cscore is viewed as the maximum of all its Tscore as shown below [18]
C score ðiÞ ¼ MaxðT score ðijÞÞ
ð3Þ
According to traditional indexing approach, we combine the domain ontology knowledge into keyword-based indices [6]. The procedure of document annotation and weighing items is the same as the keyword-based extraction and indexing process. An effective ranking algorithm is developed on the basis. In doing so, we can obtain concept-frequency (cf) and inverse document frequency (idf) [22]. Therefore, we calculate the weight value wij of a characteristic item as follows:
6.1. Vector space representation
wij ¼ cfij idfj ¼ cfij ðlog2 ðN=nj Þ þ 1Þ In the traditional vector space model, a vector is used to represent each item or document [35]. Each element of the vector includes certain keywords associated with the given document. The value assigned to that element reflects the importance of the term in representing the semantics of the document. A database containing a total of documents described by terms is represented as a term-by-document matrix [6]. The rows of the matrix are called the document vectors, and the columns of the matrix are the term vectors. Thus, the matrix element is the weighted frequency in which the term occurs in the document. In this paper, a corpus matrix of document-concepts is built on the basis of term-by-document matrix, in which the rows mean document descriptions from different cell phone brands, as shown in Fig. 10. The columns stand for the concepts that appear in the documents, while the concept consists of several terms or words. The matrix values aij are weights that represent the importance of concepts in documents. Suppose that each concept is composed of a set of lexical terms (t1, t2, t3, . . . , ti, . . . , tn). Each term may consist
ð4Þ
where cfij is the frequency of concept Cj in document di, and N is the number of documents, and nj is the number of documents that involve the concept Cj. From the formula (4), we can find that the value of wij increases with cfij and decreases with nj. The distance between two document vectors is represented by similarity. The similarity between document di and dj is defined as the cosine of the angle between two vectors below
Pm k¼1 wik wjk Simðdi ; dj Þ ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pm 2 Pm 2 k¼1 wik k¼1 wjk
ð5Þ
When carrying out a query operation, the above model di could be viewed as the queries from customers. By measuring customers’ queries of preference ontology concepts and different brand name cell phone document similarities, we can achieve document ranking.
172
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
6.2. Ontology concept measures Lexical ambiguity can be distinguished from concept similarity measure. We measure the distance between two concepts of the phrase/keyword clusters corresponding to special product attributes. In our model the customer preference ontology will conveniently execute indexing: that is, the preference ontology easily provides index terms/concepts which can be used to match with customer queries. For example, given a customer query ‘‘want a camera and multimedia cell phone with a red flip phone,” after stemming auxiliary words ‘‘want”, ‘‘a”, ‘‘and”, ‘‘with”, the keywords ‘‘camera” and ‘‘multimedia” are a function taxonomy and stand for ‘‘F-CAMERA” and ‘‘F-MULTIMEDIA”; the keywords ‘‘red” and ‘‘flip” belong to color and shape taxonomy and stand for ‘‘COL-RED” and ‘‘S-FLIP.” By putting a taxonomical label in front of the keywords, we can easily index the preference terms from the lexicon. When concepts are correlated, the associated concepts will be assigned greater weight based on their minimal distance from each other in the ontology and their own matching scores based on the number of words they match. In general, an ambiguous concept related to other concepts will have a higher score and will retain a greater probability than uncorrelated ambiguous concepts [18]. In order to calculate the similarity between two concepts, we need to build interrelationships of ontology concepts. Here, we represent our ontology as a directed acyclic graph (DAG). Each node in the DAG expresses a concept which includes a label name and a synonym list. The synonym list of a concept contains a set of keywords through which the concept can be matched with customer queries. Fig. 11 represents a small portion of preference ontological relationships of the cell phone. Each line type represents different ontology concept interrelationships [18]. Suppose matched concepts of query keywords: C1, C2, . . . , Ci, . . . , Cn; and each selected concept (Ci) contains a score based on the number of lexical terms (t1, t2, . . . , ti, . . . , tm) from the list of synonyms that have been matched with the customer queries. The keywords in customer queries are sought based on DFS or BFS which match each keyword with the lexical terms of a concept. The calculation of the score is obtained on the basis of Tij and Ci matched keywords from Eqs. (2) and (3). The shortest distance or least number of arcs between two matched concepts in preference ontology is defined as Concept Distance (CD), as follows: CDði;jÞ ¼
1 þ MinðNumber of arcsðC i ; C j Þ If they have a common parent node Infinite If they have no a common parent node ð6Þ
Note that if the concepts are at the same level and no path exists, their distance is infinite (see Fig. 11). For example, the concept distance between ‘‘FUNCTION” and ‘‘SHAPE” is infinite, and ‘‘F-COMMUNICATION” and ‘‘F-VOICE” are linked by ‘‘Part_of” relation: their distance is 1. Similarly, the concept distance between ‘‘F-COMMUNICATION” and ‘‘F-MESAGE” is 2. And the concept distance between ‘‘F-VOICE” and ‘‘P-CLEAR” is 1. Also, we can calculate the weighted concept scores which relate to the Cscore of all its correlated concepts but inversely relates to the CDs with them as follows:
xCðiÞ ¼ C score ðiÞ þ
n X C score ðkÞ CDði;kÞ k¼j
ð7Þ
where x stands for the weight of a concept, and it is related to not only the frequency of a keyword, but also the description of corresponding concept involved in the documents. The items in different position of a document will be set with different weights. For example, the concept appearing function taxonomy ‘‘F-COMMUNICATION” will be heavier than the ones appearing in the other taxonomies, which depends on customers’ preferences and desires for a specific concept. According to above similarity measure of both concepts, we can estimate whether a concept is highly correlated with others if it is less far away from them in the directed acyclic graph (DAG), i.e., semantically closer, or if it has more words matching with a particular query keyword, i.e., lexically closer. By clustering these concepts of similarity measures in a document, such as function preference, performance preference, shape preference, cost preference, etc. we can rank the result of preferences which document the customers like the most.
6.3. Evaluation analysis Evaluation analysis uses the collected preference catalogs as the benchmark and compares the retrieval performance of the ontology-based search and keyword-based search. The case study was executed by five different groups. Each group has the different viewpoints about cell phone preferences based on their background knowledge and partial shopping experiences. The objective is to briefly describe what kind of cell phones they like the most. Each of them needs to provide at least 8–10 queries: what kind of function, performance, and shape they favor the most. Also they are required to attach a short description as the context of each query. For example, why the function is needed and how it is used. These constitute design specifications for new cell phones. The
Fig. 11. Customer preference ontological external relationships of the cell phone.
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
effectiveness of retrieval is usually measured by two quantities shown as follows [18]:
# Of relevant concepts that are retrieved # Of relevant concepts # Of relevant concepts that are retrieved Precision ¼ # Of relevant concepts
Recall ¼
ð8Þ ð9Þ
Two metrics are usually used to describe the quality of preference retrieval. Recall is the proportion of relevant concepts retrieved by the system and precision is the proportion of retrieved concepts that are relevant. Precision is an accuracy measure, while recall is a measure of how much good information is retrieved. Generally speaking, it is necessary for customer preferences to evaluate recall versus precision to determine which overall strategies are most important. 7. Empirical study We design a virtual experiment platform to obtain customer preference ontology towards concept generation. Three brand cell phones have been selected to implement this empirical study, such as Blackberry, Motorola, and Nokia. Each brand includes 10 series of cell phones as follows: Blackberry: Bold9000, Curve 8320, Curve8520, Pearl 8110, Tour9630, Strom 9500, 7130g, Pearl Flip 8220, 8830, 6230; Motorola: Hint QA30, Tundra, MOTO W233, Aura, V80, XT800 ZHISHANG, MT701, QUENCH, Karma QA1, Motocubo A45; Nokia: Nokia N97, Nokia E75, Nokia 6010, 5800 XpressMusic, 7900 Prism, Nokia 5330, Nokia X6, 2220 Slide, 8600 Luna 5, Nokia C5. We assume that customer queries focus on some terms or keywords about the cell phone. Different customers may query about different problems, such as function, shape, color, and cost, which depend on customer professions, domain knowledge, culture background, and so on. In order to implement this empirical study, we distributed questionnaires to aim at five different groups and col-
173
lected above 56 survey forms. Also we collected three brands and each includes 10 series of cell phones. In total, we obtained 450 description documents from the websites. On average, the length of each document is about 4.42 sentences and 61.43 words. Five different groups of customers are investigated, and experimental data and texts are processed. Here, the five groups are represented as follows:
G1 G2 G3 G4 G5
a a a a a
group group group group group
of of of of of
industry engineers; university faculties; company executives; graduate students; freshmen.
The objective of concept generation involves identifying customer needs and then mapping those needs into a set of cell phone attributes or specifications. Considering this case study where the designer would like to generate a new cell phone from concept clusters, it is necessary to satisfy the following basic requirements. A hybrid cell phone with a touch screen and a hardware keyboard. Push button to realize Talk, Bluetooth, MP3, Video-fairly easy to master. Long battery life (over 3 weeks), some functions and extras. Digital camera/digital player-fairly easy to operate. Security features that are environmentally friendly. Fig. 12 presents an interface of a prototype system, in which the nearside of the interface is to realize customer queries about preference concepts from different groups. They inquire about some taxonomical concepts, such as functional concepts, shape concepts, or cost concepts, etc. Users can select these concepts what they like the most from the left tree structure. The right-top can realize preference concept indexing. On the one side, the users can directly input the cell phone model name to search and output its information in the document on the right underside. On the other side, the system can automatically select and rank the closest document from customer queries in the knowledge base.
Fig. 12. Interface of prototype system.
174
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
Based on this prototype system, we can first of all formulate the queries from customers in order to extract preference terms and concepts. In the process of customer querying, some terms/phrases are recognized through using DFS or BFS. For example, industry engineers like practical and easy operational cell phones for technical communication or business negotiation. College faculty members like performance reliable cell phones with good voice effects for educational activities. Company executives like highgrade and luxurious cell phones with wide screens to show their social status. Graduate students like delicate and motional externality with multimedia functions. University freshmen like small and exquisite cell phones with flip or kitty shapes in bright colors. They will inquire about function, externality, price, performance, use environment, color, etc. Quantitative distribution of the keywords/ontology concepts corresponding to different taxonomies within the collected customers’ documents is shown in Fig. 13. We collected a total of 45 queries from the five different groups, and among them three queries were eliminated because they were not related to customer preferences, such as some unpractical and imaginative cell phone functions in the future. The 42 queries left are classified as general queries, specific queries, and context queries [18]. The general queries are associated with the upper-level concepts of the ontology, such as customer preferences of different cell phone brands or their series, while the specific queries are associated with lower-level concepts of taxonomies, e.g., cell phone performance, shape features, and material attributes. The third category is context queries that cannot easily be described except for context expression, in which the customers specify a certain context in order to make the query unambiguous, such as cell phone performance parameters or quantitative indexes. Protégé 3.1 (http://protege.stanford.edu/) can be used to generate domain ontology, in which preference taxonomies were generated as the basis of concept hierarchies. The lexical terms of the concept were modeled as the slot attribute of each concept class. This also supports the domain ontology model in several formats, such as XML, OWL, and RDF. The domain ontology model was translated into XML scripts and input into the system [23]. Ontology concepts are built on the basis of customer preferences. Their interrelationships and the number of types are statistically calculated and shown in Fig. 14. Table 4 gives the comparison of the empirical results of the queries. Different types of queries can obtain corresponding recall and precision. These results show that ontology retrieval is superior to traditional keyword retrieval. As the members of each group have different cultural backgrounds, genders, and ages, they are, respectively, interested in dif-
Fig. 13. Distribution of keywords/concepts in different taxonomies.
Fig. 14. Distribution of different interrelationships.
Table 4 Recall/precision based on ontology and keywords. Recall
Types and number of queries General Specific Context
10 27 5
Precision
Ontology (%)
Keyword (%)
Ontology (%)
Keyword (%)
92 S3 26
16 19 12
82 87 73
65 84 61
ferent concepts [37]. We can index the concepts of customer queries and obtain which taxonomies the customers will prefer as shown in Fig. 15. We present Y-axis as the percentage of concept number while X-axis shows different concept taxonomies. Among groups customer preferences present an evident difference corresponding to different taxonomies. As for the G1 group queries, they are mainly concerned about easy to use, word clarity, much better sound, and so on. After carrying out document retrieval, we obtained the ‘‘Nokia E75” which was close to the G1 group requirements. However, some concepts still need to be added in order to satisfy customer preferences, such as touch phone, mute sound
Fig. 15. The customer preferences of the five different groups corresponding to taxonomies.
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176 Table 5 Different G preference models and concepts. Group No.
Returned No. of questionnaires
Models of cell phone
Num. of customer concepts
Num. of concept supplement
G1 G2 G3 G4 G5
8 12 9 15 12
Nokia E75 Tour 9630 Nokia N97 Hint QA30 Pearl 8110
22 17 24 19 22
3 3 2 4 2
in camera, etc. Also, on the basis of the G2 group queries we obtained the ‘‘Tour 9630” model, which was close to customer requirements, but there was still a small difference. We needed to add some new concepts to fit customer preferences, such as call waiting, WIFI, edit text, etc. Although the results of the retrieval provided the most suitable model to customers, they still can not satisfy customer preferences completely, so some additional concepts need to be attached as a supplement. Different taxonomical concepts versus customers’ personal preferences still exist in a little difference. The objective of information retrieval is to find the closest documents and detect the difference between the existing concepts and new concepts. Table 5 presents different groups corresponding to customer preference models and the number of concept supplements. These appended concepts, that is, emergency call, voice help, accident discernment, less radiation, etc. will become the trend of the existing model metamorphosed design in the future. In fact, all brand cell phones always confine their functions within a certain range, and perfection cannot be obtained with the given cost limits. Sometimes performance is good, but their functions are not remarkable. Some designs are pretty good, but performance and customer service are unlikely to entirely satisfy customers’ desires. Companies attempt to make a bargain with customers for a supply of the cell phones in order to cater to the customer preferences. 8. Conclusions and discussions In this research, the customer preference ontology is developed and preference design information is extracted to build a preference knowledge base which includes a preference lexicon, domain ontology, and semantic rules. An ontology-based model is given for information retrieval. The concept generation and selection of information are based on customer preference ontology. We have shown how the ontology can be used to generate and measure design concepts in customer queries. We have used the preference domain knowledge of the cell phone for describing the proposed approach, while the results can be applied to other similar products. By extracting the customer preference concepts of 450 documents and analyzing five group empirical study, our ontologybased retrieval demonstrates its superiority to keyword-based search techniques. However, further research needs to consider four aspects as follows: In the development of new technology, the large amount of unstructured and informal design information is steadily increasing, such as engineers’ log, image, nonstandard language descriptions from customers, etc. These texts are less likely to comply with the formal documental format [6]. At the same time, they are still a part of customer preference document extraction. However, it is difficult to extract the ontology concept semantics from these documents. Further investigation is worth in the future. Information extraction of customer preferences is currently based on indexing the sentence semantic rules, in which the preference lexicon and domain knowledge are crucial to achieve
175
information retrieval. However, we do not consider document syntactic structures and syntactic rules. If we develop an automatic document indexing system for customer preferences in order to minimize human intervention, the sentence structures have to be analyzed on the basis of syntactic rules [25]. Further work is needed towards this next step. Customer preferences are of certain relativity and are not absolute [38]. At a particular time, customers show a strong liking for certain cell phones. But later their preferences perhaps change and they show a liking for another cell phone. Therefore, we would like to build an ontology that is easy to update and can dynamically adapt to customer preference changes. In addition, as time goes on, customer perceptions and product concepts are constantly changing around customer preferences [24]. An automatic analysis approach to keeping abreast of the changes with a fast and simple response to customer preferences and changes in the market is needed. This approach may take a combination of ontological and statistical methods. The preference lexicon, in this paper, only collects the most positive context terms and phrases. However, some negative context terms [41], such as the negative adverbs, ‘‘no, not, hardly, rarely” or the negative adjectives, ‘‘bad, ridiculous, impracticable, troublesome.” Actually, double-negation equates to affirmation. We often use this in writing and speaking. Such word frequencies may be useful for some customers, but not for all. A more accurate language model for the elicitation of customer preferences will be developed to take this aspect into account.
Acknowledgments This research is partially sponsored by the National Nature Science Foundation of China (Grant No. 507,75,065), Nature Science Foundation of Hebei Province (Grant No. E2008000102) in China. The authors acknowledge partial support of Product Lifecycle Management (PLM) center at Purdue University and the Center for Advanced Manufacturing (CAM). The authors thank anonymous reviewers for their helpful suggestions in this study. References [1] I. Biederman, Recognition-by-components: a theory of human image understanding, Psychological Review 94 (1987) 115–147. [2] M. Berkowitz, Product shapes as a design innovation strategy, Journal of Product Innovation Management 4 (4) (1987) 274–283. [3] P.H. Bloch, Seeking the ideal form: product design and consumer response, Journal of Marketing 59 (3) (1995) 16–29. [4] N. Bolton Ruth, A dynamic model of the duration of the customer’s relationship with a continuous service provider: the role of satisfaction, Marketing Science 17 (1) (1998) 45–65. [5] Dongxing Cao, Karthik Ramani, Zhanjun Li, Guiding concept generation based on ontology for customer preference modeling, in: International Symposium series on Tools and Methods of Competitive Engineering (TMEC), Ancona, Italy, April 12–16, 2010. [6] P. Castells, M. Fernandez, D. Vallet, An adaptation of the vector space model for ontology based information retrieval, IEEE Transaction on Knowledge and Data Engineering 19 (2) (2007) 261–272. [7] C.H. Chen, L.P. Khoo, Y. Yan, Evaluation of multicultural factors from elicited customer requirements for new product development, Research in Engineering Design: Theory, Applications and Concurrent Engineering 14 (3) (2003) 119– 130. [8] C.H. Chen, L.P. Khoo, W. Yan, A strategy for acquiring customer requirement patterns using laddering technique and ART2 neural network, Advanced Engineering Informatics 16 (2002) 229–240. [9] A. Chwolka, M.G. Raith, Group preference aggregation with the AHPimplications for multiple-issue agendas, European Journal of Operational Research 132 (2001) 176–186. [10] Fornell Claes, A national customer satisfaction barometer: the Swedish experience, Journal of Marketing 56 (1992) 6–21. [11] F. Claes, D. Michael, E.W.A. Johnson, C. Jaesung, E.B. Barbara, The American Customer Satisfaction Index: nature, purpose, and findings, Journal of Marketing 16 (4) (1996) 7–18.
176
D. Cao et al. / Advanced Engineering Informatics 25 (2011) 162–176
[12] N. Crilly, J. Moultrie, P.J. Clarkson, Shaping things: intended consumer response and the other determinants of product form, Design Studies 30 (2009) 224–254. [13] S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, R. Harshman, Indexing by latent semantic analysis, Journal of the American Society for Information Science 41 (6) (1990) 391–407. [14] Erin F. MacDonald, Richard Gonzalez, Panos Y. Papalambros, Preference inconsistency in multidisciplinary design decision making, Journal of Mechanical Design 131 (2009) 031009-1–031009-13. [15] A. Griffin, J.R. Hauser, The voice of the customer, Marketing Science 12 (1) (1993) 1–27. [16] S.H. Ha, Applying knowledge engineering techniques to customer analysis in the service industry, Advanced Engineering Informatics 21 (2007) 293–301. [17] H. Ji, M.C. Yang, T. Honda, A probabilistic approach for extraction design preferences from design team discussion, in: Proceedings of the ASME International Design Engineering Technical Conference & Computer and Information in Engineering Conference, Las Vegas, September 27, 2007. [18] L. Khan, D. McLeod, E. Hovy, Retrieval effectiveness of an ontology-based model for information retrieval, International Journal of Very Large Data Base (VLDB) 13 (2004) 71–85. [19] Y. Kitamura, R. Mizoguchi, Ontology-based systemization of functional knowledge, Journal of Engineering Design 15 (4) (2004) 327–351. [20] T.Y. Lee, Adaptive text extraction for new product development, in: Proceedings of the ASME International Design Engineering Technical Conference & Computer and Information in Engineering Conference, San Diego, August 30, 2009. [21] G.S. Linoff, M.J.A. Berry, Mining the web: Transforming Customer Data into Customer Value, John Wiley & Sons, New York, 2001. [22] S. Li, A semantic vector retrieval model for desktop documents, Journal of Software Engineering and Application 2 (2009) 55–59. [23] S.C.J. Lim, Y. Liu, W.B. Lee, Product analysis and variants derivation based on a semantically annotated product family ontology, in: Proceedings of the ASME International Design Engineering Technical Conference & Computer and Information in Engineering Conference, San Diego, August 30, 2009. [24] Y.C. Lin, H.H. Lai, C.H. Yeh, Consumer-oriented product form design based on fuzzy logic: a case study of mobile phones, International Journal of Industrial Ergonomics 37 (2007) 531–543. [25] Z. Li, C. Yang, K. Ramani, A methodology for engineering ontology acquistion and validation, Artificial Intelligence for Engineering Design, Analysis and Manufacturing 23 (2009) 37–51. [26] Z. Li, V. Raskin, K. Ramani, Developing engineering ontology for information retrieval, Journal of Computing and Information Science in Engineering 8 (2008) 011003-1–011003-13. [27] M. Marcus, B. Santorini, M.A. Marcinkiewicz, Building a large annotated corpus of English: the penn treebank, Computational Linguistics 19 (2) (1994) 313– 330.
[28] J. McCormack, J. Cagan, Supporting designer’s hierarchies through parametric shape recognition, Environment and Planning B – Planning and Design 29 (2002) 913–931. [29] C. McMahon, A. Lowe, S. Culley, Waypoint: An integrated search and retrieval system for engineering documents, Journal of Computing and Information Science in Engineering 4 (4) (2004) 329–338. [30] J.J. Michalek, Preference Coordination in Engineering Design Decision-Making, Ph.D. thesis, University of Michigan, Ann Arbor, 2005. [31] G.A. Miller, A lexical database for English, Communications of the ACM 38 (11) (1995) 39–41. [32] S. Orsborn, J. Cagan, P. Boatwright, Quantifying aesthetic form preference in a utility function, Journal of Mechanical Design 131 (2009) 061001-1–06100110. [33] S. Orsborn, P. Boatwright, J. Cagan, Identifying product shape relationships using principal component analysis, Research in Engineering Design, 2007. doi:10.1007/s00163-007-0036-8. [34] M. Ruth, P.C.M. Govers, P.L.S. Jan, The development and testing of a product personality scale, Design Studies 30 (2009) 287–302. [35] G. Salton, Automatic Text Process, Addison-Wesley, Wokingham, MA, 1998. [36] M. Scott, E.K. Antonsson, Aggregation functions for engineering design tradeoffs, Fuzzy Sets and Systems 99 (3) (1998) b253–b264. [37] T.K. See, K. Lewis, A decision support formulation for design teams: a study in preference aggregation and handing unequal group members, in: ASME 2005 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference, Long Beach, California, USA, September 24–28, 2005. [38] P. Slovic, The construction of preference, American Psychologist 50 (5) (1995) 364–371. [39] R. Studer, V.R. Benjamins, D. Fensel, Knowledge engineering: principles and methods, Data and Knowledge Engineering (DKE) 25 (1–2) (1998) 161–197. [40] P.M. West, P.L. Brockett, L.L. Golden, A comparative analysis of neural networks and statistical methods for predicting consumer choice, Marketing Science 16 (1997) 370–391. [41] M.C. Yang, W.H. Wood III, M.R. Cutkosky, Design information retrieval: a thesauri-based approach for reuse of informal design information, Engineering with Computers 21 (2005) 177–192. [42] M.C. Yang, H. Ji, A text-based analysis approach to representing the design selection process, in: International Conference on Engineering Design, ICED’07, Cite Des, Paris, France, August 28–31, 2007. [43] R. Zhao, W.I. Grosky, Narrowing the semantic gap – improved text-based web document retrieval using visual features, IEEE Transaction on Multimedia 4 (2) (2002) 189–200.