Journal Pre-proof Examining end users’ ability to select business services: a conceptual framework and an empirical study Padmal Vitharana, Amiya Basu
PII:
S0378-7206(18)30436-1
DOI:
https://doi.org/10.1016/j.im.2019.103241
Reference:
INFMAN 103241
To appear in:
Information & Management
Received Date:
22 May 2018
Revised Date:
20 November 2019
Accepted Date:
23 November 2019
Please cite this article as: Padmal Vitharana, Amiya Basu, Examining end users’ ability to select business services: a conceptual framework and an empirical study, (2019), doi: https://doi.org/10.1016/j.im.2019.103241
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier.
Examining end users' ability to select business services: a conceptual framework and an empirical study
of
Padmal Vitharana Martin J. Whitman School of Management, Syracuse University, Syracuse, NY 13244-2130, USA Tel.: +1-315-4433132 Fax: +1-315-4435457 Email:
[email protected]
Jo
ur na
lP
re
-p
ro
Amiya Basu Martin J. Whitman School of Management, Syracuse University, Syracuse, NY 13244-2130, USA Tel.: +1-315-4433783 Fax: +1-315-4435457 Email:
[email protected]
Examining end users’ ability to select business services: a conceptual framework and an empirical study
Abstract Software made from autonomous business services is gaining popularity. Now end users can build large applications by assembling a suite of services. Because some end users might have limited knowledge of their requirements and the functionality
ro of
of available services, the key challenge is to find the services needed to build an application. The task of finding the services matching requirements requires specialized knowledge—knowledge of requirements and the functionality of available services— not just mere general competence. Moreover, the complexity of the requirements
-p
could also hinder the ability of end users to select services. However, there is little research into how the end users’ sophistication and requirement complexity affect
re
their ability to avoid duplication (i.e., select the most cost-effective set of services) and select a set of services that satisfy their requirements. We provide a concep-
lP
tual framework for the choice problem faced by the decision maker and develop a set of hypotheses on end user’s sophistication and requirement complexity, and the impact of these factors on outcome performance—the ability to avoid duplication
na
and select the appropriate services. Then we conduct an empirical study to test the hypotheses. Empirical results offer support for all hypotheses. Our work has several
ur
implications. We demonstrate both conceptually and empirically that end users’ naivety has a significant impact on service duplication. For a profit-maximizing
Jo
service vendor, knowledge of the end user’s sophistication/naivety allows there to be different pricing strategies: (1) a pure component strategy, (2) a pure bundling strategy, or (3) a mixed bundling strategy. Keywords: Business services, Service duplication, Empirical study, Requirements
1
Introduction
Over the years, software systems built from components and web services have be-
AU: Are all heading lev-
come commonplace (Daya et al. 2016; Kessler 2008). The service-oriented architec-
els correct?
ture provides the architectural model for implementing applications using web ser-
ro of
vices. Such applications rely on a set of services that communicate with each other and with other applications using a messaging protocol over the Internet (Haines and Rothenberger 2010). More recently, microservice-based application development has started to gain traction in the industry. Unlike web services, a microservice repre-
-p
sents a single function around a business capability, encompasses own data resources, and is quickly deployable (Daya et al. 2016). Instead of the monolithic architecture
re
prevalent in conventional development, the nascent architecture relies on building applications using independently deployable microservices. Software is no longer
lP
built from scratch, and the emerging trend to build applications using business services is likely to continue into the foreseeable future. There are many vendors offering components and web services, such as Amazon Services
(https://aws.amazon.com)
na
Web
and
ComponentSource
(https://www.componentsource.com). For example, Amazon’s pay-as-you-go model
ur
allows organizations to pay for services as needed, thereby affording greater responsiveness to change without overcommitting their budgets. Many services can be
Jo
downloaded and incorporated into applications or can be invoked over a network, over the Internet, or in the cloud using a software-as-a-service (SaaS) subscription model (Newman 2015; Singleton 2016). Microservice-based development is in its
AU:
Please
provide full
infancy, but marketplaces for vendors offering microservices could start to emerge
reference information
as the paradigm gains further momentum (Singleton 2016). Because these are digital products, generally the services purchased outright are not returnable, while
for
“New-
man
2015”
for inclusion in
subscription contracts can be terminated for a fee.
the
reference list or delete
While the service-based paradigm has clear advantages, end users face several 1
the citation.
challenges in building applications using business services. The service selection process involves a search process where requirements are matched against the functionality of a suite of services. Some end users may lack full understanding of their requirements,1 which constrains their ability to select the services needed (Vitharana et al. 2016). The task of finding the services matching requirements requires specialized knowledge—end user’s knowledge of requirements and functionality of available services—not just mere general competence. Research has acknowledged this dis-
ro of
tinction between sophisticated and naive analysts, including end users (e.g., Berry 1995), and as a result, their capacity to find the matching services (e.g., Vitharana 2003).
Furthermore, the complexity of the requirements is shown to inhibit the end
-p
users’ ability to effectively engage in the requirement analysis task (Browne and Ramesh 2002; Vitharana et al. 2016). It is easy to fathom why requirement com-
re
plexity could further complicate the service selection process, although there is little research examining this cause-effect relationship in the context of service-based soft-
lP
ware development. Consumer research postulates that decision makers’ knowledge of the product or service characteristics impacts their ability to select those that meet their needs (Alba and Hutchinson 1987; Basu and Vitharana 2009). Likewise,
na
some end users may also have limited knowledge of the functionality of the available services, which further hinders their ability to find the services that match their
ur
requirements. Naive end users with limited knowledge of both the requirements of the application and the functionality of available services may select services that
Jo
do not have the capabilities needed for the application or may rely on a suboptimal solution with duplicate service functionality, thereby greatly increasing the cost of software development. While challenges are acknowledged, research to date has not fully examined two key antecedents of the end user’s performance in business service–based application 1
In the case of a developer or an analyst searching for business services matching a customer
requirement, the risk associated with the lack of understanding customer requirements is even greater.
2
development: (1) end user’s knowledge of requirements and functionality of available services and (2) the complexity of the requirements. In this article, we address this gap in our understanding of service-based development. We provide a conceptual framework for the choice problem faced by the decision maker and develop a set of hypotheses on the end user’s sophistication and requirement complexity, and their impact on outcome performance—the ability to avoid duplication and select the
ro of
appropriate services.
Our research design is twofold. First, we develop a mathematical model to conceptualize the end users’ service selection process. In this model, we consider an end user selecting services one at a time until she is confident that the selected
-p
services fulfill the functionality needed for the application. This process proceeds in stages, where the end user selects a service, reviews it, and decides whether to
lP
study to test the hypotheses.
re
stop or continue the search for additional services. Second, we conduct an empirical
Our research makes several contributions. We contribute to theory by conceptu-
na
alizing how the aforementioned antecedents impact end users’ ability to build servicebased applications. Our conceptualizing offers insights into the impacts of end users’
ur
sophistication/naivety and requirement complexity on service selection. We also present a simple metric developed for assessing end users’ naivety/sophistication
Jo
(termed NAISOP ) based on both perceptual and objective measures. Overall, this work provides one of the first attempts to both theoretically and empirically examine end users’ performance with regard to service duplication. For the management focused on profit maximization, knowledge of the end user’s sophistication/naivety allows there to be different pricing strategies: (1) a pure component strategy, (2) a pure bundling strategy, or (3) a mixed bundling strategy.
3
2
Literature review
End users’ lack of knowledge of requirements hinders their ability to develop systems that meet their needs (Moody et al. 1998). Unlike conventional software development, the service-based paradigm relies on building applications by assembling a suite of existing business services that match requirements (Vitharana 2003). Hence, end users’ ability to select the required business services that are typically stored in a large repository depends on their understanding of their needs. End users’
ro of
inability to fully grasp the requirements could also lead to unnecessary duplication of service functionality and, as a result, increase software development costs.
Brucks (1985) found that understanding of the product class characteristics fa-
-p
cilitates information search behavior. Hence, similarly to the scenario of a general
consumer searching for a specific product, the end users’ ability to select the required
re
business services also depends on their understanding of the characteristics or features of the business services considered to fulfill those needs (Alba and Hutchinson
lP
1987). Service repositories use various service characteristics and features to catalog services. For example, Vitharana et al. (2003a, 2003b) introduced a cataloging scheme using characteristics such as rules applicable to services to catalog them in
na
the repository so that end users searching for services that match their requirements can find them on the basis of the rule facet. Therefore, understanding of product
ur
characteristics such as rules applicable to services being considered plays a pivotal role in this search exercise.
Jo
The two aforementioned knowledge aspects—understanding of both end users’ needs and the characteristics or features of the business services considered to fulfill those needs—manifest themselves in end users’ sophistication of the task at hand; that is, finding business services that match requirements. In decision-making scenarios such as service-based development, those who have diminished capacity with regard to these two knowledge aspects can be called naive consumers, a term used by scholars such as Dean (1999). Irrespective of the end users’ sophistication or naivety, the complexity of the requirements also impacts the search for business 4
services matching requirements (Moody et al. 1998; Vitharana et al. 2016). In the build-by-assembly business services paradigm, the end users’ performance corresponds to their ability to select the services that fulfill their requirements (Vitharana et al. 2003a). In many cases involving large repositories, end users could find multiple sets of business services that might satisfy the given requirement (Vitharana et al. 2003a, 2003b). Because these services have different prices, rational end users strive to identify the suite of business services that is most cost-
ro of
effective. However, as their sophistication in terms of knowledge of requirements and characteristics or features of the business services considered is bound to differ, some end users are likely to consider duplicate solutions; that is, select a suite of services with characteristics and features over and beyond those required to satisfy
re
developing service-based software systems.
-p
their requirements. Selecting duplicate business services would increase the cost of
The sophistication level of customers also impacts the vendor’s pricing strategy.
lP
Services, and more generally any products, can be offered individually or as a bundle, where two or more services are offered as a package, typically at a discounted price. The vendor must choose from three “bundling” strategies: pure bundling, where
na
services are offered as a bundle only; pure components, where no bundle is offered; and mixed bundling, where the customer can buy services either individually or as a
ur
bundle (Ghosh and Balachander 2007). For a less sophisticated end user, a bundle of services makes it less likely that she would miss something important. Thus,
Jo
the end user gains from bundling because of reduced search and transaction costs (Harris and Blair 2006; Yadav and Monroe 1993). In contrast, a sophisticated end user can easily discern which services she needs and would prefer to buy services individually. Vendors could gain from bundling because it can potentially generate greater profits through price discrimination and demand expansion (Adams and Yellen 1976; Eppen et al. 1991; Hitt and Chen 2005; Schmalensee 1984; Venkatesh and Mahajan 1993). The choice of bundling strategy depends on the relative sizes of naive and sophisticated end users, and a pure bundling strategy is more attractive 5
if end users largely lack sophistication (Basu and Vitharana 2009).
3
Conceptual framework and hypotheses
In the present study, we wish to examine how an end user’s ability to select appropriate business services for an application depends on the knowledge of her requirements and the characteristics and features of available services (i.e., level of
ro of
sophistication), and the complexity of the requirements. To develop hypotheses, we draw from research in two distinct disciplines: consumer psychology and information systems. In consumer psychology, Brucks (1985) examined how a consumer’s knowledge of a product category and the complexity of the choice task affect the con-
-p
sumer’s search behavior, and found that a more knowledgeable (i.e., sophisticated)
consumer seeks less information about inappropriate alternatives, particularly when
re
the task is complex. Sujan (1985) found that a more knowledgeable (sophisticated) consumer has greater ability to match a given problem to an appropriate category,
lP
thus facilitating the search for a solution. Both findings suggest that a more sophisticated customer will be able to search over alternative products more efficiently and make fewer mistakes. Alba and Hutchinson (1987) postulated that a consumer
na
with greater experience is better able to isolate information relevant to a given task. Summarizing, we conclude that a more sophisticated end user should be less likely
ur
to select a business service not relevant to a given requirement and more likely to recognize a service that meets the requirements of a specific application.
Jo
From an information systems point of view, the challenge faced by the end user is to map the problem space as represented by the requirements to the solution space as represented by the suite of available business services (Vitharana 2003). The end users’ sophistication plays a significant role in their ability to map the problem space to the solution space (Ravichandran and Rothenberger 2003; Vitharana 2003). A sophisticated end user can be expected to have greater understanding of her needs and therefore the problem space than a naive end user. Likewise, a sophisticated end user is more likely to be aware of the characteristics or features offered by 6
those business services that are being considered (i.e., solution space) vis-`a-vis a naive end user. Ko et al. (2006) reported that in a software maintenance exercise developers search for cues in the code to determine which specific code segment needs to be changed. We could easily draw an analogy between the scenario considered in this article and software maintenance, where requirements (i.e., problem space) are mapped against system code (i.e., solution space). It can be expected that sophisticated end users who are better versed in their own requirements as well as the
ro of
characteristics and features of available business services will more readily identify the right cues, which would help them more precisely map the problem space to the solution space. Such cues could include a functional description of the business services, applicable rules, expected inputs and outputs, and exception conditions
-p
(Ravichandran and Rothenberger 2003; Vitharana et al. 2016). Therefore, the challenges faced by more naive end users in mapping the problem space to the
re
solution space would lead to a larger number of duplications and incorrect picks
lP
when compared with more sophisticated end users.
The end user faces an added challenge when requirements are more complex. Mapping of requirements (i.e., problem space) to the business services (i.e., solution
na
space) that match them is hindered by the complex nature of needs and wants. Requirement complexity could manifest itself in the form of complex decision rules,
ur
exceptions, and interrelationships between subrequirements (Crowston and Kammerer 1998; Vitharana et al. 2016). Again, analyzability of software code during
Jo
a maintenance exercise is germane to the scenario considered in this article. Shaft and Vessey (2006) demonstrated that software maintainers’ performance has a positive relationship with the cognitive fit between their mental representation of the code and the new requirement. When requirements are relatively simple, we could expect end users to more readily realize a cognitive fit between requirements and the relevant business services considered to fulfill them. In contrast, complex requirements could cloud end users’ ability to form more accurate mental models in mapping requirements with available business services. This is likely to result in a 7
larger number of duplications and incorrect picks when compared with a scenario involving simpler requirements. There is a clear convergence between consumer psychology and information systems research streams with respect to the conceptualization of the decision scenario under consideration. The decision makers, regardless of whether they are general consumers or software end users, differ in their sophistication in terms of knowledge of the need or requirement and knowledge of product or business services afforded to
ro of
them. The two research streams converge to postulate that this variation in sophistication together with the complexity of their own need in the case of a consumer,
or the complexity of the requirement in the case of an end user, impacts decision makers’ outcome performance—the ability to avoid duplication and select the ap-
-p
propriate services. Hence, we summarize the above as the following four hypotheses regarding how an end user picks services to fulfill a given requirement:
re
Hypothesis 1. If the level of requirement complexity is the same, a more naive end
lP
user will have a larger number of duplications than a less naive end user. Hypothesis 2. If the level of end user sophistication is the same, the number of duplications will increase with requirement complexity.
na
Hypothesis 3. If the level of requirement complexity is the same, a more naive end user will have a larger number of incorrect picks than a less naive end user.
ur
Hypothesis 4. If the level of end user sophistication is the same, the number of incorrect picks will increase with requirement complexity.
Jo
In Appendix A, we present a mathematical model based on search behavior and
stopping time that also leads to the same hypotheses. In this model, we consider an end user who is selecting services with corresponding functionality. We assume that there is a large number of services and the end user selects services one at a time until she has chosen all the services needed for the application. This process proceeds in stages, where the end user selects a service, reviews it, and decides whether to stop or continue the search for additional services (see Fig. 1 for an illustration).
8
Fig. 1. Business services search and stop process.
Start
?
ro of
i=1
?
-p
Select service
@ @
re
? @ @ Confident of @ No @ relevance? @ @ @
lP
@
na
Yes
?
Jo
ur
i ←− i + 1
4
No @
? @
i> @ @ number of @ services
Yes @
-
@ @needed? @ @
Empirical study 9
Stop
4.1
Experimental task
The task was to select the “most cost-effective business service or services” needed to complete five sets of requirements. Appendix B provides details of the experimental task. The associated dataset contained information about universities in the United States and tasks related to various aspects of analyzing this dataset. The dataset and the corresponding tasks2 were selected because of the participants’ general familiarity with the domain. Before the tasks were presented, descriptions
ro of
of six business services were presented (see Appendix C). Following Brucks (1985),
we present the end user as the decision maker with hypothetical brands to avoid an internal search of known business services within the decision maker’s memory.
Thus, the decision maker must process the information presented about the business
-p
service alternatives to make an appropriate selection. At any time during the subsequent problem-solving task, the participants had the opportunity to review these
re
business services. The dataset was presented in a browser window in a table format
Instrument development
na
4.2
lP
(with scrollable rows and columns) without any reference to Microsoft Excel.
We developed an instrument to measure the participants’ knowledge of data analysis (see Appendix D), which was subsequently used to develop an aggregated measure
Experimental design and protocol
Jo
4.3
ur
of their NAISOP (knowledge of requirements and functionality of available services).
We used a controlled experiment to test the hypotheses. An experiment website was built to conduct the study, including the administration of the survey questions. Every alternate participant was assigned the simple task or the complex task. Before the main study was conducted, a pilot study was performed with 16 students. The 2
We presented each requirement as a task. So, collectively, we refer to corresponding sets of
requirements as simple task and complex task.
10
objectives were to test the website, the survey instrument, and the overall flow of the experiment. On the basis of the results of the pilot study, a few minor changes to the instrument and the website were made. For the main experiment, both students and professionals were included as participants to diversify the level of sophistication. Hence, students from a university in northeast United States and professionals (from a panel maintained by Qualtrics) were recruited. The data collection was performed over an 8-month period. For their participation, a student and a professional received
4.4
ro of
$5 and $26, respectively.
Data collection
-p
At the start of the experiment, all participants answered online survey questions about their personal profiles (demographics) and knowledge of data analysis, number
re
of statistics courses taken, and experience with Excel (number of months). A total
4.5
Analysis
lP
of 379 participants completed the study.
na
Participants’ nativity/sophistication (NAISOP) was assessed using both a perceptual measure and an objective measure. The perceptual aspect was assessed using
ur
their perceived knowledge of data analysis described earlier. The objective aspect was assessed using the number of statistics courses they had taken and the number
Jo
of months of Microsoft Excel experience. While other metrics such as personal IQ or college GPA could be used to distinguish between naive and sophisticated participants, we chose this combination of perceived knowledge of data analysis and number of statistics courses and Excel experience because the tasks are statistics related and the business services offer the statistics-related functionality of common spreadsheet software such as Microsoft Excel (e.g., charts, pivot tables, and regression analysis). Cronbach alpha for the aggregated NAISOP construct was 0.90, demonstrating sufficient reliability. Appendix E provides details of the measurement 11
model for NAISOP.
To further establish reliability and validity of the aggregated NAISOP construct, we examined its correlation with two related measures. There were strong correlations between NAISOP and the number of data analysis classes (correlation coefficient 0.514, p < 0.001) and between NAISOP and months of data analysis experience
distinguish between naive and sophisticated participants.
ro of
(correlation coefficient 0.611, p < 0.001). Hence, NAISOP is a suitable measure to
In coding dependent variables, we considered each task separately such that
-p
duplication for one task was scored as 0 (no duplication) or 1 (duplication). The same scoring scheme was used for the selection of correct business services for each
re
task. Hence, given that there were five tasks, for duplication and correct dependent variables, the possible minimum and maximum values were 0 and 5, respectively.
lP
Scoring each task with 0/1 and then adding the scores for the five tasks afforded an effective way to deal with potential outliers. Table 1 presents relevant descriptive
Jo
ur
na
statistics.
Table 1. Descriptive statistics. 12
Variable
Mean
Age
28 years
Female
40%
No. of statistics courses 4.15 Excel experience
33.9 months
Status
Undergraduate 43% Graduate 32%
ro of
Professional 24%
6.60%
1
12.14%
2
15.57%
3
17.41%
4
17.15%
5
31.13%
lP
Correct 0
14.78% 10.29%
na
1 2
11.61%
22.16% 27.97%
Jo
5
13.19%
ur
3 4
re
0
-p
Duplicates
The regression models used to test the hypotheses are as follows: Duplicates = β0 + β1 × NAISOP + β2 × Complexity + 1 , Correct = γ0 + γ1 NAISOP + γ2 × Complexity + 2 . Although data were collected from students in many disciplines and professionals
across many firms, it is possible that the error terms are correlated because of a common effect. Seemingly unrelated regression estimation (SURE) corrects for 13
correlated error terms (Davidson and MacKinnon 1993; Greene 2002). In SURE, a linear regression model consisting of a set of regression equations is generated. Each regression equation has its own dependent variable and a set of exogenous variables. The equations appear to be unrelated although they are related through the correlation in the errors. It is argued that joint estimation of the set of equations using generalized least squares produces more efficient estimates than individual AU: Instead of
ordinary least squares (Theil 1971; Zellner 1962). Hence, we tested our hypotheses
“than
individual
using SURE models (Johnston 1984). The results obtained with Stata version 15
ro of
ordinary
are shown in Table 2.
least squares” do
you
mean ”than
Table 2. Seemingly unrelated regression estimation analysis.
individual
Dependent variable: Duplicates Standard error 0.227 0.001 0.145
t 13.88 −3.45 10.40
re
Coefficient 3.157 −0.002 1.507
lP
Variable Intercept (α0 ) NAISOP (α1 ) Complexity (α2 ) Model
p < 0.001 < 0.01 < 0.001 < 0.001
χ2
R2
118.54 0.24
Dependent variable: Correct Variable Intercept (β0 ) NAISOP (β1 ) Complexity (β2 ) Model n = 379.
Coefficient 3.551 0.002 −2.055
Standard error 0.235 0.001 0.150
na
least
squares”?
t 15.13 2.29 −13.73
p χ2 R2 < 0.001 < 0.05 < 0.001 < 0.001 192.53 0.34
ur
ordi-
nary
All hypotheses were supported.
Jo
using
-p
estimation
• End users’ sophistication impacted their ability to avoid duplication in selecting only those business services required to accomplish the given task (t = −3.45, p < 0.01), thereby supporting hypothesis 1. • Task complexity impacted end users’ ability to avoid duplication in selecting only those business services required to accomplish the given task (t = 10.40, p < 0.001), thereby supporting hypothesis 2. 14
• End users’ sophistication also impacted their ability to select the business services with the functionality needed to accomplish the given task (t = 2.29, p < 0.05), thereby establishing support for hypothesis 3. • Task complexity impacted end users’ ability to select the business services with the functionality needed to accomplish the given task (t = −13.73, p < 0.001), thereby establishing support for hypothesis 4.
ro of
Furthermore, a simple t test revealed that the end users’ status (professional vs. student) had no impact on their ability to avoid duplication and to select the services needed although, as shown in Table 3, the groups differed significantly on demographics. This counterintuitive revelation is interesting because professionals who
-p
have greater sophistication were expected to perform better than students. The examination of R2 values showed explanatory power for avoiding duplication and
re
selecting correct business services at 24% and 34%, respectively.
lP
Table 3. Demographics among undergraduates, graduate students, and professionals.
Data analysis
na
Age
Statistics classes
(years) (0–100 scale)
Data analysis
Excel experience
classes
(months)
58
2
1
10
Graduate
29
67
4
3
28
38
86
9
7
86
ur
Undergraduate 22
Jo
Professional
5
Discussion
We theorized that the end users’ sophistication (or lack thereof) and task complexity impact their ability to avoid duplication and select the business services with the functionality needed. Our empirical results support this premise. Furthermore, we highlight the significance of perceptual and objective aspects of the decision makers’ knowledge in forming the aggregate measure for their sophistication. In 15
the marketing literature, only consumers’ knowledge of available product choices is typically considered as a decision parameter impacting their purchase behavior (Alba and Hutchinson 1987; Sujan 1985). This approach implicitly assumes that the customer knows about her needs, and it is not necessary to consider knowledge of the needs themselves explicitly. In contrast, the end user examined in this article requires knowledge of both her own needs and business services that can fulfill those needs. Thus, the end users’ knowledge of the requirement considered in this article
ro of
(i.e., selection of business services needed to complete the requirement) is unique in that it needs to include both their knowledge of the requirements and their knowledge
of the available choices. Therefore, in developing measures for the decision makers’
sophistication with respect to the task at hand, we need to include both dimensions
Theoretical contribution
re
5.1
-p
of knowledge.
lP
This study makes several theoretical contributions. We provide a conceptual framework for the choice problem faced by an end user in the selection of business services needed to fulfill a given requirement. This conceptualization offers the basis for the
na
key premise in the article that end users’ sophistication impacts their performance— the ability avoid duplication and select the services needed. We conceptualized the
ur
need to consider both perceptual and objective aspects of end users’ knowledge in developing a measure for sophistication as they represent two distinct dimensions
Jo
of the decision makers’ knowledge. The aggregated NAISOP measure is pivotal in capturing end users’ sophistication in terms of both their knowledge of available business services and their knowledge of the requirements.
5.2
Managerial implications
Our work has several implications for managers. End users’ performance depends both on the knowledge of their requirements and their knowledge of the functionality 16
of available business services, not on merely one alone. For a profit-maximizing vendor, the realization of disparate end user sophistication allows the vendor to have different pricing strategies, such as a pure component strategy, a pure bundling strategy, or a mixed bundling strategy. Sophisticated end users may be more inclined to purchase individual business services because of their superior knowledge of available choices and the task itself. On the other hand, naive end users may be enticed to purchase bundles to help them mitigate the risk of leaving out any crucial
ro of
functionality needed for the task at hand. In the present study, we found that professionals do not perform significantly better than students in identifying the correct services for a given task. Also, both students and professionals tend to equally duplicate services. These findings suggest
-p
that most business customers would prefer bundles, which are more likely to include items that fulfill their needs. Unless there are high royalty costs involved, the
re
marginal costs for the marketer are low. The combination of low marginal cost and the customer’s uncertainty about the value of a given service suggests that a pure
lP
bundling strategy is optimal for a market such as this (Basu and Vitahrana 2009). Clearly, this recommendation is based on our examination of a specific domain, and
Limitations and directions for future research
ur
5.3
na
further research is needed to determine if the results are similar in other domains.
We considered two key variables—end users’ sophistication and requirement complexity—
Jo
that could impact end users’ ability to avoid duplication and select the business services needed. We tried to mitigate threats to internal validity by putting necessary controls in place during the experiment. For example, participants were randomly assigned to simple and complex tasks on an even/odd basis. The website offered a control setting that included solicitation of demographic information, the instrument, the task, and the dataset. However, other variables, such as knowledge of the domain, in our case university admissions, could also impact their performance. Representational methods for conventional reusable assets are shown to impact de17
velopers’ understanding of those assets (Frakes and Pole 1994). Likewise, the effectiveness of the business service descriptions (e.g., cataloging scheme implemented in the service repository) could also impact the end users’ ability to select the services and avoid duplication. Research has also shown that an improvement in media richness reduces the cost of information search (Maity et al. 2018), thereby enhancing the end user’s ability to select the services needed. Future research needs to account for other possible variables that could explain the variation in outcome performance.
ro of
One threat to external validity and generalizability stems from the selection of the participant pool. We tried to mitigate this threat by selecting a participant
pool with a diverse background (e.g., students in accounting, finance, etc., and
professionals across multiple organizations). Moreover, we used a participant pool
-p
with a broad background in terms of the number of statistics courses taken (range
0–20, mean 4.1, standard deviation 5.0) and months of Excel experience (range 0–
re
100 months, mean 33.9 months, and standard deviation 38.5 months).3 Nonetheless, future research needs to include alternative domains and corresponding participant
lP
pools.
Conflict of interest This research did not receive any specific grant from funding
Jo
ur
na
agencies in the public, commercial, or not-for-profit sectors.
3
The survey used a slicer where the maximum was set at 20 statistics classes taken and at 100
for months of Excel experience.
18
References Adams, W.J. and J.L. Yellen (1976), Commodity bundling and the burden of monopoly, Quarterly Journal of Economics, 90 (August), 475–498. Alba, J.W. and J.W. Hutchinson (1987), Dimensions of consumer expertise, Journal of Consumer Research, 13(4), 411–454. Basu, A. and P. Vitharana (2009), Impact of customer knowledge heterogeneity on
ro of
bundling strategy, Marketing Science, 28(4), 792–801. Berry, D.M. (1995), The importance of ignorance in requirements engineering, Journal of Systems and Software, 28(1), 179–184.
Browne, G.J. and M.B. Rogich (2001), An empirical investigation of user require-
-p
ments elicitation: comparing the effectiveness of prompting techniques, Journal of Management Information Systems, 17(4), 223–249.
re
Browne, G.J. and V. Ramesh (2002), Improving information requirements determi-
lP
nation: a cognitive perspective, Information & Management, 39(8), 625–645. Brucks, M. (1985), The effects of product class knowledge on information search behavior, Journal of Consumer Research, 13 (June), 1–16.
na
Crowston, K. and E.E. Kammerer (1998), Coordination and collective mind in software requirements development, IBM Systems Journal 37(2), 227–245.
ur
Davidson, R. and J.G. MacKinnon (1993), Estimation and Inference in Econometrics, Oxford University Press, Oxford.
Jo
Daya, S., N. van Duy, K, Eati, C.M. Ferreira, D. Glozic, V. Gucer, M. Gupta, S. Joshi, V. Lampkin, M. Martins, S. Narain, and R. Vennam (2016), Microservices from Theory to Practice: Creating Applications in IBM Bluemix Using the Microservices Approach, IBM Redbooks. Dean, D.H. (1999), Brand endorsement, popularity, and event sponsorship as advertising cues affecting consumer pre-purchase attitudes, Journal of Advertising, 28(3), 1–12. 19
AU:
Please
cite Browne and
Rogich
(2001)
or
delete
the
reference from reference list.
the
Eppen, G.D., W.A. Hanson, and R.K. Martin(1991), Bundling – new products, new markets, low risk, Sloan Management Review, 32(4), 7–14. Frakes, W.B. and T.P. Pole (1994), An empirical study of representation methods for reusable software components, IEEE Transactions on Software Engineering, 20(8), 617–630. Ghosh, B. and S. Balachander (2007), Competitive bundling and counterbundling with generalist and specialist firms, Management Science, 53(1), 159–168.
ro of
Greene, W.H. (2002), Econometric Analysis, fifth edition, Prentice Hall, Upper Saddle River.
Haines, M N. and M.A. Rothenberger (2010), How a service-oriented architecture
-p
may change the software development process, Communications of the ACM, 53(8), 135–140.
re
Harris, J. and E.A. Blair (2006), Consumer preference for product bundles: the role of reduced search costs, Journal of the Academy of Marketing Science, 34(4),
lP
506–513.
Hitt, L.M. and P. Chen (2005), Bundling with customer self-selection: a simple
na
approach to bundling low-marginal-cost goods, Management Science, 51(10), 1481– 1493.
ur
Johnston, J. (1984), Econometric Methods, third edition, McGraw-Hill, New York. Kessler, E. (2008), Assembling COTS software in a certifiable safety-critical domain,
Jo
Information Systems Journal, 18(3), 299–324. Kraft, D.H. and T. Lee (1979), Stopping rules and their effect on expected search length, Information Processing & Management, 15, 47–58. Ko, A.J., B.A. Myers, M.J. Coblenz, and H.H. Aung (2006), An exploratory study of how developers seek, relate, and collect relevant information during software maintenance tasks, IEEE Transactions on Software Engineering, 32(12), 971–987. Lilien, G.L., A. Rangaswamy, G.H. Van Bruggen, and K. Starke (2004), DSS effec20
tiveness in marketing resource allocation decisions: reality vs. perception, Information System Research, 15(3), 216–235. Maity, M., M. Dass, and P. Kumar (2018), The impact of media richness on consumer information search and choice, Journal of Business Research, 87, 36–45.
AU:
Please
cite
Lilien
et al. (2004) or delete the reference from reference
Moody, J.W., J.E. Blanton, and P.H. Cheney (1998), A theoretically grounded approach to assist memory recall during information requirements determination,
ro of
Journal of Management Information Systems, 15(1), 79–98. Ravichandran, T. and M.A. Rothenberger (2003), Software reuse strategies and component markets, Communications of the ACM 46(8), 109–114.
Ross, S. (2006), A First Course in Probability, seventh edition, Prentice Hall, Upper
-p
Saddle River.
of Business, 57(1), S211–S230
re
Schmalensee, R. (1984), Gaussian demand and commodity bundling, The Journal
lP
Shaft, T.M. and I. Vessey (2006), The role of cognitive fit in the relationship between software comprehension and modification, MIS Quarterly 30(1), 29–55.
16–20.
na
Singleton, A. (2016), The economics of microservices, IEEE Cloud Computing, 3(5),
Sujan, M. (1985), Consumer knowledge: effects on evaluation strategies mediating
ur
consumer judgments, Journal of Consumer Research, 12(1), 31–46.
Jo
Theil, H. (1971), Principles of Econometrics, Wiley, New York. Venkatesh, R. and V. Mahajan (1993), A probabilistic approach to pricing a bundle of products or services, Journal of Marketing Research, 30(4), 494–508. Vitharana, P. (2003), Risks and challenges of component-based software development, Communications of the ACM, 46(8), 67–72. Vitharana, P., F.M. Zahedi, and H. Jain (2003a), Design, retrieval, and assembly in component based software development, Communications of the ACM, 46(11), 21
list.
the
97–102. Vitharana, P., F.M. Zahedi, and H. Jain (2003b), Knowledge-based repository scheme for storing and retrieving business components: a theoretical design and an empirical analysis, IEEE Transactions on Software Engineering, 29(7), 649–664. Vitharana, P., F.M. Zahedi, and H.K. Jain (2016), Enhancing analysts’ mental models for improving requirements elicitation: a two-stage theoretical framework and empirical results, Journal of the Association for Information Systems, 17(12), 1.
ro of
Yadav, M.S. and K.B. Monroe (1993), How buyers perceive savings in a bundle price: an examination of a bundle’s transaction value, Journal of Marketing Research, 30(3), 350–358.
-p
Zellner, A. (1962), An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias, Journal of the American Statistical Association,
Jo
ur
na
lP
re
57(1968), 348-368.
22
Appendix A: A model and an empirical test4 In this appendix, we provide a simple model based on search behavior and stopping time to derive the hypotheses examined in this article. We consider a decision maker who is selecting services that can fulfill a predetermined set of functions. To simplify the model, we assume that there is a large pool of services, and the decision maker selects services one at a time until she is confident she has picked all the services needed for the given application. This process may proceed in stages, where
ro of
the decision maker selects some services, reviews them, and decides whether to stop or continue the search. Each time the decision maker picks a service, three cases are possible:
1. The decision maker picks a service that is not relevant to the given application.
-p
We call this event an incorrect pick, and denote the probability this happens by q1 .
that fact and continues the search.
re
We assume that if an incorrect pick occurs, the decision maker eventually recognizes
2. The decision maker picks a service that is relevant to the given application but,
lP
after picking, she is not confident it is relevant. As the decision maker stops her search only when she is confident she has picked all services she needs, the event
na
where she picks a relevant service but is not confident it is relevant leads to her picking at least another item that performs the same function. Thus, the item originally picked becomes a duplicate pick. We denote the probability of this event
ur
by q2 .
Jo
3. The decision maker picks a service that is relevant, and is also confident that it is relevant. The probability this happens is p = 1 − q1 − q2 . Clearly, the last item selected must meet this criterion. If the decision maker always picks the correct services and is confident in her
selection, we will have p = 1 and q1 = q2 = 0. Otherwise, q1 or q2 should exceed zero. The related literature in consumer psychology (Alba and Hutchinson, 1987; 4
We thank the associate editor for the suggestion to place the analytical model in an appendix
and provide the corresponding empirical test.
23
Brucks, 1985; Sujan, 1985) suggests that q1 should be higher for less knowledgeable decision makers. By the same logic, we posit that q2 should also be higher for less knowledgeable decision makers and that, for the same level of decision maker knowledge, q1 and q2 should be higher for more complex requirements. Formally, we assume that the decision maker is not confident with any incorrect pick, and she stops the search after she has picked r services she is confident are useful for the given requirement. As even simple requirements may involve many
ro of
functions, we assume that r is the same for all cases, simple or complex. Consider first the dichotomy where, after picking, the decision maker is either confident or not confident that the service is useful for the application. The probability that she stops her search after picking n services is the same as the probability that she
-p
picks exactly n services to have r services, including the last service picked, she is confident are useful. The probability of this event is given by the negative binomial
=
p
n−1
r−1
r
(1 − p)n−r .
lP
(A1) P (n)
re
probability distribution; that is,
Kraft and Lee (1979) used this distribution to model search lengths for an informa-
(A2) E(n) =
r p
na
tion retrieval system. The expected search length is given by
ur
(see, e.g., Ross 2006, page 187).
Expected numbers of incorrect and duplicate picks. Suppose, after picking n services,
Jo
the decision maker is confident she has picked r services that are useful and stops her
search. Thus, she also picked (n − r) services that are either incorrect or duplicate q1 picks. For each of these (n − r) picks, the probability that it is incorrect is = q1 + q2 q1 q2 , and the probability that it is a duplicate is . Hence, if the search stops 1−p 1−p q1 after n steps, the expected numbers of incorrect and duplicate picks are (n − r) 1−p q2 and (n−r), respectively. Since the probability of stopping after n steps is P (n), 1−p the expected number of incorrect picks is given by 24
∞ X
q1 q1 q1 r q1 r q1 r (n − r)P (n) = [E(n) − r] = ( − r) = = . 1−p 1−p p p 1 − q1 − q2 n=r 1 − p q2 r . Similarly, the expected number of duplicates is 1 − q1 − q2
(A3)
Since we posit that q1 increases if the requirement is more complex or the decision maker is more naive, the expected number of incorrect picks should also increase. We have a similar result for duplicate picks. Summarizing, we have the following hypotheses:
ro of
Hypothesis 1. If the level of requirement complexity is the same, a more naive end user will have a larger number of duplications than a less naive developer.
Hypothesis 2. If the level of end user sophistication is the same, the number of
-p
duplications will increase with requirement complexity.
Hypothesis 3. If the level of requirement complexity is the same, a more naive end
re
user will have a larger number of incorrect picks than a less naive end user. Hypothesis 4. If the level of end user sophistication is the same, the number of
lP
duplications is larger for greater requirement complexity. Distributions of numbers of incorrect picks and duplications. We now show that
na
under our assumptions, the numbers of incorrect and duplicate picks also follow negative binomial distributions. Let y and z denote the numbers of incorrect picks
ur
and duplications when a given search ends with r correct picks. Thus, for the first (r − 1 + y + z) picks, there are (r − 1) correct picks, y incorrect picks, and z duplicate
Jo
picks, and pick number (r + y + z) (last pick) is a correct pick. As each pick is independent, the probability of this occurring is given by "
#
(r − 1 + y + z)! r−1 y z (r − 1 + y + z)! r y z (A4) p q1 q2 × p = p q1 q2 . (r − 1)!y!z! (r − 1)!y!z! The term in brackets in the above expression is the multinomial probability of observing (r − 1) correct picks, y incorrect picks, and z duplicate picks among the first (r − 1 + y + z) picks (Ross 2006, page 267). Hence, the probability that a given value y of incorrect picks occurs is 25
φ(y) =
∞ X
(r − 1 + y + z)! r y z p q1 q2 (r − 1)!y!z! z=0
∞ pr q1y X (r − 1 + y + z)! z q2 (r − 1)!y! z=0 z!
=
∞ X (r − 1 + y + z)! z (r − 1 + y)! r y p q1 (1 − q2 )−(r−1+y) q2 (1 − q2 )r−1+y (r − 1)!y! z!(r − 1 + y)! z=0
=
∞ X (r − 1 + y)! r y (r − 1 + y + z)! z p q1 (1 − q2 )−(r+y) [ q2 (1 − q2 )r−1+y ] × (1 − q2 ) (r − 1)!y! z!(r − 1 + y)! z=0
=
∞ X (r − 1 + y)! r y p q1 (1 − q2 )−(r+y) × ψ(z), (r − 1)!y! z=0
ro of
=
where ψ(z) is the probability, for a negative binomial process with probability of
success (1 − q2 ) that stops after (r + y) cases of success, that there are z cases of failure before the process stops. Thus, ψ(z) = 1;
-p
∞ X z=0
that is,
(r − 1 + y)! r y (r − 1 + y)! r y p q1 (1 − q2 )−(r+y) = p p , (r − 1)!y! (r − 1)!y! 1 2 p q1 where p1 = and p2 = . 1 − q2 1 − q2
lP
re
(A5) φ(y) =
Since p1 + p2 = 1, it follows from (A5) that (y + r) is the length of a negative
na
binomial process that stops after r cases of success, where the probability of success in each trial is p1 . Hence, the number of incorrect picks y is generated by a negative binomial process. Similarly, the number of duplicate picks z is also generated by a
ur
negative binomial process.
Jo
Empirical test. We now present results obtained with negative binomial regression with dependent variables Duplicates and Correct. The results obtained with the procedure glm.nb with the MASS library in R version 3.3.1 are presented in Table A1.
Table A1. Results of negative binomial regression.
Dependent variable: Duplicates 26
Variable
Estimate
Standard error
z
p
Intercept (α0 )
1.1145
0.09016
12.36
< 2 × 10−16
NAISOP (α1 )
−0.00077 0.00028
−2.748 0.006
Complexity (α2 )
0.4798
8.13
0.05900
4.2 × 10−16 < 0.001
74.36 (df = 2)
Jo
ur
na
lP
re
-p
ro of
Model
χ2
27
Dependent variable: Correct χ2
Variable
Estimate Standard error
z
p
Intercept (β0 )
1.2353
0.09216
13.40
< 2 × 10−16
NAISOP (β1 )
0.00056
0.00029
1.892
0.0585
Complexity (β2 ) −0.7111
0.06309
−11.27 2 × 10−16
Model
< 0.001
138.20 (df = 2)
ro of
Comparison shows that these results are consistent with the results presented in Table 2 and support all four hypotheses. Appendix B: Description of task
You are given a dataset that has information about all the universities in the United
-p
States. Your supervisor has asked you to investigate various aspects of the data
to gain greater insights. Once the analysis is complete, you are required to give a
re
presentation to your supervisor by including charts illustrating your findings. Select the most cost-effective business service or services needed to complete each
lP
of the following tasks, and in one or two sentences briefly describe why each service is required.
na
Important note: You are asked to select only the service or services needed to complete the given task and give the subsequent presentation to your supervisor. You
ur
are not asked to actually do the data analysis and give the presentation. Simple task
Jo
Task 1: Find the average and standard deviation of the number of applicants, admissions, and enrolled students for private schools. Find the average and standard deviation of the number of applicants, admissions, and enrolled students for public schools. How do these numbers compare? (Remember, after you complete the task, your supervisor wants you to present your findings.) Which service or services do you need to complete this task? (Check all that apply.) Task 2: Find the average and standard deviation of the number of applicants, ad28
missions, and enrolled students in each geographical region. How do these numbers compare? (Remember, after you complete the task, your supervisor wants you to present your findings.) Which service or services do you need to complete this task? (Check all that apply.) Task 3: Find the average and standard deviation of the admissions-to-applicants ratio, and the enrolled students to admissions ratio for each private/public and geographical region subset. (For example, private schools in the northeast is a
your supervisor wants you to present your findings.)
ro of
subset.) How do these numbers compare? (Remember, after you complete the task,
Which service or services do you need to complete this task? (Check all that apply.)
-p
Task 4: Find the average SAT Math 75th percentile score and divide the data into two groups: (1) at or above this average; (2) below this average. Compare these
re
two groups in terms of the admissions-to-applicants ratio. (Remember, after you complete the task, your supervisor wants you to present your findings.)
lP
Which service or services do you need to complete this task? (Check all that apply.) Task 5: In New York State, identify schools with the ten highest tuition rates.
na
For these ten schools, find the average, maximum, and minimum admission rates (Remember, after you complete the task, your supervisor wants you to present your
ur
findings.)
Which service or services do you need to complete this task? (Check all that apply.)
Jo
Complex task
Task 1: How does the undergraduate graduation rate (within 4 years) depend on the following three features of a school: size of the school, tuition rate, and SAT Math 75th percentile score of entrants? Your model should include the three factors (size, tuition rate, SAT Math 75th percentile score) simultaneously. (Remember, after you complete the task, your supervisor wants you to present your findings). Which service or services do you need to complete this task? (Check all that apply.) 29
Task 2: Beyond the three factors listed in task 1 above, other factors may also affect the graduation rate. Overall (considering all factors simultaneously), what are the three most important factors (features/characteristics of the school) that determine the undergraduate graduation rate? (Remember, after you complete the task, your supervisor wants you to present your findings). Which service or services do you need to complete this task? (Check all that apply.) Task 3: How does tuition depend on the following three factors considered at the
ro of
same time: geographical area, the nature of the school (public or private), and selec-
tivity (defined as total number of admissions divided by total number of applicants)? (Remember, after you complete the task, your supervisor wants you to present your
-p
findings).
Which service or services do you need to complete this task? (Check all that apply.)
re
Task 4: How does the percentage of freshmen receiving financial aid depend on the type of school (private vs. public), tuition rate, and percentage of students who
present your findings).
lP
graduate? (Remember, after you complete the task, your supervisor wants you to
na
Which service or services do you need to complete this task? (Check all that apply.) Task 5: How does the percentage of female students depend on geographical lo-
ur
cation? Statistically test if the percentage of female students is the same for all geographical locations. (Remember, after you complete the task, your supervisor
Jo
wants you to present your findings). Which service or services do you need to complete this task? (Check all that apply.) Appendix C: Business services Core Studio (price $200). This service can be used to enter data into a spreadsheet and save them in a file. This file can then be opened again. It offers the ability to sort, filter, and conditionally format data. It offers the ability to make logical comparisons between two values. The features include the following: (1) create a 30
new (blank) spreadsheet file, (2) enter data, (3) save file, (4) open an existing file, (5) sort data, (6) filter data, (7) conditionally format data, and (8) make logical comparisons between two values. Chart Expert (price $300). This service creates charts such as line, pie, column, bar, area, scatter, and clustered column charts in a variety of colors. In addition, it creates geographical maps. It offers the ability to label axes and insert a title for the chart. The features include the following: (1) create a new line, pie, column, bar,
ro of
area, scatter, or clustered column chart or geographical map, (2) open an existing chart or a geographical map, (3) format charts and maps using colors and number formatting, and (4) format axis labels and chart/map titles.
Pivot Professional (price $400). This service creates pivot tables for summarizing,
-p
aggregating, and cross-tabulating data. Subsets of data can be created and formatted to illustrate simple statistics such as means, medians, counts, maximums, and
re
minimums. It has an easy-to-use drag-and-drop capability to visually create new tables for data subsets. It offers the ability to create charts such as line, pie, column,
lP
bar, area, scatter, and clustered column charts in a variety of colors. In addition, it creates geographical maps. It offers the ability to label axes and insert a title for the
na
chart. The features include the following: (1) create a new pivot table, (2) open an existing pivot table, (3) create subsets of larger datasets to summarize, aggregate, and cross-tabulate data, (4) format pivot tables with simple statistics such as means,
ur
medians, counts, maximums, and minimums, (5) filter data in the pivot table, (6) easy-to-use drag-and-drop capability to visually create new tables for data subsets,
Jo
(7) using data in the pivot table, create a new line, pie, column, bar, area, scatter, or clustered column chart or a geographical map, (8) open an existing chart or a geographical map, (9) format charts and maps using colors and number formatting, and (10) format axis labels and chart/map titles. Text Authority (price $300). This is a comprehensive service for manipulating text. Text Authority offers all the text functions users need to work with text in your data. The features include the following: (1) join several texts into a one long text 31
string, (2) segment text strings that are separated by a delimiter (e.g., fruit/plumb), (3) determine the number of characters in a text string, (4) replace characters within a text string (replace “a” with “b”), (5) substitute new text for old text in a text string (changing “le” to “ility” changes “able” to “ability”), (6) remove spaces from text, and (7) convert lowercase to uppercase (and vice versa). AlphaStatistics (price $400). This has simple statistical functionality. AlphaStatistics allows users to report basic statistics and conduct simple statistical analysis of
ro of
their data. This service is ideal for those who need only basic statistical features. The features include the following: (1) calculate minimum, maximum, average, median,
mode, standard deviation, and variance and (2) calculate correlation, covariance, moving averages, and confidence intervals.
-p
PinnacleStats (price $600). This has a comprehensive set of functions to conduct a wide array of statistical analyses. PinnacleStats is your ultimate statistical suite for
re
all your statistical needs. This service is ideal for those who need to conduct advance statistical analyses. The features include the following: (1) calculate minimum,
lP
maximum, average, median, mode, standard deviation, and variance, (2) calculate correlation, covariance, moving averages, and confidence intervals, (3) ANOVA, (4)
na
multivariate ANOVA, (5) F test, t, test, and z test, (6) structural equation modeling, (7) regression, (8) time-series analysis, (9) χ2 test, (10) discriminant analysis, (11)
ur
binary logit, and (12) factor analysis. Appendix D: Measurement of knowledge of data analysis
Jo
Knowledge of data analysis construct items was measured on a semantic differential continuous scale from 0 (very low) to 100 (very high). My knowledge of data analysis Data analysis refers to the process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making. DA1: I characterize my understanding of the various aspects of data analysis as: 32
DA2: I characterize my understanding of what data analysis involves as: DA3: I characterize my expertise in data analysis as: DA4: I characterize my ability to answer questions on data analysis as:
Jo
ur
na
lP
re
-p
ro of
DA5: In general, I characterize my overall knowledge of data analysis as:
33
Appendix E: Development of NAISOP measure This appendix details of the measurement model that uses perceptual data analysis and objective statistics classes and Excel experience (in months) to derive the secondAU:
In
order NAISOP measure (Table E1).
Table E1 is ‘DAGEN” correct? it
Is
Table E1. Factor loadings for NAISOP.
correct
that there is no
p
Loading t
“DA5”?
Are
p
all values
1st-order DA
ro of
correct?
(data analysis) 0.954
159.467 0.000
DA2
0.950
138.227 0.000
DA3
0.957
153.884 0.000
DA4
0.968
215.751 0.000
DAGEN
0.983
281.129 0.000
re
lP
1st-order STATEXC (statistics classes
-p
DA1
and months of Excel experience)
EXCMNTS
na
STATCLA
0.657
12.745
0.000
0.744
16.386
0.000
0.669
15.748
0.000
0.816
22.688
0.000
2nd-order NAISOP
Jo
DA
ur
STATEXC
34
Padmal Vitharana is a professor of information systems in the Martin J. Whitman School of Management at Syracuse University. He received his PhD degree from the University of Wisconsin-Milwaukee. His research expertise lies in system analysis and design. His research has been published in leading journals, such as the IEEE Transactions on Software Engineering, IEEE Transactions on Systems, Man, and Cybernetics, Journal of Management Information Systems, Marketing Science, Journal of the Association for Information Systems, Communications of the ACM, Database for Advances in Information Systems, Communications of the Association for Information Systems, Information Resource Management Journal, Marketing Science, and Information & Management
Jo
ur na
lP
re
-p
ro
of
Amiya Basu is a professor of marketing in the Martin J. Whitman School of Management at Syracuse University. He received his PhD degree from the Stanford Graduate School of Business of Stanford University. His research interests include salesforce compensation, pricing, and stochastic models. His research has been published in Marketing Science, Journal of Marketing Research, Journal of Retailing and International Journal of Research in Marketing.