Innovation analytics: Leveraging artificial intelligence in the innovation process

Innovation analytics: Leveraging artificial intelligence in the innovation process

Business Horizons (xxxx) xxx, xxx Available online at www.sciencedirect.com ScienceDirect w w w. j o u r n a l s . e l s e v i e r. c o m / b u s i ...

805KB Sizes 4 Downloads 103 Views

Business Horizons (xxxx) xxx, xxx

Available online at www.sciencedirect.com

ScienceDirect w w w. j o u r n a l s . e l s e v i e r. c o m / b u s i n e s s - h o r i z o n s

Innovation analytics: Leveraging artificial intelligence in the innovation process ¨ller a,b,* Chinmay Kakatkar a, Volker Bilgram a, Johann Fu a b

HYVE AG, Schellingstrasse 45, 80799 Munich, Germany University of Innsbruck, 6020 Innsbruck, Austria

KEYWORDS Innovation analytics; Artificial intelligence; Innovation process; Front-end innovation; Machine learning

Abstract Artificial intelligence (AI) is about imbuing machines with a kind of intelligence that is mainly attributed to humans. Extant literaturedcoupled with our experiences as practitionersdsuggests that while AI may not be ready to completely take over highly creative tasks within the innovation process, it shows promise as a significant support to innovation managers. In this article, we broadly refer to the derivation of computer-enabled, data-driven insights, models, and visualizations within the innovation process as innovation analytics. AI can play a key role in the innovation process by driving multiple aspects of innovation analytics. We present four different case studies of AI in action based on our previous work in the field. We highlight benefits and limitations of using AI in innovation and conclude with strategic implications and additional resources for innovation managers. ª 2019 Kelley School of Business, Indiana University. Published by Elsevier Inc. All rights reserved.

1. The promise of AI Artificial intelligence (AI) is currently a popular topic in business and its application has been explored across disciplines by academics and practitioners (Chui et al., 2018; Kietzmann & Pitt, 2020). One recurring conclusion is that AI will affect some business activities more than others,

* Corresponding author E-mail addresses: [email protected] (C. Kakatkar), [email protected] (V. Bilgram), johann. [email protected] (J. Fu ¨ller)

depending on the degree of creativity inherent to the activity. The higher the level of creativity, the harder it will be for AI to add value. This poses an interesting dilemma for managers responsible for driving the process of innovation within firms: Can AI significantly support the innovation process even if highly creative tasks cannot be completely automated? AI is about imbuing machines with a kind of intelligence that is mainly attributed to humans (Russell & Norvig, 2010). Arguably, AI can play the role of creative enabler and partner of the innovation manager across the data-driven innovation process. As with the predominant use of AI in other

https://doi.org/10.1016/j.bushor.2019.10.006 0007-6813/ª 2019 Kelley School of Business, Indiana University. Published by Elsevier Inc. All rights reserved.

2 Figure 1.

C. Kakatkar et al. An iteration of the innovation process at the fuzzy front end

practical applications, considerable value can be captured using AI to automate mundane yet effortintensive tasks (e.g., smart data gathering, web crawling, data cleansing). Taken further, data scientists can help innovation managers leverage unstructured big datadconsisting of text, sound, and videodwhich is ubiquitous in the innovation process (Davenport & Patil, 2012; George, Osinga, Lavie, & Scott, 2016; Paschen, Pitt, & Kietzmann, 2020). In this article, we describe how AI can realistically be deployed at the fuzzy front end of innovation. We discuss how AI can enable innovation analytics, a term we use to describe the derivation of computer-enabled, data-driven insights, models, and visualizations within the innovation process. To frame our discussion, we consider the innovation process as a double diamond model that spans the exploration and selection of concepts in the problem and solution space. We present four case studies of AI in actiondone for each part of the innovation processdbased on our previous work in the field. The project teams in these case studies consist not only of traditional innovation practitioners but also data scientists that help the teams unlock the potential of AI, demonstrating how AIenabled innovation analytics can yield richer insights in a cost-effective manner. We conclude with implications for innovation managers and

highlight the benefits as well as the limitations of using AI.

2. The innovation process as a double diamond The fuzzy front end of the innovation process entails a convergence-divergence dynamic that spans problems and solutions; this representation is often called the double diamond, made famous by a similarly named visual framework developed by the U.K. Design Council (Howard, Culley, & Dekoninck, 2008). The double diamond captures a generic process that is widely applicable to innovation practices and helps frame our subsequent discussion. Crucially, the double diamond breaks down the innovation process into phases that can be directly mapped to different use-cases of AI-enabled innovation analytics. Figure 1 presents one iteration of the double diamond, which will serve as our reference point for the fuzzy front end of the innovation process. The double diamond consists of two basic dichotomies (Dorst & Cross, 2001). First, we differentiate between problems and solutions. A problem can be defined as the unmet need of a given stakeholder (e.g., a customer or user of a company’s product). A solution is the tangible or intangible innovation that can solve a given problem. Second, we differentiate between

Leveraging artificial intelligence in the innovation process exploration and selection. Exploration captures the notion of generating and discovering new insights, while selection involves filtering and combining the insights. Mapping these dichotomies produces the four distinct phases of the double diamond. In the problem exploration phase, we seek to understand the full range of possible problems in our given innovation context. The problem selection phase whittles down the identified problem set to a more tractable subset based on several criteria related to business constraints (e.g., availability of budget and necessary capabilities, opportunity costs), predefined strategic objectives (e.g., achieving operational efficiency, boosting growth), and other outside influences (e.g., market demands). In the solution exploration phase, the shortlist of problems acts as a reference point for generating new solutions and discovering or reframing existing solutions that may be relevant. The shortlist of problems focuses and limits the exploration of solutions; this is different from the situation in problem exploration in which the point of reference may be something more abstract (e.g., negative feelings customers may have toward a product). Finally, the solution selection phase picks out the most promising solutions to be processed further. Solutions are typically selected by a panel of subject matter experts and top executives, judging on areas such as novelty, feasibility, and commercial viability.

3. AI for innovation analytics Computers often play a distinctly servile role within the innovation process, used predominantly to handle tasks that innovation managers might perceive as too monotonous or arduous to do manually. Such tasks can involve routine data processing along predefined procedures, replicating results, and storing data in various file formats and databases. AI promises to fundamentally expand the role of technology in the innovation process by elevating computers from mere servants to partners, thereby empowering humans to further express creative strengths and values. In addition to the capabilities of ‘normal’ computers, AI-powered computers can perform deeper analyses of data (e.g., recognizing patterns in data, deriving latent variables, spotting anomalies), help make decisions under uncertainty (e.g., generating predictions, dealing with information asymmetry), and improve over time by continuously incorporating external feedback.

3

Data scientistsdwith their rare and potent mix of AI-relevant programming skills, statistical knowhow, and commercial acumen (Davenport & Patil, 2012)dcan help innovation managers unlock the potential value of AI in innovation projects. AI can substantially influence at least four main drivers of innovation analytics: (1) specification of objectives, (2) data collection and preparation, (3) modeling, and (4) value capture. These aspects are discussed in subsections 3.1.e3.4.

3.1. Specification of objectives The objectives of using innovation analytics in a given situation can be mapped to concrete, AIenabled analyses, which can generally be broken down into four categories (Wedel & Kannan, 2016): 1. Descriptive analysis is exploratory in nature and concerned with summarizing and visualizing historical data; 2. Diagnostic analysis uses past data to establish links between different concepts or events, which allows the innovation manager to drill down into specific parts of the data and combine them to develop and test hypotheses; 3. Predictive analysis synthesizes past and even real-time data to build models that can forecast or guess the future state of variables the innovation manager may be interested in; and 4. Prescriptive analysis not only predicts the future but it is also opinionated in the sense that it can recommend what to do in the future and how to do it.

3.2. Collection and preparation of data Several types of data can be collected from various sources throughout the innovation process. Some of this is structured data, which can be neatly represented as tables of well-defined rows (i.e., the observations) and columns (i.e., the variables or attributes). Structured data has long been a staple of innovation analytics, and many established statistical methods for regression and classification analyses depend on such data. Yet, a significant amount of unstructured data is also collected during the innovation process from textual, audio, and video sources. Unstructured data needs to be parsed and coded in some way to extract underlying nuggets of information (e.g., semantics, tone, central themes). So far, the

4

C. Kakatkar et al.

analysis of unstructured data in innovation research has heavilydif not totallydrelied on labor-intensive, qualitative methods like netnography (Kozinets, 2002). AI-based methods for text mining offer a complementary approach to analyzing unstructured data (Wedel & Kannan, 2016).

3.3. Modeling Modeling the real world is a central motivation behind using AI for innovation analytics. Figure 2 shows a two-by-two matrix for organizing common algorithms and determining when to use them. One dimension of the matrix distinguishes between regression and classification algorithms. Regression algorithms are used to analyze continuous or ordinal output data (e.g., the sales forecast of a newly launched product), whereas classification algorithms are highly suitable for unordered categorical output data (e.g., predicting the key customer segment for a new product). The other dimension of the matrix differentiates between supervised and unsupervised learning. In supervised learning, a dataset containing the Figure 2.

mapping between input and output data is given, and the task is to derive a mathematical function that best approximates this mapping. In unsupervised learning, output data is not labeled as such, and the task is to identify patterns in the data in a more exploratory manner. Supervised learning could be used to train a model on historical sales data in order to predict the success of new product launches, whereas unsupervised learning might use clustering techniques to discover commonalities and anomalies in the data.

3.4. Value capture It is important to capture the value that is created using AI-based innovation analytics. Value can be captured at the stages of AI output and the corresponding model, and as the innovation team reflects on the implications of the output of the model. The output of descriptive and diagnostic analyses can encompass a range of visualizations, from basic tables and graphs to tree-based and network-based diagrams of latent structures in the data; the latter especially emphasizes the value of AI. Meanwhile, in predictive and prescriptive analyses, key parts of the output could include the

Organizing AI algorithms by type and labeling of output data

Leveraging artificial intelligence in the innovation process predicted future state, the amount of confidence that can be ascribed to this prediction, and the importance or weighting assigned to each of the input variables in deriving the prediction (Varian, 2014). Besides the output, the model itself can be captured for future analysis (e.g., by saving the matrix of input variable weightings as a digital file format), which has several advantages: The model can be reused, critiqued, and shared with others. Finally, the innovation team can also capture value by comparing their insights to those produced by the AI. Whereas the insights will likely overlap to some degree, the AI may also yield insights that are new to the innovation team.

4. Case studies of AI-based innovation analytics in practice We present four case studies, one for each phase in the innovation process shown earlier in Figure 1. Each originates from our involvement in related innovation projects and was carried out by crossfunctional teams consisting of traditional innovation analysts/managers and data scientists with AI expertise. Rather than attempting to provide a deep-dive of AI methods within the confines of these short case studies, our aim is to inspire traditional innovation managers to consider working more closely with data scientists and leverage the potential value of AI in innovation projects.

4.1. Case #1: Discovering consumer needs for personal care products The first case concerns a large German manufacturer of personal care products. In the pursuit of new blockbuster products, the company commissioned an innovation team to conduct an in-depth analysis of online user-generated content (UGC) on consumer needs. The goal was to derive a set of key problem areas in the body care segment around which ideas for new products could be developed. Going through the large volume of relevant UGC requires a high level of manual effort, leading innovation managers to limit their scope of analysis based on experience as well as time and resource constraints. In this case, the innovation team decided to investigate AI’s potential to reduce analysis burden and even improve the quality of insights generated. Two different objectives were specified: AI would be used (1) to distinguish plausible consumer needs and the problems they imply from mere chatter and other editorial content, and (2) to generate a descriptive clustering of the identified needs,

5

which could later be refined by the innovation team. AI could thus facilitate a more extensive exploration of the problem domain than the purely manual approach and leverage more of the UGCrelated big data. The team began by using its domain expertise to identify and search online discussion forums related to body care, yielding about 1.75 million posts. Most of the posts were up to 3 sentences long. Posts related to consumer needs then had to be identified in this initial haul of content. Such a binary classification of posts required a subset of the posts to be labeled as consumer need/not a consumer need in order to allow a supervised algorithm to classify the rest of the posts. To label the subset of posts, an instruction manual for qualitative coding was prepared and a member of the project team labeled about 5,000 posts manually. Finally, using an automated procedure, each of the words in the raw textual data were encoded as unique numerical vectors that could be fed into the AI algorithms (Goldberg, 2016). Data scientists on the team built the algorithm for identifying consumer needs using neural nets (Goldberg, 2016). Neural nets can decompose the task of learning about a large piece of textual data into several simpler tasks. Upon classifying the UGC as consumer needs/not consumer needs, a descriptive clustering of the consumer needs data was performed using an unsupervised algorithm based on latent Dirichlet allocation (LDA; Blei, Ng, & Jordan, 2003). The LDA approach considered the entire set of forum posts and supposed the existence of a certain number of topicsdconsisting of a set of related wordsdthat could be variously combined to generate each of the forum posts. The LDA task would be to infer such a set of topics while ensuring a balance between the number of topics and the words per topic. Crucially, the topics served as approximations of the themes present in the UGC related to consumer needs. Value was captured from the application of AI in several ways. First, the algorithms identified needs-related posts with 75% accuracy. The LDA approach generated topics that seemed plausible enough that the innovation team saw serious potential in using AI more consistently in future projects as a way to reduce the manual burden at the initial stages of data analysis. Far from replacing traditional innovation practitioners, however, the use of AI exhibited a way to redefine their role within the analytical process. Practitioners can move away from the monotonous task of coding text and generating initial topic clusters and toward orchestrating and improving the interpretation process in collaboration with data

6 scientists. The visualization of the output (e.g., color-coded topic clusters derived from LDA) made the inner workings of the AI algorithms more transparent, further aiding the innovation team’s understanding of the raw textual content. Finally, one team member with domain expertise independently conducted a qualitative analysis of the text to derive potential problem areas for consumers and found that the AI output largely validated her insights. She did generate fewer initial problem or topic clusters than the LDA-based approach, however, and was thus able to draw on the automated AI output to enrich her manual findings.

4.2. Case #2: Identifying high-impact problems related to semiconductor chips The second case study concerns a leading American manufacturer of semiconductor chips. While its direct customers were primarily other businesses (e.g., computer and smartphone manufacturers), the company was also keenly aware of the importance of serving the needs of end users that range from oblivious consumers to professionals and serious hobbyists considered to be lead users of chip-based products. These lead users may be active online, functioning as influencers on social media and thus shaping the end users’ views on semiconductor chips. By considering the problems that affect this user group, the company believed it could target high-impact problems related to the chips. It set up a cross-functional innovation team that included data scientists to analyze the end users’ data in order to identify potential lead users and develop an initial shortlist of high-impact problems. The company limited the scope of analysis to end users based in the U.K., which happened to be a strategically important market. In the absence of AI, the innovation team may have responded to such a client project by conducting a qualitative scan of relevant UGC on social media platforms, blogs, and forums to manually derive key characteristics of potential lead users, identify a sample of such lead users, and then dive deeper into the sample to produce an initial shortlist of high-impact problems. With the help of data scientists, the online UGC could be mined more efficiently by using AI to yield a greater sample of lead users than the innovation team could have picked manually in the project timeframe. As such, the team had two specific objectives for AI: to (1) identify U.K.-based lead users of semiconductor chips via their online activity and (2) derive an initial breakdown of the

C. Kakatkar et al. key problem areas highlighted by the lead users’ UGC. Collecting and preparing data to enable the AI modeling was far from straightforward. For example, it was difficult to ascertain whether usersdlet alone lead usersdwere based in the U.K. Geographic locations of forum domains based on IP addresses were used as proxies for identifying U.K.-based users. Another key difficulty was separating UGC related to semiconductor chips from editorial content, idle chatter, and other metacontent related to the forum mechanics (e.g., posting, uploading). Thus, as with the first case study, the data scientists built a prescriptive algorithm to classify the UGC and filter out relevant posts. The data collection and preparation ultimately yielded about 80,000 relevant posts fit for further analysis. The team developed AI models for each of the specified research objectives. To identify potential lead users, members took an unsupervised learning approach and clustered users based on their influence, a combination of their reach (e.g., based on the number of followers they had on social media) and reputation (e.g., the number of times their posts were cited, liked, or forwarded). In operational terms, this meant mathematically formalizing the online activity of the users as a network of nodes (one for each unique user) and edges between pairs of nodes (representing interactions between the respective users). These network metrics served to operationalize the users’ level of influence and produce a handful of user clusters (Kakatkar & Spann, 2019); the team deemed one of the clusters as most representative of lead users. Using supervised learning, members manually identified and labeled a small subset of lead users and used it to train a classification model to approximately identify the remaining lead users; a random check conducted by the team suggested that 90% of the identified lead users were plausible. Next, to derive a shortlist of high-impact problems, the team analyzed the UGC of the lead users using two combined approaches. First, members used an unsupervised approach to determine the key points in a given lead user’s posts by adopting the LDA method discussed earlier. In parallel, they used a supervised learning approach to determine the valence of the posts (i.e., whether posts refer to something positive or negative about semiconductor chips). They also used curated semantic models of the English language to determine positive/negative valences; while these curated models happened to be proprietary, highly effective open-source models can be found online (e.g., see nltk.org and spacy.io).

Leveraging artificial intelligence in the innovation process Value was captured in terms of efficiency gains and depth of insight produced during the project. Within roughly 4 weeks, the innovation team identified about 200 potential lead users and uncovered 8 key problem areas (e.g., trade-offs between chip performance and energy efficiency, requirements for semiconductor chips in the premium gaming segment). The team estimated that identifying a similar number of potential lead users manually would have taken 4e5 times longer and resulted in higher project costs. Although the shortlist of problem areas was not completely surprising to the team in hindsight, some of what the data revealed (e.g., gaming, energy tradeoffs) was illuminating. The modular and transparent nature of the AI modeling and visualizations reassured the innovation team of the soundness of its chosen approaches and empowered it to work closely with the data scientists to customize the AI models further.

4.3. Case #3: Solution exploration via crowdfunding platforms The third case study is primarily concerned with solution exploration, although it touches on elements of solution selection as well (Bilgram, Gluth, & Piller, 2017). The innovation team considered a perennial question of innovation management: Given a specific problem, is it possible to generate solutions with a high likelihood of success? This is difficult because, at the time a novel solution is generated, there may be too little information to gauge its potential for success in the market. Crowdfunding platforms represent an interesting source of data for predicting market success. When solutionsdranging from products to services and ideas for startupsdare pitched on a crowdfunding platform, other people can fund the development and delivery of the solutions. The amount of funding attracted represents a tangible measure of the market demand for a given solution. Recent research by Kaminksi, Hopp, and Tykvova ´ (2019) showed that the funding volume of crowdfunding clusters correlates to venture capital investments in the respective clusters. Analyzing the features of well-funded solutions on crowdfunding platforms could stimulate the exploration of additional promising solutions. It would be difficult, if not impossible, for someone to manually comb through all the solution-related content on crowdfunding platforms. The content is typically unstructured data (e.g., a textual description and images of the solution). As such, the innovation team opted to use AI for solution exploration. It looked to derive

7

descriptive clusters of solutions published on various crowdfunding websites and investigate the aggregated funding for the solutions per cluster as an indicator of their relevance to users. Data scientists on the team built a web crawler to automate the collection of solution-related data from the crowdfunding websites. The data included a textual description of the solution, the funding goal (i.e., the stated amount of funding the person proposing the solution wished to receive from the crowd), the funding deadline, and the resultant funding that was achieved by the deadline. To keep the scope of analysis tractable, only high-tech product solutions published between 2013 and 2015 were considered. In total, they collected about 5,700 solutions. As in the first case study, the data scientists used unsupervised learning based on LDA to cluster the solutions in order to reveal latent topics in the textual descriptions. They then went a step further, grouping the initial clusters based on semantic similarities to produce a smaller set of larger metaclusters that could easily be analyzed by innovation analysts/managers. Value in this case study was captured in several ways. First, 68 distinct solution clusters were derived, most of which consisted of about 50 solutions. Out of these, 11 metaclusters emerged. One was labeled smart home and living and included related solution clusters like smart lighting and smart security. Using the quantitative funding data, the total funding volume and success rate per cluster was also determined. The combination of LDA-based clustering and funding statistics yielded insights that largely validated the team’s understanding of the solution types that would be uploaded on the crowdfunding platforms. For instance, several clusters corresponded to the solution space opened up by the Internet of Things, which was trending during the 2013e2015 timeframe. However, the solution clusters that received the largest amount of funding included topics that were not necessarily trending (e.g., headphones, DIY). The analyses suggested that by combining the views of the solution producers (inferred via LDA clustering) and potential consumers (based on funding data), the team could optimize the exploration of new solutions.

4.4. Case #4: Selecting promising crowdsourced solutions for new product development The final case study centers around the challenge of selecting promising solutions for further consideration within new product development

8 processes (Kakatkar, de Groote, Fueller, & Spann, 2018). A large American food company conducted an innovation contest focused on generating new solutions for the traditional chocolate bar; participation in the contest was open to all members of the public. While this allowed the company to source several different solutions relatively easily, selecting the most promising solutions among the group was still a challenge. Selection is usually the responsibility of a jury team consisting of senior executives and subject matter experts, which can be expensive to carry out and unreliable as each jury member comes with his/her own bias. An innovation team was set up to address the issue of solution selection. AI was used in two ways in the selection process. First, given a database of solutions harvested during the contest, AI was used to derive latent features of each solution. Kornish and Ulrich (2014) suggested that the future success of a solution may be explained not only by external factors such as market conditions but also by observable and latent features of the solution itself (e.g., its form and function). Second, AI models were developed to link solution features to measures of solution success (e.g., ratings in an innovation contest, sales data in case the solutions have already been launched in the market) to provide a basis for automating solution selection. A little over 1,000 people from more than 60 countries took part in the chocolate innovation contest. The contest ran for 6 weeks, at the end of which roughly 470 solutions were rated by an expert jury team that included senior executives for product marketing at the food company. The community of contestants was able to rate these solutions. The innovation team received a dataset that included the textual idea descriptions submitted by contestants as well as average expert and community ratings for each idea. Data scientists on the innovation team built two different AI models. First, they used LDA to derive latent topics in the textual solution descriptions (e.g., chocolate flavor, color, taste, seasonality), which were taken to be latent features of the corresponding ideas. Second, they implemented a supervised learning approach based on a random forest consisting of 500 decision trees, which modeled the statistical relationship between solution features (i.e., the inputs, or x variables) and ratings (i.e., the outcomes, or y variables). Each decision tree suggested a somewhat different relationship between the x and y variables (e.g., varying correlations in positive and negative directions). To avoid overcommitting to the suggestion of any one decision tree, the random forest

C. Kakatkar et al. model took an aggregated view by essentially averaging the output of all 500 decision trees. Value was captured in several respects in this case. First, the LDA-based method yielded 20 different features of solutions for redesigning chocolate bars, including features related to ingredients, taste, smell, seasonality, and texture. Moreover, the innovation team was intrigued to find solutions that shared similar features but received positive and negative ratings from the community or expert jury, suggesting that subtle changes in the combination of features could have a significant impact on the solution ratings. Interestingly, the random forest model was up to 23% better at predicting the average expert ratings than the average community ratings based on the solution features. Moreover, the innovation team used the Gini coefficientda feature importance measure for random forests (Varian, 2014)dto identify the features that primarily accounted for the link to high ratings; in this case study, the key features happened to be related to seasonal attributes (e.g., whether the chocolate bar was especially suitable in the summer) and social attributes (e.g., to what extent the bar could be shared with others). The ability to track the process of AI modeling transparently, coupled with the predictive power of the resulting model, suggests that AI can indeed help automate solution selection. Table 1 summarizes the case studies along the phases of the innovation process and the aspects of innovation analytics.

5. Implications for innovation managers AI can play an important role in the innovation process, from the exploration of problems to the selection of solutions. As underscored by our reallife case studies, AI can substantially drive innovation analytics. In our research, we found three main implications for how AI can fundamentally change the way innovation managers think about leveraging such technology in the innovation process. First, AI draws considerable value from big data; innovation teams can leverage large volumes of data and carry out insightful analyses that are highly scalable and replicable. Although there is still much room to enhance the sophistication of algorithms designed to parse unstructured data (Kakatkar & Spann, 2019; Wedel & Kannan, 2016), the tools currently available can cover a number of lowhanging fruits among the use cases in the innovation process (e.g., sentiment analysis, cluster analysis, content coding). Second, AI can empower

Summary of case studies Case studies

Aspects of innovation analytics

Case 1

Case 2

Case 3

Case 4

Solution exploration via crowdfunding platforms

Solution selection for new product development

Phase of innovation process

Problem exploration

Problem selection

Solution exploration

Solution selection

Specification of objectives What is the focus of the analysis: descriptive, diagnostic, predictive, or prescriptive?

 Prescriptive identification of consumer needs  Descriptive clustering of needs to produce qualitative insights

 Prescriptive identification of U.K.-based lead users of chips  Derive shortlist of key problem areas highlighted by UGC of lead users

 Generating descriptive clusters of solutions published on various crowdfunding websites

 Derive latent features of solutions  Find links between solution features and measures of solution success

Collection and preparation  1.75 million posts scraped from 12+ online of data What will the data look like? forums How will the data be  Manually coded 5,000 collected? posts to produce What will the data training data for superpreparation entail? vised algorithm used later on

 Used web domain of forums as a proxy for identifying U.K.-based users  Scraped forums and filtered out UGC using a supervised AI algorithm to yield w80,000 relevant posts

 Using a web crawler to retrieve 5,700+ solutions from crowdfunding websites  Focused on timespan from 2013e2015 and the high-tech sector

 Data was collected from an innovation contest done by a global food company on the topic of designing new chocolate bars  w470 solutions rated by an expert jury were selected

 Supervised neural network for identifying consumer needs  Unsupervised topic modeling using LDA for clustering needs

 Unsupervised clustering based on network analysis to identify influencers  Derived shortlist of problems using a combination of LDA and supervised learning

 Unsupervised learning with LDA cluster solutions based on latent topics that emerged from the textual descriptions  Merged clusters to get a set of larger metaclusters

 LDA was used to derive latent features from the textual solution descriptions  Supervised learning based on a random forest model was used to find the link between solution features and ratings

 75% accuracy in classiValue capture fying consumer needs What is the AI output? What can we learn from the  Generated insightful AI output? need clusters How does the AI  AI output validated and complement the researcher? enriched innovation team’s understanding

 Efficiently identified w200 potential lead users  Uncovered w8 key problem areas for chip makers  AI visualizations (networks, color-coding sentiment in text) yielded additional insights

 68 distinct solution clusters were derived, most of which consisted of w50 solutions and 11 meta-clusters  Clusters and funding statistics jointly provided insights

 LDA-based method yielded 20 different features of solutions for redesigning chocolate bars  Random forest model was up to 23% better at predicting expert ratings than community ratings and shed light on important features

Modeling Supervised or unsupervised? Regression or classification?

9

Discovering consumer needs Identifying high-impact for personal care products problems related to semiconductor chips

Leveraging artificial intelligence in the innovation process

Table 1.

10

Table 2.

AI resources for innovation practitioners

Resource type

Examples

Online courses For a guided introduction by key people that have shaped our understanding and usage of AI today

 Coursera courses on executive data science (Johns Hopkins University) that enable innovation managers to lead analytics projects  Coursera courses on applied data science with Python specialization (University of Michigan), which covers text mining and social network analysis  Coursera courses on machine learning and deep learning led by Andrew Ng, a pioneer of online learning, cofounder of Coursera, adjunct professor at Stanford University, and former head of Baidu AI Group/Google Brain

Readings For a deeper immersion in the subject matter, to understand its implications for innovation managers

 Books written for a general audience (e.g., Agrawal, Gans, & Goldfarb, 2018; Domingos, 2015)  Textbooks for definitions of key concepts and standard algorithms (e.g., Russell & Norvig, 2010; Goodfellow, Bengio, & Courville, 2016)  Academic articles related to AI by economics and management scholars (e.g., Varian, 2014; Wedel & Kannan, 2016; George, Osinga, Lavie, & Scott, 2016; Kakatkar, Groote, Fu ¨ller, & Spann, 2018)

Tools For implementing AI use cases in practice

 Two popular languages for AI are Python and R  Anaconda is a user-friendly distribution of Python that includes virtually everything that innovation teams need to get started out of the box (Python, AI software libraries, code editor, testing visualization support)  Similarly, Rstudio is a convenient programming environment for R  The NLTK and spaCy libraries for Python are particularly useful for text mining  The NetworkX library for Python and the software Gephi are useful for network analysis  Other point-and-click tools for educational purposes and quickly building AI models include KNIME and Dataiku

Demos and sample code For seeing concrete uses of AI that are relevant to innovation analytics

    

C. Kakatkar et al.

Various AI demos by Google (https://experiments.withgoogle.com/collection/ai) Various AI demos, related papers, and code by leading AI experts (http://deeplearning.net/demos/) Build and play with neural networks in your online browser (https://playground.tensorflow.org/) Tutorial on text mining and LDA with Python (https://github.com/RaRe-Technologies/gensim) Simple guide to LDA (https://www.analyticsvidhya.com/blog/2016/08/beginners-guide-to-topic-modeling-inpython/)

Leveraging artificial intelligence in the innovation process innovation managers to work closely with data scientists to delegate tasks of greater creative complexity to the computer; our case studies from the field suggest that AI can often help validate creative insights and minimize our creative blind spots. Third, AI can enable those engaged in the innovation process to better answer existing questions and also to ask better questions based on AI models that account for a range of complex interactions between variables; such models can also be used to inductively derive new hypotheses or questions about the innovation scenario (Puranam, Shrestha, He, & von Krogh, 2018). The use of AI in innovation also comes with its challenges. Traditional innovation teams may not have the know-how to build and use AI models and will thus need to collaborate closely with data scientists, ideally making them core team members from the start. Also, current technological limitations mean that, at least in the short term, the output of AI may not be as contextually nuanced as analyses prepared by humans. Any contributions from AI should be evaluated critically and complemented by the innovation team as needed. Moreover, AI methods, which typically deal with correlations, do not obviate the need for controlled experiments to establish causal effects. In their book Prediction Machines, Agrawal, Gans, and Goldfarb (2018, p. 9) claimed that “everyone has had or will soon have an AI moment,” which is essentially characterized by the realization that AI is not just another technology but something fundamentally deeper that forces us to reexamine our understanding of being human. For those engaged in the innovation process, such a realization could come when working with AI to discover insights that neither human nor computer could achieve alone. To this end, we compiled a set of resources in Table 2 for getting started with AI in practice. The resources are by no means exhaustive but cover a range of introductory learning material and open-source tools, with brief notes on their application to innovation analytics. We hope these resources are of particular value to traditional innovation practitioners who wish to collaborate more closely with data scientists to make use of AI during the course of innovation projects in the future.

References Agrawal, A., Gans, J., & Goldfarb, A. (2018). Prediction machines: The simple economics of artificial intelligence. Boston, MA: Harvard Business Review Press.

11

Bilgram, V., Gluth, O., & Piller, F. (2017). Crowdfunding data as a source of innovation. Marketing Review St. Gallen, 34(3), 10e18. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(4/5), 993e1022. Chui, M., Manyika, J., Miremadi, M., Henke, N., Chung, R., Nel, P., et al. (2018). Notes from the AI frontier: Insights from hundreds of use cases. Washington, DC: McKinsey & Company. Davenport, T. H., & Patil, D. J. (2012). Data scientist. Harvard Business Review, 90(5), 70e76. Domingos, P. (2015). The master algorithm: How the quest for the ultimate learning machine will remake our world. New York, NY: Basic Books. Dorst, K., & Cross, N. (2001). Creativity in the design process: Co-evolution of problem-solution. Design Studies, 22(5), 425e437. George, G., Osinga, E. C., Lavie, D., & Scott, B. A. (2016). Big data and data science methods for management research. Academy of Management Journal, 59(5), 1493e1507. Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345e420. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge, MA: The MIT Press. Howard, T. J., Culley, S. J., & Dekoninck, E. (2008). Describing the creative design process by the integration of engineering design and cognitive psychology literature. Design Studies, 29(2), 160e180. Kakatkar, C., de Groote, J., Fueller, J., & Spann, M. (2018). The DNA of winning ideas: A network perspective of success in new product development. In Academy of management proceedings (p. 11047). Briarcliff, NY: AOM. Kakatkar, C., & Spann, M. (2019). Marketing analytics using anonymized and fragmented tracking data. International Journal of Research in Marketing, 36(1), 117e136. Kaminski, J., Hopp, C., & Tykvova ´, T. (2019). New technology assessment in entrepreneurial financing: Does crowdfunding predict venture capital investments? Technological Forecasting and Social Change, 139, 287e302. Kietzmann, J., & Pitt, C. (2020). AI and machine learning: What general managers need to know. Business Horizons, 63(2) (XXXeXXX). Kornish, L., & Ulrich, K. (2014). The importance of the raw idea in innovation: Testing the sow’s ear hypothesis. Journal of Marketing Research, 51(1), 14e26. Kozinets, R. V. (2002). The field behind the screen: Using netnography for marketing research in online communities. Journal of Marketing Research, 39(1), 61e72. Paschen, U., Pitt, C., & Kietzmann, J. (2020). Artificial intelligence: Building blocks and an innovation typology. Business Horizons, 63(2) (XXXeXXX). Puranam, P., Shrestha, Y. R., He, V. F., & von Krogh, G. (2018). Algorithmic induction through machine learning: Opportunities for management and organization research (Working paper No. 2018/11/STR). Paris, France: INSEAD. Russell, S. J., & Norvig, P. (2010). Artificial intelligence: A modern approach (3rd ed.). Upper Saddle River, NJ: Prentice Hall. Varian, H. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives, 28(2), 3e28. Wedel, M., & Kannan, P. K. (2016). Marketing analytics for datarich environments. Journal of Marketing, 80(6), 97e121.