Accepted Manuscript
Survey of Conversational Agents in Health Joao Luis Zeni Montenegro , Cristiano Andre´ da Costa, Rodrigo da Rosa Righi PII: DOI: Reference:
S0957-4174(19)30228-3 https://doi.org/10.1016/j.eswa.2019.03.054 ESWA 12580
To appear in:
Expert Systems With Applications
Received date: Revised date: Accepted date:
10 September 2018 30 March 2019 30 March 2019
Please cite this article as: Joao Luis Zeni Montenegro , Cristiano Andre´ da Costa, Rodrigo da Rosa Righi , Survey of Conversational Agents in Health, Expert Systems With Applications (2019), doi: https://doi.org/10.1016/j.eswa.2019.03.054
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights • We propose a conversational agents taxonomy in health, detailing chal-
CR IP T
lenges and goals • The methodology consisted in a systematic literature review of the area
• Elderly people are considered a huge challenge to conversational agents in health
• The main goal of the conversational agents is to assist users and physicians.
AN US
• Text and multimodal (speech, facial and gestures) dialog are widely used
AC
CE
PT
ED
M
in health
1
ACCEPTED MANUSCRIPT
Survey of Conversational Agents in Health Joao Luis Zeni Montenegro, Cristiano Andr´e da Costa, Rodrigo da Rosa Righi
CR IP T
Software Innovation Laboratory - SOFTWARELAB, Applied Computing Graduate Program, Universidade do Vale do Rio dos Sinos - UNISINOS, Av Unisinos 950, Sao Leopoldo, 93022-750, Brazil, Phone: 55 5135908161, Fax: 55 5135908162, Email:
[email protected]
Abstract
AN US
Artificial intelligence (AI) has transformed the world and the relationships among humans as the learning capabilities of machines have allowed for a new means of communication between humans and machines. In the field of health, there is much interest in new technologies that help to improve and automate services in hospitals. This article aims to explore the literature related to conversational agents applied to health care, searching for definitions, patterns,
M
methods, architectures, and data types. Furthermore, this work identifies an agent application taxonomy, current challenges, and research gaps. In this work,
ED
we use a systematic literature review approach. We guide and refine this study and the research questions by applying Population, Intervention, Comparison, Outcome, and Context (PICOC) criteria.
PT
The present study investigated approximately 4145 articles involving conversational agents in health published over the last ten years. In this context, we finally selected 40 articles based on their approaches and objectives as related
CE
to our main subject. As a result, we developed a taxonomy, identified the main challenges in the field, and defined the main types of dialog and contexts related
AC
to conversational agents in health. These results contributed to discussions regarding conversational health agents, and highlighted some research gaps for future study. Keywords: Conversational Agents, Chatbot, Expert Systems, Systematic Review, Health.
Preprint submitted to Journal of LATEX Templates
March 30, 2019
ACCEPTED MANUSCRIPT
1. Introduction There are many technologies driving considerable changes in the health field involving physicians, patients, and other health professionals in general. Tech-
CR IP T
nology growth in this field has led to improvements in usability and accessibility
(Bickmore et al., 2009). Among the most popular techniques in this field, dialog interfaces, tracking, system displays, and artificial intelligence stand out (Rizzo et al., 2011).
Many solutions are being developed with these techniques, such as living
assistants and social robots (Amrhein et al., 2016). The interactions between
AN US
humans and robots are gaining importance in assorted fields, such as speech
and object recognition, and natural language understanding. As a result, these technologies have achieved improved user-robot interactions (Novielli, 2010). Robot assistants are related to intelligent virtual agents, also known as embodied conversational agents, while simple conversational agents are more com-
M
mon (Yasavur et al., 2014). The primary definition of conversational agents is related to a computer program or artificial intelligence able to hold a conversation with humans through natural language processing (NLP). Moreover,
ED
conversational agents can emulate a human personality to ease their interactions with real users (Abdul-Kader & Woods, 2015). Conversational agents use com-
PT
plex semantic and syntactic grammars or highly technical scripting languages to parse user utterances in their language construction processes (Herbert & Kang, 2018).
CE
Conversational agents are useful tools in many contexts owing to their voice-
enabled and/or text-based interactions (Fadhil & Gabrielli, 2017). Conversational agents are being used in many applications in health, generating much
AC
discussion about the technology and producing various proposals for implementation in this area (Knuepffer, 2015). In the health field, such agents could be useful in guiding users by providing relevant information about a disease or discussing the results of clinical tests (van Heerden et al., 2017). Conversational agents can communicate by text messages to perform tasks
3
ACCEPTED MANUSCRIPT
such as providing health care monitoring and delivering health care advice to people of all ages (Lisetti et al., 2012). However, there are some challenges regarding the design of suitable interfaces, seeking the best fit for different profiles,
CR IP T
and applying personal touches for communicating with different users (Tokunaga et al., 2016).
From another perspective, conversational agents are being used to help clinicians to identify symptoms and improve assessment skills, diagnosis, interview techniques, and interpersonal communication (Rizzo et al., 2011). One problem
is the difficulty of engaging in emotional dialog with users. In many contexts,
AN US
agents do not show feelings, and this can affect their credibility (Eisman et al., 2016).
There are several challenges in engaging users and conversational agents. Agents can be beneficial in smartphones for individuals, especially in emergencies such as identifying suicide or self-harm situations, although recent studies have proven the vulnerabilities of agents such as Siri, Google Now, and S voice
M
when used for these purposes (Torous et al., 2018). In Wargnier et al. (2015), the agent provides proactive functions, sending messages to help elderly users in
ED
situations where they are distracted. In the same context, reminiscence strategies are considered challenging when related to design interfaces, conversation logic, and meaningful metrics directed at elderly adults, owing to deteriorated
PT
mental health and reduced capabilities, which is becoming a concern in many countries with aging populations (Nikitina et al., 2018).
CE
The main finding of our research was a taxonomy emphasizing health and computational aspects, divided into three major topics: interactions, dialogs, and architectures. We further sought to answer various questions related to
AC
the goals and future challenges of agents in the health domain, as well as to review the main interaction contexts and models of communication used more frequently. Moreover, we revised the text to elucidate on several questions that have not been clearly answered as potential topics for future studies. Our study is organized into six sections. Section 2 presents a summary of the literature review. Section 3 discusses the methods adopted in this survey. 4
ACCEPTED MANUSCRIPT
Section 4 discusses the results, while section 5 presents some challenges and directions for future work. Finally, conclusions are presented in Section 6.
CR IP T
2. Conversational Agents One premise of conversational agents is engagement, which is defined as a
conversational process between agents and users. This process should observe and analyze the behavior of other participants with voice or visual tools helping
to focus the attention of individuals on communication. In this sense, conversational agents execute a vital role in facilitating engagement, which is a major
AN US
goal of the technology developer (Wargnier et al., 2015).
Conversational agents may be considered virtual avatars with verbal or nonverbal communication characteristics, executing a human-to-machine interactive task (Wargnier et al., 2015). Also called relational agents, they can emulate some human characteristics, using different types of interactions such as speech,
M
gaze, hand gestures, and other nonverbal modalities (Wang et al., 2015). These types of interactions are called multimodal communication, being very useful in the health field by providing potential personal interfaces to interact with users
ED
from the health care domain (Turunen et al., 2011). In the health context, the use of agents for education has been widely ex-
PT
plored, with several applications having been introduced in the therapeutic and patient care context (Wang et al., 2015). Robots have been tested continuously for elderly care with features that support physicians by providing advice and
CE
insights to treat elderly patients (Heerink et al., 2010). Conversational agents act in any environment to gather information and use
it in their interactions through knowledge engineering processes (L´ opez et al.,
AC
2008). This gathering process can allow an agent to perform different functions based on the current necessity, e.g., some studies used speech and language techniques to detect dementia in patients (Tanaka et al., 2017a; Fraser et al., 2016; Aramaki et al., 2016). In this sense agents are helpful tools for human-machine interaction, allow-
5
ACCEPTED MANUSCRIPT
ing the input of data via natural language, processing sentences, and returning answers accurately through text or speech (Galvao et al., 2004). Some agents apply probabilistic techniques to improve dialog. Natural language processing
CR IP T
(NLP) is one of the primary tools for interactions between humans and agents. This technique is focused mainly on signal processing, syntactic and semantic analysis of discourse, and pragmatics, thus allowing natural language gener-
ation to provide meaningful conversation based on context analysis (Ganesh et al., 2008). In this way, conversational agents can be successful in creating an informal connection with their users (Iftene & Vanderdonckt, 2016).
AN US
Therefore, conversational agents can use artificial intelligence algorithms to
interpret user dialog and conduct useful interactions with users. Approaches such as reinforcement learning have gained popularity in the conversational agents field, demonstrating better results than approaches based on rules and other techniques (Yasavur et al., 2014).
M
2.1. Related Work
Articles related to the context of our study are presented in the follow-
ED
ing review. The survey by Abdul-Kader & Woods (2015) discussed different techniques, architectures designs, and methods from papers related to speech interactions through conversational agents, identifying significant improvements
PT
in this area over the past decade. Architectures have become a common research topic in the last few years. The study by D’Alfonso et al. (2017) presented a multimodal semantic archi-
CE
tecture based on ontologies and inferences.In D’Haro et al. (2017) the focus was develop a modular platform to combine different types of technologies to provide
AC
testing and checking. Conversational agents play an essential role in integration with other technologies in this architecture. In the study by Kar & Haldar (2016), an architecture involving the Internet
of Things (IoT) and conversational agents was proposed to identify the challenges and shortcomings of existing IoT systems.
6
ACCEPTED MANUSCRIPT
Recently, another systematic literature review of conversational agents focused on health care was published (Laranjo et al., 2018). The study presents specific characteristics of conversational agents, posing research questions to-
CR IP T
ward the health domain by assessing aspects including main characteristics and study methods. In this sense, there are some critical differences with our study. In their article, the focus is more specific to the health field than on the in-
tersection of health and computer science. For instance, their survey does not
include many articles on computer science, which is reflected in the final corpus of 40 articles in our survey and 17 in theirs. Furthermore, our proposed taxon-
AN US
omy is more comprehensive, expanding on the agent and architecture aspects, and details systems and techniques used by conversational agents. Finally, the research questions aim to elucidate some aspects different from the other study, including goals, main contexts, and proposals in the health domain. According to several research results, and considering the related work, this article answers the central question: What are the main interactions, archi-
M
tectures, and dialogues that involve conversational agents? In this sense, this article aims to develop a systematic literature review of conversational agents
ED
in health care to better understand the concepts, challenges, goals, types of interactions, and contexts related to this subject.
PT
3. Material and Methods
We applied a Systematic Literature Review (SLR) in this study, defining
CE
processes to interpret, identify, and assess literature related to our subject, with the general proposal of answering all questions listed in this article. The main concept of SLR is to produce a summarized content of a study area, thus offering
AC
more evidence than a traditional literature review (Stapi´c et al., 2012). 3.1. Study Design This section discusses the research design and the steps which were used to
accomplish this study. The focus of this work is on performing a systematic
7
ACCEPTED MANUSCRIPT
literature review of recent studies related to conversational agents, providing an overview of this area. According to Turner et al. (2010) SLRs yield consistent results to solidify a phenomenon, such as a new technology. This method in-
essential gaps that should be addressed in future work.
CR IP T
volves collecting a large number of articles to organize, analyze, and identify
The primary goal of this study is related to conversational agent patterns,
goals, types of interactions, and future directions. Supporting this method,
we followed recognized guidelines (Turner et al., 2010; Petticrew & Roberts, 2008) to present all steps applied in the systematic review, and used recent
AN US
publications pertaining to this method to deepen our knowledge. The scope of
the SLR is defined by the steps below, based on a similar structure used by other studies that used this method (Roehrs et al., 2017).
• Research questions: introduce the research questions investigated;
M
• Search strategy: outline the strategy and libraries explored to collect data; • Article selection: explain the criteria for selecting the studies;
cally;
ED
• Distribution of studies: present how studies are distributed chronologi-
• Quality assessment: describe the quality assessment applied to the selected
PT
studies;
CE
• Data extraction: compare the selected studies and research questions. 3.2. Research Questions
AC
In this section, we seek to describe general and specific questions to guide
this research. General questions: GQ1 What is the taxonomy for conversational agents in health? GQ2 What is the state of the art related to conversational agents in health? GQ3 What are the challenges related to conversational agents in health?
8
ACCEPTED MANUSCRIPT
Specific Questions: SQ4 What are the main contexts for interaction of conversational agents in health?
CR IP T
SQ5 What are the main dialogue components used by conversational agents in health?
SQ6 In terms of architecture, what are the main systems and techniques used?
The main concerns and topics related to conversational agents in health are
highlighted through the questions above. First, we considered three questions as
AN US
being general owing to their comprehensiveness. GQ1 concerns the main definitions and classifications of conversational agents in health to assist in generating
a taxonomy. Second, GQ2 is a classification of relevant articles considered by our criteria as state of the art. Last, GQ3 is related to challenges and open questions involving conversational agents in health.
We also listed three specific questions related to conversational agents in
M
health, proposing discussions related to particular topics within the central questions. SQ4 seeks to identify the main goals, domains and contexts related to
ED
conversational agents in health. The focus of SQ5 is on main dialogue components used by conversation agents in the health field. Finally, SQ6 seeks to summarize the systems and techniques used in conversational agent architec-
PT
tures in the health field. 3.3. Search Strategy
CE
The initial process of this research applied a research query to academic and
scientific databases to help answer our questions. The process of query con-
AC
struction was based initially on the authors’ previous experience on the subject, correlating known terms such as synonyms, acronyms, and words inserted in the same context. Figure 1 presents the research string used to search the selected databases. We used the PICOC methodology (population, intervention, comparison, outcome, and context) proposed by Petticrew & Roberts (2008) to refine our 9
ACCEPTED MANUSCRIPT
Figure 1: Search string used for database queries
Search String String: (”Conversational agents” OR ”Chatbots” OR ”Embodied con-
CR IP T
versational agent” OR ”Relational agent” OR ”Intelligent Virtual Agents”) AND ( ”Health” OR ”Healthcare”) AND (”Hospital”)
research string. Figure 2 demonstrates the steps taken to refine the research
AN US
string by following the PICOC method.
PT
ED
M
Figure 2: PICOC process to refine the research string.
CE
3.4. Article Selection
The article selection process used filtering methods to improve the results
that fit our main objectives. The process follows these steps: removal of dupli-
AC
cates, application of exclusion criteria, removal of impurities, and abstract and text filtering. In this context, we initially removed duplicate articles present in various databases. Once we found all related articles, we removed the studies that did not address our criteria. For this purpose, we used the terms of population and intervention derived from the PICOC method as follows:
10
ACCEPTED MANUSCRIPT
• Exclusion criteria 1: Articles do not address ”Conversational Agents” and related acronyms (population criterion I)
similar words (intervention criterion II)
CR IP T
• Exclusion criteria 2: Articles do not address ”Health,” ”Healthcare,” or
The next step focused on impurity removal, deleting theses, dissertations, and books, focusing only on scientific articles from journals and conferences. We also removed articles that were three pages or less in length.
The final process applied filters to abstracts and full texts in order to select
AN US
the ideal corpus. We applied a text filter, selecting only articles with similar
content to the main topics selected for this article, reading all texts and focusing on an article proposals, methods, and architectures. As suggested by Zaveri et al. (2016), three reviewers analyzed all the studies, verifying aspects important to our selection, such as interactions, dialogues, and architecture concepts. Moreover, these reviewers compared their article choices and based on mutual
ED
3.5. Quality Assessment
M
agreement, selected a final list of articles.
The corpus quality was a concern of this research. As a prerequisite, we verified the quality of selected articles through the presence of the following
PT
characteristics: the research proposal, context, literature review, related work, methods, results, conclusion, and future work. We conducted a quality analysis of conferences and journals using the h-index
CE
evaluation metric, which is a single-number criterion assessing the productivity and the impact of a scientific journal and conference. Specifically, we used the
AC
h5-index, as calculated by Google Scholar 1 . The h5-index is the h-index for articles published in the last 5 complete years for a specific journal or conference. It is considered the largest number h, such that h articles published in 20142018 have at least h citations each. For instance, an h5-index is 10 when in the 1 http://scholar.google.com
11
ACCEPTED MANUSCRIPT
last 5 years 10 published articles have at least 10 citations each one (Google, 2019). As a criteria, we delimited a h5-index score (equal or higher than 5) to an article be accepted in this review. Additionally, we included two more 2
for journal publications and the CORE
3
CR IP T
indexes, the SJR and best quartile
for articles in conferences, aiming at expanding the quality assessment.However, some conferences and journals did not have a classification in these indexes.
One major factor for that is due to some journal and conferences being newer,
sometimes with few editions. We opted to maintain in the survey articles of these publications that attain the minimum h5-index score, because they were
field of conversational agents. 3.6. Data Extraction
AN US
significant to our discussion and many times the publication was related to the
This section consisted of analyzing the relationship between the research questions proposed and the selected studies, verifying which articles could an-
M
swer each question.
Correlations of the questions according to the article contents are listed
questions.
ED
in Table 1. Thereby, this method improves focus on articles that answer our
PT
4. Results and Discussion
In this section, we present the results and discussions based on the topics
CE
previously elaborated for conversational agents. 4.1. Recruitment
AC
We answered our questions in separate sections based on 40 articles remain-
ing after filtration. We seek to contribute through discussions on the field of conversational health agents, highlighting the main challenges, models, techniques, contexts, and current goals of this subject in recent research. 2 http://www.scimagojr.com 3 http://portal.core.edu.au
12
ACCEPTED MANUSCRIPT
Table 1: Quality Assessment of article structure and related questions.
Section
Description
Research Questions
Open Content GQ1, GQ3, SQ4, SQ6
CR IP T
Title Abstract
All questions
Keywords
GQ1, SQ5, SQ6
Article Content
All questions
Method
All questions
Results
All questions
Discussion
All questions
Conclusion
All questions
AN US
Introduction
4.2. Conducting the Search Strategy
M
We chose ten databases (ACM, CiteSeerx, IEEE, JMIR, PMC, Scholar, Science, Springer, Taylor e Francis, and Wiley) to evaluate articles for this research. Our criterion was the relevance of these databases regarding health literature.
ED
The process of selection of articles in each database used the procedures mentioned previously to select studies published over the past 10 years. Conversational agents rely on Natural Language Understanding (NLU) and Natural
PT
Language Generation (NLG) to complete tasks, which have accomplished significant breakthroughs over the past decade (Jacquet et al., 2018). Analyzing
CE
our corpus, we can observe that the number of articles has increased in recent years. Finally, we opted to exclude patents and citations, and only selected
AC
articles published in English. 4.3. Article Selection Our search process is detailed in Figure 3, showing all processes used for
exclusion and filtration. Initially, our search string found 4190 articles in different databases. We first removed duplicate studies, resulting in 3966 studies remaining. We excluded 13
ACCEPTED MANUSCRIPT
AN US
CR IP T
Figure 3: Article selection process.
M
1724 studies that did not specifically address conversational agents , and 958 studies not related to ”health” or ”healthcare” were excluded, resulting in 405
ED
articles remaining.
The next stage involved removing impurities such as theses, dissertations, books, and articles of three pages or less. In this stage, the remaining 405
PT
articles were filtered by abstract, which removed articles that did not address our subject, leaving 181 remaining studies. We then followed the guidelines by
CE
Zaveri et al. (2016) to further filter articles by abstract and full text. We the proceed with the quality assessment, based on our previously defined criteria. Following these analysis, we further eliminate 2 articles that did not attain the
AC
minimum expected quality, resulting in a final corpus of 40 articles. We present the final corpus, divided in journal articles (Table 2) and articles published in conference proceedings (Table 3). The tables also show the indexes considered in the quality evaluation, h5-index, SJR and quartile for journal publications, and CORE for articles in conferences.
14
CR IP T
ACCEPTED MANUSCRIPT
Table 2: Final corpus of articles published in journals.
Id
Article
Publisher
h5-
SJR
Quartile
180
1.16
Q1
110
1.55
Q1
84
1.04
Q1
Index
A01
Tanaka et al. (2017b)
IEEE
A02
Sebastian & Richards
Elsevier
(2017) D’Alfonso et al. (2017)
Frontiers
A04
Hudlicka (2013)
Elsevier
53
1.38
Q1
A05
Zhang et al. (2017)
Elsevier
53
1.38
Q1
A06
Bickmore et al. (2011)
Elsevier
50
1.03
Q1
A07
King et al. (2013)
Taylor-Francis
37
1
Q1
A08
Turunen et al. (2011)
Elsevier
33
0.54
Q2
A09
Bres´ o et al. (2016)
Wiley
21
0.43
Q1
A10
Yasavur et al. (2014)
Springer
20
0.36
Q2
A11
Rizzo et al. (2011)
JVRB
20
0.3
Q3
A12
Tanaka et al. (2017a)
IEEE
17
0.48
Q2
A13
Tielman et al. (2017)
IOSPress
17
0.28
Q3
A14
Heerink et al. (2010)
Springer
15
0.3
Q3
Chuah et al. (2013)
MIT
13
0.23
Q3
Karpagam & Saradha
ICT
12
0.11
Q4
M
ED
CE
A16
PT
A15
AN US
A03
(2014)
Shaked (2017)
IET
11
0.32
Q3
A18
Hirano et al. (2017)
SAGE
10
0.39
Q3
A19
Wells et al. (2015)
PMC
7
0.21
Q3
AC
A17
15
ACCEPTED MANUSCRIPT
Selected Article
Publisher
A20
Nikitina et al. (2018)
ACM
A21
Magerko et al. (2011)
AAAI
A22
Kasinathan et al. (2017)
IEEE
A23
Fadhil (2018)
ACM
A24
L´ opez et al. (2008)
IEEE
A25
Ring et al. (2016a)
A26
Zhang et al. (2014)
A27
Bickmore et al. (2009)
A28
Baskar
&
Lindgren
(2014)
h5-Index
CORE
74
C
69
C
34
-
22
-
18
B
AN US
Id
CR IP T
Table 3: Final corpus of articles published in conferences.
MIT
17
A
Springer
17
B
ACM
17
C
Springer
16
C
ACL
16
-
Jin et al. (2017)
A30
Amato et al. (2017)
WAIAH
15
B
A31
Lisetti et al. (2012)
FLAIRS
15
C
A32
Ochs et al. (2017)
CASA
15
-
A33
Alesanco et al. (2017)
Springer
14
-
A34
Tokunaga et al. (2016)
IEEE
12
C
van
IEEE
10
-
ED
PT
A35
M
A29
Heerden
et
al.
CE
(2017)
Wargnier et al. (2016)
IEEE
10
-
A37
Lokman & Zain (2010)
Springer
10
-
A38
Rentea et al. (2012)
IEEE
9
-
A39
Kowatsch et al. (2017)
ETH
6
B
A40
Ni et al. (2017)
ISKE
5
B
AC
A36
16
ACCEPTED MANUSCRIPT
4.4. Data Extraction and Answers to the Research Questions In this section, we discuss and answer the general and specific questions asked in this study.
CR IP T
GQ1 What is the taxonomy for conversational agents in health?
To obtain a better comprehension of conversational agents in health, we created a taxonomy, as shown in Figure 4. We organized the taxonomy in
nodes, each one covering a central concept related to the area. The nodes are further detailed in terms of attributes, describing how the state of the art in the field addresses each of the characteristics of a conversational agent in health.
AN US
The primary purpose of this taxonomy is to categorize recurrent ideas by
organizing concepts and connections. We organized the taxonomy characterizing an analysis of agent features and at the same time considers contexts and original proposals.
We have begun the building process with an in-depth analysis of all articles to identify characteristics, patterns, and categories of the corpus. Following, we
M
defined three central concepts, defined as nodes in the taxonomy, to analyze (1) Interactions, (2) Dialog, and (3) Architectures attributes found in the literature.
ED
The first node, Interactions, defines the contexts for using conversational agents in health, not focusing on the type of agent that acts in these contexts. The second node, Dialog, defines the types of approaches applied in the articles
PT
selected and all models of communication used by agents in health care. The third node, Architectures, categorizes the techniques and systems used in the
CE
selected articles.
Following each node, the taxonomy presents the corpus attributes related to
each central concept. This detailing is presented later in the article, when we
AC
answer the specific questions.
17
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN US
CR IP T
Figure 4: Taxonomy of conversational agents
GQ2 What is the state of the art related to conversational agents in health? The selected articles that could pertain to the state of the art were analyzed
18
ACCEPTED MANUSCRIPT
Figure 5: State of the Art articles considered in this survey with number of citations and
M
AN US
CR IP T
average per year.
by the number of citations and relevance of the studies from 2008 to 2018, being presented in Figure 5. They were classified according to H5-Index metric, from
ED
the highest to lowest score. We analyze the articles according to the number of citations they have received on the Google Scholar database. Moreover, we
PT
calculated the average of citations per year, considering the interval from the year of publication until 2018. This approach gives the reader an overview of the relevance of articles considered in this survey.
CE
GQ3 What are the challenges related to conversational agents in health?
The last general question focuses on conversational agent challenges in the
AC
health context. In this sense, we sought to pose questions that were not answered in the corpus and could indicate future research directions. We summarized and clustered the results into five groups to facilitate interpretation, as shown in Table 4. Supporting this question, we discuss some challenges raised by selected articles related to conversational agents in health, focusing on the macro
19
ACCEPTED MANUSCRIPT
challenges and dividing them into a few groups.
Group G1
Challenges
Article Id
Dialog Generation
CR IP T
Table 4: Challenges pertaining to conversational agents in health care.
A04, A10, A15, A24, A25, A28, A35, A36, A40.
G2
Integration with other tech-
A26, A27, A30, A38.
nology Elderly people
A07, A17, A20, A34.
G4
User experience
A01, A02, A03, A11, A16, A21,
AN US
G3
A22, A23, A31, A32, A33, A39.
G5
New approaches
A05, A06, A08, A09, A12, A13, A14, A18, A19, A29, A37.
M
The first group (G1) is related to concerns regarding the generation and understanding of dialog by agents as open questions proposed to be solved in
ED
future work, as well as new natural language processing algorithms for comprehension of a user’s message. The second group (G2) is related to integration of conversational agents with other technologies. Additional challenges are re-
PT
lated to the adaptation and use of conversational agents for elderly people in our third group (G3). Challenges related to user experience regarding interactions and new interfaces are topics in group four (G4), referring to challenges in
CE
user context and user experience. Group five (G5) is related to new approaches involving methodologies and architectures.
AC
Technical challenges are quite common when discussing health agents. Con-
cerns involving the generation and context of agents (L´ opez et al., 2008; Hudlicka, 2013; Jin et al., 2017) as well as new natural language processing algorithms for comprehension of the user’s message (Wargnier et al., 2016; Ring et al., 2016a) are common themes. Natural language generation modules should be improved to work with ambiguities and varieties in the context of the conversation (L´ opez 20
ACCEPTED MANUSCRIPT
et al., 2008). Generating text and speech response in web environments can minimize patients’ costs of access to agents (Hudlicka, 2013). Improvements in voice recognition systems in hospital environments are part of (Wargnier et al.,
CR IP T
2016) work, as well as providing more elaborate responses. The classification and identification of depression topics in user messages are also part of future work in the health area (Ring et al., 2016a).
The second group is related to integration of agents with other technologies, such as electronic health records (EHRs) and medical documents, as well as
integration with sensors and knowledge representation bases (Bickmore et al.,
AN US
2009; Rentea et al., 2012). Hospital records should serve as a basis for agent
consultations, rather than being in separate stations. The use of EHR jointly to assistants could help in prevent diseases in the future, providing relevant information to the patient (Bickmore et al., 2009). Questions related to the adaptation and use of conversational agents for the elderly are very pertinent. In this scenario, the agents could act as reminders to assist elderly people with
M
mental health problems (Nikitina et al., 2018). Furthermore, studies to improve the quality of care based on the reactions of elderly people to agents are of
ED
interest (Tokunaga et al., 2016). Multimodal architectures based on speech interactions for older people are another focus to future research. (Shaked, 2017).
PT
Challenges related to user experience regarding interactions and new interfaces were discussed in several articles: (Alesanco et al., 2017; Karpagam &
CE
Saradha, 2014; Rizzo et al., 2011; Kowatsch et al., 2017; Lisetti et al., 2012; Magerko et al., 2011; Tanaka et al., 2017b; Kasinathan et al., 2017), which discussed improvements in communication techniques involving the user, us-
AC
ing emotions in the dialog, and designing interfaces to make the conversation more realistic. 3D interfaces to the user interaction should be more present, offering multi-modality to the agent (Karpagam & Saradha, 2014). Face-to-face interfaces will also be explored in future work to help in the obesity treatment (Kowatsch et al., 2017). Moreover, new approaches to methodologies and architectures are considered as challenges. Modeling interactions between human21
ACCEPTED MANUSCRIPT
agent, exploring new hospital contexts Zhang et al. (2017), and new imitation methods Lokman & Zain (2010) are some future challenges noted in the corpus. Recommendation systems may serve as a tool to detect dementia in an easier
CR IP T
way Tanaka et al. (2017a).
SQ4 What are the main contexts for interaction of conversational agents in health?
We investigated all article goals, domains, and contexts involving conversational agents in health. In this sense, Table 5 clarifies our findings.
AN US
Training goals refers to coaching agents focused on enabling students to per-
form future tasks and consider new situations or assisting physicians to improve in their daily practices. The aim of the first group is to provide training on real situations that medical students or health professionals may face in their daily lives (Hudlicka, 2013; Rizzo et al., 2011).In this sense, the agents can simulate patients in environments such as hospitals or facing emergencies where health
M
professionals must make critical decisions (L´ opez et al., 2008; Magerko et al., 2011). Education agents are focused on teach health professionals and patients
ED
to understand health concepts sometimes referred as a tool for alphabetizing patients (Bickmore et al., 2009). Some tools are based on mobile devices, focusing on doubts and tips aimed at achieving better patient health (Tielman et al.,
PT
2017) or as a tool for teaching medical students in specific pathology’s (Chuah et al., 2013; Jin et al., 2017). Educators agents may deliver for moms breast-
CE
feeding counselors, providing an interpersonal continuity of care (Zhang et al., 2014). The focus of the third group is on prevention, assisting in improving the relationships between patients and physicians Rentea et al. (2012). Agents can
AC
be useful to help patients in choosing the most proper disease prevention pathway Lokman & Zain (2010) or in suicide prevention or coping with depression Bres´ o et al. (2016). Application for preventive care with mental health is shown in the work of Hirano et al. (2017). The conversational agent assistants have the characteristic of supporting physicians and patients in performing health-related daily tasks (Zhang et al., 2017; Ring et al., 2016b). In the case of patients, they 22
ACCEPTED MANUSCRIPT
Table 5: Environment interaction of Conversational agent. Node
1 Level
2 Level
Article Id
Interaction Health Goals Assistance
A03, A05, A06, A08, A10, A16,
Train
CR IP T
A18, A19, A21, A24, A25, A34.
A01, A04, A11, A20, A22, A23, A30, A31, A38.
Elderly
A14, A17, A32, A35.
Diagnosis
A12, A27, A36, A39.
Education
A02, A13, A15, A26, A28, A33.
Prevention
A07, A09, A29, A37, A40.
Health Contexts Patient
A01, A03, A04, A05, A06, A07,
A08, A09, A10, A12, A13, A14,
AN US
A17, A18, A19, A20, A22, A23, A25, A26, A27, A28, A31, A33, A34, A35, A36, A37, A38, A39.
Health
Physician
A11, A16, A29, A30, A32, A40.
Student
A02, A15, A21, A24.
Dermatology
A33.
Domain
Areas
Hospital
A01, A11, A18, A20, A26, A32,
M
A34.
A02, A03, A05, A14, A24, A30, A37.
Cardiology
A21.
Neurology
A19, A22, A23, A40.
Mindfulness
A04.
Endocrinology
A07, A36.
Nutrition
A06, A08, A38.
G.Practitioner
A09, A10, A12, A13, A15, A16, A17, A25, A27, A28, A29, A31,
PT
ED
Therapy
A35, A39.
CE
can assist in therapeutic and cognitive treatments (McArthur et al., 2007; van Heerden et al., 2017). Assistants could be tailored to mental health issues,
AC
helping in recovery (D’Alfonso et al., 2017). The fifth group focuses on diagnosis, in which the agent helps patients and physicians to predict diseases from symptoms or behaviors. These conversational agents could work in a clinical environment or on a patient’s mobile device (Alesanco et al., 2017; Kasinathan et al., 2017). The combination of natural language processing capability with knowledge-driven could provide a diagnostic capability helping doctors to make
23
ACCEPTED MANUSCRIPT
decisions (Ni et al., 2017). The health care field presents many studies related to assistants focused on elderly users Turunen et al. (2011), and the sixth group focuses on this task. Some studies seek to discover a useful relationship between
CR IP T
agents’ conversational interfaces and older users Heerink et al. (2010), and along similar lines, different types of avatars and integration’s of agents with elderly
people have been researched(Tokunaga et al., 2016). Counselors for physical activity presented satisfactory results, increasing the health quality to the elderly
people, seems a good alternative to being more explored in the future (King et al., 2013).
AN US
We attempted to identify within which contexts conversational agents act in
the health field. In this sense, we identified three types of context in which they can act: patients, physicians ,and students. Patient-centered applications have been developed, including agents focused on delivering diagnosis van Heerden et al. (2017) and in assisting patients in understanding diagnostic methods and results (Bickmore et al., 2009). Agents may focus on the treatment of diseases
M
such as mental health, therapies, and skin tags via smartphones or other devices (Alesanco et al., 2017; D’Alfonso et al., 2017; Lisetti et al., 2012; Nikitina
ED
et al., 2018). Conversational agents can also act in prevention with patients if they have access to patients’ health information or agents may offer assistance in overcoming physical inactivity and obesity (Turunen et al., 2011; Rentea
PT
et al., 2012). Agents can also assist in health literacy by helping patients to understand basic medical concepts (Bickmore et al., 2009). New approaches
CE
to chatbots helping patients in choosing the most proper disease prevention pathway by asking for different information (starting from a general level up to specific pathways questions) and to support the related prevention check-up and
AC
the final diagnosis (Amato et al., 2017). In the doctor context, the agent can act in health care institutions or in an ubiquitous manner through a mobile device. Agents may perform various tasks such as a support tool in the detection of patients with dementia or to control obesity. In addition to training professionals in the health environment, they can act as personal assistants to doctors (Kowatsch et al., 2017; Karpagam & Saradha, 2014; Rizzo et al., 2011). Agents 24
ACCEPTED MANUSCRIPT
in the student context may work with tutorials and practices involving medical education (Magerko et al., 2011; L´ opez et al., 2008). Moreover, students can be trained by agents, for example by posing real situations to students as problems
experiments conducted by agents (Chuah et al., 2013).
CR IP T
related to anorexia nervosa (Sebastian & Richards, 2017) or by participating in
We also listed the most representative domains where conversational agents act, which resulted in a considerable variability of domains in which this tech-
nology can be used. Agents have been developed to interact in many health areas, helping in hospitals and emergency rooms Chuah et al. (2013); Magerko
AN US
et al. (2011), as well as serving for example as an auxiliary to assist in auto-care for depression Ring et al. (2016b)and nutrition Turunen et al. (2011). Guidelines for prescription medications for the dermatology are delivered to health professionals by the conversational agent (Alesanco et al., 2017).
tional agents in health?
M
SQ5 What are the main dialogue components used by conversa-
The primary objective behind this question is to identify and classify all di-
ED
alogue components used in our corpus. The main components of conversational agent dialogues found in our review are presented in Table 6. Dialog models and different types of dialog representation were classified in
PT
this section. Regarding dialog types, the most representative in our sample was dialog management, which is responsible for managing the state of the dialog
CE
during interactions between an agent and individual (L´ opez et al., 2008). The dialog engine parses and computes the input for a reply to the final user (Amato et al., 2017). The dialog generation work generating replies to users through
AC
machine learning algorithms or statistics approaches (Jin et al., 2017). The dialog planner has of the main feature uses as a flexible plan so that the dialogbased interaction can be dynamically conducted based on the knowledge that the system has acquired about the user (Lisetti et al., 2013). Moreover, we have identified the main types of agents present in health dialogues: coaching agents and counseling agents. Coaching agents are used in 25
ACCEPTED MANUSCRIPT
Table 6: Conversational agent dialogue type. Node
1 Level
2 Level
Article Id
Dialog Generation
A11, A20, A22, A28, A29, A32.
Dialog Planner
A06, A09, A13, A31, A35, A37.
Dialog
Dialog Engine
CR IP T
Dialog Types
A02, A03, A05, A07, A14, A15, A23, A24, A25, A26, A30, A39, A40.
Dialog Management
A01, A04, A08, A10, A12, A16, A17, A18, A19, A21, A27, A33, A34, A36, A38.
Agent Types
A03, A04, A05, A06, A07, A09, A10, A11,
AN US
Counseling
A13, A14, A15, A16, A17, A19, A21, A25, A26, A27, A29, A30, A31, A32, A33, A35, A37, A38, A39, A40.
Coach
A01, A02, A04, A08, A12, A18, A20, A22, A23, A28, A34, A36.
Communication models
A02, A04, A05, A07, A09, A11, A12, A13, A15, A16, A17, A19, A20, A21, A22, A26, A27, A29, A31, A32, A36.
Speech
A10, A14, A24, A25.
Text
A01, A03, A06, A08, A18, A23, A28, A30, A33, A34, A35, A37, A38, A39, A40 .
PT
ED
M
Multimodal
general to train physicians or improve their handling of daily tasks, and in some cases, training patients to learn a new routine (Magerko et al., 2011). Coaching
CE
agents can provide brief interventions to change the behavior of an individual, or attempt to influence an individual to take better actions in the future
AC
(Yaghoubzadeh et al., 2013). Counselors can be used as tool trainers that support improved health care of patients suffering mainly from emotional health problems or by improving physician and nurse working conditions (Hudlicka, 2013). Counselor agents are socially engaged, building alliances with their patients, and encourage the patient to assist in improving their own health (Johnson et al., 2004). We present in Table 6 the classification. 26
ACCEPTED MANUSCRIPT
Furthermore, we present the main communication models found. Interaction by text can work through text messages or web platforms, and can create conversations similar to interactions in social networks(Alesanco et al., 2017).
CR IP T
Consequently, many studies are seeking to improve this type of interaction, such as Kowatsch et al. (2017), which aims to design interfaces for text messages used to specifically support interactions between patients and physicians. Text di-
alog is helpful when combined with social networks, as it is familiar to most
users (van Heerden et al., 2017). Text techniques study were emphasized in
Fadhil (2018) where were compared different dialogue styles and plain of text
AN US
with approaches using emojis in conversations. Also, a new dialog architecture proposal was presented to have a dialogue with a human agent on health-related topics, where each component performs a set of tasks for the purpose to enable the agent to be enrolled in a dialogue (Baskar & Lindgren, 2014). Speech interactions can help elderly people in many situations as a more natural form of interaction (Ring et al., 2016a). Some studies apply speech methods in robots
M
to provide a better interface for interactions with users (Heerink et al., 2010).In another study by L´ opez et al. (2008), speech interactions were used to train
ED
students in real situations driven by virtual patients acting as normal patients manifesting different symptoms, encouraging students to think of possible correlated diseases. Multimodal dialogues may include text and speech interactions
PT
mixed with other techniques. In our samples, multimodal architectures were dominant. Some researchers such as Tanaka et al. (2017b) argue that multi-
CE
modal interactions may be a useful approach to assist people to overcome social difficulties, improving their narrative skills, conversation skills, and social skills. The work by Wargnier et al. (2016) aimed to create a multimodal agent to sup-
AC
port elderly people at a hospital using a text-to-speech method, converting the text message to voice using a specific engine. This type of approach is beneficial to elderly people or people who have problems reading text messages. There are also multimodal dialogues which use different types of interactions, such as text, speech, hand gestures, and facial gestures in combination. This can be seen in the study by Hudlicka (2013), which aims to create a personal connection with 27
ACCEPTED MANUSCRIPT
a user through verbal and non-verbal interactions. Facial and gesture models are considered highly beneficial to establish a level of realism with patients by using virtual characters. The intent is to apply meditation techniques after this
CR IP T
system is trained using a set of pedagogical strategies to support all user questions and behaviors. In some cases, agents based on multimodal interactions are
intended to improve understanding when users are not familiar with a specific technology (Bickmore et al., 2009). This type of model can also assist in train-
ing individuals in medical skills through gestures and facial interactions (Rizzo
AN US
et al., 2011).
SQ6 In terms of architecture, what are the main systems and techniques used?
To define the architecture construct, we divided this node into two subnodes characterized by the types of systems, and techniques. In particular, in this taxonomy node, not all 40 articles were classified, since some articles did
M
not present the employed techniques or systems. The taxonomy also yielded various techniques to process natural languages, such as statistical methods
ED
and machine learning methods. All points discussed it is presented in Table 7. Techniques such as convolutional neural networks (CNNs) have been applied in the health context to assist conversational agents in tasks such as identifying
PT
users intent from their questions, interpreting their inquiries about the provided results (Jin et al., 2017). The Wizard of Oz technique is an efficient way to ex-
CE
amine user interaction with computers and allows rapid iterative development of dialog wording and logic. Some experiments have used this method for attention monitoring performance of a Louise agent and for interactive developments tests
AC
(Wargnier et al., 2016). Approaches more complex, such as Markov chain and Reinforcement learning (RL), were used as a virtual counseling system, delivering brief alcohol health interventions by spoken dialogues interactions (Yasavur et al., 2014). Alternatively, Rule-Based Expert System model (RBES) were used by Kasinathan et al. (2017) to represent dialog patterns, receiving inputs from patients in the form of texts and extracting keywords to interpret meaning 28
ACCEPTED MANUSCRIPT
Table 7: Conversational agent Architecture. Node
1 Level
2 Level
Article Id
Wizard of Oz
A13, A14, A15, A17, A19, A23, A36.
Reinforcement Learn-
A10.
Architectures
ing
CR IP T
Techniques
RBES
A22.
CNN
A29.
Pattern Matching
A03, A09, A11, A18, A25, A27, A30, A32, A33, A35, A39. A12.
AN US
SVM
Semantic-based
A04, A06, A08, A20, A24, A28, A37, A38.
Word2Vec
A40.
AIML
A16.
Free TTS
A10, A16, A31.
Beat
A05, A26.
Systems
A25.
Sonic
A11.
Mobile Coach
A39.
ED
M
Pocket Sphynx
AC
CE
PT
Kinect
A36.
Dialog Flow
A33, A35.
Watson
A30.
Blender
A02.
ACKTUS
A28.
MDDagent
A01.
Snack
A12.
iCat
A14.
Google C. NLP
A20.
Java Beans
A21.
Telegram
A23.
OpenDial
A32.
and to produce meaningful responses. Using inference, this approach provides a diagnosis of possible diseases based on the existing symptoms entered by users.
29
ACCEPTED MANUSCRIPT
Moreover, almost all studies are inclined to use natural language processing using different types of models, such as bag-of-words. Especially in the health area, these techniques could be merged with the word2vec model, which
CR IP T
is pre-trained on a large dataset of medical documents. Additionally, medical documents can be represented by ontologies stored in OWL structures, permitting reasoning techniques to extract implicit knowledge (Bickmore et al., 2011).
Finally, the last sub-node includes the systems, which is related to frameworks and applications used to support the research of conversational agents in health.
In this sense, we listed the most representative tools adopted by researchers in
AN US
our corpus. Multimodal studies usually have a speech feature in their architectures mixed with other components. Pocket Sphinx is a speech recognition system used as a US-English acoustic model and dictionary to provide a
grammar-based analysis tool. Facial systems such as Kinect are commonly used to classify expressions and assist in emotional analysis (Wargnier et al., 2016). Text-to-speech is a recurrent task appearing in conversational agent studies that
M
can also involve hand gestures, body posture shifts, gaze shifts, eyebrow movement, and head nods. In this sense, the BEAT system could help to interpret
ED
these gestures automatically, passing the translation to a text-to-speech tool (Wells et al., 2015). APIs for text dialog, such as the Dialogflow system, could
PT
help in NLP and NLU tasks (Alesanco et al., 2017).
5. Challenges and Future Directions
CE
Based on our research questions, we discuss potential future studies in the
health care field. Regarding diseases, future research that focuses on enabling agents to deliver individualized therapy suggestions based on linguistic analysis
AC
of interactions or postings by patients is envisioned (D’Alfonso et al., 2017). Moreover, agents should be trained with more data using crowd sourcing information to help in the reminiscence field, which studies the mental, social, and emotional well-being of older adults and requires enhanced data to improve sustained support for the elderly (Nikitina et al., 2018).
30
ACCEPTED MANUSCRIPT
In another context, improving agent features is considered as a future challenge. Facial and gesture techniques need improvements in the future to include hard of hearing users, in addition to generally improving agent social skills in
CR IP T
human-to-computer interactions, focusing on the types of agents and feedback that produce the most benefits (Tanaka et al., 2017b; Karpagam & Saradha, 2014). Moreover, tools are being developed to assist health care specialists to
insert and manage content through dialog with agents (Sebastian & Richards,
2017). Future work related to natural language generators to handle context
in linguistic events as anaphora and deixis should be pursued, along with the
AN US
creation of rules to interpret ambiguities to improve dialog management (L´ opez et al., 2008). In this context, assistants with access to EHR data sources could be developed for preventive tasks based on patient information to change patient behavior in some instances (Rentea et al., 2012). Extending bases of training to test and optimize techniques such as speech recognition with more samples
is considered somewhat of a future challenge, and further evaluation and im-
M
provement is recommended (Yasavur et al., 2014).
In the health care field, there are efforts focusing on allowing patients to
ED
extract their information directly from a hospital’s electronic medical records, making this process faster (Bickmore et al., 2009). Investigating the patient perception of agents and evaluating the results in a controlled group is another
PT
future work proposed for this field (Kowatsch et al., 2017). Disclosing test results through chat messages is an excellent option to engage a user. Toward
CE
this end, convolutional neural networks or other automated systems are capable of quickly determining positive, negative, or invalid results, and sending a result
AC
to the patient (van Heerden et al., 2017).
6. Conclusion The present study discusses different aspects of conversational agents in the
health care field. We list research questions to obtain specific answers regarding our subject, thus qualifying the information of this survey. Initially, we pro-
31
ACCEPTED MANUSCRIPT
vide a taxonomy clarifying concepts related to agents restricted to the specific delimited context. In our taxonomy, we derived many groups based on their similarities and relationships, led by three central concepts: Architectures, Di-
CR IP T
alogs, and Interactions. From the conception of the proposed taxonomy, the diversity of relationships and their ramifications can be observed in relation to conversational agents in health applications. We classified our sample by cita-
tion number to determine the articles considered state of the art in our survey. We analyzed some metrics such as article quality based on citations received
since the published year. From the taxonomy created we were able to classify
AN US
the conversational agent goals, where six classes were representative of our question, such as training, education, prevention, educational assistance, diagnosis,
and elderly assistance. In the same context, we identified the major types of dialog interactions in health care through agents, dividing them into text, speech, and multimodal interactions. In addition, we evaluated the main contexts in which conversational agents act according to our corpus, identifying patients,
M
physicians, and students as the most relevant.
Our research included some limitations. We included two determinant crite-
ED
ria for our results in the research, which were health in general and multimodal models. These criteria directed our search string to results in both selected fields. However, the research questions were proposed to seek aspects relevant
PT
to the health field, so some excess literature resulted. The constrained time period limited the survey to some extent, as our focus was on recent articles only.
CE
Our approach evaluated articles published in the last ten years, seeking the latest methodologies and architectures. Another limitation was that we focused
AC
on scientific articles, not considering commercial tools, patents, or software. Regarding challenges and concerns associated with conversational agents in
the health field, we identified a number of potential future works related to new health context areas, and new proposals such as health literacy, which is expected to become a trend in the future based on discussions in some of the articles surveyed. This trend is expected to grow in the health care environment based on increased interest in providing medical education to patients and stu32
ACCEPTED MANUSCRIPT
dents. Another concern involved improvements to conversational agents related to interactions, interfaces, and models of learning, with focus on facilitating user engagement. Finally, the elderly were identified as a challenge to conversational
Acknowledgment
CR IP T
agent adoption in different environments, requiring further study.
The authors would like to thank the Brazilian National Council for Scientific
and Technological Development - CNPq (Grant Numbers 303640/2017-0 and
AN US
405354/2016-9) for supporting this work.
References References
Abdul-Kader, S. A., & Woods, J. (2015). Survey on chatbot design techniques
M
in speech conversation systems. International Journal of Advanced Computer Science and Applications, 6 , 72–80.
ED
´ Sancho, J., Gilaberte, Y., Abarca, E., & Garc´ıa, J. (2017). Bots Alesanco, A., in messaging platforms, a new paradigm in healthcare delivery: application to custom prescription in dermatology. In European Medical and Biological
PT
Engineering Confernce Nordic-Baltic Conference on Biomedical Engineering and Medical Physics 2017 (pp. 185–188). Springer.
CE
Amato, F., Marrone, S., Moscato, V., Piantadosi, G., Picariello, A., & Sansone, C. (2017). Chatbots meet ehealth: automatizing healthcare, . 14 , 381–388.
AC
Amrhein, A., Cyra, K., & Pitsch, K. (2016). Processes of reminding and requesting in supporting people with special needs: Human practices as basis for modeling a virtual assistant? In EDIA@ ECAI (pp. 14–19).
Aramaki, E., Shikata, S., Miyabe, M., & Kinoshita, A. (2016). Vocabulary size in speech may be an early indicator of cognitive impairment. PloS one, 11 , 13. 33
ACCEPTED MANUSCRIPT
Baskar, J., & Lindgren, H. (2014). Cognitive architecture of an agent for humanagent dialogues. In International Conference on Practical Applications of Agents and Multi-Agent Systems (pp. 89–100). Springer.
CR IP T
Bickmore, T. W., Pfeifer, L. M., & Jack, B. W. (2009). Taking the time to care:
empowering low health literacy hospital patients with virtual nurse agents. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1265–1274). ACM.
Bickmore, T. W., Schulman, D., & Sidner, C. L. (2011). A reusable frame-
AN US
work for health counseling dialogue systems based on a behavioral medicine ontology. Journal of biomedical informatics, 44 , 183–197.
Bres´ o, A., Mart´ınez-Miranda, J., Botella, C., Ba˜ nos, R. M., & Garc´ıa-G´ omez, J. M. (2016). Usability and acceptability assessment of an empathic virtual agent to prevent major depression. Expert Systems, 33 , 297–312.
M
Chuah, J. H., Robb, A., White, C., Wendling, A., Lampotang, S., Kopper, R., & Lok, B. (2013). Exploring agent physicality and social presence for
ED
medical team training. Presence: Teleoperators and Virtual Environments, 22 , 141–170.
D’Alfonso, S., Santesteban-Echarri, O., Rice, S., Wadley, G., Lederman, R.,
PT
Miles, C., Gleeson, J., & Alvarez-Jimenez, M. (2017). Artificial intelligenceassisted online social therapy for youth mental health. Frontiers in psychology,
CE
8 , 796.
D’Haro, L. F., Niculescu, A. I., Cai, C., Nair, S., Banchs, R. E., Knoll, A., &
AC
Li, H. (2017). An integrated framework for multimodal human-robot interaction. In Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2017 (pp. 076–082). IEEE.
Eisman, E. M., Navarro, M., & Castro, J. L. (2016). A multi-agent conversational system with heterogeneous data sources access. Expert Systems with Applications, 53 , 172–191. 34
ACCEPTED MANUSCRIPT
Fadhil, A., & Gabrielli, S. (2017). Addressing challenges in promoting healthy lifestyles: the al-chatbot approach. In Proceedings of the 11th EAI International Conference on Pervasive Computing Technologies for Healthcare (pp.
CR IP T
261–265). ACM. Fadhil, S. G. W. Y. . Y. B. A., A. (2018). The effect of emojis when interacting
with conversational interface assisted health coaching system. In In Pro-
ceedings of the 12th EAI International Conference on Pervasive Computing Technologies for Healthcare. ACM.
AN US
Fraser, K. C., Meltzer, J. A., & Rudzicz, F. (2016). Linguistic features identify alzheimers disease in narrative speech. Journal of Alzheimer’s Disease, 49 , 407–422.
Galvao, A. M., Barros, F. A., Neves, A. M., & Ramalho, G. L. (2004). Personaaiml: An architecture developing chatterbots with personality. In Proceedings
M
of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3 (pp. 1266–1267). IEEE Computer Society.
ED
Ganesh, S., Harsha, S., Pingali, P., & Verma, V. (2008). Statistical transliteration for cross language information retrieval using hmm alignment model and crf. In Proceedings of the 2nd Workshop on Cross Lingual Information
PT
Access.
Google (2019). Google scholar metrics. https://scholar.google.com/intl/en/scholar/metrics.htmlmetrics. https://scholar.google.com/intl/en/scholar/metrics.html#
CE
URL:
metrics.
AC
van Heerden, A., Ntinga, X., & Vilakazi, K. (2017). The potential of conversational agents to provide a rapid hiv counseling and testing services. In the Frontiers and Advances in Data Science (FADS), 2017 International Conference on (pp. 80–85). IEEE.
Heerink, M., Kr¨ ose, B., Evers, V., & Wielinga, B. (2010). Relating conversa-
35
ACCEPTED MANUSCRIPT
tional expressiveness to social presence and acceptance of an assistive social robot. Virtual reality, 14 , 77–84. Herbert, D., & Kang, B. H. (2018). Intelligent conversation system using multi-
CR IP T
ple classification ripple down rules and conversational context. Expert Systems with Applications, .
Hirano, M., Ogura, K., Kitahara, M., Sakamoto, D., & Shimoyama, H. (2017).
Designing behavioral self-regulation application for preventive personal mental healthcare. Health psychology open, 4 , 2055102917707185.
AN US
Hudlicka, E. (2013). Virtual training and coaching of health behavior: Example
from mindfulness meditation training. Patient education and counseling, 92 , 160–166.
Iftene, A., & Vanderdonckt, J. (2016). Moocbuddy: a chatbot for personalized learning with moocs. In RoCHI–International Conference on Human-
M
Computer Interaction (p. 91). Public Library of Science.
Jacquet, B., Masson, O., Jamet, F., & Baratgin, J. (2018). On the lack of prag-
ED
matic processing in artificial conversational agents. In International Conference on Human Systems Engineering and Design: Future Trends and Applications (pp. 394–399). Springer.
PT
Jin, L., White, M., Jaffe, E., Zimmerman, L., & Danforth, D. (2017). Combining cnns and pattern matching for question interpretation in a virtual patient
CE
dialogue system. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (pp. 11–21).
AC
Johnson, W., LaBore, C., & Chiu, Y.-C. (2004). A pedagogical agent for psychosocial intervention on a handheld computer. In AAAI Fall Symposium on Dialogue Systems for Health Communication (pp. 22–24).
Kar, R., & Haldar, R. (2016). Applying chatbots to the internet of things: Opportunities and architectural elements. arXiv preprint arXiv:1611.03799 , (p. 9). 36
ACCEPTED MANUSCRIPT
Karpagam, K., & Saradha, A. (2014). An intelligent conversation agent for health care domain. ICTACT Journal on Soft Computing, 4 . Kasinathan, V., Xuan, F. S., Wahab, M. H. A., & Mustapha, A. (2017). Intelli-
CR IP T
gent healthcare chatterbot (hecia): Case study of medical center in malaysia. In Open Systems (ICOS), 2017 IEEE Conference on (pp. 32–37). IEEE.
King, A. C., Bickmore, T. W., Campero, M. I., Pruitt, L. A., & Yin, J. L. (2013). Employing virtual advisors in preventive care for underserved communities:
results from the compass study. Journal of health communication, 18 , 1449–
AN US
1464.
Knuepffer, C. (2015). Chat-bots for people with parkinsons disease: Science fiction or reality?
In Driving Reform: Digital Health is Everyones Busi-
ness: Selected Papers from the 23rd Australian National Health Informatics Conference (HIC 2015) (p. 128). IOS Press volume 214.
M
Kowatsch, T., Nißen, M., Shih, C.-H. I., R¨ uegger, D., Volland, D., Filler, A., K¨ unzler, F., Barata, F., Hung, S., B¨ uchter, D. et al. (2017). Text-based
ED
healthcare chatbots supporting patient and health professional teams: Preliminary results of a randomized controlled trial on childhood obesity. In Persuasive Embodied Agents for Behavior Change (PEACH2017). ETH Zurich.
PT
Laranjo, L., Dunn, A. G., Tong, H. L., Kocaballi, A. B., Chen, J., Bashir, R., Surian, D., Gallego, B., Magrabi, F., Lau, A. et al. (2018). Conversational
CE
agents in healthcare: a systematic review. Journal of the American Medical Informatics Association, .
AC
Lisetti, C., Amini, R., Yasavur, U., & Rishe, N. (2013). I can help you change! an empathic virtual agent delivers behavior change health interventions. ACM Transactions on Management Information Systems (TMIS), 4 , 19.
Lisetti, C. L., Yasavur, U., De Leon, C., Amini, R., Visser, U., & Rishe, N. (2012). Building an on-demand avatar-based health intervention for behavior change. In FLAIRS Conference. 37
ACCEPTED MANUSCRIPT
Lokman, A. S., & Zain, J. M. (2010). Chatbot enhanced algorithms: A case study on implementation in bahasa malaysia human language. In International Conference on Networked Digital Technologies (pp. 31–44). Springer.
CR IP T
L´ opez, V., Eisman, E. M., & Castro, J. L. (2008). A tool for training primary
health care medical students: The virtual simulated patient. In Tools with Artificial Intelligence, 2008. ICTAI’08. 20th IEEE International Conference on (pp. 194–201). IEEE volume 2.
Magerko, B., Deen, J., Idnani, A., Pantalon, M., & D’Onofrio, G. (2011). Dr.
AN US
vicky: A virtual coach for learning brief negotiated interview techniques for treating emergency room patients. In AAAI Spring Symposium: AI and Health Communication.
McArthur, S. D., Davidson, E. M., Catterson, V. M., Dimeas, A. L., Hatziargyriou, N. D., Ponci, F., & Funabashi, T. (2007). Multi-agent systems for power engineering applicationspart i: Concepts, approaches, and technical
M
challenges. IEEE Transactions on Power systems, 22 , 1743–1752. Ni, L., Lu, C., Liu, N., & Liu, J. (2017). Mandy: Towards a smart primary care
ED
chatbot application. In International Symposium on Knowledge and Systems Sciences (pp. 38–52). Springer.
PT
Nikitina, S., Callaioli, S., & Baez, M. (2018). Smart conversational agents for reminiscence. arXiv preprint arXiv:1804.06550 , 1 , 6.
CE
Novielli, N. (2010). Hmm modeling of user engagement in advice-giving dialogues. Journal on Multimodal User Interfaces, 3 , 131–140.
AC
Ochs, M., De Montcheuil, G., Pergandi, J.-M., Saubesty, J., Pelachaud, C., Mestre, D., & Blache, P. (2017). An architecture of virtual patient simulation platform to train doctor to break bad news. In Conference on Computer Animation and Social Agents (CASA).
Petticrew, M., & Roberts, H. (2008). Systematic reviews in the social sciences: A practical guide. John Wiley & Sons. 38
ACCEPTED MANUSCRIPT
Rentea, V., Rentea, M., Isaroiu, M., Bogdan, N., & Ioanitescu, R. (2012). Prevention assistant–risk evaluation based on sparse data. In Emerging Intelligent Data and Web Technologies (EIDWT), 2012 Third International Conference
CR IP T
on (pp. 158–165). IEEE. Ring, L., Bickmore, T., & Pedrelli, P. (2016a). An affectively aware virtual therapist for depression counseling. In ACM SIGCHI Conference on Human
Factors in Computing Systems (CHI) workshop on Computing and Mental Health.
AN US
Ring, L., Bickmore, T., & Pedrelli, P. (2016b). Real-time tailoring of depression counseling by conversational agent. Iproceedings, 2 , e27.
Rizzo, A., Kenny, P., & Parsons, T. D. (2011). Intelligent virtual patients for training clinical skills. JVRB-Journal of Virtual Reality and Broadcasting, 8 , 9.
M
Roehrs, A., da Costa, C. A., da Rosa Righi, R., & de Oliveira, K. S. F. (2017). Personal health records: a systematic literature review. Journal of medical
ED
Internet research, 19 , 21.
Sebastian, J., & Richards, D. (2017). Changing stigmatizing attitudes to mental health via education and contact with embodied conversational agents.
PT
Computers in Human Behavior , 73 , 479–488. Shaked, N. A. (2017). Avatars and virtual agents–relationship interfaces for the
CE
elderly. Healthcare technology letters, 4 , 83.
AC
Stapi´c, Z., L´ opez, E. G., Cabot, A. G., de Marcos Ortega, L., & Strahonja, V. (2012). Performing systematic literature review in software engineering. In CECIIS 2012-23rd International Conference. ACM.
Tanaka, H., Adachi, H., Ukita, N., Ikeda, M., Kazui, H., Kudo, T., & Nakamura, S. (2017a). Detecting dementia through interactive computer avatars. IEEE journal of translational engineering in health and medicine, 5 , 1–11.
39
ACCEPTED MANUSCRIPT
Tanaka, H., Negoro, H., Iwasaka, H., & Nakamura, S. (2017b). Embodied conversational agents for multimodal automated social skills training in people with autism spectrum disorders. PloS one, 12 , 15.
CR IP T
Tielman, M. L., Neerincx, M. A., van Meggelen, M., Franken, I., & Brinkman,
W.-P. (2017). How should a virtual agent present psychoeducation? influence of verbal and textual presentation on adherence. Technology and Health Care, (pp. 1–16).
Tokunaga, S., Horiuchi, H., Tamamizu, K., Saiki, S., Nakamura, M., & Yasuda,
AN US
K. (2016). Deploying service integration agent for personalized smart elderly care. In Computer and Information Science (ICIS), 2016 IEEE/ACIS 15th International Conference on (pp. 1–6). IEEE.
Torous, J., Nicholas, J., Larsen, M. E., Firth, J., & Christensen, H. (2018). Clinical review of user engagement with mental health smartphone apps: evi-
M
dence, theory and improvements. Evidence-based mental health, 21 , 116–119. Turner, M., Kitchenham, B., Brereton, P., Charters, S., & Budgen, D. (2010).
ED
Does the technology acceptance model predict actual use? a systematic literature review. Information and Software Technology, 52 , 463–479. Turunen, M., Hakulinen, J., St˚ ahl, O., Gamb¨ ack, B., Hansen, P., Gancedo,
PT
M. C. R., de La C´ amara, R. S., Smith, C., Charlton, D., & Cavazza, M. (2011). Multimodal and mobile conversational health and fitness companions.
CE
Computer Speech & Language, 25 , 192–209. Wang, C., Bickmore, T., Bowen, D. J., Norkunas, T., Campion, M., Cabral,
AC
H., Winter, M., & Paasche-Orlow, M. (2015). Acceptability and feasibility of a virtual counselor (vicky) to collect family health histories. Genetics in Medicine, 17 , 822.
Wargnier, P., Carletti, G., Laurent-Corniquet, Y., Benveniste, S., Jouvelot, P., & Rigaud, A.-S. (2016). Field evaluation with cognitively-impaired older adults of attention management in the embodied conversational agent louise. 40
ACCEPTED MANUSCRIPT
In Serious Games and Applications for Health (SeGAH), 2016 IEEE International Conference on (pp. 1–8). IEEE. Wargnier, P., Malais´e, A., Jacquemot, J., Benveniste, S., Jouvelot, P., Pino,
CR IP T
M., & Rigaud, A.-S. (2015). Towards attention monitoring of older adults with cognitive impairment during interaction with an embodied conversational agent. In Virtual and Augmented Assistive Technology (VAAT), 2015 3rd IEEE VR International Workshop on (pp. 23–28). IEEE.
Wells, K. J., V´ azquez-Otero, C., Bredice, M., Meade, C. D., Chaet, A., Rivera,
AN US
M. I., Arroyo, G., Proctor, S. K., & Barnes, L. E. (2015). Acceptability of an embodied conversational agent-based computer application for hispanic women. Hispanic health care international: the official journal of the National Association of Hispanic Nurses, 13 , 179.
Yaghoubzadeh, R., Kramer, M., Pitsch, K., & Kopp, S. (2013). Virtual agents
M
as daily assistants for elderly or cognitively impaired people. In International Workshop on Intelligent Virtual Agents (pp. 79–91). Springer.
ED
Yasavur, U., Lisetti, C., & Rishe, N. (2014). Intelligent virtual agents and spoken dialog systems come together to deliver brief health interventions. Journal on Multimodal User Interfaces, in press, 1 , 19.
PT
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., & Auer, S. (2016). Quality assessment for linked data: A survey. Semantic Web, 7 ,
CE
63–93.
Zhang, Z., Bickmore, T., Mainello, K., Mueller, M., Foley, M., Jenkins, L.,
AC
& Edwards, R. A. (2014). Maintaining continuity in longitudinal, multimethod health interventions using virtual agents: the case of breastfeeding promotion. In International Conference on Intelligent Virtual Agents (pp. 504–513). Springer.
Zhang, Z., Bickmore, T. W., & Paasche-Orlow, M. K. (2017). Perceived organi-
41
ACCEPTED MANUSCRIPT
zational affiliation and its effects on patient trust: Role modeling with embodied conversational agents. Patient education and counseling, 100 , 1730–1737.
CR IP T
Author Contribution
Joo Luis Zeni Montenegro: Writing-Original draft preparation, Conceptual-
ization, Methodology, Investigation. Cristiano Andr da Costa: Conceptualiza-
tion, Methodology, Investigation, Writing-Reviewing and Editing, Supervision.
Rodrigo da Rosa Righi: Methodology, Investigation, Writing-Reviewing and
AC
CE
PT
ED
M
AN US
Editing.
42