Available online at www.sciencedirect.com Available online at www.sciencedirect.com
Available online at www.sciencedirect.com
ScienceDirect
Procedia Computer Science 00 (2018) 000–000 Procedia Computer Science (2018) 000–000 Procedia Computer Science 13500 (2018) 315–322
www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia
3rd International Conference on Computer Science and Computational Intelligence 2018 3rd International Conference on Computer Science and Computational Intelligence 2018
Forming of Dyadic Conversation Dataset for Bahasa Indonesia Forming of Dyadic Conversation Dataset for Bahasa Indonesia a Computer a Computer
Cuk Thoaa , Arden S. Setiawanaa , Andry Chowandaa,∗ Cuk Tho , Arden S. Setiawan , Andry Chowandaa,∗
Science Department, School of Computer Science, Bina Nusantara University, Jl. KH. Syahdan No. 9, Jakarta 11480, Indonesia Science Department, School of Computer Science, Bina Nusantara University, Jl. KH. Syahdan No. 9, Jakarta 11480, Indonesia
Abstract Abstract Computer has been assisting human in almost all aspects of daily life. Despite the computer outsmart human in some tasks, it is Computer has been assisting in almostand all aspects Despite the computer human tasks, is a social ignorant tool. It doeshuman not understand capable of ofdaily doinglife. natural conversation withoutsmart human. To buildina some system that itcan anaturally social ignorant tool.and It does not understand and capable of doingtonatural conversation with human. To build a data. system that can understand communicate with human, it is essential train the system with natural conversation This paper naturally and consists communicate withdyadic human,conversation it is essential train the language. system with natural conversation data. This paper proposes aunderstand dataset which of natural in to Indonesian Where literature suggested that there are proposes a dataset which consists of natural dyadic conversation in Indonesian language. Where literature suggested that there are exists inadequate number of conversation dataset in Indonesian Language. There are 3164 words (formal and slang (informal nonexists inadequate number offrom conversation dataset in Indonesian Language. There are 3164 words (formal and slang standard words) annotated the recording of five groups, with the largest number of words belong to Food topic(informal (Group 3,non826 standard words) annotated the recording of five groups, with number of words belong toaFood topic (Group 3, 826 words), and the lowest onefrom belongs to Travelling topic (Group 2, the 372largest words). The dataset contributes pre-trained conversation words), and the lowest one belongs to Travelling topic (Group 2, 372 words). The dataset contributes a pre-trained conversation model with deep learning (LSTM). The model is trained in 10000 iterations, 128 batches, and 4 hidden layers, resulted in a model withofdeep perplexity 2.01.learning (LSTM). The model is trained in 10000 iterations, 128 batches, and 4 hidden layers, resulted in a perplexity of 2.01. c 2018 2018 The The Authors. Authors. Published Published by by Elsevier Elsevier Ltd. Ltd. © c 2018 The Authors. Published by Elsevier Ltd. This is This is an an open open access access article article under under the the CC CC BY-NC-ND BY-NC-ND license license (https://creativecommons.org/licenses/by-nc-nd/4.0/) (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection and peer-review under responsibility of of thethe 3rd3rd International Conference on Computer Science and Computational IntelSelection and peer-review under responsibility International Conference on Computer Science and Computational Selection and2018. peer-review under responsibility of the 3rd International Conference on Computer Science and Computational Intelligence 2018. Intelligence ligence 2018. Keywords: Natural Conversation; LSTM; Indonesian Datasets; Corpus; Dyadic Keywords: Natural Conversation; LSTM; Indonesian Datasets; Corpus; Dyadic
1. Introduction 1. Introduction Computer technology has become one of the important aspects of human life and assists human in most of their Computer has become one the important aspects of andnot assists in but most of medtheir daily activitiestechnology such as working or doing theofchores. It also improves thehuman qualitylife of life onlyhuman daily life also 1,2,3 daily activities such as working or doing the chores. It also improves the quality of life not only daily life but also med. Therefore, the demand for a sophisticated technology, that can understand human, is getting higher 1,2,3 . ical area 1,2,3 1,2,3 . Therefore, the demand for aitsophisticated technology, that can human, is getting higher ical area Since computer is an unsocial machine, is not easy for the computer to dounderstand the tasks that it needs bigger data and. Since computer is antounsocial it is not easy the computer to do theneeds taskstothat it needs biggerlanguage. data and advanced algorithm train to machine, understand human. Thefor computer as a machine learn the human advanced algorithm to train to understand human. The computer as a machine needs to learn the human language. Machine learning provides the same process as when parents teach a child to understand what they want. The input 4,5,6to understand what they want. The input Machine learningmechanism provides the same dataset. process There as when teach a child for that learning is called areparents many techniques that are good in dealing with the interac4,5,6 for that learning mechanism is called dataset. There are many techniques that are good in dealing with the interaction between computer and human languages. However, the availability of the dataset is limited (particularly the one 7 tion between computer and human languages. However, the availability of the dataset is limited (particularly one there is that available for public), especially when it is in Bahasa Indonesia . Moreover, literature also suggested that the 7 . Moreover, literature also suggested that there is that available for public), especially when it is in Bahasa Indonesia an inadequate number of audio-visual dataset in dyadic conversation in Bahasa Indonesia. an inadequate number of audio-visual dataset in dyadic conversation in Bahasa Indonesia. ∗ ∗
Corresponding author. Tel.: +62-21-534-5830 Corresponding Tel.: +62-21-534-5830 E-mail address:author.
[email protected] E-mail address:
[email protected] c 2018 The Authors. Published by Elsevier Ltd. 1877-0509 1877-0509 © 2018 The Authors. Published by Elsevier Ltd. c 2018 1877-0509 Thearticle Authors. Published by Elsevier Ltd. This is an open access under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) This is an and openpeer-review access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection under responsibility of the 3rd Conference on Computer Science and Computational Intelligence 2018. Selection and peer-review under responsibility of International the 3rd International Conference on Computer Science and Computational Selection and peer-review under responsibility of the 3rd International Conference on Computer Science and Computational Intelligence 2018. Intelligence 2018.
10.1016/j.procs.2018.08.179
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
316 2
Table 1. Datasets for Dyadic Communication
Datasets Twitter Corpus 12 Sina Wibo Corpus 13 Ubuntu Dialogue Corpus 14 Movie Dialog Dataset 15 Cornell Movie Dialogue Corpus 16 AMI Meeting Coprus 17 British National Corpus 18 Longman Spoken American Corpus 19 Cambridge and Nottingham Corpus 20 Indonesian News Corpus 7 Indonesian Chat Corpus 8
Topic Various Topics Various Topics Ubuntu Topics Film Topics Film Topics Meeting Topics Various Topics Various Topics Various Topics News Topics Various Topics
# of Dialogues/Words 1.3M Dialogues 4.5M Dialogues 930K Dialogues 3.1M Dialogues 220K Dialogues 175 Dialogues 10M Dialogues 5M Dialogues 5M Dialogues 7.1B Words 300 Dialogues
Language English Chinese English English English English English English English Indonesian Indonesian
There are four aspects as this paper concerns; method, format, time and equipment to produce Bahasa Indonesia corpus. The method is a deep learning method, especially the method that is related to voice and dataset. Format is a formal file of the voice and the annotation results, while equipment, such as mic and recording are placed in three directions. In order to produce good quality voice, the room should be free from any noise disruption from the surrounding. This research has gathered basic dataset from 15 persons in a group of 2. Each group discusses certain topics and the discussion is recorded in about 10 minutes. The purpose of the recording is to get the dataset in daily Bahasa Indonesia conversation, and is used to train the computer using deep learning method. The literature suggested that there are not many dataset of dyadic natural conversation between human and buman in Bahasa Indonesia 8,7 . Dyadic is a term that describes the relationship between two persons. In Psychology, dyadic is part of a study of romantic relationship between a couple 9 . There are three data approaches in analyzing the data of dyadic; multi dyadic analysis, affective analysis and collaborative auto-ethnography 10 . In computer technology, the study of dyadic also covers changing voice into data that can be understood by the computer. There are two things that can influence the quality of the voice vocal cord and articulation 11 . If the computer can understand the meaning of the voice, the computer can complete the instructions. In addition, with the current technology, the computer can interact and speak to human according to the provided data. Hence, this paper contributes to the collection of dyadic conversation dataset in Bahasa Indonesia. There are more than 3100 words annotated and evaluated with LSTM (dual encoder, sequence to sequence) architecture. The results show a perplexity of 2.01 from the training done with Tensorflow with total of 10000 iterations, 128 batches and 4 hidden layers. 2. Related Work 2.1. Corpus for Natural Language Processing The result of the voice recording will be collected and used as an input to train the computer. At the moment, there are not many available datasets or corpus that can be used to form algorithm model for natural dyadic communication particularly in local language such as Indonesian Language 7 . Most of the available corpus is in English and only a few in Bahasa Indonesia. Another obstacle is that the format of the dataset can be different from one and another. This has become an issue for the research and additional work has to be done to align the format of the dataset. Table 1 shows some of the well-known datasets that are available for public. The table shows that the number of dataset or corpus in Bahasa Indonesia is very insignificant compared to others in English which is 300 dialogues in Bahasa Indonesia compared to 10 million dialogues in English. There are various topics provided by the English language dataset, namely: Meeting, Film, Future, Holiday, School, etc. Most of the datasets were collected from the social media (e.g. Twitter, and Sina Wibo), or from dictionary (e.g. British National, Longman, and Cambridge). The others were collected from various sources such as Online newspaper, Meeting, and Movies Dialogue. Moreover, some datasets have audio and visual dataset to offer in addition to the text datasets. This could benefit the other researchers from social science and psychology area to analyse details on the social and psychology aspects of the interaction (e.g.
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
317 3
Fig. 1. Sequence to Sequence Model 6
AMI Meeting Corpus). Currently, there is still no exists dyadic conversation dataset publicly available for research in Indonesian Language. The Indonesian News 7 and Indonesian Chat 8 Corpus only provide text dataset. 2.2. Machine Learning In term of learning, there are two important elements; learners and teachers. Learners learn from what the teacher gives. This also complies in machine learning where the learner is the machine and the teacher is the provided dataset. The result of this learning activities is the basis for the computer to execute a command or even to produce new knowledge. Learning strategy is also necessary to achieve the objective of learning. Learning strategies can be categorized into two types depending on the amount of the inference that can be done by learning based on the received information 11 . The first category is no inference at all and the second is significant inference. No inference at all refers to when a programmer input all the needed information to the computer. This will give the computer significant knowledge but without any inferences. The second one is allowing the computer to learn new knowledge or even give new concept. In order to do it, a significant inference is needed. The computer can learn through the provided dataset. With the dataset, the computer will be trained using deep learning algorithm. There are many deep learning algorithms, but since the dialogue has the type of continuous and sequence, the appropriate algorithm is Recurrent Neural Network (RNN), especially Long Short-Term Memory (LSTM) model 6 . The deep learning algorithm example can be seen in fig. 1 which describes the LSTM with the title of Alice in the Wonderland. Sutskever, Vinyals, & Le 6 introduced dual encoder LSTM where expected questions and answers are trained together. First, each word of the question sentence h will be converted into a vector using word2vec until special character [EOS] is detected. Weight W will be updated and given to second network which is the correct answer (g) from question h (see fig. 1 for details). 3. Methodology Overview of this research activity is presented in fig. 2. There are several stages: define, data collection, data annotation, data training, conversation model, and evaluation. At the define step the boundary of this research is defined as well as the research object and methods. The second stage is the data collection. At this stage, participants were recruited to join the recording processes. It had been expected that the number of participants is around 60 to 80, but due to the limitation of time and equipments constraints, there were only 10 participants, with around eight to ten minutes five recordings. Each session consisted of a recorded conversation between two participants (see fig. 3). Several conversation topics on Food, Traveling, and Future were provided, but participants were also allowed to use their own topic if they were
318 4
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
Fig. 2. Research Methodology
Fig. 3. Recording Sessions, Interlocutor A (left), and Interlocutor B (right)
not comfortable with the topics provided, resulting one group was discussing students activities instead of the provided topics. The interaction and conversation were recorded using two high resolution mobile phones. During the recording, two microphones also were used and placed 30 cm away from the participants (see fig. 4)to record the conversation specifically. Next step was the data annotation, where the records were transcribed by the annotators resulted in a JSON Format. The annotated data were processed into a dataset with JSON Format. This format was used since generally, chatbot systems use JSON or XML format. The next stage was the data training. At this stage, there were three sub-phases: First is the training using the available Bahasa Indonesia Dyadic Communication dataset, second is to model training using the proposed Bahasa Indonesia dyadic communication dataset, the last is to train using the combined datasets. The conversation model as the result of this training was evaluated at the end of the research and described in the next section. The participants were asked to have a discussion with one and another. The duration of the dialogue or interaction is around eight to ten minutes per session. The summary of data with five groups is presented in table 2. There were two groups (Group 1 and Group 4) discussing the same topic: Future. Group one focus on talking about their future career after their finish their study at the university. Group 4 also talked a lot about their future plan in career after their university life, but this group also talked about the possibility to continue their education life with a Master degree. Group 2 talked a lot about their experiences when they were travelling, and Group 3 had an interesting discussion about foods or eating places around the campus. Finally, the last group chose their own topic to talk about. They chose several student activities such as their university assignments, exams, and lab activities. In total, there are 3164 words annotated from the recording, with the largest number of words belong to Food topic (826 words), while the lowest one belongs to Travelling (372 words). The group 5 (Student activities) has the largest number of utterances and dialogues spoken during the session, 64 and 32 respectively. While Group 1 has the lowest number of utterances and dialogues in their conversation, 20 and 10 respectively. The Food topic (Group 3, maximum
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
Table 2. Summary of the Recording (Datasets)
Group 1 2 3 4 5 -
Topic Future: Career Traveling Food Future: Career & Study Student activities SUMMARY
# of Words 786 372 826 642 538 3164 (TOTAL)
Max Words/utterances 135 45 238 85 23 105,2 (AVERAGE)
# utterances 20 28 22 24 64 158 (TOTAL)
319 5
# of dialogues 10 14 11 12 32 79 (TOTAL)
Fig. 4. Data Collection Settings
Fig. 5. Example of Dual Encoder LSTM 6,7
of 238 words per utterances) and the Future: Career topic (Group 1, maximum of 135 words per utterances) have relatively long utterances in their dialogue. This happened when the interlocutors were trying to explain something to others (see the first example described in the Results Section). Generally, all the groups have the same number of the minimum words per utterances, one. They are usually a greeting word, such as: hi, halo, or bye. Finally, the summary of all collected dataset is illustrated in the Results Section. 4. Results As mentioned in the previous section, the language used in the conversation is not formal Bahasa Indonesia but daily spoken language of Bahasa Indonesia especially the Bahasa used by young people who mostly use slang language. This particularly benefits for the community in Natural Language Processing Research. However, annotating the conversation is generally quite challenging and consumed times. This is due to the limited annotation program in Bahasa Indonesia which cause the process of annotation for this paper took much time since it was done manually. The result of the annotated conversations is used as training data using the deep learning algorithm which was tested. The deep learning algorithm is Dual Encoder Recurrent Neural Network, and more specifically the architecture is Dual Encoder Long Short-Term Memory (LSTM) 6,7 . The question is Kamu makan apa nanti? and the answer will be Tidak tahu nih (see fig. 5) There are total 79 dialogues with 158 long and short utterances in the dyadic conversation dataset collected. The average of maximum words per utterances in every group of respondents is 105,2 words, meaning, the conversation was quite intimate. The largest number of maximum words per utterances belong to food group (Group 3), where the
320 6
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
interlocutors tried to explain how they usually order or get their food. The second one is belong to the first group, with topic of future career, where one of the interlocutor was describing their career aspiration and compare it with one of their family life. Some of the conversation are quite serious, such as: career, and how to eat clean and healthy, but others are quite relax and fun conversation, such as: traveling. Some of the conversation examples are illustrated below. In the first example (topic: Travelling), the participants were discussing a trip to a famous island in Indonesia, Komodo Island. From the conversation style and their non-verbal cues (e.g. from facial expressions, gestures, etc.), both interlocutors have already known each other before the experiment. Hence, the conversation style used in this session was informal Indonesian with many informal words such as kemaren, cmn, aer. This can benefit the natural language processing system that the system can work with non-formal language, where that’s what should happen in the natural conversation. While in the second example (topic: Student Activities) the interlocutors also know each other, so the language used during the conversation was informal Indonesian (e.g. gue, gw, lu, ntar, belom). Topic : Traveling .... Interlocutor A: Kemaren lu ngapain aja ke pulau komodo? (What did you do in the Komodo Island?) Interlocutor B: Sebenernya gw ga ke pulau komodo doang. Gw ke .... itu ada komodo, pulau ... (Actually, I did not only go to the Komodo Island. I went there [places that has] the Komodos, the . Island) Interlocutor A: Terus itu lu absennya gimana? (How about your class attendance?) Interlocutor A: Oh tenang aja kan jatah absen ada 3. Terus gw kemaren liat yang lu maen di dalem aer tuh. (Oh, dont worry I have 3 attendance limits. I also saw [your picture] yesterday, the one that you were playing underwater. ) Interlocutor B: Itu beda lagi tuh bukan di pulau komodo. (It was different, It was not in the Komodo Island.) Interlocutor A: Jadi klo komodo itu galak ga? (Is Komodo fierce?) Interlocutor B: Reptil itu kan cmn ada di indonesia. Kalo digigit pusing sumpah. Jadi dilidah ada bakteri yang mematikan , gigitnya ga sakit cmn bakterinya itu. Di situ gaada rumah sakit tuh di pulau komodo, harus di bali. Kalo lu ga pernah jalan? (That the only reptile that live in Indonesia. If you were bitten, you would feel dizzy. On the tongue, there is deadly bacteria. The bite is not painful, but the bacteria is. There was no hospital in Komodo Island. You must go to Bali island. Did you go somewhere?) Interlocutor A: Pernahnya ke puncak pass sih. cmn kena 1 way jadi gw pulang lagi deh. Tapi lu jalan mlu deh ajak dong sekali-kali. (I went to Puncak pass, but there was one-way traffic, so I headed back. But you always travel, invite me someday.) .... Topic : Student Activities .... Interlocutor A: Gab, lu tau ga? Ntr gw UAP loh besok. Lu udah ngerti belom materinya? (Gab, did you know? Tomorrow is my laboratory exam. Have you mastered the subject? ) Interlocutor B: Udahlah orang gw udah UAP. (I did, because had the exam already.) Interlocutor A: Oh gimana UAPnya? (Oh, how was it?) Interlocutor B: Kayak quiz aja ada inheritance gitu-gitu. Cuman kuiz kan ada 2 soal ini ada 1 soal. Cuman ada yang update, add, delete, sama sort of view. (Just like the quiz [given by the instructor], there was inheritance. In the quiz there were 2 questions, but this only had one. But there was update, add, delete, sort of view.) Interlocutor A: Gw ga ngerti nih gimana dong? Besok UAP-nya lagi Ajarin dong. (I do not understand the subject? Tomorrow will be the exam. Please teach me.) Interlocutor B: Belajar sendiri bisa lah. (You can learn it by yourself.) Interlocutor A: Gw denger-denger jem 3 kita ada rapat ya, itu bahas apa lagi? (I heard we will have a meeting on 3 oclock. What will be the meeting agenda?) Interlocutor B: Itu bahas buat final day. Itu kan buat H-1. Kita kan hari ini H-1. (The agenda will be discussing about the final day. It is for D-1, today is D-1.) ....
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
321 7
Fig. 6. Training Results with 10000 itteration
The training was conducted using Phyton and API Tensorflow with a total of 10000 iterations, 128 batches, and 4 hidden layers. The result of this is described in fig. 6. At the beginning of the iteration (1000th iteration), the training model gives perplexity result of 17.04. Perplexity is a disability of a system to understand something. The smaller the result means the better. At the 2000th iteration, the training model gives a better result which is 6.17. The result becomes stable from 6000th iteration and finally at 10000th iteration give the result of 2.01. The complete results every 1000 iteration are described as Y = [17.04, 6.17, 4.03, 2.17, 2.06, 2.28, 2.17, 2.08, 2.01, 2.01]. This indicates that the maximum iteration needed with experimental setting provided is 9 to 10 iteration before it getting constant results of the perplexity; However, different experimental setting could develop different results.
5. Discussion and Future Work This paper presents the dataset that can be used to train the computer to interact with the human. Although it is not as many as the English dataset, this dataset will add the number of available dataset in Bahasa Indonesia. The algorithm of Dual Encoder Long Short Term Memory can be used in training the data of conversation. It is hoped that along with this dataset, other research studies to produce annotation program in Bahasa will be developed. This dataset can also be used in further research studied of computer training or machine learning. The dataset consists of more than 3100 words annotated from natural conversations between friends, with the most favorite topic is the Food topic with 826 words. The dataset was used to train a model for natural conversation in the Indonesian language, resulted in a perplexity of 2.01. Moreover, the model could catch not only formal words but also slang (or non-standard) words in the conversation. This was results from the datasets that were mostly gathered from conversations between friends. The dataset could be used as a deep/machine learning data for Natural Language Processing (e.g. conversation, argumentation mining, summarization, sentences extraction, sentiment analysis, etc.). Although the number of words is not huge, the dataset consists of the natural conversation between friends, where many informal words in Indonesia are captured. Finally, the trained model of conversation is also available publicly to be used for a system such as Embodied Conversation Agent (ECA) 21,22 , or Chat Bot 22 , where the ECA or Chat Bot would be understand natural conversation with us, human.
322 8
Cuk Tho et al. / Procedia Computer Science 135 (2018) 315–322 No Name / Procedia Computer Science 00 (2018) 000–000
Acknowledgement We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research, and the support of BINUS University Grant 2017 to support the operational for this research. References 1. Chowanda, A., Flintham, M., Blanchfield, P., Valstar, M.. Playing with social and emotional game companions. In: International Conference on Intelligent Virtual Agents. Springer; 2016, p. 85–95. 2. Chowanda, A., Blanchfield, P., Flintham, M., Valstar, M.. Computational models of emotion, personality, and social relationships for interactions in games. In: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems; 2016, p. 1343–1344. 3. Chowanda, A., Blanchfield, P., Flintham, M., Valstar, M.. Play smile game with erisa. In: IVA 2015, Fifteenth International Conference on Intelligent Virtual Agents. 2015, . 4. Mikolov, T., Chen, K., Corrado, G., Dean, J.. Efficient estimation of word representations in vector space 2013;. 5. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.. Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems. 2013, p. 3111–3119. 6. Sutskever, I., Vinyals, O., Le, Q.V.. Sequence to sequence learning with neural networks. In: Advances in neural information processing systems. 2014, p. 3104–3112. 7. Chowanda, A., Chowanda, A.D.. Recurrent neural network to deep learn conversation in indonesian. Procedia Computer Science 2017; 116:579–586. 8. Koto, F.. A publicly available indonesian corpora for automatic abstractive and extractive chat summarization. In: LREC. 2016, . 9. Manning, J., Kunkel, A.. Qualitative approaches to dyadic data analyses in family communication research: an invited essay. Journal of Family Communication 2015;15(3):185–192. 10. Schrodt, P.. Quantitative approaches to dyadic data analyses in family communication research: An invited essay. Journal of Family Communication 2015;15(3):175–184. 11. Camastra, F., Vinciarelli, A., Yu, J.. Machine learning for audio, image and video analysis. Journal of Electronic Imaging 2009;18(2):029901– 029901. 12. Ritter, A., Cherry, C., Dolan, B.. Unsupervised modeling of twitter conversations. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics; 2010, p. 172–180. 13. Shang, L., Lu, Z., Li, H.. Neural responding machine for short-text conversation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers); vol. 1. 2015, p. 1577–1586. 14. Lowe, R., Pow, N., Serban, I.V., Pineau, J.. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. In: 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2015, p. 285. 15. Dodge, J., Gane, A., Zhang, X., Bordes, A., Chopra, S., Miller, A., et al. Evaluating prerequisite qualities for learning end-to-end dialog systems 2015;. 16. Danescu-Niculescu-Mizil, C., Lee, L.. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In: Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics. Association for Computational Linguistics; 2011, p. 76–87. 17. Renals, S., Hain, T., Bourlard, H.. Recognition and understanding of meetings the ami and amida projects. In: Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on. IEEE; 2007, p. 238–247. 18. Leech, G.N.. 100 million words of english: the british national corpus (bnc) 1992;. 19. Stern, K.. The longman spoken american corpus: providing an in-depth analysis of everyday english. AGE 2005;18(24):20. 20. McCarthy, M.. Spoken language and applied linguistics. Ernst Klett Sprachen; 1998. 21. Chowanda, A., Blanchfield, P., Flintham, M., Valstar, M.. Erisa: Building emotionally realistic social game-agents companions. In: International Conference on Intelligent Virtual Agents. Springer; 2014, p. 134–143. 22. Zhu, W., Chowanda, A., Valstar, M.. Topic switch models for dialogue management in virtual humans. In: International Conference on Intelligent Virtual Agents. Springer; 2016, p. 407–411.