Available online at www.sciencedirect.com
ScienceDirect Procedia Computer Science 82 (2016) 122 – 126
Symposium on Data Mining Applications, SDMA2016, 30 March 2016, Riyadh, Saudi Arabia
Readability of Arabic Medicine Information Leaflets: A Machine Learning Approach Sihaam Alotaibia, Maha Alyahyaa, Hend Al-Khalifaa, Sinaa Alageelb and Nora Abanmyb a
College of Computer and Information Science, King Saud University, Riyadh ,Saudi Arabia b College of Pharmacy, King Saud University, Riyadh ,Saudi Arabia
Abstract This paper presents a project that explores the possibility of assessing the readability level of Arabic medicine information leaflets using machine learning techniques. There are a number of popular readability formulas and tools that have been successfully used to assess the readability of health-related information in several languages. However, there is limited work on the readability assessment of health-related information, specifically medicine information leaflets in Arabic. We describe the design of a tool that uses machine learning to assess the readability of medicine information leaflets. We utilize a corpus comprising 1112 medicine information leaflets annotated with three difficulty levels. Based on a study of existing literature, we selected a number of features influencing text difficulty. The tool will help specialized organizations in medicine information leaflets production to produce the leaflets at appropriate level of reading for the majority of leaflets consumers. © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ©2016 The Authors. Published by Elsevier B.V. (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-reviewunder underresponsibility responsibility of the Organizing Committee of SDMA2016. Peer-review of the Organizing Committee of SDMA2016 Keywords: Text Readability; Medicine Information Leaflets; Machine Learning; Arabic Language; Arabic Natural Language Processing.
1. Introduction Text readability is described as “the ease of understanding or comprehension because of the writing style” 1. Text readability assessment provides a measure of the text appropriateness to target readers. It provides a number of potential benefits in many fields such as education and healthcare. In education, it helps teachers match between texts and the reading abilities of students. In healthcare, it helps to provide health-related information such as medical instructions understood by the average patient.
* Corresponding author. Tel.: +966532221149 . E-mail address:
[email protected]
1877-0509 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Organizing Committee of SDMA2016 doi:10.1016/j.procs.2016.04.017
Sihaam Alotaibi et al. / Procedia Computer Science 82 (2016) 122 – 126
There are a number of popular readability formulas and tools that have been successfully used to assess the readability of health-related information in several languages2, 3. To the best of our knowledge, there have been a number of researches on automatic Arabic readability measurements of different types of documents such as curriculum textbooks and web articles4, 5, 6, however, no existing research has explored the readability of healthrelated information, specifically medicine information leaflets in Arabic. Medicine Information leaflets or package inserts are considered an important source of health-related information. Medicine Information leaflets are leaflets including information about medical conditions, doses, side effects that packed with medicines to give the user more information about the medicine. 1.1. Problem definition In Saudi Arabia, a survey of over 2000 community pharmacy customers found that the number of respondents who read package inserts (PIs), or at least ask someone else to read the PIs exceeds 85% 7. This indicates that PIs sometimes become the main source of information for some patients for many reasons8. First, self-medication is common in Saudi Arabia9. Second, the purchase of prescription-only medications (POM) without community pharmacies prescription is common in Saudi Arabia10. This means that the patient assumes the bulk of responsibility for medication safety not only for over-the-counter (OTC) medications but also for medications such as antibiotics, nonsteroidal anti-inflammatories, and oral contraceptives8. Third, the information provided by physicians or pharmacists that describe dose and frequency, precautions, and adverse effects of their prescribed medications has been shown to be suboptimal11. Challenges in reading and understanding such written medicine information leaflets may represent one cause for the high rates of medication errors, such as poor adherence which attenuates optimum medicine benefit. To eliminate this challenge it is necessary to find ways to improve the readability of these leaflets and one such method is based on the assessment of text readability. The main goal of this project is to develop an Arabic text readability assessment tool for consumers of medicine information leaflets. Therefore, our project aims to answer the following question: Given some medicine information leaflets that are written in Arabic, can a readability level of the text be measured automatically and accurately, and to what extent is it comparable to human judgment? 2. Literature review Researchers have tried to use several approaches and features that help to achieve an accurate estimate of the readability level for health-related texts when developing readability assessment tools. Several studies have followed the traditional approach which is based on formulas 12, 13. The researchers found that readability formulas are not specifically tailored for medical content 14. However, these formulas revealed to serve as an accepted initial approximation for predicting the consumer's comprehension of health topics15. Machine learning techniques such as Support Vector Machine, Decision Trees and Naïve Bayes were used in the development of a model that assesses the readability of web-based health information in a number of studies2, 16. In addition, the readability assessment research that uses computational approach is nearly new for the Arabic language. It has been applied on different types of documents such as curriculum textbooks and web articles 4, 5, 24. However, there is a lack of computational approaches for the purpose of readability assessment of health-related text in Arabic language. 3. Methodology In this section, we will describe the tools used in our readability assessment tool, to help to generate the statistical language models and accomplish readability assessment task. Then we will describe the readability assessment tool development phases.
123
124
Sihaam Alotaibi et al. / Procedia Computer Science 82 (2016) 122 – 126
3.1. Used tools x The SRI Language Modeling Toolkit17 SRILM is a toolkit for building and applying statistical language models (LMs), used in speech recognition and machine translation and other applications17. SRILM consists of a set of C++ class libraries implementing language models and a set of executable programs built on top of these libraries to perform standard tasks such as training LMs and testing them on data. SRILM runs on UNIX and Windows platforms. SRILM integrated into our tool in the feature extraction phase to generate statistical language models for each readability level. x Weka Machine Learning Software 20 Weka is a machine learning workbench which provides tools for data pre-processing, classification, regression, clustering, and visualization. It also provides a graphical user interface 20.Weka toolkit integrated into our tool to generate classification algorithms such as SVM, decision tree, and naïve bayes. 3.2. Development phases Automatic text classification is the process of classifying text into predefined classes 21. Recently, the techniques of automatic text classification have been utilized in readability assessment field 4. The effectiveness of classifying a given text into readability levels depends on factors including the selection of the dataset, the choice of algorithm, and the selection of effective features to be extracted from the dataset 22. The development of our tool includes the following phases: dataset collection, data preprocessing, features extraction, classifier training, and testing (Fig.1).
Fig. 1. The Readability Assessment Tool
3.2.1. Dataset collection Our corpus, acquired from King Abdullah Arabic Health Encyclopedia and the Saudi Food and Drug Agency, contains 1112 Medicine Information leaflets in Arabic, annotated with three difficulty levels: easy, intermediate, and difficult. The annotation process was performed by human annotators. The readability of leaflets will not be assessed for the whole document. Instead we will assess readability of selected sentences that relevant to patients from each leaflet. 3.2.2. Data pre-processing The preprocessing phase focuses on Arabic natural language processing tools and techniques to preprocess the leaflets. The tools can be used to facilitate the extraction of features for the machine learning algorithm. We used
Sihaam Alotaibi et al. / Procedia Computer Science 82 (2016) 122 – 126
AraNLP library23 to apply some preprocessing steps to the leaflets. It includes a tokenizer, part-of-speech tagger (POS-tagger), normalizer, and a punctuation and diacritic remover. We also applied a normalization to Arabic letters that have different forms to a unified form such as ( ﺁ, ﺇ, ﺃto )ﺍ.The initial results from preprocessing phase showed an increasing in the accuracy of the lexical features extraction. Moreover, the size of a leaflet dropping a 20%, which improves the efficiency of features extraction. 3.2.3. Features Selection and Extraction The main goal of this phase is to identify statistical features that can help in the prediction of readability of medicine information leaflets. Also to represent the output text from the previous step as a feature vector that will be the input of the learning algorithm 4. Features and combination of features are evaluated to find the best features that provide metrics that measure readability with high accuracy. Most of particular features are motivated by prior work (shallow text features, lexical difficulty features and partof-speech ratio features). Shallow text features utilized by most readability formulas. They concentrate on factors, such as word and sentence length, average word length and average sentence length, based on the hypothesis that longer sentences containing long words are more difficult to read 2. The second feature set is lexical difficulty features. Since vocabulary plays an important role in health related text readability, the most important element taken into account are concerned with lexical features 2. For example, the unigram language models which have been shown to be effective in the literatures of readability assessment 2, 18, 19. Unigram language models features are expected to capture most of vocabulary’s influence 2. We used the POS-based ratio features inspired by Forsyth 5, POS-based ratio features were shown to be the most informative in making predictions. POS-based ratio features are the ratio of the individual POS type in Arabic (nouns, verbs, adjectives, and pronoun) occurrences to the token count in the document. 3.2.4. Machine Learning Algorithm To investigate the possible contribution of different types of features on the readability prediction of medicine information leaflets, we use machine learning techniques, in particular support vector machines (SVMs), to train a classification model on data with class labels, known as training dataset. Then the SVM classifier will be used on data without class labels, the testing dataset, to produce the most likely readability class label (easy, intermediate, and difficult). We selected the Support Vector Machine (SVM), because of their prior success in automatic readability assessment 25, 26. Support vector machines (SVMs) are a set of supervised learning methods used for classification, regression and more. SVM formally defined by a separating hyperplane. In other words, given labeled training data, the algorithm produces an optimal hyperplane which categorizes new examples 27. 3.2.5. Evaluation In order to evaluate our automated readability assessment tool, we will use 10-fold cross validation as the evaluation method. The main benefit of cross fold validation is that it maximizes the training set size. It also provides sound test results which are more generalizable. We will measure the performance of our readability assessment tool in term of accuracy rate. The accuracy rate is the percentage of documents for which the readability measure correctly predicted their gold-standard levels. 4. Conclusion In this paper we described our methodology to develop an automatic text readability assessment tool of medicine information leaflets for the Arabic language. We implemented the first increment in the methodology which is the data preprocessing by utilizing an ANLP library. For future work, we will complete on the implementation of our methodology increments. We will conduct the performance evaluation of our tool. We will show the contributions of the selected features in the readability of Arabic medicine information leaflets and the major findings will be derived for further improvements.
125
126
Sihaam Alotaibi et al. / Procedia Computer Science 82 (2016) 122 – 126
Acknowledgements This Project was funded by the National Plan for Science, Technology and Innovation (MAARIFAH), King Abdul-Aziz City for Science and Technology, Kingdom of Saudi Arabia, Award Number (INF 2822). References 1. Klare G. The measurement of readability.ACM Journal of ComputerDocumentation; 2000. vol. 24, no. 3, pp. 107-121. 2. Wang Y. Automatic recognition of text difficulty from consumers health information. In Computer-Based Medical Systems, in19th IEEE International Symposium on IEEE. 2006. pp. 131-136. 3. Cronin M, O’Hanlon S & O’Connor M. Readability level of patient information leaflets for older people. In Irish journal of medical science. 2011. pp.139-142. 4. Al-Khalifa S & Al-Ajlan A. Automatic readability measurements of the Arabic text: An exploratory study. Arabian Journal for Science and Engineering; 2010. 35. 5. Forsyth JN. Automatic Readability Prediction for Modern Standard Arabic.M.S. thesis.Department of Linguistics and English Language.Master of Arts, Brigham Young University, Brigham Young. 2014. 6. Cavalli-Sforza V, El Mezouar M &Saddiki H. Matching an Arabic Text to a Learners’ Curriculum. 2014. 7. Bawazir, SA, Abou-Auda, HS, Gubara, OA, Al-Khamis, KI, & Al-Yamani MJ. Public attitude toward drug technical package inserts in SaudiArabia. Journal of Pharmacy Technology; 2003.19(3). pp. 209-218. 8. Al-aqeel SA. Evaluation of medication package inserts in Saudi Arabia', Drug. Healthcare and patient safety.4, 33, 2012. 9. Alghanim SA. Self-medication practice among patients in a public health caresystem.EMHJ.17(5). 2011. 10. Abdulhak AA, Al Tannir MA, Almansor MA, Almohaya MS, Onazi AS, Marei MA, Aldossary OF, Obeidat SA, Obeidat MA, Riaz MS, Tleyjeh IM. Non prescribed sale of antibiotics in Riyadh, Saudi Arabia: a cross sectional study. BMC Public Health. 2011 Jul 7; 11(1):538. 11. Svarstad BL, Bultman DC & Mount JK. Patient counseling provided in community pharmacies: effects of state regulation, pharmacist age, and andbusyness. Journal of the American Pharmacists Association: JAPhA; 2003. 44(1). pp. 22-29. 12. Edmunds MR, Barry RJ, &Denniston AK. Readability assessment of online ophthalmic patient information.In JAMA ophthalmology 2013; pp.1610-1616. 13. Cheng C & Dunn M. Health literacy and the Internet: A study on the readability of Australian online health information. Australian and New Zealand journal of public health; 2015. 39(4), 309-314. 14. Shedlosky-Shoemaker R, Sturm AC, Saleem M, Kelly KM. Tools for assessing readability and quality of health-related web sites. J Genet Couns. 2009; 18:49–59. 15. Gemoets D, Rosemblat G, Tse T, & Logan R. Assessing readability of consumer health information: an exploratory study. In Medinfo 2004.pp.869-873. 16. Leroy G, Miller T, Rosemblat G, & Browne A. A balanced approach to health information evaluation: A vocabulary based naïve Bayes classifier and readability formulas. Journal of the American Society for Information Science and Technology; 2008. 59(9).1409-1419. 17. Speech.sri.com, "STAR Laboratory: SRI Language Modeling Toolkit", 2016. [Online]. Available: http://www.speech.sri.com/projects/srilm/. [Accessed: 25- Feb- 2016]. 18. Feng L, Jansche M, Huenerfauth M, &Elhadad N. A comparison of features for automatic readability assessment. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters. In Association for Computational Linguistics. 2010. pp. 276-284. 19. Aluisio S, Specia L, Gasperin C &Scarton. Readability assessment for text simplification.In Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications. 2010. pp. 1-9. 20. Cs.waikato.ac.nz, 'Weka 3 - Data Mining with Open Source Machine Learning Software in Java', 2015. [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/. [Accessed: 27- May- 2015]. 21. El-HaleesAM. Arabic Text Classification Using Maximum Entropy.The Islamic University Journal (Series of Natural Studies and Engineering); 15(2007). pp. 157–167. 22. Larsson P. Classification into Readability Levels: Implementation and Evaluation. 2006. 23. Althobaiti M, Kruschwitz U, &Poesio M. AraNLP: A Java-based library forthe processing of Arabic text. In Proceedings of the 9th Language Resources and Evaluation Conference (LREC). 2014. 24. Shen W, Williams J, Marius T & Salesky E. A language-independent approach to automatic text difficulty assessment for second-language learners. MASSACHUSETTS INST OF TECH LEXINGTON LINCOLN LAB. 2013. 25. François T &Fairon C. An AI readability formula for French as a foreign language. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning in Association for Computational Linguistics. 2012. pp. 466-477. 26. Qumsiyeh R & Ng YK. ReadAid: a robust and fully-automated readability assessment tool. In Tools with Artificial Intelligence (ICTAI), 23rd IEEE International Conference on IEEE. 2011. pp. 539-546 27. Cristianini N, &Shawe-Taylor J.An introduction to support vector machines and other kernel-based learning methods. Cambridge university press. 2000.