Mining authentic student feedback for faculty using Naïve Bayes classifier

Mining authentic student feedback for faculty using Naïve Bayes classifier

ScienceDirect Procedia Computer Science 00 (2018) 000–000 ScienceDirect Available online at www.sciencedirect.com Available online at www.sciencedir...

626KB Sizes 0 Downloads 57 Views

ScienceDirect Procedia Computer Science 00 (2018) 000–000 ScienceDirect

Available online at www.sciencedirect.com

Available online at www.sciencedirect.com Procedia Computer Science 00 (2018) 000–000

ScienceDirect

www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia

Procedia Computer Science 132 (2018) 1171–1183

International Conference on Computational Intelligence and Data Science (ICCIDS 2018) International Conference on Computational Intelligence and Data Science (ICCIDS 2018) Mining authentic student feedback for faculty using Naïve Bayes classifier Mining authentic student feedback for faculty using Naïve Bayes classifier

Sandhya Maitraa*, Sushila Madanb, Rekha Kandwalc, Prerna Mahajand a* b Vidyapith, P.O. Banasthali Vidyapith, Rajasthanc304022, India Mahajand Sandhya MaitraBanasthali , Sushila Madan , Rekha Kandwal , Prerna a

b aLady

Sri Ram College for Women, Lajpat Nagar-IV, New Delhi, Delhi 110024, India Banasthali Vidyapith, P.O. Banasthali Vidyapith, Rajasthan 304022, India

c bMahan

Institute Of Technologies, Shivaji Enclave, New Delhi, Delhi 110024, India Lady Sri Ram College for Women, Lajpat Nagar-IV, New Delhi, Delhi 110024, India

d cInstitute

of Information Technology and Management, D Block, Janakpuri, New Delhi-110058 Mahan Institute Of Technologies, Shivaji Enclave, New Delhi, Delhi 110024, India

Abstract

d

Institute of Information Technology and Management, D Block, Janakpuri, New Delhi-110058

The output of traditional analysis of student feedback for class room delivery of faculty suffers from inaccuracy due to non Abstract consideration of the influence of various direct and indirect quality features related to student such as regularity in class attendance, academic background, outcomes and positive feedbackduemeasure. The output ofeffort, traditional analysis of studentcourse feedback for classachieved room delivery of facultyattitude suffers on fromthe inaccuracy to non Consequently, the output of traditional faculty feedback analysis is not a true indicator of faculty effectiveness in the teaching consideration of the influence of various direct and indirect quality features related to student such as regularity in class learning process. paper presents a proactive and outcomes outcome based facultyand feedback analysis model uses Naïve Bayes attendance, effort,Theacademic background, course achieved positive attitude on which the feedback measure. Classifier to cull and of classify the feedback providedanalysis by eachis student intoindicator valid orofinvalid on the basis of the Consequently, theout output traditional faculty feedback not a true facultycategories effectiveness in the teaching relative effect of The aforementioned quality featuresand on outcome the feedback The aboveanalysis qualitymodel features are used to refine the learning process. paper presents a proactive basedmeasure. faculty feedback which uses Naïve Bayes feedback to address the by imprecision to overcome limitations of theontraditional Classifier measure. to cull outThe andmethod classifyattempts the feedback provided each student into valid the or invalid categories the basis model. of the Consequently, theaforementioned output of faculty feedback analysis results in a more refined accurate Faculty Effectiveness relative effect of quality features on the feedback measure. Theand above quality features are used toIndex. refine The the Faculty Index is calculated of only thetovalid feedback with of feedback feedbackEffectiveness measure. The method attemptsastoweighted address average the imprecision overcome themeasures limitations ofthe thevalidity traditional model. taken as the associated weight. The classifier into results consideration the independent contribution of each of the features as well Consequently, the output of faculty feedbacktakes analysis in a more refined and accurate Faculty Effectiveness Index. The as the multiple evidences of calculated their occurrences in average the feedback each student. also ofsuggests Faculty Effectiveness Index is as weighted of onlyprovided the valid by feedback measuresThe withmethod the validity feedbacka comprehensive feedback form The comprising twointo parts namely subjective feedbackcontribution part for eliciting in traditional taken as the associated weight. classifieroftakes consideration the independent of eachfeedback of the features as well manner and an outcome based feedback part to collect on the aforesaid features related toalso student whicha as the multiple evidences of their occurrences in theinformation feedback provided by eachquality student. The method suggests influence the feedback measure. comprehensive feedback form comprising of two parts namely subjective feedback part for eliciting feedback in traditional manner and an outcome based feedback part to collect information on the aforesaid quality features related to student which influence the feedback measure. © 2018 The Authors. Published by Elsevier B.V. © 2018 The Authors. Published by Ltd. committee of the International Conference on Computational Intelligence and Peer-review under responsibility of Elsevier the scientific This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/) Data Science (ICCIDSPublished 2018). by Elsevier B.V. © 2018 The Authors. Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Peer-review responsibility Data Scienceunder (ICCIDS 2018). of the scientific committee of the International Conference on Computational Intelligence and Data Science (ICCIDS 2018).

*Corresponding Author: [email protected]

1877-0509© 2018 TheAuthor: Authors. [email protected] Published by Elsevier B.V. *Corresponding

Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science2018 (ICCIDS 2018).Published by Elsevier B.V. 1877-0509© The Authors. Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science (ICCIDS 2018). 1877-0509 © 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/3.0/) Peer-review under responsibility of the scientific committee of the International Conference on Computational Intelligence and Data Science (ICCIDS 2018). 10.1016/j.procs.2018.05.032

1172 2

Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et.al./ Procedia Computer Science 00 (2018) 000–000

Keywords: Educational data mining; Machine learning; Naïve Bayes classifier; Higher education institution.

Nomenclature FEI

Faculty Effectiveness Index

SVM

Support Vector Machine

AUC

Area Under Curve

F MEASURE 2*[(Precision *Recall) / (Precision+Recall)] RSS

Really Simple Syndication

GIS

Geographic Information System

K-NN

K-Nearest Neighbour

1. Introduction Educational data mining is susceptible to uncertainties of several kinds such as ambiguity and vagueness with inadequate measures to handle them. The success of teaching learning process is largely dependent on continuous quality monitoring with feed forward mechanisms such as faculty feedback to control the progress of the process. Sometimes, due to lack of correlation between internal and university examination marks and students' feedback suitable corrective actions are not taken and students' centred teaching does not take place [1]. The reason is that the legitimacy of student feedback measure is not given due consideration and corrective actions were taken based on a feedback measure with inherent uncertainties. Quality management necessitates reduction of uncertainty in data analysis which creeps in due to non consideration of interrelationships between quality features such as attendance and feedback besides results, student level progression and placements. The effect of inter relationships between aforesaid quality features should be studied, analyzed and interpreted to facilitate an integrated and comprehensive evaluation. The data for pilot study has been collected from an Indian Higher Education Institution. Student feedback is considered as a major parameter for academic performance indicator score evaluation according to regulating bodies worldwide. It is important to measure the impact of several interventions such as teaching learning pedagogy, monitoring and control mechanisms both feedback and feed forward into future performance of the institution. But the consideration of student feedback as a major parameter for faculty effectiveness has been opposed by teachers to be counterproductive for both teachers and students in the prevailing Indian political and social situations. According to [2] the output of teaching learning process is not the student but the skills and competency acquired by the student, and this definition requires the students to take an active role in the education process. The traditional procedure for eliciting faculty feedback is ineffective for facilitating student centric learning and progression as it does not provide any information on bottlenecks in the teaching learning process. The traditional feedback analysis does not provide the requisite inputs to address the actual problem being faced by each individual student. Moreover it is viewed more as a reactive approach in which the assessor who is the student tends to attribute his or her progress to the teacher alone who is viewed as the assessee. It is seldom viewed as a productive procedure to unearth the real bottlenecks in the teaching learning process. The individual features of the student or assessor are not taken into consideration in feedback analysis. Additionally the instrument used to elicit the feedback is also subjective and does not focus on skills acquired or course outcomes achieved by the student. Therefore it is imperative to adopt a proactive and outcome based process of feedback elicitation, feedback validation and analysis to reap the true benefits of feedback analysis. This provides the impetus to refine the traditional student feedback by reducing the uncertainty in the measure which occurs due to the influence of individual student features. As student feedback for classroom delivery is a feed forward mechanism a probabilistic classifier is more suitable for feedback data analysis. The Naïve Bayes classifier can be used to accurately classify the feedback based on the degree of influence of the



Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et. al./ Procedia Computer Science 00 (2018) 000–000

1173 3

different features on feedback. Moreover a Naive Bayes classifier converges quicker than discriminative models like logistic regression, as it needs lesser training data comparatively. Further a justification for adopting the model is that despite the fact that the Naïve Bayes assumption doesn’t hold, the classifier is fast, easy to apply and performs quite well. It also helps to study the independent effect of these features on the learning progress of a student. 2. Background Quality management approach is recommended for teaching learning process. Further, the Higher Education Institutions should adopt total quality management approach for managing, scrutinizing and enhancing the quality of teaching and learning [1]. The role of feedback analysis and requirement of reliable feedback is emphasized for continuous improvement in teaching learning process to benefit stake holders of the process [3]. The strengths and weaknesses assessed for teaching learning process of all types of higher education institutions include lack of correlation between internal and university examination marks, negligence of student centric teaching, students' feedback and corrective actions besides others aspects [1]. The problem is that internal assessments are more subjective while external assessments are objective. Students face different types and levels of bottlenecks in teaching learning some of which may be attributed to the teacher attributes and some to learner attributes. A tool based on comprehensive analytic hierarchy process performs quantitative and qualitative analysis of evaluating teacher performance. It tests if the degree comparison design is logical by projecting itself towards a consistent target ensuring objectivity in subjective assessment by different universities and colleges [3].This shows that objectivity can be brought to subjective assessment even in student feedback process. The issue of students as subjects from the point of view of their learning benefits has been discussed by a research focused on students and their learning processes. It is observed that learning with experiments is a source of motivation for students in their subjects [4]. It is also important to know the measure of benefit accrued by each individual student and reasons for variation. A student satisfaction index evaluation system based on factor analysis is a assessed and the model is modified through empirical study. It is observed the student satisfaction is directly related to teaching equipment and materials especially e-learning, network resources and teacher guidance and not academic standards of teachers or teaching methods [5]. Attempt has not been made to address the impact of students’ attributes on their satisfaction levels. A student satisfaction survey which is considered as a quality management technique in higher education is conducted by [6] which shows that student feedback grades teaching learning activities above all other aspects such as infrastructure, services etc. Student feedback also grades a faculty in teaching learning process but factors that influence student feedback for class room delivery of faculty are not sufficiently defined. As a result the feedback is clouded by additional factors and is not a true indication of lacuna in the teaching process. Such a feedback cannot be acted upon to administer productive changes to streamline the teaching learning process. The Naïve Bayesian machine learning approach is one of the most popular methods used in predictive analytics. A mix of each of machine learning approaches such as Naïve Bayes, SVM, K Nearest Neighbour(KNN), Neural Network classifiers for binomial classification and natural language processing techniques is applied by [7] on the student feedback data. The Rapid Miner tool is used to find the polarity of student feedback on the basis of a set of pre defined features of teaching and learning. The procedure utilised different evaluation criteria for each case and performed a comparison. The results suggest Naïve Bayes algorithm to be superior in accuracy and recall in comparison with other algorithms. The teaching and learning features extracted from feedback are classified as negative or positive depending upon their ability to analyze the feature that requires improvement. [8] have conducted a study on the reliability of student feedback ratings or quantitative features based on evidence offered by linguistic aspects of free text accompanying the feature space. Higher qualitative evidence in text specifies higher numerical ratings for an honest feedback. The correlation between quantitative ratings and qualitative evidence from comments on the feature is considered for the same. [9] have applied Naïve Bayes statistical machine learning algorithm for classification of Gujarati documents into predefined categories. It is observed that the classifier is more accurate for randomly selected 10 fold partition as compared to 2 fold partition of data as in the latter case documents used for test data may not have been trained properly by the classifier. The accuracy without using feature selection is higher compared to using the classifier with feature selection. It is also observed to work well on small data sets. [10] performed a comparative analysis of Naïve Bayes, Decision Tree, and k-Nearest Neighbor classifiers for searching alternative designs of energy usage by a building before its construction which revealed that the performance of Naïve Bayes is the best in terms of precision, recall, F measure, accuracy, and Area Under

1174 4

Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et.al./ Procedia Computer Science 00 (2018) 000–000

Curve (AUC) as it is observed by the distribution of dependencies among all attributes over classes affect the classification of Naive Bayes, not merely the dependencies themselves. [11] discussed two supervised machine learning algorithms: K-Nearest Neighbor (K-NN) and Naïve Bayes‘ in terms of their overall accuracy, precision as well as recall value for movie and hotel reviews. It was observed that in case of movie reviews Naïve Bayes‘ gave far better results than K-NN while for hotel reviews these algorithms gave almost same accuracies. [12] used the Naïve Bayes classifier algorithm to analyse or to predict the risk levels involved in loans with a good classification rate of the order of 63.85 per cent. The default probability is explained by the variables measuring working capital, leverage, solvency, profitability and cash flow indicators. [13] proposed text segmentation using Naïve Bayes classifier on large training data set generated from small overlapping blocks of eight document images with text strings of different fonts, sizes, intensity values and background models and observed that the Naïve Bayes classifier showed accurate results and comparatively much lesser computational time over other text segmentation techniques. [14] performed research on text classification using three classification algorithms namely Naïve bayes, SVM (Support Vector Machine) and Stanford Tagger for text classification trained on two different data sets for 5 categories and observe that Naïve Bayes is a better text classification model owing to its simplicity besides requiring a small amount of training data to estimate necessary parameters affecting classification. It converges faster than discriminative models such as logistic regression which require large training data set. It is a high bias /low variance classifier which performs better than low bias /high variance classifiers such as logistic regression or KNN on small training sets. A Naive Bayes classifier was proposed by [15] for prediction of Bronchopulmonary Dysplasia in extremely premature infants on the basis of combinations of fourteen different features in their second week of birth. The algorithm gives decent results and can be used alongside Support Vector Machine or Logistic Regression especially where very limited number of measured parameters is available. [16] made use of Naïve Bayes classifier to build a distributed multi-human location algorithm for a binary pyroelectric infrared sensor tracking system. The model uses a two-level regional location, static partitioning and dynamic partitioning, in which a Naïve Bayes classifier is used to simplify initially the human location in a static sub-region and taking this initial location as the center, it is again used to define a new secondary dynamic sub region and ultimately the human location. It is shown through simulation and experimental results that Naïve Bayes classifier is superior to neural network classifier in terms of performance with respect to single target locating accuracy, multiple target accuracy, reduced complexity, memory utilization for test sample maintenance, additional computation space as well as execution time for both single and multiple target test. [17] applied Naïve Bayes classifier to identify the precise location of earthquake of a particular magnitude from dense earthquake data provided from RSS feed (GeoRSS data) which GIS softwares find it cumbersome to identify. The efficient Naïve Bayes classifier accurately predicts the outcome in GeoRSS Data. [18] proposed a Naïve Bayes classifier which identifies misclassified noisy instances of a phone call training data set using conditional probabilities of the attributes and independence assumption. The effectiveness of the technique is verified using decision tree machine learning algorithm to classify user phone call behavior as accept, reject, missed and outgoing. The Naïve Bayes classifier being a technique based on probabilities is applicable to a wide range of domains according to the above literature survey. It is computationally fast, has linear complexity and does not require a large amount of training data. Furthermore the algorithm justifies the membership of an instance to a particular class with the help of probabilities. The Naïve Bayes algorithm has been applied to educational data mining in different contexts. Therefore Naïve Bayes classifier is suitable for effective classification of student feedback for faculty on class room delivery as class room consists of smaller data set. The student features influencing the measure appear as multiple evidences and feedback needs to be validated taking into account the degree of influence of these features on the measure. 3. Faculty feedback validation Feedback validation is not given due importance in analysis of higher education data. The need of the hour is to design an effective mechanism to measure qualitative aspects and perform combined data analyses of correlated variables namely, feedback, attendance and outcomes. The model is based on probabilistic approach as it is a feed forward mechanism which is used during ongoing teaching learning process when final result data is not yet



Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et. al./ Procedia Computer Science 00 (2018) 000–000

1175 5

available. Hence it is not possible to adopt a fully deterministic approach. The intermediate inputs provided by the proposed model suggest timely corrective actions to counter the entropy in the teaching learning process. The machine learning approach used is Naive Bayesian technique. Further multiple regression and correlation techniques have been used to ascertain existence of the correlations between the parameters. Feedback analysis is performed and feedback is quantified using weights. Assignment of weights is carefully made. In the context of student feedback on classroom delivery there are various features some independent and some dependent which effect the feedback. It is imperative that these interdependencies are considered to obtain a legitimate or valid measure or the student feedback. The quantification of the independent contributions of these features towards student feedback provides a basis for further analysis related to student progression. Furthermore according to the National Board of Accreditation the recent approach to education is outcome based and is focused on the skills and potential of students which have a direct relation to the industrial needs. Therefore it is perceived that the faculty feedback should be a more proactive process which takes into consideration the self assessment of a student on skills acquired as a result of attending the course besides other aspects.

3.1 Correlations

SPSS package is used to establish the correlations between various features and self assessment or student learning. Table 1. Correlations

Self Assessment

Effort

Self Assessment

1

0.302

Effort

0.302

1

Correlation between self assessment and effort is significant at the 0.05 level (1-tailed) as evident from Table 1. In a pilot study conducted on participants from students of Masters in Computer Applications course of a Indian higher education institution it was observed that there exists a significant correlation between Regularity & Self Assessment as shown in Fig. 1. Also significant regression of Self Assessment on effort is observed. Further there is a significant regression of self assessment on student regularity, student category, and student effort and student attitude as evident from data analysis.

6

Sandhya Maitra et.al./ Procedia Computer Science 00 (2018) 000–000

Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183

1176

Fig. 1. Correlation between regularity and self assessment

3.2

Outcome Based Feedback

A new proposed feedback form is proposed which provides a completely new dimension to traditional feedback forms. The form is elicits feedback in a proactive manner and incorporates outcome based feedback features thus accentuating the significance of skill based feedback and its usability in providing important inputs to the teachinglearning process and assisting in quality management of the process. Further these inputs orient the teaching learning activity to instill the requisite skills in the students and increase their employability. It is suggested by [19][20][21] that features such as Regularity, Effort, Category and Attitude of the feedback provider influence feedback either in a counter productive or in an unproductive manner. It is important to refine feedback considering the influence of these features to provide valid feedback. Such a valid feedback goes a long way in giving the right direction to the teaching learning process. The validity of feedback can be relied upon to draw various other conclusions regarding the progress of the curriculum as well as the extent to which Regularity, Effort, Category and Attitude of the student hamper or assist the student during the teaching learning process as applicable to each individual case. The feedback form also elicits responses on self assessment of the student based on skills acquired as well as knowledge and expertise gained.

3.3

Naïve Bayes Classifier

Naive Bayes is a kind of classifier which uses the Bayes theorem. It predicts membership probabilities for each class such as the probability that given record or data point belongs to a particular class. The class with the highest probability is considered as the most likely class. Naive Bayes Classifier assumes that all the features are unrelated to each other to study their individual effect on feedback. Presence or absence of a feature does not influence the presence or absence of any other feature. Though these may depend on each other and on existence of other features all these features are considered to independently contribute to the probability that it is a valid feedback. The hypothesis is tested for given multiple evidences (features). So,



Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et. al./ Procedia Computer Science 00 (2018) 000–000

1177 7

calculations become complicated. To simplify the work, the feature independence approach is used to ‘uncouple’ multiple evidences and treat each as an independent one. Let’s consider a training dataset with 60 records and 2 classes. The data is cleaned and contains no missing values. There are two classes associated with feedback types namely valid and invalid. The predictor features set consists of 4 features and each of the mutually exclusive are denoted by the following notation : Ri = Regularity, Ei = Effort, Ai = Attitude, Ci = Category, SAi = Self Assessment of ith student. A Bayesian classifier is formulated taking into account the five features of Regularity, Category, Effort, Attitude and Self Assessment. The feedback form is so designed that part A of the feedback form collects student feedback in a traditional form. The part B of the form elicits information on student effort, student regularity, student category, student attitude and self Assessment. The score of the student on each of these features is collected based on the corresponding metrics assigned in the form. Further the probabilities of these five features are calculated for each student with respect to their individual score and the best total scores for these features for an ideal feedback provider. The probabilities of each of the mutually exclusive features for an individual student are equal to (student’s score of the event) / (total best score of the criteria). The students’ features namely Category, Effort, Attitude and Regularity are considered as mutually exclusive events while Student Self Assessment is an event which can occur with any of these mutually exclusive events. The posterior probabilities of student effort, student regularity, student category and student attitude given self Assessment also provide additional information on the extent to which each feature affects the learning of each individual student. Further conditional probability of Self Assessment due to each of the mutually exclusive events for any student is assigned according to relative significance as below. P(SAi |Ei ) = 0.25, P(SAi |Ri ) = 0.25, P(SAi |Ci ) = .25, P(SAi|Ai ) = .25.

(1)

Total Probability of Self Assessment for a particular student considering all features is given by total probability theorem as follows: P(SAi ) = P(Ri ) x P(SAi|Ri ) + P(Ei ) x P(SAi |Ei ) + P(Ai ) x P(SAi|Ai ) + P(Ci ) x P(SAi|Ci )

(2)

The posterior probability of the contribution of a feature for obtained Self Assessment for a particular student i using Naïve Bayes’ theorem is as follows : P(Ri |SAi ) = (P(Ri ) x P(SAi|Ri )) / P(SAi ), P(Ei|SAi ) = (P(Ei ) x P(SAi |Ei )) / P(SAi )

(3)

P(Ai |SAi ) = (P(Ai ) x P(SAi|Ai )) / P(SAi ), P(Ci|SAi ) = (P(Ci ) x P(SAi |Ci )) / P(SAi )

(4)

The format of the new feedback form designed is presented in Appendix at the end of the paper. The Feedback validation model can be represented in a conceptual model as in fig 2 below:

Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183

1178

Part A

Part B

Compute student feedback from part A

Compute probabilities from outcome based part B

Probabilities

Valid feedback

Valid feedback weight FEI Compute FEI

Compute weights of valid feedback

Naïve Bayes Classification

Fig. 2. Feedback Analysis Model

4. Algorithm for validation and computation of faculty effectiveness index The following conventions hold for ith student PNRi = P( irregularity), PNE i = P(No Effort), PNA i = P(negative attitude), PNC i = P( poor or unsuitable academic background or category), PNSA i = P( negative assessment), ME i = P( multiple evidences), Valid i = P(validity of feedback), Invalid i = P( non validity of feedback), FEI = Faculty Effectiveness Index.

Sandhya Maitra et. al./ Procedia Computer Science 00 (2018) 000–000 Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183



11799

I. Elicit Student Feedback( Fi ) of each student for a Faculty for a subject taught in traditional form Part A as well as outcome based form Part B. i=0. II. While (i<= total no.of students) { 1. Compute Student feedback of each student for each student i from form A. 2. Compute a. P(Ri) = (Student’s score of Ri) / (total best score of Ri) b. P(Ei) = (Student’s score of Ei ) / (total best score of Ei) c. P(Ai) = (Student’s score of Ai) / (total best score of Ai) d. P(Ci) = (Student’s score of Ci) / (total best score of Ci). e. PNRi =1- PRi , PNEi=1-PEi f. PNAi=1-PAi , PNCi=1-PCi g. PNSAi=1-PSAi 3. Input PRi, PEi , PAi , PCi, PSAi , PNRi , PNEi, PNAi, PNCi , PNSAi to Naive Bayes Classifier. 4. Naive Bayes Classifier { Compute a. MEi = PRi *PEi *PAi *PCi *PSAi *.5 + PNRi *PNEi *PNAi *PNCi *PNSAi *.5 b. P(Validi |MEi) = (PRi *PEi *PAi *PCi *PSAi *.5)/MEi c. P(Invalid i |MEi) = (PNRi *PNEi *PNAi *PNCi *PNSAi *.5 )/ MEi d. IF(P(Validi |MEi) >= P(Invalid i |MEi)) Feedback Classi: = Valid else Feedback Classi: = Invalid } 5. IF (Feedback Classi = Valid) {N= N+1; Assign Weight Wi = P(Validi |MEi) } 6. i = i+1 } III. Calculate FEI = ( Σvalid i Wi Fi )/N Fig. 3: Algorithm for Validation and Computation of Faculty Effectiveness Index

5. Results The classifier developed using R programming language is applied on a sample data set of 1000 students of an Indian Higher Education Institution affiliated with a state university showed satisfactory results. A two mean Z test applied subjective feedback or traditional feedback and valid feedback of students respectively for a faculty on class room delivery gave the following result: Test Results for FEI Results of Two-sample Z-Test for means applied on Faculty Effectiveness Index computed on student feedback elicited through traditional approach and validated outcomebased student feedback respectively. data: df$SF and df$new_sf z = 9.2879, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 99 percent confidence interval: 5.065038 8.952574 sample estimates: mean of x mean of y 48.03904 41.03023 Where the random Fig.4.variables Results ofare Two-sample Z-Test for means applied on Faculty Effectiveness Index X= sf (student feedback elicited through traditional approach) And Y= new_sf (validated and outcome based student feedback )

Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et.al./ Procedia Computer Science 00 (2018) 000–000

1180 10

6.

Conclusion and future Scope

This research has applied Naïve Bayes’ machine learning classifier to build a Feedback Validation Model in Education data mining. The probabilistic approach of the model can effectively unearth the reasons for the learning progress of the student in the teaching learning process on an individual basis. It also specifies the extent to which each of the reasons affects the learning progress of each student. It provides a feed forward mechanism for timely rectification of deviations by interventions such as counseling, additional classes etc as per the intermediate reflections. The feedback which is influenced by different factors is segregated from unproductive influences which can otherwise distort the authenticity and project false or fabricated feedback. Further true feedback is evaluated. Such a feedback devoid of unproductive influences shall prove to be a good measure of student feedback and consequently faculty performance as well. The authenticated feedback serves as a tool to help assess effectiveness of teaching learning in a more proactive and objective manner. It also focuses on skills and outcomes achieved and helps improve the outcomes. The Feedback Validation Model shall be an effective tool for quality management of teaching learning process in higher educational institutions. There is ample scope for future work using this model. Anonymity of student feedback can be maintained by developing an identity code generator program based on random numbers. A student can generate an random identity which he or she can use to link his or her subjective feedback with objective counterpart. Different variants of hybrid Naïve Bayes classifiers can be explored for feedback authentication and their accuracies compared In terms of uncertainty reduction in feedback data analysis. The Naïve Bayes’ classifier can be used to explore other areas of educational data mining such as placement analysis. 7.

Compliance with ethical standards informed consent:

Informed consent was obtained from all individual participants included in the study. Appendix A. FEEDBACK FORM PART A Faculty: S.No. 1.

2.

3.

4.

5.

6.

Subject: Attribute Knowledge of Subject Exceedingly knowledgeable Adequately knowledgeable Inadequately informed Poorly informed Ability to Explain (oral) Well articulated Fairly articulated Satisfactorily articulated Unsatisfactorily articulated Use of examples Uses Many examples Uses examples adequately Infrequently uses examples Never uses examples Teaching Material Very Well Organised Fairly Well Organised Satisfactorily Well Organised Unsystematic Use of training aids Very Good usage Satisfactory usage Rarely uses Never uses Opportunity to raise questions & discussion Ample opportunity Occasional opportunity Rare opportunity

Grade A B C D A B C D A B C D A B C D A B C D A B C

Feedback



Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et. al./ Procedia Computer Science 00 (2018) 000–000

7.

8.

9.

10.

1181 11

D

Discourages Questions Regularity in engaging Classes Very regular Fairly regular Irregular Takes few classes Impartiality Impartial Generally fair Occasionally partial Always Biased Attitude towards student's difficulty Extremely helpful Usually helpful Avoids Addressing Unhelpful Overall impression of Teaching

A B C D A B C D A

Excellent Very Good Satisfactory Poor

B C D

Appendix B. FEEDBACK

FORM PART B (STUDENT PROFILE)[A Representative Sample] Student Code : I. CATEGORY 1. Please provide the details of your qualification: S.no Qualification 1 Xth 2

XIIth

8,4,2,1

3 Graduation 4 Post -Graduation ( Aggregate up-to last semester ‘s %) 2. Please rate your understanding of the following ICT application ICT Application Excellent (6) Good (5) 1 Word Processing File Navigation 2 3 Internet Browsing 4 Emailing 5 Presentation Tools 6 Social networking sites 3. Why did you choose your stream? Pl. tick the relevant 1 I like technological innovation/ management 2 Parents / guardians' pressure 3 No other option, 4 Others. II. REGULARITY 1. Are you normally on time for your classes. Early (8) Norma l Time (6)

Percentage 6,4,2,1 10,4,2,1 10,4,2,1

Fair (4)

Low (2)

(2) (1) (1) (1)

Late

(4)

2. How well are you able to follow your planned time schedules? Mostly (8) Sometimes (6) Rarely (4) 3. Besides your classes, how many hours do you spend on self-studies in a day? 0hrs. (1) 1-2hrs. (4) 3-4hrs. (6) 4. Generally how often do you visit your library? Rarely (1) 1-2 days(4) 3-4 days(6) 5. How often you skip your classes? Not at all (8) Sometimes(6) Very often(4) 6. What is your attendance in the course approximately?

Very Late (2) Never (2) >=5hrs.(8) Daily (8) Always(1)

Does not exist(1)

1182 12

Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et.al./ Procedia Computer Science 00 (2018) 000–000

90-100% (12) 70-90% (10) 50-70%(8) 7. Time outlay for other activities. Rarely (4) Relatively less (3) Moderate (2) III.EFFORT 1.How many journals / technical or business magazines do you study? Many (4) Some (3) Few (2) 2.How many seminars you have attended during the last 2 years? Many (4) Some (3) Few (2) 3.Have you ever written any research paper/ article/ abstract? Many (4) Some (3) Few (2) 4. If yes, pl. specify number. Many (4) Some (3) Few (2) 5.How often do you generally Always(3)

<50% (4) Relatively more (1)

None (1) None (1) None (1) None (1) Sometimes (1)

Never (0)

participate in class discussion revise the lessons/topics taught in the class try to logically approach a problem try to learn new things on my own IV.SELF ASSESSMENT 1.If you have to evaluate your performance in the subject, where do you place yourself? <50% (1) 50-60% (2) 60-75% (3) >=75% (4) 2.Please rate yourself on Technical Graduate Skills acquired. as a consequence of attending classes for the subject. Outcomes achieved Excellent (4) Good (3) Fair (2) Low (1) Documentation skills Development skills Analytical skills Problem solving skills Fundamental knowledge Project Management skills V.ATTITUDE 1.How often do you generally praise your Always (1) Sometimes (1) Never (0) Class mates Teachers Staff Institution System 2.How did you feel when Took It Positively(2) Felt Offended(1) Self Pity(1) Ignored(0) criticized by Teachers Peers



Sandhya Maitra et al. / Procedia Computer Science 132 (2018) 1171–1183 Sandhya Maitra et. al./ Procedia Computer Science 00 (2018) 000–000

1183 13

References [1]. [2].

[3].

Ramachandran, Mahadevan. , Shivaprakash, N.C., and Bose.S.K. (2017). Quality assessment of technical education in Indian Engineering Institutions. In IEEE Global Engineering Education Conference (EDUCON) (pp. 973 - 977). Berlin: IEEE. Li, Jun-Lin., Zhao, Jian-Hua., and Xue, Gui-Ying. (2011). Design of the index system of the college teachers' performance evaluation based on AHP approach. In International Conference on Machine Learning and Cybernetics (pp. 995 - 1000). Guilin: IEEE. Yeap, Boon-Han. (2008). A New Perspective for Quality Improvement in Teaching and Learning Processes. In Edu-Com 2008

International Conference, Sustainability In Higher Education: Directions For Change (pp. 583- 590). Australia: Edith Cowan University. Staron, Maslow. (2007), Using Experiments in Software Engineering as an Auxiliary Tool for Teaching—A Qualitative Evaluation from the Perspective of Students' Learning Process. In 29th International Conference on Software Engineering (pp. 673 - 676). Washington DC: IEEE Computer Society. [5]. Xie, Jiafeng, and Guo, Hui. (2010), Study on the Evaluation Model of Student Satisfaction Based on FactorAnalysis. In International Conference on Computational Intelligence and Software Engineering (pp.1- 4). New Jersey: IEEE. [6]. Mărcuş, Andrei., Zaharie, Monica., and Osoian, Codruta. (2009). Student Satisfaction as A Quality Management Technique in Higher Education. In International Association of Computer Science and InformationTechnology-Spring Conference (pp. 388-390). New Jersey: IEEE. [7]. Dhanalakshmi, V., Bino, Dhivya., and Saravanan, A.M. (2016). Opinion mining from student feedback data using supervised learning algorithms. In Big Data and Smart City (ICBDSC), 2016 3rd MEC International Conference on (pp. 1 -5). Muscat: IEEE. [8]. Kannan, Rajkumar., Bielikova, Maria., Andres, Frederic., and Balasundaram, S.R. (2011). Understanding Honest Feedbacks and Opinions in Academic Environments. In COMPUTE'11 The Fourth Annual ACM Bangalore Conference (pp. 1 -4). New York: ACM. DOI: 10.1145/1980422.1980443 [9]. Rakholia, Rajnish , and Saini, JatinderKumar. (2017). Classification of Gujarati Documents using Naïve Bayes Classifier. Indian Journal Of Science And Technology, 10(5): 1-9. DOI:10.17485/ijst/2017/v10i5/103233 [10]. Ashari, Ahmed., Paryudi, Iman., and Tjoa, A. Min. (2013). Performance Comparison between Naïve Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool. (IJACSA) International Journal Of Advanced Computer Science And Applications, 4(11), 33-39. Retrieved from http://www.ijcacsa.thesai.org [11]. Dey, Lopamudra., Chakraborty, Sanjay., Biswas, Anuraag., Bose, Beepa., and Tiwari, Sweta. (2016). Sentiment Analysi s of Review Datasets Using Naïve Bayes‘ and K- NN Classifier. International Journal Of Information Engineering And Electronic Business, 8(4):5462. [12]. Krichene, Aida. (2017). Using a naive Bayesian classifier methodology for loan risk assessment. Journal Of Economic s, Finance And Administrative Science, 22(42): 3-24. DOI: 10.1108/jefas-02-2017- 0039. [13]. Haji, M., and Katebi, S. (2005). An Efficient Text Segmentation Technique Based on Naive Bayes Classifier. ICGST - GVIP, 5(7): 27-36 [14]. Tilve, Amey., and Jain, Surbhi. (2017), Survey On Machine Learning Techniques For Text Classification. International Journal Of Engineering Sciences & Research Technology, 6(2):513-519. DOI: 10.5281/zenodo.322477 [15]. Wajs, Wies-Law., Ochab, Marcin., Wais, Piotr., Trojnar, Kamil., and Wojtowicz, Hubert. (2017). Advanced Solutions in Diagnostics and Fault Tolerant Control. Advances In Intelligent Systems And Computing: 281-288. http://dx.doi.org/10.1007/978-3-319-64474-5 [16]. Yang, Bo., Lei, Yiqun., and Yan, Bei. (2016). Distributed Multi-Human Location Algorithm Using Naive Bayes Classifier for a Binary Pyroelectric Infrared Sensor Tracking System. IEEE Sensors Journal, 16(1): 216-223. DOI: 10.1109/jsen.2015.2477540 [17]. Netti, K., and Radhika, Y. (2017). A model for accurate prediction in GeoRSS data using Naive bayes Classifier. Journal Of Scientific And Industrial Research, 76: 473-476. [18]. Sarker, Iqbal, Kabir, Muhammad, Colman, Alan, and Han, Jun. (2017). An Improved Naive Bayes Classifier -based Noise Detection Technique for Classifying User Phone Call Behavior. In Fifteenth Australasian Data Mining Conference (AusDM 2017) (pp. 1 -15). Melbourne: Springer International Publishing. [19]. Zerihun, Zenawi., Beishuizen, Jos., and Van Os, Willem. (2012), Student learning experience as indicator of teaching quality. Educational Assessment, Evaluation And Accountability, 24(2): 99-111. DOI:10.1007/s11092-491 011-9140-4 [20]. Haslett, Betty. (1976). Attitudes toward teachers as a function of student academic self-concept. Research In Higher Education, 4(1): 4158. DOI: 10.1007/bf00991460 [21]. Dochy, Filip., Moerkerke, George., and Segers, Mien. (419 1999). The Effect of Prior Knowledge on Learning in Educa tional Practice: Studies Using Prior Knowledge State Assessment. Evaluation & Research In Education, 13(3): 114-131. DOI: 10.1080/09500799908666952 . [4].