Automated staff assignment for building maintenance using natural language processing

Automated staff assignment for building maintenance using natural language processing

Automation in Construction 113 (2020) 103150 Contents lists available at ScienceDirect Automation in Construction journal homepage: www.elsevier.com...

851KB Sizes 0 Downloads 42 Views

Automation in Construction 113 (2020) 103150

Contents lists available at ScienceDirect

Automation in Construction journal homepage: www.elsevier.com/locate/autcon

Automated staff assignment for building maintenance using natural language processing

T



Yunjeong Moa, Dong Zhaob, , Jing Duc, Matt Syald, Azizan Azize, Heng Lif a

Construction Management Department, University of North Florida, 1 UNF Drive Jacksonville, FL 32224, USA School of Planning, Design, and Construction, Michigan State University, East Lansing, MI 48824, USA c Department of Civil and Coastal Engineering, University of Florida, Gainesville, FL 32611, USA d School of Planning, Design, and Construction, Michigan State University, East Lansing, MI 48824, USA e Center for Building Performance and Diagnostics, Carnegie Mellon University, Pittsburgh, PA 15213, USA f Department of Building and Real Estate, Hong Kong Polytechnic University, Hong Kong, China b

A R T I C LE I N FO

A B S T R A C T

Keywords: Building service request Construction management Machine learning NLP Text mining

Staff assignment is the decision-making to determine appropriate workforce with required skills to perform a specific task. Staff assignment is critical to success of construction projects, especially when responding to routine requests such as the change order and building service. However, the effectiveness is low due to manual processing by the management personnel. To improve the productivity of staff assignment, this paper creates a machine learning model that reads service request texts and automatically assigns workforce and priority through the technique of natural language processing (NLP). The dataset used for modeling in this study contains 82,106 building maintenance records for a 3-year period from over 60 buildings on a university campus. The results show a 77% accuracy for predicting workforce and an 88% accuracy for predicting priority, indicating a considerably high performance for multiclass and binary classifications. Different from existing studies, the NLP model highlights the value of stop-words and punctuation in learning service request texts. The NLP model presented in this study provides a solution for staff assignment and offers a piece of the puzzle to the information system automation in the construction industry. This study has an immediate implication for building maintenance; and, in the long term, contributes to human-building interactions in smart buildings by connecting human feedback to building control systems.

1. Introduction Staff assignment for task scheduling is critical to success of construction projects. Staff assignment refers to the decision-making process to determine the appropriate staff with particular skills required to perform the task that meets service demand [1,2]. Staff assignment for construction management occurs across the entire project cycle starting from building design to facility operations and maintenance. Project managers need to quickly decide and allocate suitable workers and resources to respond to task requests for configuration change, change order processing, design review, technical notification, and building maintenance [3,4]. For example, prompt change order processing during construction requires efficient staff assignment that finds an appropriate crew to perform a construction task. Staff assignment is challenging for construction management. (1) A construction project is complex that involves a large number of various

crews with particular skills such as electrician, plumbers, masons, carpenters, steelworkers, and machine operators. (2) Staff assignment is required to be very accurate and quick because construction projects often have a tight schedule and high associated costs. (3) Staff assignment is a repeated routine for project managers since the design and scheduling in a construction project often change. The extant studies primarily view staff assignment as a problem of scheduling [5], human resource [6], or resource optimization [7]. These studies provide insight into the selection and acquisition of expertise that are required to succeed at the project level. However, a gap remains to improve staff assignment at the task level in a case-by-case manner. Automatic staff assignment on routines (e.g., crew selection) improves productivity but most assignment is manually processed by construction managers or professionals (termed assigners hereafter). The effectiveness is low. The quality of staff assignment is strongly reliant on the professionalism and experience of individual assigners



Corresponding author. E-mail addresses: [email protected] (Y. Mo), [email protected] (D. Zhao), du.j@ufl.edu (J. Du), [email protected] (M. Syal), [email protected] (A. Aziz), [email protected] (H. Li). https://doi.org/10.1016/j.autcon.2020.103150 Received 24 November 2019; Received in revised form 20 February 2020; Accepted 23 February 2020 0926-5805/ © 2020 Elsevier B.V. All rights reserved.

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

analysis extracts a meaningful representation from free texts based on sentence structure, grammatical structure, and sentence components such as nouns, adjectives, and verbs [17,18]. Early NLP applications used a complex set of hand-written rules while recent NLP applications use machine learning algorithms, extending its scope to the artificial intelligence field [19]. Logistic regression (LR), Naïve Bayes (NB), and support vector machine (SVM) are three widely used algorithms for NLP modeling. LR maximizes probability estimates owing to the scalability to multiple class values and substantial advantage of model sparsity [20]. NB can quickly and accurately document classification even when predictors are fairly weak [19]. SVM can identify linear and polynomial separators and address text classification with high-dimensional input spaces [21]. NLP emerges as a powerful tool to improve the productivity in the architecture, engineering, construction (AEC) domain, although being used in other domains for years [22]. Salama and El-Gohary [23] presented an NLP application for automatic construction document classification. Their classification was binary, and each document was classified as either positive or negative. NB, SVM, and maximum entropy (ME) were their algorithms and the performance was evaluated by Precision and Recall. Zhang, et al. [24] applied NLP to analyze construction accident reports and their classification is multiclass of eleven labels. LR, NB, SVM, K-nearest neighbor (KNN), and Decision Tree (DT) were used and the performance was evaluated with F1 score. Other NLP applications in the ACE field include: the extraction of construction safety information from safety reports written in an unstructured text format [25,26]; the classification of stakeholders' concerns and client opinions on the construction process of AEC projects [27]; and prediction of the cost overrun level on which a construction project at the estimation stage [28]; and the knowledge map of knowledge items and semantic clarity for facility management. Nevertheless, few NLP applications address staff assignment in the AEC domain including the building maintenance where this paper is rooted.

[8]. Inappropriate decisions from assigner may lead to losses and project delays so that assigners must be cautious often under a high pressure [1]. As a result, the staff assignment performance is often slow and assigners may be burdened when a considerable volume of requests rush to them [9]. Therefore, the objective of this study is to create a prediction model that can automatically assign staff in response to task requests documented in unstructured texts. The context of this study is focused on the building maintenance considering three specific reasons: (1) Staff assignment is a substantial routine for building maintenance that needs to handle a large number of service requests [10]. (2) Building maintenance is an important phase in a building lifecycle and takes > 80% of the duration and costs [11]. (3) Building functionality is dependent on efficient maintenance which is expected to be reactive, reliable, economic, and occupant-satisfied [12]. In response, our model uses the natural language processing (NLP) technique to convert massive unstructured texts into structured and actionable information for automated staff assignment. Different from prior studies that optimize staff assignment through operations research, we address this problem of resource optimization through machine learning. Wu, et al. [13] used NLP to optimize BIM objects and suggested NLP for project resource optimization and automatic estimation. Here, we convert the optimization problem into a classification problem and train NLP models to determine appropriate workforce by learning from past staff assignment cases. The data used for the case study as part of this paper are from a national university's operations division. 2. Background 2.1. Building service request processing Building maintenance efficiency is a key performance indicator for efficient facility management. It represents how efficiently the maintenance is performed assuming that the work is mostly of preventive and corrective types of maintenance [14]. For example, periodical building inspection is a routine of preventive maintenance; and building service request processing is a typical corrective maintenance task that aims to identify and rectify a fault in building systems in response to service requests. 90–97% of building maintenance is related to processing service requests and the majority of them are unplanned [8]. The quality of processing unplanned requests is the most important indicator for measuring maintenance performance [15]. Fig. 1 exhibits the processing of building service requests with the following steps: (1) Service reception: identify undesirable building status from occupant requests and inspector reports; (2) Service coordination: analyze service requests and schedule appropriate crews; and (3) Service evaluation: assesses service quality and user satisfaction through questionnaires, user complaints, and statistical reports. Staff assignment and prioritization are the two cores of task scheduling. Staff prioritization is also critical to improve the general condition of buildings. The priority is often judged by a problem's criticality, for example, the tasks to protect lives and avoid building damages are prioritized. Staff assignment for building maintenance requires assigners to acquire knowledge and experience on both building systems and construction workflow. The assigners are expected to effectively coordinate service orders and manage repair, capital, and alteration projects. The high workload and intensity often result in slow responsiveness and low productiveness. Therefore, an automated staff assignment system is needed to expedite service responses and raise maintenance efficiency.

3. Methods A feature a measurable property or characteristic of data [29]. In this NLP study, we define feature the bag-of-words (BOW), a representation of text that describes the occurrence of words. We define label the metadata of the dataset. Specifically, the unigram is the basic type of BOW feature; and the bigram (even trigram), metadata, and regular expression are additional types of feature. It is noted that metadata can be included as features in machine learning [19]. 3.1. Data overview In this study, data for model development are 82,106 service requests and solutions for a period of three years. We collected the data from the operations division at a national university in the United States. The service requests are from 60 educational buildings and dorms, of which the size varies from 22,000 to 295,000 square feet. The requests were sent by occupants, building managers, or university-level inspectors and saved in a central database on campus (IBM Maximo). All the service requests have been manually processed by the assigners in the operations division. The solutions, such as crew and priority, decided by the assigners were recorded in the database and the labeled data were used as the ground truth for the modeling in this study. Details of labels in this dataset are listed in Table 1. 3.2. Model development The goal of the model is to predict the solution for staff assignment based on the service request. Among the four labels of the solution (see Table 1), the labels of Crew and Priority are the outputs to be predicted for staff assignment. The prediction of Crew is a multiclass classification and the prediction of Priority is a binary classification because the

2.2. NLP and its applications NLP is a computational technique that allows for the human-like language processing on naturally occurring texts [16]. NLP enables a machine to understand texts and perform linguistic analysis. The 2

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

Service Reception

Service Coordination

Phone

Prioritization Email Online

Tasking Inspection

Service Evaluation Questionnaire administration

Complaint evaluation

Statistical reporting

Fig. 1. Workflow of service request processing.

dataset specifies 19 Crew classes and two Priority classes (listed in Table 2). It is notable that electrician (EL), laborer (LB), plumber (PL) are the major workforce for building maintenance, accounting for 51% of Crew; and regular tasks (RG) dominates 81% of Priority. The texts in the label Short description in the dataset are chosen as inputs to predict the outputs. The texts in the label Long description are also ideal inputs but they are optional, and a majority of the requests do not provide the long description. Other labels (e.g., Location) are alternate inputs as metadata to supplement Short description. Metadata is structured information that explains an information resource or helps to manage the information. Therefore, the variables for model development is described as follows:

Table 2 Summary of output labels and classes. Label

Crew

Output #1: Yc = {yc,1 , yc,2 , yc,3 ,…yc,19. } Output #2: Yp = {yp,1 , yp,2 } Inputs: X = d where Yc denotes the label Crew; yc,i denotes the 19 crew classes (1 ≤ i ≤ 19); Yp denotes the label Priority; yp,j denotes the two priority classes (j = 1, 2); and ℝd denotes the d-dimensional feature space with d features. In addition, 79,526 service request instances were used for the NLP model development after missing data removal.

Priority

Class

AC AM CP CT EC EL EV GD HS LB LK MS PL PT RF SF TD EX RG

Description

Air conditioning Auto mechanic Carpenters Custodial Contractor craft - Other Electricians Elevator service specialist Gardeners Housing & dining personnel Laborer Locksmiths Mason Plumbers Painters Roofing contractor Steam fitters Truck driver Expedited services with faster execution. Regular service tasks, not expedited.

Service request #

%

8020 558 6549 5234 4576 12,356 977 782 244 18,471 6310 625 10,088 1011 285 2163 1279 15,016 64,574

10.08 0.70 8.23 6.58 5.75 15.52 1.23 0.98 0.31 23.21 7.93 0.79 12.67 1.27 0.36 2.72 1.61 18.87 81.13

Table 1 Data summary.

Request

Label

Description

ID Short description

The unique numbering assigned to each service request. The required description of building problems, e.g., “Please investigate gas small in 8100 corridor stairwell”; “Heating is not working”; and “Light is out” The optional extended description of building problems, e.g., “The light flickers continuously. Someone came and changed the light bulb, but it did not do anything. It still flickers. I was told in the beginning of the year that this was a problem with several rooms and that it was electrical. It still causes a problem with my studying because at night it is very dark. My eyes have to get accustomed to light changes constantly.” The building name where problems are located. The channel used to deliver requests. i.e., E-mail, phone, and web. The workforce assigned to perform the service task, e.g., electrician. The privilege of the assigned service task, i.e., expedited, and regular. The hours to complete the assigned service task. The person who supervises the assigned service task.

Long description

Solution

Location How to submit Crew Priority Duration Supervisor

3

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

Start

Raw request data

Perform descriptive analysis

Cleaned data

Process data / Treat missing data

Step1: Data preparation

Split dataset Test

set

Dev set

Train set

Step2: Feature extraction

Extract baseline features Explore baseline features Evaluate feature extraction methods

Step3: Algorithm selection

Run main algorithms Evaluate algorithm performance

Step4: Error analysis

Analyze errors (confusion matrix)

Extract additional features

Step5: Feature engineering

No

Run selected algorithm in Step3 Satisfied? Evaluate algorithm performance Finalize feature engineering option

Yes

Step6: Performance evaluation

Explore engineered features Evaluate final algorithm performance

End

Fig. 2. Flowchart to develop automated service scheduling model.

engineering, following the approach by Visa, et al. [31]. In feature engineering, we added the optional features to the model to improve accuracy in an explorative way. Often, the error analysis and feature engineering constitute a loop where new features are iteratively tested until accuracy can be no longer improved. We used three machine learning algorithms: LR, NB, and SVM. They are efficient in classifying BOW features and are frequently used in text classification [32,33]. These algorithms have been used by similar NLP studies in the construction field [23–25]. We used the default Weka parameter settings since they have been optimized to be satisfactory for most learning problems and ensure acceptable baseline prediction performance [34]. Khoshgoftaar, et al. [35] suggest not changing the parameter settings in Weka unless an improvement in the classifier performance is evidenced in preliminary analysis.

3.3. Modeling procedure Fig. 2 exhibits the modeling procedure and workflow of five steps using which we developed the NLP model. In data preparation, we divided the data into three datasets: a training set (60%) for feature extraction, a development set (20%) for feature engineering, and a testing set (20%) for final performance evaluation, following the approach by Owoputi, et al. [30]. Table 3 lists the eight pre-processing methods (M1–8) used to extract features. They represent eight combinations of BOW components to exclude redundant information. For example, M2 includes punctuation as feature; M7 includes stop-words as feature and applies stemming. We evaluate their performance and choose one for feature selection. In error analysis, we used the confusion matrix (or called error matrix) to assess accuracy for feature Table 3 Summary of BOW components. Component

Description

Punctuation Stop-words

A mark that separates sentences by meaning, e.g., “,” “.” and“!”. A frequently-used word that does not carry significant meanings but functionally helps build meanings, e.g., “a,” “am,” “please,” “too,” and “how.” A word's root from that reduces inflected words, e.g., “work, works, and working.”

Stem

M1

4

M2

M3

M4

Yes Yes Yes

M5

M6

Yes Yes

Yes

Yes

M7

M8

Yes

Yes Yes

Yes

Yes

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

3.4. Model evaluation

Table 5 Comparison of algorithm performance.

We used five statistics to evaluate the model's prediction performance: accuracy, kappa, precision, recall, and F1 score [36]. Accuracy is the percentage of correct predictions. Kappa measures the proportion of true agreement beyond the chance to mitigate biased accuracy values by adjusting expected accuracy using a coincidence factor [37–39]. Precision indexes the ratio of true positive predictions to all positive predictions. Recall indexes the ratio of true positive predictions to the sum of true positive and false negative predictions. The F1 score is the harmonic mean of precision and recall. These statistics are described in Eqs. (1)–(5).

Accuracy = Kappa =

True Positive + True Negative Total

(2)

Recall =

True Positive True Positive + False Positive

F 1 Score =

(5)

4. Results 4.1. Algorithm selection Table 4 summarizes the model performance for feature extraction. The result indicates that the LR algorithm with the M8 reaches the greatest performance for predicting Crew (e.g., accuracy = 0.742). LR with the M5 reaches the greatest performance for predicting Priority (e.g., accuracy = 0.873). Table 5 lists the results of algorithm selection for Crew and Priority. The results indicate that LR is the appropriate algorithm for predicting Crew in this dataset, as it produces the highest values of accuracy (0.742), kappa (0.703), precision (0.740), recall (0.742), and F1 score (0.741). Pairwise t-tests further confirm that LR performs better than NB (t = 17.884, p < .001) and SVM (t = 4.053, p < .001) at the 99% level. Similarly, the results indicate that LR is the appropriate algorithm Table 4 Comparison of feature extraction methods. Label

Crew

Priority



Method

M1 M2 M3 M4 M5 M6 M7 M8⁎ M1 M2 M3 M4 M5⁎ M6 M7 M8

# Features

3448 3470 3509 2930 3531 2952 2981 3003 3448 3470 3509 2930 3531 2952 2981 3003

Accuracy

Kappa

LR

NB

SVM

LR

NB

SVM

0.740 0.742 0.741 0.736 0.740 0.741 0.738 0.742 0.870 0.871 0.871 0.866 0.873 0.867 0.866 0.870

0.698 0.690 0.693 0.704 0.686 0.693 0.695 0.687 0.854 0.851 0.850 0.851 0.844 0.849 0.847 0.842

0.736 0.736 0.738 0.729 0.738 0.733 0.733 0.735 0.867 0.868 0.869 0.864 0.869 0.865 0.865 0.867

0.701 0.702 0.702 0.697 0.701 0.702 0.698 0.703 0.523 0.531 0.531 0.507 0.539 0.513 0.513 0.526

0.652 0.642 0.647 0.660 0.638 0.646 0.650 0.640 0.526 0.518 0.526 0.518 0.497 0.510 0.519 0.490

0.698 0.697 0.699 0.690 0.699 0.694 0.694 0.696 0.526 0.527 0.534 0.507 0.535 0.514 0.515 0.524

Precision

Recall

F1-score

Crew

LR NB SVM LR NB SVM

0.742 0.687 0.735 0.873 0.844 0.869

0.703 0.640 0.696 0.539 0.497 0.535

0.740 0.685 0.732 0.864 0.843 0.862

0.742 0.687 0.735 0.873 0.844 0.869

0.741 0.686 0.733 0.868 0.844 0.866

Fig. 3 displays an example of confusion matrix for the Crew prediction. The confusion matrix is a special type of contingency table. In the matrix, each column represents the instances of an actual label and each row represents the instances of a predicted label [40]. In particular, the instances along the diagonal line are correct predictions because the predicted labels match the actual. The percentage in each cell indicates the proportion by row. For example, the bottom-left cell (SF/ AC, 23) means 23% of actual “SF” are predicted as “AC”, which is not correct. The matrix illustrates that most incorrect predictions are related to the crew laborers (LB column). The results suggest that the LB label is a major source of prediction errors. In a similar way, error analysis for the Priority class suggests improving the “expedited” (EX) label. Then, we iteratively explored new feature types such as metadata, bigram, and regular expression. As a result, the metadata Location, bigrams, and ten regular expressions have been added into the model. For example, the regular expression “door(\s\w+)? not (close|closing)” is used to address the erroneous unigram “door” because “door” is related to phrases “door was not closing” and “door not close.” The regular expression “(light|lights|power)(\s\w+)? out” is used to address the erroneous unigram “light.” In a regular expression, “\s” refers to whitespace characters, such as space, tab, and newline, and “\w” refers to word characters. “|” refers to the OR operand, “+” refers to one or more, and “?” refers to once or none. Table 6 summarizes the feature selection comparisons. Each option (row) denotes a combination of feature selection in the model. For example, Option 1 model only includes the unigram feature. Pairwise ttests confirm that Option 3 is selected for predicting the Crew and Priority. Option 3 performs better than Option 2 (t = 5.355, p < .001 for Crew; t = 5.073, p < .001 for Priority) which is better than Option 1 (t = 8.564, p < .001 for Crew; t = 2.615, p = .009 for Priority). Option 3 also performs slightly better than Option 4 (t = 1.942, p = .052 for Priority). The findings suggest that in this case, metadata and bigrams improve the prediction; while the inclusion of regular expression does not obviously improve the prediction because the features, in this case, are not too complicated that n-grams can explain [17]. The final selected features help correct approximately 35% of previous incorrect predictions.

(4)

2 Precision × Recall Precision + Recall

Kappa

4.2. Feature selection

(3)

True Positive True Positive + False Negative

Accuracy

for predicting Priority in this study with the highest values of accuracy (0.873), kappa (0.539), precision (0.864), recall (0.873), and F1 score (0.868). Pairwise t-tests confirm that LR performs better than NB at the 99% level (t = 9.428, p < .001) and SVM at the 90% level (t = 1.827, p = .068). Therefore, LR is selected and SVM is considered as an alternative.

where P(A) is the probability of agreement and P(E) is the probability of chance agreement.

Precision =

Algorithm

Priority

(1)

P (A) − P (E ) 1 − P (E )

Label

4.3. Performance evaluation Table 7 lists the final prediction performances for the Crew and Priority. The t-tests indicate that the accuracy for Crew improves from 0.742 to 0.772 at the 99% level (t = 11.871, p < .001). The 3% improvement is meaningful for the Crew since a 70% accuracy is difficult to achieve for multiclass classification with high dimensionality [41].

: selected method. 5

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

A\P AC AM CP CT EC EL EV GD HS INS LB LK LS MS PL PT RF SF TD

AC AM CP CT EC EL EV GD HS INS LB LK LS MS PL PT RF SF TD

100%

81 0 1 2 3 2 2 0 0 0 2 0 0 0 5 0 2 25 1 0 97 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 2 0 70 2 9 1 1 1 0 0 4 7 0 30 1 3 0 2 1 1 0 1 82 2 0 0 1 0 0 3 0 0 0 2 1 2 0 2 1 0 4 3 72 1 7 3 0 0 2 1 0 4 1 8 0 2 1 2 0 2 0 2 73 5 0 0 0 13 4 0 0 1 1 2 2 1 0 0 0 0 1 1 77 0 0 0 0 0 0 0 0 0 2 0 0 0 1 0 2 1 0 0 78 0 0 1 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 4 0 12 7 7 18 4 10 0 0 66 3 0 0 7 4 0 2 31 0 1 4 0 0 1 2 0 0 0 1 83 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 2 0 1 0 0 1 0 0 1 0 0 63 1 2 2 0 0 2 0 1 0 2 1 1 2 0 0 5 1 0 2 80 2 7 6 1 0 0 2 0 1 0 1 1 0 0 0 0 0 2 0 78 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 82 0 0 6 0 0 0 0 0 1 0 0 0 0 0 0 0 2 1 0 68 0 0 1 1 1 0 0 0 1 0 0 3 0 0 0 0 0 0 0 60

0%

Fig. 3. Confusion matrix of error analysis for Craft where cells on the diagonal line indicate correct prediction and other cells indicate incorrect predations.

The accuracy for the Priority improves from 0.873 to 0.880 at the 99% level (t = 4.055, p < .001), which is a very good performance. Table 8 lists the top 20 features with a high correlation coefficient to the output variables. In the NLP model, each selected feature is an input variable; and Crew and Priority are the two output variables. The preand-post comparison indicates that eight new bigrams are added to the top 20 features for predicting the Crew and nine new bigrams are added for predicting the Priority. The results suggest that bigrams play an important role in improving performance in this case. We interpret that the length of service request texts is short and this nature requires bigrams to improve prediction power.

Table 7 Comparison between Initial and Final Performances. Label

Test

Accuracy

Kappa

Precision

Recall

F1-score

Crew

Initial Final Initial Final

0.742 0.772 0.873 0.880

0.703 0.738 0.539 0.580

0.740 0.770 0.864 0.874

0.742 0.772 0.873 0.880

0.741 0.771 0.868 0.877

Priority

request such as work state, processes, methods in a variety of work conditions and scope changes. Our results suggest stop-words and punctuation valuable BOW components for NLP applications about staff assignment. This finding is consistent with existing studies that underline the prediction power from stop-words and punctuation. Iacobelli, et al. [44] included stopwords in features to classify the personality of bloggers and obtained the highest accuracy. They highlight the importance of stop-words (common words) for classifying personal traits since their presence facilitates more accurate classification performance. Davidov, et al. [45] claimed that the inclusion of punctuation provides a substantial boost for classification quality in their analysis of Twitter dataset. Another study in the science domain by Baradad and Mugabushaka [46] confirmed that the corpus-specific stop-words improve the accuracy of textual analysis in scientometrics. In our case, the stop-word “too”, as the top-ranked feature, has the highest correlation coefficient with the workforce selection. It emphasizes unfavorable indoor quality and plays an important role to distinguish the Crew classes. The word “too” is

5. Discussion 5.1. Implications for staff assignment automation Our model provides implications for staff assignment in both preand post-construction phases. At the preconstruction phase, our model can be used to automate staff assignment in response to the change order requests. Automatic classification system using NLP plays a significant role in construction process automation since a large amount of information in construction projects is exchanged as a form of text document including change order requests [42]. Change order is one of the most general concerns for construction management and changes occur at any phase of a project due to various reasons from different sources [43]. The model reported in this study can be applied to automate the staff assignment by analyzing the texts of change order Table 6 Results of feature engineering. Label

Option

Unigram

Crew

1 2 3⁎ 4 1 2 3⁎ 4

Yes Yes Yes Yes Yes Yes Yes Yes

Priority



Meta-data

Bi-gram

Regular expression

Yes Yes Yes

Yes Yes

Yes

Yes Yes Yes

Yes Yes

Yes

: selected option. 6

Accuracy

Kappa

Precision

Recall

F1-score

# Feature

0.741 0.753 0.767 0.767 0.872 0.875 0.884 0.884

0.703 0.716 0.733 0.733 0.534 0.545 0.588 0.588

0.741 0.753 0.766 0.766 0.863 0.866 0.877 0.877

0.741 0.753 0.767 0.767 0.872 0.875 0.884 0.884

0.741 0.753 0.766 0.766 0.868 0.871 0.880 0.881

3003 3005 12,451 12,452 3531 3533 13,066 13,067

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

Table 8 Top selected features before and after feature engineering. Label

Crew

Priority

Rank

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Pre feature engineering

Post feature engineering

Feature

Corr

Type

Feature

Corr

Type

too ac air cold thermostat warm check ahu unit fan hot temp a\/c degree hvac cool humid temperature vent flow osr toilet ~ breaker clogged ( ) elevator tripped unclog not alarm . reset commode stuck relamp no light power

0.3306 0.3084 0.2600 0.2460 0.1982 0.1880 0.1847 0.1825 0.1787 0.1768 0.1625 0.1547 0.1535 0.1466 0.1439 0.1437 0.1336 0.1307 0.1284 0.1276 0.2007 0.1850 0.1800 0.1697 0.1683 0.1584 0.1583 0.1578 0.1549 0.1474 0.1324 0.1272 0.1173 0.1169 0.1114 0.1084 0.1064 0.1054 0.1024 0.0943

unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram unigram⁎ unigram unigram unigram⁎ unigram⁎ unigram unigram unigram unigram unigram unigram⁎ unigram unigram unigram unigram unigram unigram unigram

too ac air check_ac cold ac_. please_check too_cold too_hot BOL_too thermostat warm check ahu unit fan too_warm hot cold temp osr BOL_osr toilet ~ osr_~ ~_( breaker clogged ( ) elevator tripped breaker_EOL unclog_toilet unclog tripped_breaker BOL_tripped BOL_clogged clogged_toilet not

0.3306 0.3084 0.2600 0.2466 0.2460 0.2338 0.2058 0.2050 0.2039 0.1992 0.1982 0.1880 0.1847 0.1825 0.1787 0.1768 0.1683 0.1625 0.1550 0.1547 0.2007 0.1912 0.1850 0.1800 0.1785 0.1769 0.1697 0.1683 0.1584 0.1583 0.1578 0.1549 0.1511 0.1475 0.1474 0.1468 0.1437 0.1353 0.1333 0.1324

unigram unigram unigram bigram unigram bigram⁎ bigram bigram bigram bigram unigram unigram unigram unigram unigram unigram bigram unigram bigram unigram unigram bigram unigram unigram⁎ bigram⁎ bigram⁎ unigram unigram unigram⁎ unigram⁎ unigram unigram bigram bigram unigram bigram bigram bigram bigram unigram

Note: BOL = Beginning of a line; EOL = End of a line; “_” = space; Corr = absolute correlation coefficient; * denotes features that involve punctuation.

elevator accidents, and flooding. Examples of “osr” requests include “building fire alarm,” “elevator make[s] loud noise and kind of jumping,” “running faucet,” and “clogged toilet.”

usually used with the words “hot,” “cold,” or “humid” to indicate ACrelated work in the request texts. For example, “public spaces too cold in building”; “office is too hot”; and “career services too humid. Please check AC.” Similarly, the word “please” delivers a high prediction power when used with the verb “check” as a bigram. The inclusion of BOW components may be case by case. Our results suggest M8 for the prediction of Crew prediction and M5 for the Priority prediction. The difference between M5 and M8 is that M5 does not include stemming. The inclusion is highly dependent on the nature of dataset and may not apply to other NLP studies. For example, punctuation and stop-words help retain the features that capture properties of natural sentences from original text annotations in this study [47]. Stemming here helps differentiate varying forms of verbs that have a role for accurate classification. For example, “water leaked” and “water leaking” inform different circumstances which help Priority prediction. Yet, this function of verb differentiation does not help Crew prediction. For example, either “water leaks” or “water leaking” would need a plumber to fix it. Customized terms are useful to improve the prediction of staff assignment. In this study, the term osr is crafted by the operations division to designate “operations-and-systems-related” requests. Many service requests tagged with “osr” are classified as “expedited” class in the Priority because the failure to respond to “osr” requests may damage operational systems, for example, resulting in fires, electric shocks,

5.2. Implications for building maintenance practices The findings unveil important concerns from occupants that designers, engineers, and operators shall consider starting from the building design phase. Indoor environmental quality (IEQ) is the top concern from occupants. The highly ranked features are thermal comfort-related words (e.g., “temperature,” “warm,” “hot,” “cool,” “cold,” “degree,” or “humid”) and HVAC-related words (e.g., “AC,” “thermostat,” “AHU,” “fan,” or “vent”). This finding indicates that a large number of service requests are associated with occupants' thermal comfort. It is understandable that people spend 90% of time in buildings and become sensitive to temperature and humidity. In response, frequent IEQ inspections and a well-prepared HVAC workforce are recommended for efficient building maintenance. Necessary building feedback may reduce waste of maintenance time and effort. The word “check” is the only highlighted verb feature rather than “fix” or “repair”. The strong action request implies occupants' uncertainty and anxiety about the performance of building systems. Whenever feeling uncomfortable, they prefer to ask services to verify potential problems. The finding indicates a lack of building feedback so 7

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

productiveness in construction management practices since it is manually processed by assigners. To address this problem, this paper reports an NLP model that automatically decides the workforce at 77% accuracy and the work priority at 88% accuracy. In other words, given a piece of service request message, our model can automatically assign a correct crew at a 77% rate and determine correct work priority at an 88% rate. The accuracy is high for multiclass classification with high dimensionality. This study has four contributions to the body of knowledge and industry practices. (1) The model automates the decision making to determine appropriate workforce with required skills to perform tasks, which reduces processing time and management team's workload. (2) The results contribute to building maintenance practices and recommend building status feedback to occupants and sufficient MEP workforce. (3) The model contributes to the NLP applications in the AEC domain and highlights the value of stop-words, punctuation, and customized terms. (4) The model offers a piece of the puzzle for building systems automation and enhancing human-building interactions.

that building operation status cannot be timely delivered to occupants. As a solution, we suggest applying smart building technology to address the shortage of human-building interactions in building maintenance [48]. The results suggest the mechanical, electrical and plumbing (MEP) workforce should be always ready for an expedited response to urgent building problems since most urgent building problems are related to electricity and water. The feature selection for predicting the Priority indicates three key predictors: “clogged toilet,” “elevator,” and “relamp light.” These objects are closely related to human health and safety. Any undesirable status of these objects requires expedited services. 5.3. Implications for smart building systems This study provides a piece of the puzzle for the big blueprint of building systems automation. (1) An immediate application of our NLP model is the development of a text classification module as an add-in to existing building service systems. The NLP module can be also used to web-based building management systems, for example, to detect building performance deficits [49]. (2) Our NLP model can be integrated into the emerging Building Information Modeling-Facility Management (BIM-FM) system which underlines the value of human request responses for advancing BIM-FM system [50], and our NLP model provides such a text-data processing module that connects building users with BIM-FM. This function can be extended to a smartcity scale, which will ultimately support accurate decision making for city planners, designers, and managers. (3) Our NLP model can be incorporated with the speech recognition-based control in smart buildings [51]. Users can send service requests by talking, which is more user-friendly and efficient. The model can process request classification when voice is recognized and transcribed. (4) Our NLP model contributes to intelligent building control systems and human-building interactions. In commercial buildings, the central control manages HVAC, lighting, elevator, communication, and security systems. Although most control systems have self-diagnosis functions, they need inputs from human feedback [52]. Our model can close the loop of building and human interactions by automatically linking human feedback to the building controls.

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. References [1] A.T. Ernst, H. Jiang, M. Krishnamoorthy, D. Sier, Staff scheduling and rostering: a review of applications, methods and models, Eur. J. Oper. Res. 153 (1) (2004) 3–27, https://doi.org/10.1016/S0377-2217(03)00095-X. [2] E. Yang, I. Bayapu, Big data analytics and facilities management: a case study, Facilities 38 (3/4) (2019) 268–281, https://doi.org/10.1108/F-01-2019-0007. [3] M. Arias, J. Munoz-Gama, M. Sepúlveda, J.C. Miranda, Human resource allocation or recommendation based on multi-factor criteria in on-demand and batch scenarios, European Journal of Industrial Engineering 12 (3) (2018) 364–404, https:// doi.org/10.1504/EJIE.2018.092009. [4] W.J. Gutjahr, S. Katzensteiner, P. Reiter, C. Stummer, M. Denk, Competence-driven project portfolio selection, scheduling and staff assignment, CEJOR 16 (3) (2008) 281–306, https://doi.org/10.1007/s10100-008-0081-z. [5] M.C. Wu, S.H. Sun, A project scheduling and staff assignment model considering learning effect, Int. J. Adv. Manuf. Technol. 28 (11−12) (2006) 1190–1195, https://doi.org/10.1007/s00170-004-2465-0. [6] K. El-Dash, Assessing human resource management in construction projects in Kuwait, Journal of Asian Architecture and Building Engineering 6 (1) (2007) 65–71, https://doi.org/10.3130/jaabe.6.65. [7] W.J. Gutjahr, P. Reiter, Bi-objective project portfolio selection and staff assignment under uncertainty, Optimization 59 (3) (2010) 417–445, https://doi.org/10.1080/ 02331931003700699. [8] K.O. Roper, R.P. Payant, The Facility Management Handbook, AMACOM Books, New York, USA, 2014. [9] Y. Liu, J. Wang, Y. Yang, J. Sun, A semi-automatic approach for workflow staff assignment, Comput. Ind. 59 (5) (2008) 463–476, https://doi.org/10.1016/j. compind.2007.12.002. [10] K.M. Emde, Estimating Health Risks from Infrastructure Failures, IWA Publishing, 2007. [11] A. Tatiya, D. Zhao, M. Syal, G.H. Berghorn, R. LaMore, Cost prediction model for building deconstruction in urban areas, J. Clean. Prod. 195 (2018) 1572–1580, https://doi.org/10.1016/j.jclepro.2017.08.084. [12] B. Wauters, The added value of facilities management: benchmarking work processes, Facilities 23 (3/4) (2005) 142–151, https://doi.org/10.1108/ 02632770510578511. [13] S. Wu, Q. Shen, Y. Deng, J. Cheng, Natural-language-based intelligent retrieval engine for BIM object database, Comput. Ind. 108 (2019) 73–88, https://doi.org/ 10.1016/j.compind.2019.02.016. [14] D. Amaratunga, D. Baldry, M. Sarshar, Assessment of facilities management performance–what next? Facilities 18 (1/2) (2000) 66–75, https://doi.org/10.1108/ 02632770010312187. [15] A. Parida, G. Chattopadhyay, Development of a multi-criteria hierarchical framework for maintenance performance measurement, J. Qual. Maint. Eng. 13 (3) (2007) 241–258, https://doi.org/10.1108/13552510710780276. [16] E.D. Liddy, Natural language processing, Encyclopedia of Library and Information Science, CRC press, Boca Raton, 2001, , https://doi.org/10.1081/e-elis3120008664. [17] A. Kao, S.R. Poteet, Natural Language Processing and Text Mining, Springer, Berlin, 2007, https://doi.org/10.1007/978-1-84628-754-1. [18] Y. Mo, D. Zhao, M. Syal, A. Aziz, Construction Work Plan Prediction for Facility

5.4. Limitations and future research Limitations exist in this study. First, one limitation is the considerable difference in accuracy and kappa values for the Priority prediction. The accuracy of predicting the regular label can easily reach 80% because the data distribution is skewed. Considering the high probability of the correct prediction occurring by chance, the kappa value may indicate greater robustness. Second, a limitation on our sampling approach is that stratified sampling rather than random sampling used in this study could ensure that the train and test sets have approximately the same percentage of samples of each target class as the complete set. Our sampling approach does not influence results' scientific rigor, though. Third, a limitation is that this study did not optimize parameters while using default Weka settings in modeling. Although our performance is acceptable, we recognized that parameter optimization could possibly improve performance [34]. Overall, future studies can address the limitations by adopting data transformation, sampling strategies, and ensemble-based methods [53,54], or apply more sophisticated dataset and tools. In addition, other NLP models such as binary classification may be explored to improve prediction in the future. 6. Conclusion Staff assignment is a key component of task scheduling in construction projects. However, staff assignment is challenging due to low 8

Automation in Construction 113 (2020) 103150

Y. Mo, et al.

[19]

[20]

[21]

[22] [23]

[24]

[25]

[26]

[27]

[28]

[29] [30]

[31]

[32]

[33]

[34]

[35]

[36]

Management Using Text Mining, Computing in Civil Engineering, ASCE Publishing, 2017, pp. 92–100, https://doi.org/10.1061/9780784480847.012. I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Practical machine learning tools and techniques, Data Mining (2016) 91–160, https://doi.org/10.1016/b978-0-12804291-5.00004-0. A. Genkin, D.D. Lewis, D. Madigan, Sparse logistic regression for text categorization, Working Group on Monitoring Message Streams Project Report, Center for Discrete Mathematics and Theoretical Computer Science, 2005Retrived from http://archive.dimacs.rutgers.edu/Research/MMS/loglasso-v3a.pdf , Accessed date: 26 June 2018. F. Colas, P. Brazdil, Comparison of SVM and some older classification algorithms in text classification tasks, Proceedings of 2006 International Conference on Artificial Intelligence in Theory and Practice, Springer, 2006, pp. 169–178, , https://doi.org/ 10.1007/978-0-387-34747-9_18. G.G. Chowdhury, Natural language processing, Annu. Rev. Inf. Sci. Technol. 37 (1) (2003) 51–89, https://doi.org/10.1002/aris.1440370103. D.M. Salama, N.M. El-Gohary, Semantic text classification for supporting automated compliance checking in construction, J. Comput. Civ. Eng. 30 (1) (2013) 04014106, , https://doi.org/10.1061/(ASCE)CP.1943-5487.0000301. F. Zhang, H. Fleyeh, X. Wang, M. Lu, Construction site accident analysis using text mining and natural language processing techniques, Autom. Constr. 99 (2019) 238–248, https://doi.org/10.1016/j.autcon.2018.12.016. A.J.P. Tixier, M.R. Hallowell, B. Rajagopalan, D. Bowman, Automated content analysis for construction safety: a natural language processing system to extract precursors and outcomes from unstructured injury reports, Autom. Constr. 62 (2016) 45–56, https://doi.org/10.1016/j.autcon.2015.11.001. D. Zhao, A.P. McCoy, B.M. Kleiner, J. Du, T.L. Smith-Jackson, Decision-making chains in electrical safety for construction workers, J. Constr. Eng. Manag. 142 (1) (2016) 4015055, https://doi.org/10.1061/(ASCE)CO.1943-7862.0001037. X. Lv, N. El-Gohary, Text analytics for supporting stakeholder opinion mining for large-scale highway projects, Procedia Engineering 145 (2016) 518–524, https:// doi.org/10.1016/j.proeng.2016.04.039. T.P. Williams, J. Gong, Predicting construction cost overruns using text mining, numerical data and ensemble classifiers, Autom. Constr. 43 (2014) 23–29, https:// doi.org/10.1016/j.autcon.2014.02.014. C.M. Bishop, Pattern Recognition and Machine Learning, Springer, Cambridge, U.K., 2006. O. Owoputi, B. O'Connor, C. Dyer, K. Gimpel, N. Schneider, N.A. Smith, Improved part-of-speech tagging for online conversational text with word clusters, Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics, The Association for Computational Linguistics, 2013, pp. 380–390 Retrieved from https://www.aclweb.org/ anthology/N13-1039.pdf , Accessed date: 10 November 2019. S. Visa, B. Ramsay, A.L. Ralescu, E. Van Der Knaap, Confusion matrix-based feature selection, Proceedings of the 22nd Midwest Artificial Intelligence and Cognitive Science Conference, CEUR-WS Publishing, 2011, pp. 120–127 Retrieved from http://ceur-ws.org/Vol-710/paper37.pdf , Accessed date: 10 November 2019. E. Mayfield, C.P. Rose, An interactive tool for supporting error analysis for text mining, Proceedings of the 2010 Conference of the North American Chapter of the Association for Computational Linguistics, The Association for Computational Linguistics, 2010, pp. 25–28 Retrieved from https://www.aclweb.org/anthology/ N10-2007.pdf (Assessed November 10, 2019). M.D. Shermis, J. Burstein, Handbook of Automated Essay Evaluation: Current Applications and New Directions, Routledge, London, U.K, 2013, https://doi.org/ 10.4324/9780203122761. C. Thornton, F. Hutter, H.H. Hoos, K. Leyton-Brown, Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms, Proceedings of the 19th International Conference on Knowledge Discovery and Data Mining, ACM, 2013, pp. 847–855, , https://doi.org/10.1145/2487575.2487629. T.M. Khoshgoftaar, C. Seiffert, J. Van Hulse, A. Napolitano, A. Folleco, Learning with limited minority class data, Proceedings of 6th International Conference on Machine Learning and Applications, IEEE, 2007, pp. 348–353, , https://doi.org/10. 1109/icmla.2007.76. S. Shirowzhan, S.M. Sepasgozar, H. Li, J. Trinder, P. Tang, Comparative analysis of

[37]

[38] [39]

[40]

[41]

[42] [43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

9

machine learning and point-based algorithms for detecting 3D changes in buildings over time using bi-temporal LiDar data, Autom. Constr. 105 (2019) 102841, , https://doi.org/10.1016/j.autcon.2019.102841. K.J. Boyko, Detection and Elimination of Rock Face Vegetation from Terrestrial Lidar Data Using the Virtual Articulating Conical Probe Algorithm, Doctoral dissertation Geolocial engineering, Missouri University of Science and Technology, 2019 Retrived from https://search.proquest.com/docview/2272175994 , Accessed date: 26 December 2019. L. Daly, G.J. Bourke, Interpretation and Uses of Medical Statistics, John Wiley & Sons, New York, USA, 2008, https://doi.org/10.1002/9780470696750. R. Garg, E. Oh, A. Naidech, K. Kording, S. Prabhakaran, Automating ischemic stroke subtype classification using machine learning and natural language processing, J. Stroke Cerebrovasc. Dis. 28 (7) (2019) 2045–2051, https://doi.org/10.1016/j. jstrokecerebrovasdis.2019.02.004. E. Mayfield, D. Adamson, C. Rose, LightSide Researcher's Workbench User Manual, Carnegie Mellon University, Pittsburgh, PA, 2014 Retrieved from http://ankara.lti. cs.cmu.edu/side/LightSide_Researchers_Manual.pdf Accessed Januray 30, 2020. D.M. Farid, L. Zhang, C.M. Rahman, M.A. Hossain, R. Strachan, Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks, Expert Syst. Appl. 41 (4) (2014) 1937–1946, https://doi.org/10.1016/j.eswa.2013.08.089. W. Yu, J. Hsu, Content-based text mining technique for retrieval of CAD documents, Autom. Constr. 31 (2013) 65–74, https://doi.org/10.1016/j.autcon.2012.11.037. B.-G. Hwang, L.K. Low, Construction project change management in Singapore: status, importance and impact, Int. J. Proj. Manag. 30 (7) (2012) 817–826, https:// doi.org/10.1016/j.ijproman.2011.11.001. F. Iacobelli, A.J. Gill, S. Nowson, J. Oberlander, Large Scale Personality Classification of Bloggers, Springer, Berlin, 9783642245701, 2011, pp. 568–577, https://doi.org/10.1007/978-3-642-24571-8_71. D. Davidov, O. Tsur, A. Rappoport, Enhanced sentiment learning using twitter hashtags and smileys, 2010 Conference of International Committee on Computational Linguistics, The Association for Computational Linguistics, 2010, pp. 241–249 https://www.aclweb.org/anthology/C10-2028.pdf , Accessed date: 3 December 2019. V.P. Baradad, A. Mugabushaka, Corpus specific stop words to improve the textual analysis in scientometrics, Proceedings of 15th International Conference of the International Society for Scientometrics and Informetrics, ISSI Society, 2015, pp. 999–1005 http://www.issi-society.org/proceedings/issi_2015/0999.pdf (Accessed December 23, 2019). R. Kraft, J. Zien, Mining anchor text for query refinement, Proceedings of the 13th International Conference on World Wide Web, ACM, 2004, pp. 666–674, , https:// doi.org/10.1145/988672.988763. D. Zhao, A.P. McCoy, J. Du, P. Agee, Y. Lu, Interaction effects of building technology and resident behavior on energy consumption in residential buildings, Energy and Buildings 134 (2017) 223–233, https://doi.org/10.1016/j.enbuild. 2016.10.049. E. Corry, P. Pauwels, S. Hu, M. Keane, J. O’Donnell, A performance assessment ontology for the environmental and energy management of buildings, Autom. Constr. 57 (2015) 249–259, https://doi.org/10.1016/j.autcon.2015.05.002. T.W. Kang, C.H. Hong, A study on software architecture for effective BIM/GIS-based facility management data integration, Autom. Constr. 54 (2015) 25–38, https://doi. org/10.1016/j.autcon.2015.03.019. O. Müller, I. Junglas, S. Debortoli, J. vom Brocke, Using text analytics to derive customer service management benefits from unstructured data, MIS Q. Exec. 15 (4) (2016) 243–258 https://aisel.aisnet.org/misqe/vol15/iss4/4/ , Accessed date: 30 July 2018. M. Asadullah, A. Raza, An overview of home automation systems, Proceedings of 2nd International Conference on Robotics and Artificial Intelligence, IEEE, 2016, pp. 27–31, , https://doi.org/10.1109/ICRAI.2016.7791223. N.V. Chawla, Data mining for imbalanced datasets: An overview, Data Mining and Knowledge Discovery Handbook, Springer, Berlin, 9780387098227, 2009, pp. 875–886, , https://doi.org/10.1007/978-0-387-09823-4_45. B. Krawczyk, Learning from imbalanced data: open challenges and future directions, Progress in Artificial Intelligence 5 (4) (2016) 221–232, https://doi.org/10. 1007/s13748-016-0094-0.