Automation in Construction 16 (2007) 392 – 407 www.elsevier.com/locate/autcon
Product data modeling using GTPPM — A case study Ghang Lee a,⁎, Rafael Sacks b , Charles Eastman c a
Department of Architectural Engineering, Yonsei University, 134 Sinchon-Dong, Seodaemun-Gu, Seoul 120-749, Korea (Republic of ), Korea b Civil and Environment Engineering, Technion Israel Institute of Technology c College of Architecture/College of Computing, Georgia Institute of Technology Accepted 13 May 2006
Abstract The Georgia Tech Process to Product Modeling (GTPPM) is a formal process-centric product modeling method. It enables capture of domainspecific information and work processes through process modeling. The method was initially developed in response to the need to integrate multiple usecases with differing data definitions from different companies. It automates and applies formal methods to aspects of process and product modeling that have traditionally been negotiated by committees. It has been deployed in several research projects, in addition, as an information flow analysis method rather than as a product modeling method. This paper reports a case of deploying GTPPM as a product modeling method, using a purpose-built software tool for its implementation. The case study in the domain of precast concrete construction demonstrates that it is possible to semi-automatically derive a product data model from the collected information through normalization, information integration, and conflict resolution processes. © 2006 Elsevier B.V. All rights reserved. Keywords: GTPPM; Product modeling; Process modeling; Precast concrete
1. Introduction Product (data) modeling initially started in the 1980s as an effort to develop a standard data format for exchanging product information between different software applications. Several international product modeling efforts, such as ISO-STEP and IAI-IFC, have been initiated, and much work has been invested to develop standard product models. (See the Glossary at the end of this paper for acronyms and abbreviations used in this paper.) The most comprehensive effort in the AEC arena is the IAI's development of the Industry Foundation Classes (IFC) model, which is intended to cover the whole gamut of building and construction. The current IFC model – version 2 × 2 [12] – has received broad acceptance and has been used in a growing number of projects. The current product modeling practices can be categorized into two types. The current ISO-STEP product modeling process, from elicitation of domain knowledge, through reconciliation of the meanings of vernacular terms, proposal and refinement of appropriate data structures, and ‘democratic’ approval of product model candidates, has largely been performed by task groups (e.g., ⁎ Corresponding author. Tel.: +82 2 2123 5785; fax: +82 2 365 4668. E-mail address:
[email protected] (G. Lee). 0926-5805/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.autcon.2006.05.004
TC 184 SC/4 workgroups) supported by product modeling professionals. The resultant model, called an Application Reference Model, is later integrated into other standard ISOSTEP models, called Integrated Resources and Application Protocols. As a result, current product modeling practice is very time-consuming and laborious. It commonly takes about 5 to 10 years to generate and validate one product model including the committee review process to approve the developed model as a standard. The IAI is taking a slightly different approach. Since its inauguration in August 1994, the IAI focused on developing a framework for the IFC based on ISO-STEP models until the first commercial-level model (IFC 1.5.1.) was released 4 years later in July 1998. Since then, the IFC extension models have been developed as addenda to the main framework model. In general, it takes shorter time to develop IFC extension models than to develop ISO-STEP models because IFC extension models are often defined as extended subtype structures of the IFC kernel layer, and that by only a small number of product modelers. Nevertheless, the development of extension models still takes almost 3 years on average (Table 1). The impact of long durations for product model development is that product model development often lags behind development of the very applications that generate the data whose transfer the models are intended to
G. Lee et al. / Automation in Construction 16 (2007) 392–407 Table 1 Development durations for IFC extension models (source: Ref. [29]) IFC extension model
Development period
Development duration (years:months)
HVAC performance validation [BS-7] HVAC modeling and simulation [BS-8] Network IFC: IFC for cable networks in buildings [BS-9] Code compliance support [CS-4]
January 1998– May 2003 June 2001–May 2003 January 2002– May 2003 May 2001–May 2003 April 2002–May 2003 March 2001–May 2003 2001–May 2003
5:4
Electrical installations in buildings [EL-1] Costs, accounts and financial elements [FM-8] Material selection, specification and procurement [PM-3] Steel frame constructions [ST-1] Reinforced concrete structures and foundation structures [ST-2] Precast concrete construction [ST-3] Structural analysis model and steel constructions [ST-4] IFC drafting extension [XM-4] Average
October 1998– May 2003 January 1997– May 2003 2001–May 2003 August 2000– May 2003 April 2001–April 2003
3:0 1:4 2:1 1:2 2:3 2:0 4:7 6:4 2:0 2:8 2:0 2:9
facilitate. This issue will grow as product models become more commonly used. While development durations for any product model or product model extension are naturally dependent on budgeting and scope issues, it appears that extension models with complex geometry information (e.g., the IFC-ST series in Table 1) require longer durations than other extension models. The process can be made more efficient, and its results more reliable, than it is at present. A new process-centric product modeling method called the Georgia Tech Process to Product Modeling (GTPPM) method was developed with these issues in mind. The strategy was to develop an effective method that started by allowing each company to define their current or planned processes in detail. This research effort included the development of both formal analysis methods for converting the raw information flow data into semantic structures, and their conversion into product model constructs. It also involved conceptual and user interface development to facilitate domain expert entry of relevant, correct and complete information used in the activities being modeled. The logic and rules governing the method's function have been documented previously [7,17,21]. GTPPM has been deployed and used across a spectrum of industrial and academic applications. It was been deployed iteratively in the Precast Concrete Software Consortium1 (PCSC) 1 The PCSC is a consortium of major precast concrete producers in Canada and the US formed in 2001. Its goals are to fully automate and integrate engineering, production, and construction operations, to gain productivity, and ultimately to increase the market share. As the means to achieve the goals, the PCSC chose to develop an intelligent 3D parametric CAD system and a Precast Concrete Product Model (PCPM) to enable data exchange between diverse systems used through the lifecycle of precast concrete pieces.
393
project between 2001 and 2004 [27] and in research projects at Purdue [3], Carnegie Mellon [9,28], Georgia Tech [27] and the Technion [25]. Sacks et al. [27] collected fourteen information flow models developed by North American precast concrete producers using GTPPM and analyzed and reported various aspects of the information flows of the North American precast concrete fabrication process. The study reviews the information flow of different project delivery types (e.g., design build, subcontract, piece supply) in terms of the level of detail (the number of activities to the number of information flows) and the Design Structure Matrix [23], which was automatically extracted from GTPPM. In none of these, however, was GTPPM used to the full extent of its capabilities as a product modeling tool, as it was in the case study presented here. Song et al. [28] used GTPPM to represent the pipe fabrication and material tracking processes as part of the FIATECH Smart Chips project. Ergen et al. [9] attempted to formally represent precast supply chains as component-based information flow using the concept of “material flow” embedded in GTPPM. Navon and Sacks [25] captured and represented the control processes of a large company with sophisticated information systems using GTPPM to study the issues in Automated Project Performance Control (APPC). However, in these projects, GTPPM was deployed as a high-level information flow and process analysis method, rather than a product modeling method (as it was initially conceived). In this paper, we describe a case study of deploying GTPPM as a product modeling method and discuss its advantages as such. This case study was conducted with three precast concrete fabricators in the US from 2003 to 2004 as part of an effort to develop a preliminary product data model for supporting data exchange between three different processes. Section 2 introduces the general structure and characteristics of GTPPM. Section 3 describes how GTPPM was implemented as a software application. Sections 4.1 and 4.2 present two aspects of the Requirements Collection Method (RCM). Section 4.3 details how a GTPPM model can be represented in an SQL format for comparison with the database structures of legacy systems that are targets for integration. Section 4.4 provides empirical data for the degree of modeling effort required for the process, and Section 4.5 summarizes the Logical Product Modeling (LPM) step. 2. A standard product modeling method and GTPPM The current international standard for product modeling process [13] is composed of three steps, which are respectively represented by the three model types: the Application Activity Model (AAM), the Application Reference Model (ARM), and the Application Interpreted Model (AIM). The AAM defines the use-cases of information in terms of processes and information flows. The ARM is a data model that is defined in a context of certain information use-cases, without reference to related but external engineering domains or common information concepts used in multiple domains. The ARM is restructured as the AIM product model by integrating the common information concepts, called Integrated Resources, and the relations to external part models. The AIM is the final data model integrated with other standard Parts.
394
G. Lee et al. / Automation in Construction 16 (2007) 392–407
The AAM modeling phase is the first step of product modeling. It is equivalent to the requirements collection and modeling phase (RCM) (or more generally, process modeling) of a general data modeling process [8]. There are several process modeling methods such as the Flowchart, UML, and DFDs. IDEF0 is the most commonly used in product modeling and is endorsed for use by ISO-10303, the STEP standard [13]. In addition to the basic process model semantics (e.g., Activity, Flow, Decision, etc.), IDEF0 allows modelers to describe Input Information, Control, Output Information, and Mechanism (ICOM), but only as short phrases. In standard practice, the ARM modeling phase develops a full list of information items required for the processes defined in the AAM; this is called the logical product modeling phase (LPM). EXPRESS-G and EXPRESS are the logical product modeling languages [14]. Current practice focuses on the development of an AAM that reflects a general, common process followed industry-wide. As a result, AAMs are high level and do not reflect detail workflows. In an ideal situation, however, multiple AAMs that represent various targeted information use-cases and workflows would be developed in order to define a standard product model that can support various information use-cases defined within multiple companies, at a more detailed level. However, since IDEF0 and other process modeling methods do not provide a mechanism to elicit individual input and output information items from a certain information-use context, AAM modeling becomes merely a task to understand a general industry practice and to define the overall scope of a product model. We are not aware of any standard product model that is developed based on more than one AAM. Furthermore, the process of collecting domain terms and definitions from domain experts and translating them into a product model has been informal. The validity and the completeness of the collected information items could not be formally evaluated until the resultant product model is implemented as a translator. Ironically, detailed and various information exchange scenarios between different software applications and required subset information need to be specified prior to the development of a translator or to the creation of views, which should have been done in the process of developing a standard product model. A project to compensate for this problem in IFC, which was initiated in a slightly different context, is the Information Delivery Manual (IDM) project — an effort to define different “data views (conformance classes)” and information delivery protocols for different processes [30,31]. GTPPM is an effort to provide a formal method to define and reuse information requirements throughout the Requirements Collection and Modeling (RCM) phase, the Logical Product Modeling (LPM) phase, and subsequent product-model deployment phases. GTPPM assumes the following processes: a. Requirements collection and modeling phase (RCM): Domain experts specify various information use-cases (flow) and specific information items used in each use-case with help from product modeling experts through this phase. b. Logical product modeling phase (LPM): GTPPM defines rules to resolve conflicts between collected information items
and restructure them in an integrated product model. A preliminary product model can be derived automatically from the collected information items using these rules. Product modeling experts elaborate and finalize the Application requirements model (ARM). This model is then integrated with Integrated Resources to generate the AIM. The sub-sections below briefly describe the general architecture and the characteristics of GTPPM. The detailed descriptions of the rules, examples, and tutorials can be found in other resources [17,18]. Elaborated and modified rules are to be published in Refs. [19,20,22]. 2.1. Requirements collection and modeling phase RCM is a graphical requirements-collection-and-modeling method for capture of information in the context of its use. It is similar to other requirements-collection-and-modeling methods for data modeling in that it is also based on a general process modeling concept, but differs from others in that GTPPM provides users with the logic and mechanisms to define specific information items required by each activity. An RCM model consists of two parts: process modeling and specification of product information. 2.1.1. Process modeling components Each process model represents a sequence of activities and information flows between them. GTPPM defines the activity as a logical step of processing information. It receives input information and yields output information: Output :¼ ActivityðInputÞ In this respect, an activity is interchangeable with a system or a function. For example, a flowchart may represent a business process or an algorithm. A sequence of activities (i.e., receiving and producing information) naturally forms an information flow. The GTPPM tool includes thirteen process modeling components (Fig. 1). The difference between the GTPPMRCM module and other process modeling methods is not in the process modeling components themselves, but in the relations between the process modeling components and specific information items. For example, a static information source (e.g., building codes, industry standards) cannot be defined within a process and can carry only pre-defined information items whereas a dynamic information repository (e.g., a database management system) can store and return any information item generated or used in a process. Another example is that high-level activities do not carry any definition of specific information items, but work only as a grouping mechanism for detailed activities. In order to avoid any conflict in information definitions between high-level activities and their constituent detailed activities, all information items are stored and carried in detailed activities alone. The collected RCM process models can be considered the targeted information use-cases for the Universe of Discourse (UoD). Different users (or companies, or applications) may use
G. Lee et al. / Automation in Construction 16 (2007) 392–407
395
Fig. 1. An example of a GTPPM–RCM model.
information in different ways. GTPPM (RCM) encourages domain experts to generate a process model based on their current workflow, or their envisioned future workflow. It does not require (and indeed discourages) compromise in process modeling by requiring alignment with any ‘typical’ or ‘industrystandard’ workflow or process. 2.1.2. Specification of product information In GTPPM, product information can be specified in two ways — in unrestricted local terms or in a machine-readable format. The terms defined using unrestricted local terms are called vernacular information items (VIIs) and the information items specified in a formal manner are called information constructs (ICs). The terms used to describe the same information often differ from organization to organization, from company to company, and local terminology may vary or conflict. For this reason, the method allows domain experts to specify information used by each activity in local terms, using VIIs first, and then map them to ICs to support automation of the analysis process. If domain experts are able to work directly in terms of information constructs, the VII specification process can be omitted. ICs are defined in a machine-processable format following simple syntactic rules [19]. Modelers can specify the information used by each activity in a consistent and analyzable way using ICs. In order to avoid any semantic ambiguities, information constructs are specified using the terms predefined in an information menu. (If necessary, more terms can be added later.) An information menu is a collection of tokens that are used in a universe of discourse (UoD), with a classification structure similar to the parts of speech in a language. The classification structure restricts the ways in which tokens can be strung together in constructing information items. An example of an IC is “project + non_residential_building⁎parking_deck {project_code}”. The plus symbol (+) denotes the association (“associated with”) or decomposition (“part of”) relations, the asterisk symbol (⁎) denotes the specialization (“type of”)
relation, and attributes are enclosed in curly brackets. The attributes always belong to the last entity in the concatenation. (See Ref. [19] for more details on the notation and the rules.) Examples of VIIs and ICs are provided in Table 2. Mapping between company-specific terms in VIIs and information constructs (ICs). Any two given companies (denoted Company I and Company II in Table 2) may use different terms within their organization. If domain experts are unfamiliar with the structure of ICs, they can start defining information items required for their processes using VIIs. Later, either domain experts or product modelers can map them to ICs. 2.2. Logical product modeling phase The ICs collected through the RCM phase are analyzed, integrated, and converted into a product model through the Logical Product Modeling (LPM) phase. LPM is an algorithmic process to derive a product model from collected information constructs. It's composed of two main steps: • Extraction and integration of information constructs (ICs) from multiple RCM models. • Normalization of collected information constructs into a formal product data model. Since GTPPM encourages the domain experts to define all possible different information use-cases and information Table 2 Mapping between company-specific terms in VIIs and information constructs (ICs) Company I (VIIs) Company II (VIIs)
Information Constructs (ICs)
Site name Site address Estimated weight Piece mark Serial number
SITE{name} SITE{address} PIECE + LOADS{weight, unit} PIECE{piece mark} PIECE{control number}
Construction site name Construction site location Load Mark number Control number
396
G. Lee et al. / Automation in Construction 16 (2007) 392–407
Fig. 2. GTPPM system configuration.
constructs required by them, some of the information constructs collected in the LPM phase may have conflicting definitions. The LPM process resolves such conflicts between information constructs and integrates the collected information constructs into a single well-formed product data model. The normalization and conflict resolution rules are defined as Design Patterns [1,11]. The current LPM process is composed of twelve design patterns, which are similar to the normal forms in relational databases [2,4,5]. They are similar in that both seek to eliminate anomalies and redundancies in a data structure and to incrementally define a well-formed data model. The major difference is that the LPM integration and normalization process is a schema-level normalization, whereas the relation database normalization is an instance-level normalization. A simple example of a design pattern is that, if Entity A is a subtype of Entity B and has Attribute N in one information construct and another information construct defines that Entity B also has Attribute N as its attribute, Attribute N should be removed from Entity A since it can inherit Attribute N from Entity B when the two information constructs are integrated. Details on the LPM normalization design patterns can be found in Ref. [20]. The last step, harmonizing the LPM with the Integrated Resources and the related external constructs in other data models is undertaken by the domain experts, using conventional methods.
tions. It can export the activity names, the context, and information associated with them as an Excel file and also can integrate and normalize information collected from the RCM model to a single integrated EXPRESS model. The GTPPM method is unique in that it enables process modeling with explicit detailed information flows; this is also the key feature of the software tool that distinguishes it from other graphical process modeling interfaces. Not only do users place symbols representing activities, information flows and controls, but the system prompts for detailed information item lists (drawn from the information menu for the domain) for each activity and flow. It also checks for consistency in the information flows, as will be explained in Section 4.1. The tool is designed to support alternative modeling approaches, as illustrated in Fig. 3(a) to (c). Information items can be defined by domain experts or by product modeling experts. Information required by each task (or activity) can be defined as ICs, as shown in Fig. 3(a); if domain experts prefer to specify information items using their local terms, then they may use VIIs, and the procedure is as shown in Fig. 3(b). In the latter case, the VIIs are eventually mapped to ICs. Finally, GTPPM analyzes and derives a product model from the collected ICs, in the LPM procedure shown in Fig. 3(c). ICs can be collected from different process models and can be integrated into a single model; conflicts between ICs are resolved in the LPM phase.
3. GTPPM implementation 4. A test case Product modeling is a collaborative effort between domain experts and product modeling experts (mediators). GTPPM structures and formalizes the collaboration between the domain experts and product modeling experts and automates the product model development procedures. To realize these capabilities, a GTPPM software tool has been implemented as an MS Visio® add-on using Visio graphic engine and Application Programming Interface (API) (Fig. 2). The GTPPM shapes listed in Fig. 1 were modeled and defined as a Visio Stencil and then their behaviors and functions were implemented using the Visio API. Examples of behavior include automated shape and identifier creation and update. Examples of functions include syntax checkers, the information flow consistency checker, and information collectors. An Excel file was used as a data repository for the information menu and the vernacular data dictionary and linked to the GTPPM RCM module through Microsoft Component Object Model (COM) interface. The GTPPM LPM module, which was also implemented using the Visio API, includes various export functions and information integration and normalization func-
GTPPM was deployed in a test-case product-modeling project. In order to capture different types of information use-cases, the management processes (i.e., estimating, bidding, production, and shipping) of two precast producers, Company A and Company B, and a precast concrete “designing/drafting” process, of Company C, were modeled using the GTPPM–RCM module. Companies A and B were chosen because they had advanced database management systems for managing estimation, production, and shipping information, which could be compared with a product model generated through the GTPPM method. Company C was chosen because it had well developed guidelines for designing precast concrete pieces. Based on the guidelines, the designing/ drafting processes for double tees (Fig. 4) and exterior columns were modeled. The RCM models of the three companies were developed by the process modeling experts (the authors) based on the interviews with the management-level personnel of each company. Later, the models generated were reviewed by the domain experts at the three companies. Based on the review
G. Lee et al. / Automation in Construction 16 (2007) 392–407
comments, the models were then elaborated. The information items were first defined as vernacular information items (VIIs), based on the interviews and various types of forms provided by the three companies, and then mapped to information constructs (ICs) later. The following sections describe the modeling process in detail with examples, test-case results, and lessons learned.
397
4.1. Mapping between Vernacular Information Items (VIIs) and Information Constructs (ICs) This section describes how VIIs and ICs were created and mapped in detail. Once the process models were developed, the groups of information items that are transferred from one
Fig. 3. Two possible GTPPM (RCM) modeling procedures (using ICs alone and using VIIs) and the GTPPM (LPM) procedure. a) the Requirements Collection and Modeling (RCM) process using Information Constructs (ICs). b) the Requirements Collection and Modeling (RCM) process using Vernacular Information Items (VII). c) the Logical Product Modeling (LPM) process.
398
G. Lee et al. / Automation in Construction 16 (2007) 392–407
information items (VIIs). Examples of specified information sets and their items are as follows (a ‘takeoff list’ is a report of the quantity of products and subcomponents):
Fig. 4. A stack of double tees.
activity to another were modeled as information sets (Fig. 5), based on standard company reports required by the end of certain activities (e.g., job summary sheet, turnover meeting check list, piece tag, and packet slip). The information set is GTPPM's mechanism to group and name collections of information items. The sets were first defined with vernacular
PROJECT INFORMATION SHEET {; project name; location; report date; purchaser; address; city_state_zip; project size; job#; contract value; taxes; status; type; sold as; detailed project requirements; Sales Rep; estimator;} PACKING SLIP {; address;city_state_zip; job#; truck number; trailer number; truck driver; payment method; po#; piece mark; piece qty; piece description; comments; contents packaged by; contents checked by; contents received by; delivered date;} PIECE TAG {; bar code; piece weight; piece mark;} TAKEOFF LIST {; project name; location; job#; product type id; product element id; product name; product qty; product size; product u/m; estimator; estimate no; area code; distance between the project site and the plant; piece mark; piece depth; piece width; piece unit length; piece weight; load name; total loads; total # of pieces; piece qty;}
Fig. 5. Examples of information sets (company names are mosaicked).
G. Lee et al. / Automation in Construction 16 (2007) 392–407
The VIIs were then mapped to ICs using the Information Item Mapper (Fig. 6), in consultation with the domain experts. Fig. 6 shows the user interface of the Information Item Mapper. The lefthand window panes list tokens by their relations (e.g., the specialization relation, the association relation, and attributes). Tokens in specialization relations inherit attributes from their parents. Users can form information constructs (IC) by navigating through the lists, and then mapping the resulting ICs to VIIs in the right-hand window (labeled ‘Mapped Information’). VIIs and ICs were generally mapped one to one. However, several VIIs and ICs were mapped one to many or many to one. (This implies that some VIIs and ICs are in fact in the many-tomany relation). VIIs that were synonyms were mapped to single ICs. Some of the VIIs actually included several pieces of information, and had to be mapped to several ICs. An example of the latter is galvanized embed order status. In order to keep track of the order status of a product or a part in terms of data management, we need to specifically know which item has been ordered, what is the purchase order identifier, and so on. However, when such information is maintained in a paper format, it is recorded informally and freely as one long note. To clarify the precise meaning, the data recorded in the VII galvanized embed order status, was mapped to four distinct ICs as follows: PIECE + MATERIAL * HARDWARE{;type;}; PIECE + MATERIAL * HARDWARE{;id;};
399
PIECE + MATERIAL * HARDWARE+PURCHASE_ORDER{; status;}; PIECE+MATERIAL*HARDWARE+PURCHASE_ORDER{;id;}
Some VIIs may have different meanings to different people, depending on their familiarity with the domain. The VII rebar schedule is a good example. A rebar schedule is not a timebased activity schedule for making or placing rebars; instead it denotes a listing of rebars for production, including diameters, lengths and shape designations, often, but not always, including an abstract 2D representation of the bent rebar shapes. In the mapping process, ambiguous VIIs such as rebar schedule were mapped to ICs based on the definitions, data types, examples, references, and synonyms of the VIIs (the right side of Fig. 6). VIIs specified in information sets were automatically converted to ICs according to the mapped relations between VIIs and ICs. The input and output information items of activities were specified using information sets as the targets of information production. The consistency of information flows were then checked using the automated routines available in the tool. 4.2. Ensuring logical consistency in information process models As mentioned in the Introduction, GTPPM was first used by fourteen PCSC member companies for analyzing their sales, design, engineering, and production processes [27]. Analysis of
Fig. 6. Mapping ambiguous terms based on the descriptions.
400
G. Lee et al. / Automation in Construction 16 (2007) 392–407
Fig. 7. The activity information window.
these models revealed logical inconsistencies in many of the information flows. For example, information items modeled as generated in a source activity were never used in any subsequent activity; items reported to be received in a consecutive activity, were neither imported into nor generated by the source activity. These issues motivated development and implementation of a rigorous method to validate the consistency of the information flows defined within process models. The dynamic consistency checking method validates the consistency of information flow as the flows are entered by the domain experts, based on the availability of information: i.e., if input information required for an activity is not provided by the upstream activities, the information flow is inconsistent. The dynamic consistency checking method defines the relationships between different information types (i.e., input, output, generated information, and remaining information) as rules. (see Ref. [21] for more details on the rules). In the test case project, the dynamic consistency checking method was used to check the validity of collected information. Fig. 7 shows the GTPPM Activity Information Window. Activity “Schedule Pieces to Fabrication Areas” receives input information and returns output information. The goal of dynamic consistency checking is to eliminate two types of inconsistency indicators — the unavailable- or the not-providedinformation items. The unavailable information items are the
input information items that are not provided by upstream activities: UnavailableInformationu[ fxjxa inputðAÞg − [ fyjya outputðupðAÞ g where x, y information items up(A) upstream activities of an activity A; output(A) output information of an activity A; input(A) input information of an activity A New information items can be generated as a result of an activity, but cannot be generated when they are only transferred from one activity to another. Such illogicality can be corrected by taking one of the following remedies or other approaches depending on the cause. – Add new output information items to one of the upstream activities. – Remove the unavailable information items from the input item list of the current activity.
G. Lee et al. / Automation in Construction 16 (2007) 392–407
401
Fig. 8. The consistency checker.
– Add new activity that can provide the unavailable information items. – Add or redirect information flows. The not-provided-information items are the information items that are required by downstream activities, but have not been provided any upstream activities. The not-providedinformation is formally defined as follows:
Not−providedinformationu [ fxjxa inputðdnðAÞ g − [ fyjya outputðupðdnðAÞ g where dn(A)
downstream activities of an activity A;
Similar to the unavailable information problem, the notprovided information items can be eliminated either by editing the information lists of relevant activities; adding or removing activities; or by adding, deleting, or redirecting information flows. The same logic has been implemented in the Consistency Checker (Fig. 8), which can detect and mark any inconsistent or empty activity in an RCM model. This batch checking function can be effectively used especially when a product modeling expert needs to check the validity of RCM models collected from non-product-modeling experts (i.e., domain experts). By deploying the dynamic consistency checking method, missing or illogical information items were detected and
corrected in the three collected RCM models. Note however, that the dynamic consistency checking method can be rendered ineffective if a modeler repeatedly copies one information definition list from activity to activity without rigorously compiling the information items one by one, because there will not be any logical inconsistency between the resulting identical information inputs and outputs. 4.3. Practicality of an automatically generated model The Company A and Company B information constructs were normalized into two separate preliminary product models in EXPRESS. In order to compare the results with the data structures of Company A's current database management system, the information constructs collected from Company A's model were also normalized into an SQL schema. For this process, an SQL code generation module was developed and used to show referential relationships between TABLEs. Fig. 9 shows the SQL table structure generated from the Company A model with referential relations. Company A's ERP system was running multiple database management systems, which were composed of several commercial and custom-built database management systems. Company A was using an MS Access®-based estimation system, two Oracle®-based production scheduling, shipping, inventory, and purchase management systems, a legacy accounting/costing system, an engineering/drawing management system, and a human resource/payroll system. However, only limited sets of information could be exchanged between the different database
402
G. Lee et al. / Automation in Construction 16 (2007) 392–407
Fig. 9. An SQL table structure of the Company A model with referential relations.
management systems. The company was in the process of developing a central database that would integrate the dispersed databases and that could acquire geometric information and bills of materials (BOMs) directly from an advanced 3D CAD system. Direct and quantitative comparison between the automatically generated data model and the data schemas of Company A's ERP system was limited because of the fundamental differences between them and also because of the differences between the terms used in the two data schemas. For example, the automatically generated model was designed as one large schema while Company A's existing database system was a distributed set of schemas. The automatically generated data model was based on an object-oriented modeling approach (i.e., EXPRESS) whereas Company A's systems were relational databases using SQL. In order to flatten the inheritance structure of the object-oriented model, the automatically generated EXPRESS model was translated into a SQL model based on one [32] of several mapping methods from EXPRESS (object-oriented database) models to Relational databases [10,16,24,26,32,33]. The automatically generated SQL model included thirty-seven TABLEs. Still, the fundamental terms and structural problems remained. Thus the differences between the property set structures and the terms prohibited the authors from comparing the two different schemas directly and quantitatively. Each TABLE and its attributes were reviewed by the authors and the IT manager at Company A. In terms of the existence of TABLES and attributes associated with them, the following sixteen TABLES were categorized as over-defined, nineteen TABLES as closely defined, and only two were categorized as under-defined: 1. Over-defined: TABLEs that include more information than Company A's current data models: ASSEMBLY, BIDDING, BOM (bill of materials), BUILDING_CODE, CONSTRAINTS, DIMENSIONS, ENGINEERING, EQUIP-
MENT, ERECTION, GEOMETRY (2D, 3D), MOLD, QC_CHECK, SHIPPING, SURFACE_TREATMENT, TRUCK_LOADS. 2. Closely Adequate: TABLEs that define information at the same level as Company A's current data models: DESIGN REQUIREMENTS, DOCUMENTATION, DRAWING, ERECTION_DRAWING, ESTIMATION, HARDWARE, HARDWARE_LIST, LABOR, MATERIAL, PIECE, PIECE_DRAWING, PIECE_LIST, PRODUCTION_AND_HANDLING, POUR, PRESTRESSING, PROJECT, REINFORCEMENT, SCHEDULE, SITE. 3. Under-defined: TABLEs that lack necessary information: BATCH (mix recipe), CONCRETE (mix recipe). The over-defined TABLEs include additional attributes and entities that were not managed by Company A at that time, but that they wished to manage in the near time. Examples include 3D geometry and engineering information, equipment information, quality control and constraint check information and additional shipping and BOM information. The two TABLES related to the concrete mix process were under-defined because the concrete mix process was defined with little detail in the requirements collection model. This shows the sensitivity of GTPPM to its requirements collection model; it can only create a product model based on the use-cases that are specified in the requirements collection process 4.4. Modeling effort required One of the goals of the GTPPM method is to reduce the degree of human effort required for product modeling. Empirical evidence of the impact the method has on the requirements collection and the logical product modeling phases was recorded through development of the process models for Companies A, B,
G. Lee et al. / Automation in Construction 16 (2007) 392–407
403
Fig. 10. A part of a double tee modeling process.
and C, in the previous sections and compilation of a sample unified product model described in Section 4.5 below. In contrast to the models of Companies A and B, which focused on management and administrative procedures, the process model prepared at Company C captured precast concrete engineering procedures. The procedures for designing and drafting prestressed double tee elements (Fig. 4) and exterior columns were selected and modeled. Engineering processes are more difficult to model than administrative processes because of the high degree of domain expertise they include and also because of their complexity; even domain experts with more than 10 years of experience find it difficult to describe engineering and design processes in a systematic way. In this case, the information items of each activity were defined directly without using information sets. As before, they were first defined as vernacular information items (VIIs) and then mapped to information constructs (ICs). The major difference between a business management process and a designing/drafting process in terms of information flow is that information flow in the designing/drafting process is accumulative: i.e., a model of a precast concrete structure behaves as a data repository. As soon as a designer adds a shape or texts to a precast concrete model or to a drawing, they represent certain information. Also, such design information affects not only the activities immediately following it, but also many other activities that appear later in the process. Therefore, the information transfer can be modeled using dynamic (information) repositories. In this case study, precast concrete pieces were modeled using dynamic
repositories, which is a concept similar to a database, allowing dynamic inputs and outputs, as shown in Fig. 10. In Fig. 10, “DT model” and “Drawings from clients” are the examples of dynamic repositories. They receive and redistribute collections of information to other activities.
Table 3 The statistics of model components Process model components Activities Internal detail External detail Internal high level External high level Total (nA) Flows Information flow (nF) Feedback flow Material flow Other Dynamic repository Static information source Continue Information items Information sets VIIs (non distinctive) ICs (distinctive) Degree of information dependence (nF/nA)
Company A Company B Company C 98 13 9 13 133
96 29 7 29 161
55 7 4 7 73
210 14 70
179 9 64
160 3 0
10 6 84
14 10 42
21 2 18
24 192 135 1.58
6 186 231 1.11
0 0 85 2.19
404
G. Lee et al. / Automation in Construction 16 (2007) 392–407
Table 4 Modeling durations (hours) Company
A
B
C
24 12.5 7.5 44
24 3 2 29
6 2 2 10
Task Process modeling (hours) Capturing detailed information flows using VIIs (hours) Mapping VIIs to ICs (hours) Total (hours)
Another difference between the previous Company A and Company B models and the Company C model is that the Company C model included specific types of products. Since the Company A model focused on the management process, types of pieces were defined by generic information such as product name or piece-mark, whereas, in the Company C designing/ drafting model, types of pieces were defined specifically as spandrel, pc_column, floor_piece, etc. In order to design a piece, designers need to know which type of piece (at an object level) is connected to which other type of piece. For the same reason, although the Company C model dealt with only two product types (double tees and spandrels), the definitions of adjacent pieces and connections were also captured. The GTPPM process models collected from Companies A, B and C contained 135, 231, and 85 distinctive information constructs respectively (Table 3). In terms of the number of process components, there was no significant difference between the Company A and Company B models; the models included 133 and 161 activities and 210 and 179 information flows respectively. The Company C model was smaller than the two previous models in terms of both the number of process components and the number of distinctive information items, because it only dealt with a small portion of the design and engineering process. It included 73 activities and 160 flows. The degree of information dependence in each model, which is the ratio of the number of information flows (nF) to the number of activities (nA) [27], was 1.58, 1.11, and 2.19 respectively. Table 4 shows the modeling hours recorded for the three models. The RCM modeling process took 44, 29, and 10 h respectively, whereas development of Company A's ERP system had occupied commercial database developers for several months. The difference between 44 hours and several months is significant given that the resultant model was very close to the data model of the working ERP system. Although this study did not focus on productivity, this result could be strengthened in subsequent studies by conducting controlled experiments focusing on the time reduction issue. 4.5. Logical product modeling and refinement Information constructs collected from the above three models were integrated as one model through the LPM process. This phase is automated, and does not require user input. The integrated model included 129 entities; for reference and comparison, CIS/2 LPM 6 [6] has 731 entities and PCC-IFC Version 0.9 [15] has 413 entities. The syntax of the automatically generated integrated product model was validated using two
syntax checking tools (those embedded in the commercial tool EXPRESS Data Management (EDM®) Supervisor Version 4.5 and in the shareware Expresso Version 3.1.4). The integrated model was then implemented as a physical database management system using the EDM Server. Fig. 11 illustrates an EXPRESS-G model of the integrated piece definitions from the Company A, Company B, and Company C models (‘dt’ in the model represents the double tee entity). The EXPRESS-G model was automatically generated by importing the integrated model into STEP Tools® ‘as-is’ without any refinement or modification. As shown in Fig. 11, a product model automatically derived from various process models using GTPPM is not complete and requires further refinement. This is not due to any logical problem in GTPPM, but due to its sensitivity to specified processes. For example, in Fig. 11, among three subtypes (‘spandrel’, ‘pc_column’, and ‘floor_piece’) of ‘piece’, only ‘spandrel’ has 3-dimensional geometry information. It is not an error and does not mean that ‘pc_column’ or ‘floor_piece’ will not have 3-dimensional geometry. It simply means that, in the information constructs collected from the three test cases described above, only ‘spandrel’ had a case where it was associated with 3-dimensional geometry information. This may seem unreasonable; however, it can be traced to the fact that the test cases were taken from companies using 2-dimensional CAD systems [27]. In such cases, if a modeler identifies missing information that has not been captured through the GTPPM process, such information should be added to the final model during the refinement process. Furthermore, the current version of the GTPPM tool was implemented assuming that entity data types would be elaborated in the product-model refinement phase, and so sets STRING as the default data type. The WHERE clauses and the cardinalities of entities are also assumed to be defined in the refinement phase. For example, the direct association relations between ‘dt’ and two connection types (dap and chord in Fig. 11) may be refined by the WHERE clauses in the manual modification process, and other modifications can also be made. Since these issues are usually dealt with in the last phase of product modeling, they may not be critical. However, a mechanism to capture domain rules and translate them into the WHERE clauses is a challenging topic and is worthy of the future attention of researchers in information modeling. 5. Conclusions As the variety of AEC software applications increase and their nature becomes more diverse, and the adoption of advanced information technology in real projects grows, the need for standard product modeling to enable data exchange between different applications increases. However, traditional standard product modeling practice involves a long and iterative review process. GTPPM is an effort to systematically collect rich information input from domain experts at the start of the process, in a form that can be structured and used to automate product model generation downstream, shortening the duration required for complete product model development. It is also an effort to logically structure and formalize the process of generating a
G. Lee et al. / Automation in Construction 16 (2007) 392–407
Fig. 11. Automatically generated PIECE definitions.
405
406
G. Lee et al. / Automation in Construction 16 (2007) 392–407
product model, transforming it from an intuitive art to an engineering procedure. An additional benefit is that the method allows participating companies to track how their corporate procedures have been embedded into the final product model. GTPPM has been implemented and improved through several test cases with the North American Precast Concrete Software Consortium (PCSC). It has also been used in a number of other research projects. Through these and other projects, GTPPM has had to address not only embedded logical and formal capabilities, but also user-oriented and organizational realities, which led to multiple refinements and enhancements. The case studies have shown that its application is feasible. Early indications are that it fulfils the goal of reducing product modeling duration, although no firm conclusions can be drawn regarding the quality of its output vis-à-vis that of a more traditional product modeling approach. It is however more rigorous and more easily automated. In the development of the GTPPM method and the software tool that supports its implementation, we found that both logical and user-oriented capabilities were needed and interdependent. One could not be completely defined before the other. Working with a sizable number of organizations allowed iterative cycles of development, deployment and refinement. Iterative refinement is a necessary component of such research and its validation; it is of value both as a method to be followed by others and as a means to define the logic behind the decisions made. We believe such an empirical process is necessary for knowledge capture research in general. In addition to automating the information and capture of new product models, the GTPPM method can also be used for the update and extension of existing product models. Furthermore, its potential applications may not be limited to product modeling. For example, the systematic, integrated collection of process and information flow data, and its semantic structuring into syntactic units, as realized in GTPPM, can also be applied in systematically capturing information requirements such as the Information Delivery Manual (IDM) or development of other knowledge-rich systems, such as enterprise resource planning systems, process reengineering, and in support of knowledge elicitation for knowledge-based system development. Appendix A. Glossary
APPC: Automated Project Performance Control CIS/2: CIMSteel (Computer-Integrated Manufacturing for Construction Steelwork) Integration Standards Release 2 DFD: Data Flow Diagram(s) GTPPM: Georgia Tech Process to Product Modeling (Method); A product modeling method that was developed to expedite a product modeling process by providing a logical integration mechanism between the requirements collection and modeling phase and the logical modeling phase – IC: Information Construct, a concatenation of tokens – VII: Vernacular Information Items
IDEF: Integration Definition of Function Modeling – ICOM: Input, Control, Output, and Mechanism IAI IFC – – – –
IAI: International Alliance for Interoperability IFC: Industry Foundation ClassISO STEP ISO: International Organization for Standardization STEP: STandard for the Exchange of Product model data
ISO STEP model types – AAM: Application Activity Model – AIM: Application Interpreted Model – ARM: Application Reference Model LPM: Logical Product ModelingRCM: Requirements Collection and Modeling SQL: Structured Query Language (a query and transaction specification language for the relational data model) UML: Unified Modeling Language(s) UoD: Universe of Discourse
References [1] C. Alexander, S. Ishikawa, M. Silverstein, M. Jacobson, I. Fiksdahl-King, S. Angel, A Pattern Language: Towns, Buildings, Construction, Oxford University Press, New York, 1977. [2] P.A. Bernstein, J.R. Swenson, D. Tsichritzis, A unified approach to functional dependencies and relations, in: W.F. King (Ed.), Proceedings of the 1975 ACM SIGMOD International Conference on Management of Data, ACM, San Jose, California, 1975, pp. 237–245. [3] D. Castro-Lacouture, B2B e-Work Intranet Solution Design for Rebar Supply Interactions, Ph.D. thesis, School of Civil Engineering, Purdue, 2003. [4] E.F. Codd, Extending the data base relational model to capture more meaning, ACM Transactions on Database Systems (TODS) 4 (4) (1979) 397–434. [5] E.F. Codd, Further normalization of the data base relational model, in: R. Rustin (Ed.), Data Base System, vol. 6, Prentice-Hall, Englewood Cliffs, N. J., 1972, pp. 33–64. [6] A. Crowley, CIMSteel Integration Standards Release 2 (CIS/2), http:// www.cis2.org/ (2003, Last Accessed: 2005). [7] C.M. Eastman, G. Lee, R. Sacks, A new formal and analytical approach to modeling engineering project information processes, in: K. Agger, P. Christiansson, R. Howard (Eds.), CIB W78, vol. 2, Aarhus School of Architecture, Aahus, Denmark, 2002, pp. 125–132. [8] R. Elmasri, S. Navathe, Fundamentals of Database Systems, Fourth ed. Addison Wesley Longman, Inc., Reading, MA, 2004 Edition 4. [9] E. Ergen, B. Akinci, R. Sacks, Formalization and automation of effective tracking and locating of precast components in a storage yard, EIA-9: Eactivities and intelligent support in design and the built environment, 9th EuropIA International Conference, Istanbul, Turkey, 2003, pp. 31–36. [10] J. Fong, Translating object-oriented database transactions into relational transactions, Information and Software Technology 44 (2002) 41–51. [11] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison Wesley, 1994. [12] IAI, Industry Foundation Classes IFC2x Edition 2, http://www.iaiinternational.org/Model/R2x2_add1/index.html (2003, Last Accessed: 2006). [13] ISO TC 184/SC 4, ISO 10303-1:1994 Industrial automation systems and integration- Product data representation and exchange- Part 1: Overview and fundamental principles, International Organization for Standardization, 1994.
G. Lee et al. / Automation in Construction 16 (2007) 392–407 [14] ISO TC 184/SC 4, ISO 10303-11:1994 Industrial automation systems and integration- Product data representation and exchange- Part 11: Description methods: The EXPRESS language reference manual, International Organization for Standardization, 1994. [15] K. Karstila, A. Laitakari, M. Nyholm, P. Jalonen, V. Artoma, T. Hemio, K. Seren, Ifc2x PCC v09 Schema in EXPRESS, PCC-IFC Project Team, IAI Forum Finland, 2002. [16] K.H. Law, T. Barsalou, G. Wiederhold, Management of complex structural engineering objects in a relational framework, Engineering with Computers, vol. 6, Springer-Verlag, New York, 1990, pp. 81–92. [17] G. Lee, GTPPM Official Website, http://dcom.arch.gatech.edu/glee/gtppm (2002, Last Accessed: 2005). [18] G. Lee, A New and Formal Process to Product Modeling (PPM) Method and its Application to the Precast Concrete Industry, Ph.D. dissertation, College of Architecture, Georgia Institute of Technology, 2004. [19] G. Lee, C.M. Eastman, R. Sacks, Grammatical rules for specifying product information to support automated product data modeling, Advanced Engineering Informatics 20 (2006) 155–170. [20] G. Lee, C.M. Eastman, R. Sacks, Twelve Design Patterns for Integrating and Normalizing Product Model Schemas, Computer-Aided Civil and Infrastructure Engineering (CACAIE) (in press). [21] G. Lee, R. Sacks, C.M. Eastman, Dynamic information consistency checking in the requirements analysis phase of data modeling (Keynote), in: Z. Turk, R. Scherer (Eds.), eWork and eBusiness in Architecture, Engineering and Construction— European Conference for Process and Product Modeling (ECPPM), A.A. Balkema, Slovenia, 2002, pp. 285–291. [22] G. Lee, R. Sacks, C.M. Eastman, Eliciting Information for Product Modeling using Process Modeling, Data and Knowledge Engineering (in review). [23] MIT, UIUC, The Design Structure Matrix Website, http://www.dsmweb. org/ (2003, Last Accessed: 2006).
407
[24] S. Monk, J.A. Marianib, B. Elgalalb, H. Campbell, Migration from relational to object-oriented databases, Information and Software Technology 38 (1996) 467–475. [25] R. Navon, R. Sacks, Status and Research Agenda of Automated Project Performance Control (APPC), ASCE Journal of Construction Engineering and Management (in review). [26] J.W. Rahayua, E. Changa, T.S. Dillona, D. Taniar, A methodology for transforming inheritance relationships in an object-oriented conceptual model to relational tables, Information and Software Technology 42 (2000) 571–592. [27] R. Sacks, C.M. Eastman, G. Lee, Process model perspectives on management and engineering procedures in the North American precast/ prestressed concrete industry, the ASCE, Journal of Construction Engineering and Management 130 (2) (2004) 206–215. [28] J. Song, C. Haas, C. Calda, E. Ergen, B. Akinci, C.R. Wood, J. Wadephul, FIATCH Smart Chips Projects: Field Trials of RFID Technology for Tracking Fabricated Pipe - Phase II, FIATECH, 2004. [29] VTT, IAI IFC Model Development, http://ce.vtt.fi/iaiIFCprojects/ (2004, Last Accessed: 2005). [30] J. Wix, Information Delivery Manual (IDM) Using IFC to Build SMART— The IDM Project Official Website http://www.iai.no/idm/learningpackage/ idm_index.htm (2005, Last Accessed: 2005). [31] J. Wix, Information Delivery Manual (IDM): Enabling Information Exchange in AEC/FM Business Processes, Jeffrey Wix Consulting Ltd, UK, 2005, p. 33. [32] S.-J. You, D. Yang, C.M. Eastman, Relational DB implementation of STEP based product model, CIB World Building Congress 2004, Toronto, Ontario, Canada, 2004. [33] X. Zhang, J. Fong, Translating update operations from relational to objectoriented databases, Information and Software Technology 42 (2000) 197–210.