Cop\Tight
© )FAC
Decision,,) Structures in I ~IHq
AUIOIll
DATA INTEGRATION FOR QUALITY ASSURANCE IN COMPUTER INTEGRATED MANUFACTURING J.
T. Tou and K. Chung
Center for Information Research, University of Florida, Gai ll esl'ille, FL 32611, US A
Abstract. This paper presents the design concept of an intelligent information system for verification, validation, association, and conversion of test data which are gathered at various workcells in CIM environment. The technician at each workcell enters test data into a central database. Experience has told us that the database may be erroneous due to wrong entry, wrong unit, wrong format, wrong code, and wrong scale . Thus the database must be verified and validated before it becomes useful. The data in the validated database are then grouped and categorized for the well-defined hierarchical structure. To facilitate the use of the database, the data will be dynamically reorganized to meet the needs of various users. Keywords . Computer-integrated manufacturing; data integration; data verification; data validation; data association; data conversion; patterndirected knowledge-based system.
We propose the design of an Intelligent Quality Information System (IQIS) which performs various functions for quality control, such as data collection, data integration, and decision support. In this paper, we focus our attention on the data integration strategies of an IQIS for data verification, data validation, data association, and data conversion, as summarized in Fig. 1. We apply this concept of data integration to the test data collected from PCB production operation at an electronic manufacturing company.
INTRODUCTION Computer integrated manufacturing (CIM) ~s a key to improving tomorrow 's industrial productivity. The design of CIM is a multi-disciplinary effort, which involves not only mechano-industrial engineering and control systems theory, but also computer science, software engineering and information technology. As a matter of fact, the latter discipline will play a leading role of increasing importance, because most factories have been islandwise automated and locally integrated for efficient production and the major problems in CIM perhaps lie in information-based integration. A manufacturing organization consists of a number of functional units, i.e., corporate planning, engineering design, production planning, marketing planning, research and development, manufacturing, warehousing and product distribution. These functional units are linked either by management information flow, by technology information flow, by materials flow, or by a combination of both [1]. It is information which holds the key to integration of manufacturing processes. This paper is concerned with some fundamental problem in information-based integration of automated manUfacturing.
DATA INTEGRATION In CIM environment, technicians at various workcells enter test data into a central database, which we refer to as the raw database. Our idea of data integration is to perform data verification and data validation of the raw database . The validated data are then automatically grouped and categorized to form a quality assurance database for the whole manufacturing organization. We call this operation data association. To facilitate the use of the quality assurance database, the system will dynamically reorganize the data structure to meet user demands. This operation is referred to as data conversion. These four basic operations in data integration are discussed in this section. Prior to data verification, a normalization operation is performed to unify the description of each entity.
As shopfloor has increased in speed, complexity, and scope, the requirements imposed on data collection, verification and validation, integration, and decision support techniques for qual i ty assurance have exceeded the capability of the traditional manual techniques [2][3]. The increase of information flow from various sources and the reduced reaction time requirement have dictated the automation of data collection and data integration as well as the decision support for process control.
The process of data normalization is the application of a number of rules to the relational model, which is used to describe all entities, in order to unify the attribute relations in the entity representation. These rules prove to be useful guidelines because the relations
1')·- ,)
.J.
l~(i
T. TOll alld K. ChulIg-
formed by the normalization process will make data easier to interpret, verify and manipulate. In the case of printed-circuit board (PCB) test data, we may have the entity which contains multiple test results (attributes "defect id" and "location") in a single data entity. In this case, our normalization rule should ensure that all the attributes are atomic (in the smallest possible component); that is, there is only one value for each domain and not a set of values, as shown in the following example: Before normalization, the entity representation contains multiple test results, (pcb_id, workorder, op_no, defect_id" location" defect_id z , location z , defect_id 3 , location 3 , date). After normalization, the entity is described by (pcb id, workorder, op_no, defect_id" location" date) (pcb id, workorder, op_no, defect_id z , location z , date) (pcb id, workorder, op_no, defect_id 3 , location 3 , date). Data Verification The data verification stage is designed to detect and identify erroneous data entities in the raw database, and to notify the next stage of data validation to correct these entities. This stage consists of two types of verification operations: (1) attribute-based verification (2) constraint-based verification. In performing data verification, we utilize two types of reference database; i.e. entity/attribute dictionary for attributebased verification and value dictionary for constraint-based verification, which contain the knowledge necessary for data verification. Attribute-based verification performs two levels of verification for each attribute of an entity, i.e. the syntactic level for data format, and the semantic level for data value, code, unit, and scale, based on the entity/attribute dictionary which describes the allowable formats, values, codes, and scales for each attribute of an entity. constraint-based verification performs the product-dependent dynamic verification based on the value dictionary which describes the relationship between primary attributes and constrained secondary attribute values. In performing the attribute-based verification, we utilize the entity/ attribute dictionary database which specifies allowable domain for each attribute. This allows the information providers, such as test operators or computerized machines, to cooperate with others even though they use different formats, scales, or units.
where E; , A;k and N denote the entity, the kth attribute of entity E;, and the number of attributes, respectively; F, C, V, U, and S denote sets of allowable formats (f"f z ' .. ), codes (c"c z ' •• )' values (lower and upper limit), units (u"u z , .. ), and scales (s"sz' .. ). In performing the semantic level of attribute-based verification, if the attribute has the unit field (or scale field), several allowable units (or scales) may be defined within their ranges. Fo/ instance, an attribute may have several allowable units, such as "Cm", "m",and "Km", as shown in Fig. 2. The system reads an attribute value from the raw database and determines its unit by finding the range corresponding its value. The unit "Cm" is the desired unit which has the first priority for the range matching. If the proper unit is found, the procedure will be stopped with the unit identification code for the unit/scale unification in the data validation stage. If not, this data entity will be identified as erroneous. Let us consider the example of the entity 'pcbtest' for typical PCB test data, as shown in Fig. 3~ which will be used for every illustration in this paper. The first three attributes are entered by a bar code reader and the last three attributes are entered by a test technician to send the PCB inspection results into a central da tabase v ia a mul t iplexer channel. The data entity (053627 4U316 601 171 C13TR14 050589) in Fig. 3 can be interpreted by the system as follows: The PCB (053627 serial no.) for amplifier (4U316 product type) has the defect (171 solder bridge) between the capacitor (C13) and the transistor (TR14), which is detected at the workcell (601 : solder inspection) on May 5, 1989. Fig. 4 illustrates the structure of entity/ attribute dictionary for PCB test data which allows, in this specific example, different formats and codes. The verification procedure is applied to the attributes of each entity in order to identify erroneous data in term of format and code, by performing the comparison operation. The underlined format has the first priority for verification (or comparison) as the desired format. In addition to the attribute-based verification, the product-dependent, process-dependent constraint-based verification will be performed using the constrained attribute relationship. This verification is based on the value dictionary describing the constraints between attributes, which are dynamically changeable upon the inquiry from the manufacturing department. The value dictionary is defined as
The entity/attribute defined as E; -> F C
V U S
dictionary
is
(A;,(F,C,V,U,S), ... ,A;k(F,C,V,U,S), ... ,A;N(F,C,V,U,S)} { f, , fz, f 3 ,
••• }
(C"c Z,c3' · · · )
(lower limit, upper limit) (u, (range,) , U z (range z ) , ... ) (s, (range,) , U z (range z ) , ... }
where PA(E;) denotes a set of primary attributes (p"pz' ... ) for an entity E;; A; is the primary attribute which constrains the values and codes of the secondary attributes (A" .. ,A;."A;+" .. ,A N). From this
127
Data Integration for Quality Assurance general definition, we can easily find that each attribute can be both primary attribute and secondary attribute for other attributes.
erroneous data which can be utilized to reduce errors by informing op~rators of their most frequent mistakes in the data entry.
For instance, a primary attribute 'workorder' may have the allowable values of secondary attributes such as 'pcb_id', 'op no', and 'date' in the entity of test data 'pcbtest (pcb_id, workorder, op_no, defect id, location, date) '. Furthermore, a primary attribute 'op_no' may have the allowable values of secondary attribute 'defect id' as shown in Fig. 5. In this example~ the system searches the entity name ('pcbtest'), primary attributes ('workorder', 'op_no'), and their secondary attributes as illustrated in Fig. 6. By performing this constraint-based verification, we can examine and remove the product-dependent semantic level of error. Also value dictionary can be updated and modified by a system operator via the interacti ve user interface, which should guarantee the consistency and accuracy for the dictionary.
Data Association The error-free raw database, which has been verified and validated through the previous operations, is automatically grouped and categorized to form a wellstructured quality assurance database. In the data association stage, we represent the database by hierarchical enti tyattribute-value relations which is referred to as category files. The raw database collected in a random fashion is semantically categorized into an associative tree by searching the data entity, identifying the key attribute for classification, and assigning each entity to the proper node in the associative tree structure. Each node of the category file hierarchy represents an entity associated with a set of attributes and values, as illustrated in Fig. 7. The categorization defined as follows~
procedure
is
Data Validation The data verification stage passes erroneous data with the error codes to the data validation stage. Data validation accomplishes the data correction task in order to generate the error-free raw database, based on the verification results in the previous stage. In performing this task, all attributes will be transformed into the desired representation by unifying allowable formats, units, scales, in order to facilitate the following stage of data association which generates the hierarchical data structure.
FOR i=l TO M ; for each entity FOR j=l TO L ; for each level of hierarchy E' ; = C ; j (E;, K;j' B; j ) ; generate the categorized entity E'; where R; -> B; ->
C; -> ->
L M
The data validation task includes the following subtasks: a) format modification This subtask modifies, if any, the alternative formats into the desired format. The system inquires the data reentry for the attribute, which has the invalid format, in the user interactive mode.
C;
the number of levels of the hierarchy for entity E; the number of entities involved in the opera.t ion the set of key attributes for E; key attribute for the jth level of hierarchy the set of possible branches (categories) for E; the set of possible branches for the j th level of hierarchy the set of categorization operators for E; : categorization operator for the jth level of hierarchy
b) unit/scale unification The previous stage of data verification can determine the unit (or scale) of an attribute by the range corresponding its value. The system unifies the unit/ scale by multiplying an attribute value by an appropriate factor for the desired unit/ scale. c) invalid code (or value) elimination If invalid codes (or values) are detected, the system has to decide whether the corresponding entity will be eliminated or the inquiry fot data re-entry will be issued. The decision is dependent on how important that entity is. In our application, erroneous test data entities are kept from the further processing with error flag. d) statistics for erroneous data In addition to the above generic functions of data verification, the system generates the statistical information for
In a PCB manufacturing operation, our approach to test data grouping is based on the test workcell structure. The categorization of various tests is based on the defect type. The raw database is reorganized by the values of key attributes which are examined to locate each entity in an associative tree, i . e. workorder, op no, and defect_id. For .the data association~ we use the entity-category-relationship model [4-6] where category is defined as a subset of entity. This model is known to have more tree-like structure than the well-known entity-relationship model. In Fig. 7, the node 'test data', as a category of the QA database, has several attributes corresponding various test workcells. In turn, each node for a test workcell may have its subsets by categorizing the function of a test workcell into several defect domains (or categories) . Each category for defect domain also maintains many defect types as its attributes. After grouping and categorizing data
1211
J.
T. Tou and K. Chun/{
entities based on the identity (or similarity) of the key attributes, we can associate further the inter-related entities in the leaf nodes of the associative tree into more structural shape. In performing data association, we expect to encounter different entities in similar domain which can be merged into single entity (or in hierarchical form relating those entities by the tree-like structure) . Three types of similar domains; i.e., identical, enclosed, and overlapped, are considered for the data merging as follows: a} identical entities operation : A(E j } == A(E;} A(E;} is a set of attributes of entity E;. Merging operation for the identical entities (E;, EJ ) , which have the same attributes, is to ellmlnate the duplicated declaration for entities in the same domain by unifying the entities. b} enclosed entities operation The enclosed entities, where the entity (E j ) in a larger domain encloses another entity (E;), can be combined to form a hierarchical representation by assigning the parent node to E. and assigning the son node to the entity J E'; generated by the above operation.
'pcb_id' and lOp no' (repair workcell number). Therefore-repair technicians can efficiently retrieve test results and perform repair operations for each peB by entering key attributes. Furthermore, the category file can be regrouped into more convenient form for direct utilization to support statistical analysis. For instance, the system directly utilizes the category file to provide the defect histogram during a certain period. However, history data for each PCB can easily be retrieved from the new hierarchical structure which is generated by performing data conver~ion based on key attribute 'pcb_id'. Fig. 8 illustrates the example of data conversion for supporting repair operation and statistical analysis. In designing the decision support system for process control, diagnostic trees for the various defect types are designed by capturing the knowledge and expertise of experienced engineers and technicians as well as the design data for the manufacturing processes. The observed defects are converted into a defect pattern which serves as the input to the intelligent information system for performing diagnostic analysis. This portion of the intelligent information system is a knowledge-based expert system. Knowledge engineering techniques are applied to seek possible solutions. The design of a knowledge-based expert system may be conducted by following two approaches [8]:
c} overlapped entities operation: A(E k} == A(E;} () A(E-} A(E';} == A(E;} - A(kk} A(E' j} == A(E j } - A(E k} The overlapped entities (E-, E. ), where these two entities have a set of common attributes (A (E k) ) , can be merged by assigning the parent node to new entity (E k) containing the overlapped (common) attributes and assigning son nodes to new entities (E'-, E'.) containing the nonoverlapped attributes. Data Conversion (and Extraction) This stage of data integration provides the demand-oriented (or problemoriented) dynamic data reorganization to support various user applications. The data conversion aims at the decision support for the designated level of user by regrouping and utilizing the category files built in the previous stage. since the different user classes have different views (demands) of the central database, our approach in the data conversion focuses on regrouping the category files into user-oriented format. Before starting the data conversion, the analysis for user demands should be done based on the expected user inquiries. For our specific applications in PCB manufacturing operation, we classify user demands into four classes: a} repair operation, b} statistical analysis, c} process control, and d} process improvement.
(l) rule-based approach, and (2) pattern-directed approach [6][7] The rule-based approach makes use of a collection of "if-then" rules. The patterndirected approach is based upon the construction of pyramid-like know-hows, which refer to as a knowledge hierarchy. PADIKS (pattern-directed knowledge-based system) approach is especially powerful in dealing with problems with classificatory properties such as medical diagnosis [4] [5] , agricultural application [7], and information retrieval. By utilizing the diagnostic tree based on field engineers' experience and the design data for the manufacturing processes, we can generate the relationship between defects and causes. Fig. 9 illustrates several possible causes for each defect. Corresponding to each defect cause, there are a set of defects associated with the defect cause. Each defect has a confidence factor to indicate its frequency of occurrence when a given cause is present. Another confidence factor, indicating the frequency of occurrence of a defect cause when a given defect is observed, can also be recorded. These data provide the relationships between defect causes and defects which can be represented as «defect>,
In order to support the repair operation, the data conversion reorganizes the category file of test data into a new structure based on the key attributes
«solder ball>,<5%» - [(low preheat time, O.4), (low solder fluidity ,O.2), (solder contact time,O.2), (low preheat temp.,O.I),
Data Integration for Quality Assurance
(low solder temp. ,0.1), ... )
(3)
«excess solder> , <;5% » [ (slow conveyor speed,0.3), (high wave height,0.3), (nonuniform flux,0.2), (board direction,O.l),
... )
(4)
«insufficient solder> ,<5%» [(low preheat temp,0.4), (fast conveyor speed, 0.3), (low wave height,0.3), (high solder pot temp., 0.2), (foam flux not wetting,O.l), (lead contamination,O.l), (hole contamination,0.05), ... )
(5)
(6) In order to reduce the time of rulebased reasoning, the well-known 'Pareto diagram' based on Pareto principle, which provides graphic guidance for prioritized problem solving, will be employed by considering the major defect causes first. The Pareto principle, which has been widely accepted by the manufacturing community, can be simply stated as follows: A few of the manufacturing process parameters (vital "few) cause most of the quality problems, whereas most process parameters (trivial many) account for very little of the quality problems on the manufacturing operation. Pareto diagram is generated by tabulating the number of observed defects versus the defect type during a certain period of manufacturing operation. This diagram is useful to find the vital parameters and to update the confidence values between the defect and its vital parameters. However, it does not consider dynamic interactions between a set of vital parameters for each defect. The study on this dynamic interaction problem is under investigation.
(7)
(8)
12Y
E. L. Waltz and D. M. Buede (1986). Data fusion and decision support for command and control. IEEE Trans. Sys., Man, and Cybern., vol. SMC-16, no. 6, pp. 865-879. " J. T. Tou (1978). Design of a Medical Knowledge System for Diagnostic Consultation and Clinical Decisionmaking. Proc. Int'l Comp. Syrnp. J. T. Tou (1978). MEDIKS - a medical knowledge system. Proc. 31st Annual Conf. on Engineering in Medicine and Biology. L. C. Chang and J. T. Tou (1984). Mediks - a medical knowledge system. IEEE Trans. Sys .. Man. and Cybern., vol. SMC-14, no. 5, pp. 746-750. J. T. Tou and J. M. Cheng (1983). Design of a knowledge-based expert system for applications in agriculture. Proc. IEEE Symp. on Automating Intelligent Behavior. J. T. Tou (1985). Knowledge engineering revisited. Int. J. of Comp. and Inf. Sci., vol. 14, no. 3, pp. 123-133. ACKNOWLEDGEMENT
The work reported in this paper was supported by the Florida High Technology and Industry Council under grant 4910451406512.
Based upon the PADIKS approach, the observed PCB defects are the input which is converted to a pattern vector . This pattern vector is utilized to find the most possible defect-causes and to recommend the appropriate process control by navigating through the tree-like hierarchical knowledge structure.
identify erron eous d a t a in t he raw da tabase
corre c t or rem ove erro neou s d a t a i n the ra w databa se g roup and c at ego ri ze t he d a t a i nt o a well-d e fin e d h ierarch i c al s tru c tu re fo r t he qual ity assura nce database
CONCLUSION In this paper we have presented the design concept of an intell i gent quality information system (IQIS) for electronics engineering and manufacturing industries. Four ba s ic operations (data verification, data validation, data association, and data c onversion) for data integration have been discussed. The concept of data integration has been applied to the test data collected from PCB production operation at an electronic manufacturing company. Based on the design strategies discussed in this paper, we have been developing a prototype system in collaboration with electronics industry. REFERENCES [1)
(2)
J. T. Tou (1985). Design of expert systems for integrated production automation. Journal of Manufacturing Systems, vol. 4, no. 2. J. T. Tou, K. Chung, J. Park, and T. Jeong (1988). Intelligent information system for computer integrated manufacturing. Proc. the 1st Conf. on PROCIM, pp. 10-12.
dynamic al l y r eorg an i ze t h e da t a s truc ture to mee t user demand s
Fig . 1
Design Concept for Data Integration in CIM Environment
unit Cm m Km
Fig. 2
range 10 0.1 0.0001
100 1 0.001
Allowable units and Their Ranges
J.
130
T. Tou and K. Chung
entity : pcbtest
053627
4U316
171
601
C13TR14
attribute (description) pcb id
(peB
primary attribute: workorder (4UJ16 : peB for amplitier) secondary attributes : pCb_id ( 400000, 401000 : lower and upper limit for PCB serial number for product 4UJ16) op no ( 610 : mechanical test, 601 : solder joint, 6J7 : coating test ) date ( 010588, OJ0588 : manUfacturing date )
050589
format
serial no.)
workorder (worJc.order no. for each product type) (ID of process step or worxcell no .) defect id (type of defect) location (location of detect) date (data collection date) op no
I6 AS IJ
primary attribute: op_no ( 610, 601, . , . , 6J7 ) secondary at tribute : defect_id «251, '" 254), (171 : solder bridge, 172 missing solder, .. , 178 : cold solder) ,
A3 A7 I6
(081,
'"
084»
An: alphanumeric character (n : length) In : integer Fn: fixed-point (decimal number)
Fig. 3
Fig. 5
An Example of the Entity for PCB Test Data
Example for Primary and Secondary Attributes
-
pcbtest (workorder op no) vork-order
enti ty : pcbtest attribute(l) attribute(2)
attribute() attribute(4)
Fig. 4
40316
op_n
: pcb id (fornat(li.,A6) ) : workorder (format(lI2) , code(4U316, ... » : OP_"O (format(Il,A3), code(020, .. ,902») : detect id (format(Al), code(050, .. , FOO»
Entity/Attribute Dictionary for PCB Test Data
entity type
pcb id
4 00000 -
401000
date
010588
-
030588
op no
no
defe ct id
251-254
Fig. 6
test ....orkcell
coat ing test
compo nent test
610,601,6)0, 6J5, 595 590,6 4 0,637
,.".,. . . --1 .... ron9 component
damaged
test data
solder test
detect
derect domain
data
QA
601
171-178
I ... I I ... I
-E
~
type
.... rong location 111 ssin9 component .... rong polarity part not seated imp roperly mounted short lelllds long leads lead improperly bent type value mate rial chipped cOlDponent cracked cOlDponent broken componen t measles cOlDponent damaged Pf..:B solder bridges lDi ssing solder tract. solder joint excess solder insuff ici~nt solder pinholes solder bubbles cold solder
solder
s horted open
test {
repair history data
tolerance
intend ttent voltage frequency
electrlc defect
resistance capacitance voltage gain diode voltage le vel rise/fall t ime truth tahle leakage
in-circuit
mechanical test
hardware
. E
----.f--
ro ng hard .... are missing hardware damaged hard .... are loose hard .... are
cleanliness test
Fig.
7
637
081-084
Value Dictionary for ConstraintBased Verification
process control
databa se
I I
category File for QA Database in PCB Manufacturing
:
Data Integration for Quality Assurance raw peB data data verification
I validation
and association
category file based on defect type data conversion
I menu-driven user command interpreter
~
a) new hierarchy based on pcb_ id b) new hierarchy based on pcb_ id and op_no
example) a) PCB history data ? b) repair operation ?
graphical user interface
I
T•
user inquiry Fig. 8
0 0
~
<
It)
S
....'"
~
rT
'<
It)
III
'""
., 0
'0
~
0
(1)
It)
An Example of the Data Conversion operation
....
rT
It)
I-'
s
., ::r .... .,''"" ::r It)
(1)
rT
It)
co
0 0
....'" rT ....rT0- 0'" rT '< ....rT ~
'0
Ul
.;
rT
< (1)
'"
0.
Il
co
rT
It)
s
'0
., .,''""
rT .... s
It)
It)
rT
It)
'" '" ............
Il 0
~
III
'0 It)
0
rT
0
., ....<'"
Ul
rT
'<
Mechanical adjustment
Solder formation
Fig. 9
~
~
Preheat conditions
Heat source
Diagnostic Tree for Solder Defects
.,.... 0
It)
III
0
0 0
0-
rT
I-'
~
., ....'" ....'" ....rr'" " .... '" ~
(1)
rT
S
rT
0
I-'
~
rT
It)
rT
'<
0-
I-'
co
'" ,
0-
Component quality