COMPUTER VISION AND IMAGE UNDERSTANDING
Vol. 63, No. 2, March, pp. 273–286, 1996 ARTICLE NO. 0019
Perfecting Vectorized Mechanical Drawings YUAN CHEN, NOSHIR A. LANGRANA, AND ATISH K. DAS Department of Mechanical and Aerospace Engineering, Rutgers—State University of New Jersey, P.O. Box 909, Piscataway, New Jersey 08855 Received August 12, 1993, accepted February 24, 1995
This article describes a system which has been developed to convert scanned engineering drawings into a vectorized file. The vectorized file consists of lines, arcs, and circles and generates a description of paper-based engineering drawings. The scanned engineering drawing is initially vectorized by a system called RENDER using line-tracking and curve-fitting techniques. However, the results obtained after the initial vectorization are not adequate. This paper discusses a post-processing system called P-RENDER which has been developed to further refine the vectorized line drawings and to recreate the drawing with the exact numbers of lines and arcs. Dashed entities are also detected and recreated. For large C- or D-size engineering drawings the refinement algorithm also leads to a considerable reduction in the size of the CAD database. 1996 Academic Press, Inc.
1. INTRODUCTION
A mechanical drawing is a geometric description of an object. It is mostly composed of parametric curves and text annotation. Conversion of mechanical drawings is essentially a computer vision problem. The task at hand is not only to produce a compact encoding of the paper drawing in electronic format, but also to obtain an intelligent interpretation of geometric information of the object being drawn [1]. Much of the work to date has been addressed toward the problem of automating segmentation and recognition of components in mechanical drawings and documents [1–7]. It is intended that the system described here be incorporated with one of these systems called RENDER (restoration of engineering drawings or blue prints into CAD data base) developed by Nagasamy and Langrana [2]. This system automatically recognizes and generates a description of paper-based engineering drawings. Like the other vectorizing methods published in the literature [3–5], the line extraction algorithm (such as line tracking and curve fitting) used in RENDER is based on low-level local processing. An individual segment is fitted with straight lines, and if the fitting is not acceptable the segment is fitted with higher order curves such as conic [2, 6] or cubic splines [7]. There are three noteworthy
drawbacks in this low-level bottom-up processing technique [1]. First, only local information is used and the fitting process is sensitive to local distortions that are introduced during preprocessing (e.g., thinning). Second, an adverse side effect is that a segment can be a part of a single curve only (e.g., an improper starting point may divide an arc into two arcs). Third, dashed entities can not be extracted; i.e., dashed lines or dashed circles are vectorized as isolated small line segments. The vectorizing algorithm requires a number of parameters such as angle tolerance or arc deviation to alleviate these problems. However, the problems have not been solved completely [2, 8]. The current research presents an approach to alleviate the above-mentioned problems. A post-processing system called P-RENDER has been developed. P-RENDER recreates the output obtained from RENDER. It includes procedures for attribute assignment, recreation of arcs, recognition of dashed entities, and global alignment of the points in the drawing. The processing steps involved in recreating the arcs present in the vectorized drawing are described in Section 2. The technique used for the detection and recreation of the dashed entities are described in Section 3. Global alignment is discussed in Section 4. Errors and the current limitations of the system are discussed in Sections 5 and 6. 2. REFINING THE VECTORIZED DRAWING
2.1. Pattern Primitives RENDER [2] was used to provide the vectorized image of the scanned drawing in this research. Primitives (lines, arcs, circles, etc.) were recognized from the digital data. A CAD format for each primitive has a form ‘‘x, y, attribute.’’ The attribute of the points were designated based on the primitives vectorized [9]. The attribute of a point denotes whether a point is a starting point or end point of a line, or center of a circle, or center of an arc, etc. Figure 1 shows the vectorized information in the line segments and the defined pattern primitives, A to G, which were used previously and H to L which were developed in this study for dashed line entities. The vectorized information is
273 1077-3142/96 $18.00 Copyright 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
274
CHEN, LANGRANA, AND DAS
FIG. 1. Pattern primitives.
stored in the form of clusters. The coordinate and attribute information of all the points which are connected to each other by arcs or straight line segments are clustered together in clockwise or counterclockwise order. The first point in this cluster is called the starting point and the last point is called the end point. 2.2. Improper Vectorization In general, a vectorized system such as RENDER has three major problems: (i) inability to correctly vectorize
the segment, (ii) inability to assign proper starting and end points, and (iii) inability to extract dashed line entities. Figure 2 shows a drawing with a circle, three lines AB, ` ` ` CD, EF, and three arcs BC , DE , and FA. This figure shows a simple yet realistic vectorization problem. The task in vectorization of this drawing is not only to correctly vectorize all line segments, arcs, and a circle, but also to locate tangent points and concentric center points. Several resolution rates (72, 144, and 300 DPI) were tried. Some information was lost when a resolution of 72 DPI was used. A resolution of 300 DPI introduced some noise into the im-
PERFECTING VECTORIZED MECHANICAL DRAWINGS
275
A series of algorithms have been developed to refine vectorized drawings. They include relocation of the starting and end point, recreation of arc, dashed line entities, and global alignment (Fig. 4). 2.3. Location of a Proper Starting Point
FIG. 2. A shaft support fixture.
ages. The resolution of 144 DPI was found to be optimum. The drawing shown in Fig. 2 was scanned and vectorized. A magnified form of the vectorized image is shown in Fig. ` 3. Figure 3a shows that the arc BC became a straight line and an arc. ` straight Figure 3b shows that the arc DE became an arc, ` line, and an arc. Figure 3c shows that the arc FA became five straight lines. All the straight line segments representing various portions of arc’s were vectorized as small straight line segments. Usually, in mechanical engineering drawings small straight line segments do not occur. Therefore, in this paper these straight lines segments are called pseudo line segments, PLS. PLS are always adjacent to the arcs. A default value of 0.21 in. is used as the threshold length for detecting the PLS. So, if the length of a line segment is less than the threshold length then it is considered as a PLS. The value of 0.21 inches for the threshold length was arrived at after testing a large number of drawings which were drawn on 8As99 3 1199 paper. The value of the threshold length can be modified by the user.
The location of the starting point is important from the point of view of recreation of the arcs. The algorithm for the recreation of the arc requires that the starting point does not fall anywhere on the original arc. This helps in bracketing the PLS and the smaller arc segments which are a part of the original arc. Initially the starting point of the vectorized drawing is forced to the bottom left-hand corner. This is accomplished by reviewing the sum of x and y coordinates of all the points. The point which has the minimum sum is chosen as the initial starting point. However, this location may not be the desirable starting point and so additional processing is done. In the example shown in Fig. 3, the starting point was at G. In Fig. 5, the starting point was at D. Figure 5 has located a satisfactory starting point. However, this is not true for Fig. 3. The starting point is relocated again if one of the two criteria is satisfied: (i) the straight line segment is identified to be a PLS, or (ii) the starting point is identified to be belonging to an arc. In Fig. 3, point G is the starting point of an arc; therefore, it is moved to the next point E in the clockwise direction. Since the segment EF is a straight line and not PLS, location E becomes a desirable starting point. If the vectorized image was counter clockwise with starting point G, the desirable starting point obtained using the above criteria would have been point D. 2.4. Recreation of the Arc Segment
FIG. 3. Vectorized image of the shaft support fixture.
After vectorization, an original arc may be represented by a combination of lines and arcs which must be combined to recreate the original arc. The two steps involved in the recreation of an arc are: (i) to trap all arc segments and/ or PLS representing the original arc, and (ii) to compute the starting point, the end point, and the center of the arc. 2.4.1. Location of the Arc. The length of the PLS is much smaller than the actual straight line segment. The lengths of all the straight line segments present in the vectorized drawing are compared to locate the PLS, and the arc(s) adjacent to these PLS are trapped or bracketed to obtain the group of PLS and arc(s) representing original arcs. This is followed by determining the starting point, the end point, and the center of the recreated arc. As shown in Fig. 3, the starting point is E, and EF is a straight line segment which must be kept, but the straight lines L1, L2, . . ., L5 were identified as PLS and they were bracketed. Since segment AB is not identified as PLS, it was not included; thus, the arc between EF and AB was
276
CHEN, LANGRANA, AND DAS
FIG. 4. Flow chart for refinement of the vectorized drawing.
identified. Similarly the arcs were identified between straight lines AB and CD, and between CD and EF. The next task is to compute the starting point, the end point, and the center. Usually, there are four possible arcs which can be recreated (Fig. 6).
`
(4) Figure 6iv shows that semicircular arc BC has sides which are going in opposite directions. Figure 7 shows a general circular arc in which tangent ` lines are not parallel. Consider arc BC with a starting point B and end point C. The reconstruction procedure is as follows:
(1) Figure 6i shows a general circular arc with tangent lines. `
(2) Figure 6ii shows that arc BC ’s sides have an angle of 908. (3) Figure 6iii shows that both sides of the semicircular ` arc BC have the same slope.
FIG. 5. A template.
FIG. 6. Four kinds of arcs which can be recreated.
PERFECTING VECTORIZED MECHANICAL DRAWINGS
277
FIG. 7. A general circular arc with tangent lines. FIG. 9. A semicircular arc between parallel lines.
(1) Determine the point of intersection of the lines AB and CD, i.e., point I, (2) Determine the bisection line IJ, (3) From point B, drop a perpendicular such that BE ' AB, (4) From point E, drop a perpendicular on DI such that EF ' DI, (5) Compute G as the average of points C and F, (6) From the new point G create a line GH ' DI, and thus determine point H, (7) From point H, drop a perpendicular onto AI to obtain point K. The arc is reconstructed with the starting point K, the end point G, and the center H. This method does not work when an intersection point I cannot be found. This occurs in two cases as shown in Fig. 6iii and` Fig. 6iv. Figure 8 shows a semicircular arc BC between AB and CD, where AB and CD are in the same direction. In this case, starting point B and end point C are very close to the exact position. This is because the arc and its sides have sharp angles. Therefore, the center point E is computed by averaging the points B and C. ` Figure 9 shows a semicircular arc BC between lines AB and CD where AB i CD. To recreate the arc shown in Fig. 9, the line BE is created which is perpendicular to AB. The point F is determined as the mean of points E and C. The straight line FG perpendicular to CD is created. In this manner, the starting point G and end point F of the semicircular arc were computed. The center point H is the mean value of points G and F.
FIG. 8. A semicircular arc between collinear lines.
The vectorization of a keyway presents a different problem. As shown, Fig. 9a and Fig. 10 are very similar. The only difference is the size of the arc. There is an arc between points B and C in both figures. In Fig. 9a, chord BC is of the maximum length compared to other chord lengths which result in the semicircular arc. In Fig. 10, chord BC is not the one with the maximum chord length. Thus, the length of chord BC determines whether the arc is a semicircular ` arc or a keyway’s arc. In Fig. 10 the keyway’s arc BC consisted of five lines and two arcs. Using the starting point relocation procedure, point A was identified as a starting point. The point B was identified as the beginning of the arc and point C as the end point of the arc. The center was found using the following steps:
FIG. 10. A keyway arc.
278
CHEN, LANGRANA, AND DAS
FIG. 13. Regular multiple-point dashed line.
[1] including ours [2] are computationally expensive. The three-pass algorithm proposed by Lai and Kasturi [10] has been implemented here to detect dashed lines from vectorized drawings. 3.1. Algorithm for Detecting Dashed Entities
FIG. 11. Intersecting dashed entities.
(1) Find the center line EF from the mean value of ` points B and C. The center point of the arc BC must be on this line. (2) Find all chord lengths joining the starting point B, and their perpendicular bisectors. All the bisectors will intersect the line EF. The mean of the intersection points is taken as the center point.
Potential dashed line entities are composed of isolated straight line segments, i.e., two point entities, with attribute numbers 1 and 2 (Fig. 1, Item B). But when a dashed entity crosses another entity (dashed or nondashed entity), some line segments may be connected together and become a three-point entity. In Fig. 11 one dashed line crosses a dashed circle. At the intersection point, it becomes a threepoint line segment and the points in this three-point entity have attribute numbers 1, 2, and 2 (or 122) (Fig. 11b). It is obvious that one part of this three-point line segment belongs to the dashed line and the other part belongs to the dashed circle. Thus, the first task is to separate this kind of three-point line segment into two line segments. The second step is to detect and cluster the line segments which belong to dashed entities. In this step, a current
3. RECOGNITION OF DASHED ENTITIES
Dashed lines are used extensively in engineering drawings to represent center lines, axes of symmetry, and hidden lines. Several approaches have been proposed for recognition of dashed lines [1, 10]. These approaches are designed to recognize dashed lines from skeletonized images. Lai and Kasturi [10] used a three-pass algorithm to extract dashed entities from maps and mechanical drawings. Pao and Jayakumar [1] used the Hough transform to extract dashed entities. It is known that pixel-based algorithms
FIG. 14. Dashed circle.
FIG. 12. Regular two-point dashed line.
FIG. 15. Dashed semicircle.
PERFECTING VECTORIZED MECHANICAL DRAWINGS
279
FIG. 16. Regular circular dashed arc.
segment is linked with a potential next segment if the angle vibration (DU) and gap distance (G) between the two segments are less than the thresholds for angle variation (UT ) and gap distance (GT ) respectively. If there are more than one segment which satisfy the above criteria then the segment with the smaller DU is chosen. For example, in Fig. 11c, if line C is the current segment, one of the two potential candidate lines B and D can be chosen. The priority for smaller DU makes line D cluster to line C. The above two steps are used to detect and cluster dashed entities. Once the dashed entities are clustered, the attribute numbers of the points representing the entity are of the form 121212 and so on (Fig. 12a). The third step is to classify and recreate the dashed entities. The following six categories are used to classify the dashed entities. (1) Regular Two-Point Dashed Straight Line. As shown in Fig. 12a, the elements of the clustered entity have the same slope, same gap, and same length (all within certain tolerances). Such an entity is classified as a regular twopoint dashed line and is recreated by taking the average value of the x and y coordinates. The result is shown in Fig. 12b. The attribute numbers of the points representing the recreated entity become 11 and 2. The attribute number 11 stands for a dashed entity as defined in Fig. 1. (2) Regular Multiple (More than Two)-Point Dashed Straight Line. Figure 13a shows a clustered dashed entity in which there are points (such as points B and C) where there is a change of slope. In such a case, the dashed entity is recreated as a multiple-point dashed line which has a starting point, points where there is a change of slope, and the end point. For example, the entity shown in Fig. 13a is recreated as a four-point dashed line with starting point A, two points B and C where there is a change of slope, and end point D. The attributes of the points are 11, 2, 3, and 2 (Fig. 13).
(3) Dashed Circle. If a clustered entity has a closed end and a uniform change of slope, it is classified as a dashed circle. The center and radius of the dashed circle are calculated, and the recreated dashed circle is stored as a two-point entity. The points which are stored are the center point (attribute 11) and a point on the circumference (attribute 4) (Fig. 14 and Fig. 1, Item I). (4) Dashed Semicircle. If the clustered entity has a closed end and a uniform change of slope followed by a regular dashed line, it is classified as a dashed semi-circle. The center and radius of the semi-circle are calculated. The recreated dashed semi-circle is stored as a four-point entity (Fig. 15). Based on the definition of attributes shown in Fig. 1, the attributes of the points are 11, 5, 7, and 2 or 11, 2, 5, and 6 depending on the direction of the semi-circle. (5) Regular Circular Dashed Arc. Figure 16a shows an example of a dashed semicircular arc within a clustered entity. Two steps are necessary to recreate the entity. First, the regular dashed line portions are recreated (Fig. 16b). Second, the dashed arc is recreated by calculating the center of the arc (Fig. 16c). (6) Alternating Long and Short Dashed Center Lines. The identification of alternating long and short dashed center lines is done by comparing the length of the current segment with that of the previous segments for both odd and even segments. If the current segment is an even numbered segment then its length is compared with the average length of the previous even segments and a similar process is carried out for the odd numbered segments. If the lengths of all the even numbered segments are close to each other and the lengths of all odd numbered segments are also close to each other (within a certain tolerance) and the average length of the even segments is different from the average length of the odd segments then they are identified as belonging to this group. If such an entity is identified (Fig. 17a) it is recreated by computing an average straight
FIG. 17. Alternating long and short dashed center lines.
280
CHEN, LANGRANA, AND DAS
FIG. 18. Vectorized image of a drawing containing 153 line segments.
FIG. 20. The drawings in Fig. 18 after refinement of the dashed lines; represented by 77 line segments.
line through all long and short line segments. The location of both end points is calculated from this straight line. The recreated entity is a two-point entity with attribute numbers 21 and 2 (Fig. 17b), Fig. 1, Item L).
and one dashed circle as shown in Fig. 19. The two parameters UT and GT used for this drawing are 38 and 0.13 in. respectively. Figure 20 shows the dashed lines after refinement which involved calculation of location of points used in the recreated entity. The total number of entities is reduced from 153 to 77. This is a significant reduction in the stored data.
3.2. Experimental Results of Dashed Entity Recognition The dashed line detection algorithm described in this section was applied to test the mechanical drawing shown in Fig. 18. This figure is a vectorized drawing output from RENDER. The initial drawing was 8As99 3 1199 and was scanned using a resolution of 144 DPI. The vectorized drawing contained 153 line segments. The dashed line recognition algorithm identified 8 center lines, 11 hidden lines,
FIG. 19. Dashed line entities recognized in the drawings shown in Fig. 18.
FIG. 21. Limitations of the dashed line recognition algorithm.
PERFECTING VECTORIZED MECHANICAL DRAWINGS
281
FIG. 22. Results of refinement of the vectorized drawings shown in Fig. 18.
3.3. Limitations of Dashed Entity Recognition Figure 21 shows the limitations of the dashed line recognition algorithm. For 8As99 3 1199 drawings scanned at the rate of 144 DPI, the minimum gap threshold (GT ) was found to be 0.13 in. Intersecting dashed entities can be detected and recreated as shown in Fig. 21a. Two different kinds of dashed lines are used in Fig. 21 to differentiate between the vectorized figure and the recreated figure. The recreated dashed lines are assigned a predefined size. If the distance between two adjacent dashed entities is less than the defined gap distance threshold (GT ), the two dashed entities may get mixed together (Fig. 21b). Furthermore, the recreation will fail if the entities are not vectorized correctly. In Fig. 21c it can be seen that there is an arc in the dashed semi-circle. The recreation algorithm will not be able to recreate this semi-circle correctly. The recreation algorithm also requires a minimum of four line segments to recreate dashed arcs or dashed circles. In Fig. 21d, the three line segments did not satisfy this requirement and therefore could not be recreated. The algorithm for the recreation of a dashed circle is based on uniform change in the slope of the segments. Hence, all the segments need to be perfectly recognized to recreate a dashed circle. Finally, three segments were needed to recreate the dashed arc in Fig. 21d.
4. GLOBAL ALIGNMENT
The algorithms used in the previous section to recreate the drawings were local processes only; i.e., the algorithm investigates one line segment at a time. But it is obvious
FIG. 23. The refined drawings overlaid on the vectorized drawings shown in Fig. 18.
282
CHEN, LANGRANA, AND DAS
FIG. 24. A C-sized drawing after refinement (22 3 17 in.).
that in mechanical drawings all lines (points) are interrelated. For example, in Fig. 18, the center lines ` in the adjacent views must be aligned and in Fig. 2, arc DE and the circle must have the same center point. Thus, from the global point of view consideration, related lines (points) must be aligned. This is also true in Fig. 22c, where the center lines between the adjacent views must be aligned and concentric circles and/or arcs must have the same center point. The alignment algorithm searches each point, collects the coordinate values which are close to each other within a threshold tolerance, and takes the average values as the new coordinate values for these related lines (points). The altered related lines (points) are thus aligned. Figure 22 shows the results of alignment. All the arcs, dashed lines, and circle are modified and aligned. In both Figs. 22a and 22b points a, b, c, and d are aligned. In Fig. 22c points a, b, c, d, e, f, g or g, h, i, j or k, l, m, n, o etc. are all aligned. The success of all the refinements can be seen in Fig. 23 where the input drawing (Fig. 18) and the final recreated drawing are overlaid together. This type of one-level higher processing seems to be efficient, reliable, and accurate. The errors are discussed in the next section. In summary, three types of global alignment have been performed: • All concentric circles and arcs have the same center point. • Center lines between adjacent views are collinear. • Since mechanical components are being analyzed, the orthogonality of adjacent sides is preserved.
5. REFINEMENT ON A LARGE VECTORIZED DRAWING
Figure 24 is the refined form of a C-size (22 3 17 in.) drawing. The vectorized output obtained for the drawing had 606 entities out of which 234 were small straight line segments or dashed entities. These were reduced to 35 dashed entities. Thus after refinement, the total number of entities reduced from 606 to 407. All the straight line entities were properly aligned after refinement. Although a large number of entities were successfully vectorized and
FIG. 25. Recreation of circles which are tangent to each other in pairs.
PERFECTING VECTORIZED MECHANICAL DRAWINGS
TABLE 1 Relative Errors in Length for the Figure Shown
283
3. In Part-4 also the rectangles and the dashed lines were recognized correctly. The system was unable to recognize all the dashed lines in Part-5. 4. In Part-6 the horizontal and vertical dashed lines were recognized individually; however, the system failed to combine them. 6. ERROR DISCUSSION
The input to the system described in this paper consisted of images of machine drawings digitized at a resolution of 144 DPI using a commercially available scanner. There are two possible sources of error during the process. (1) Usually the average thickness of the line segments TABLE 2 Relative Errors in Slopes for the Figure Shown
refined, this drawing from a mechanical drawing point of view, is far from being perfect. To show the difficulties, specific portions of the drawing have been marked manually to comment on them individually. 1. Part-1 which has a partially hidden circle was represented by a solid arc and small line segments after refinement. The algorithm does not have additional knowledge to integrate this information to make it a complete circle. 2. In Part-2 the dashed rectangle, the side plate (composed of 4 line segments), and the circle were correctly recognized. However, the center lines were not recognized and they were represented by small line segments. The dashed line and rectangle were recognized perfectly in Part-3.
284
CHEN, LANGRANA, AND DAS
TABLE 3 Absolute Errors in Length for the Figure Shown
is 0.032 (thick) to 0.016 in. (thin) [13]. The scanning resolution used is 144 DPI (dots per inch) which means that there are 2 to 4 dots per line. This may result in a truncation error in the computation of the length. However, this error is going to be very small and may not be significant. (2) Since the extraction of lines from the scanned image is based on thinning and vectorizing algorithms, there is a small distortion in the straight lines and arcs. For example, in Table 1 the coordinates of points B and R after vectorizing were B(1.014, 10.074) and R(5.687, 10.043). BR should be a horizontal line. However, the y coordinates of points
B and R are not identical; there is a small distortion (0.031 in.). The recreation algorithm and the global alignment method developed in this study will force vertical, horizontal, or collinear lines to have the same x or y coordinates (or be collinear). After refinement, the y coordinates of both points B and R are 10.058. This proves the need of global alignment. Table 1 shows the local errors for the figure above it. The length and slope of each line segment were measured. The absolute error and percentage error are also shown in Table 1. The maximum local error was 0.019 in., which occurred in line QR, and the maximum percentage local error was 5.32% which occurred in line Za. The slope errors are listed in Table 2. The maximum slope error of 3.0768 occurred in line TU. An isolated circle is vectorized without distortion in RENDER. Hence, the center point of a circle can be used as a reference point to measure the absolute error in length in drawings which contain a circle. The figure in Table 3 has a circle whose center is used as a reference point to measure the absolute lengths. The maximum absolute er-
FIG. 26. Arcs which cannot be recreated correctly.
PERFECTING VECTORIZED MECHANICAL DRAWINGS
285
FIG. 27. Missing or incorrectly recreated arcs.
ror occurred in line SB where the error was 0.020 inch or 1.581%. The local and absolute errors discussed above show that regardless of which point is chosen as a reference point, the refinement process has generated realistic and accurate vectorized drawings. Figure 25 shows five circles of which four are tangent to each other in pairs. Figure 25a is the vectorized drawing and Fig. 25b is the recreated drawing. It is obvious that the recreation of a tangent circle will have larger errors. In this example, tangent points A and B were located but C could not be located. The table shown in Fig. 25 indicates that the maximum error is 0.020 in. 7. CURRENT LIMITATIONS
Although the current system was successful in vectorizing and quantifying engineering drawings, the system has certain limitations. (1) The system is unable to handle large skew in the drawings. (2) The threshold length defined to distinguish PLS from the actual straight lines is an important parameter to locate an arc and move the starting point to a proper location. The user must input this value or select the default
value of 0.21 in. This process can be iterated or moved by the user if the selected value does not work. There is another limitation in the system associated with the threshold length used for detecting PLS. Suppose in the actual drawing, between two straight lines, that there are a group of small line segments whose lengths are smaller than the threshold length. Then, in such a case, the system will consider them as PLS and replace them with an arc. (3) Arcs with tangent lines, semi-circular arcs between collinear lines, and semi-circular arcs between parallel lines can be recreated. All other arcs such as shown in Fig. 26 would be incorrectly recreated. The system is not capable of handling S curves also. (4) It is very much possible that arcs, circles, and chamfers of size smaller then 0.2 in will not be successfully recognized sometimes because of the small number of pixels representing them. Figure 27a shows a vectorized drawing. In the original drawing, arc A’s radius is 0.1 in., arc H’s radius is 0.15 in., and the radii of arcs B, C, D, E, F, and G are 0.2 in. After vectorizing, arc A was missed, and arcs C, D, F, and H became a chamfer. Since the arcs are small, the refinement algorithms cannot recreate the arcs correctly as shown in Fig. 27b. One of the possible ways by which the above errors can be avoided is by recognizing and analyzing the dimensional
286
CHEN, LANGRANA, AND DAS
information present in the drawing which will tell us about the presence of an arc and its dimensions. However, for this, procedures for recognizing dimensional information present in engineering drawings need to be developed which is being currently investigated.
8. CONCLUSIONS
The algorithms presented in this research have demonstrated the use of further refinement on the vectorized drawings. RENDER was used as the input/preprocessing mechanism, and we believe that this methodology could prove useful in vectorization of line drawings in areas other than mechanical engineering. Even though the past system was capable of recognizing the primitives such as lines, arcs, and circles, the recognition accuracy was dependent on the type and complexity of drawings as well as the quality of the original scanned document. This is due to the local nature of the low level of processing and vectorization techniques. To alleviate this, the current research concentrates on developing a higher level refinement of the existing system and reconstructing mechanical drawings with the exact number of lines and arcs, and their order. The information generated in this manner lends itself in a form suitable for two-dimensional geometric feature recognition. For large C- or D-size engineering drawing the refinement algorithm can further reduce the size of the CAD database. The significance of the present work is that we are successfully able to convert the paper drawings into a usable CAD-oriented model. We have shown how this method can improve upon the low-level local processing in line
extraction. We believe this information is useful to the research community. REFERENCES 1. D. Pao and R. Jayakumar, Graphic feature extraction for automatic conversion of engineering line drawing, in International Conference on Document Analysis and Recognition, Vol. 2, pp. 533–541, 1991. 2. V. Nagasamy and N. A. Langrana, Engineering drawing processing and vectorization system, Comput. Vision Graphics Image Process. 49, 1990, 379–397. 3. I. Chakravarty, A single-pass chain generating algorithm for region boundaries, Comput. Vision Graphics Image Process. 15, 1981, 182–193. 4. M. S. Landy and Y. Cohen, Vectorgraph coding: Efficient coding of line drawings, Comput. Vision Graphics Image Process. 30, 1985, 331–344. 5. J. R. Parker, Extracting vectors from raster images, Comput. Graphics 12, 1988, 75–79. 6. T. P. Clement, The extraction of line-structured data from engineering drawings, Pattern Recognit. 14, 1981, 43–52. 7. J. P. Bixler and L. T. Watson, Spline-based recognition of straight lines and curves in engineering drawings, Image Vision Comput. 6(4), 1988, 262–269. 8. R. Kasturi and L. O’Gorman, Document image analysis: An overview of techniques for graphics recognition, Syntact. Struct. Pattern Recognit. Workshop 1, 1990, 179–197. 9. B. Bailey, Prime MEDUSA User’s Guide, Prime Computer, Natick, MA 1988. 10. C. P. Lai and R. Kasturi, Detection of dashed lines in engineering drawings and maps, in International Conference on Document Analysis and Recognition, Vol. 2, pp. 507–515, 1991. 11. D. Casasent and R. Krishnapuram, Curved object location by Hough transformations and inversions, Pattern Recognit. 20, 1987, 181–188. 12. R. Kasturi and S. T. Bow, A system for interpretation of line drawings, IEEE Trans. Pattern Anal. Mach. Intell. 12, 1990, 978–991. 13. C. Jensen and J. D. Helsel, Fundamentals of Engineering Drawing, pp. 49–63, MacGraw–Hill, New York, 1985.