Perceptual grouping for symbol chain tracking in digitized topographic maps

Perceptual grouping for symbol chain tracking in digitized topographic maps

Pattern Recognition Letters 20 (1999) 355±365 Perceptual grouping for symbol chain tracking in digitized topographic maps P. Gamba a a,* , A. Mecoc...

618KB Sizes 0 Downloads 46 Views

Pattern Recognition Letters 20 (1999) 355±365

Perceptual grouping for symbol chain tracking in digitized topographic maps P. Gamba a

a,*

, A. Mecocci

b

Dipartimento di Elettronica, Universit a di Pavia, Via Ferrata 1, 27100 Pavia, Italy b Facolt a di Ingegneria, Universit a di Siena, Via Roma 77, 53100 Siena, Italy Received 18 February 1998; received in revised form 23 October 1998

Abstract In this paper a new algorithm that applies perceptual grouping to detect and track discontinuous chains of symbols in digitized maps is proposed. The procedure is based on an arti®cial intelligence kernel that supervises three di€erent auxiliary processes: the Search Strategy Generation module that is responsible for the strategy to scan pixels; the Symbol Detection (SD) module that extracts the recognized symbols; the Cost Function Evaluation (CFE) module that assigns a global quality index to each symbol by considering the whole course of the line. Selected Gestalt rules are used to optimize the grouping procedures. After the algorithm discussion, the problem of the extraction of dotted and dashed lines from digitized topographic maps is discussed. Experimental results on many maps of the Istituto Geogra®co Militare Italiano (IGMI) show a very good behavior: 92% of the discontinuous lines have been correctly chained, and the percentage of incorrectly classi®ed symbols is also very small. Ó 1999 Elsevier Science B.V. All rights reserved. Keywords: Perceptual grouping; Symbol chain tracking; Document analysis

1. Introduction In many applications, especially for Geographic Information System (GIS) purposes, it is often necessary to retrieve the information contained in archival maps, either hand-drawn or printed. The process requires the digitization of the source and its interpretation, to extract symbols, structures, text and comments. In particular, the extraction and interpretation of map symbols is complicated by the fact they can

* Corresponding author. Tel.: 39 382 505 923; fax: 39 382 422 583; e-mail: [email protected]

group to form discontinuous chains. These structures, although seldom treated in technical literature, happen to be very interesting elements in di€erent types of drawings. For example, in topographic maps discontinuous symbol-chains are used to indicate, among the others, administrative boundaries (dashed lines) or regional boundaries (dotted lines). Furthermore, in technical drawings similar chains are useful to de®ne symmetry axis (again dashed lines) or view sections (dash-pointdash lines). That is why it is surely important to know how to track these lines in an ecient and suciently precise way. However, the usual line extraction techniques (Caponetti et al., 1984; Kasturi et al., 1990) are not

0167-8655/99/$ ± see front matter Ó 1999 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 9 9 ) 0 0 0 0 3 - 3

356

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

suitable to be applied directly to these type of structures, even when more complex knowledgebased approaches are introduced (Yamada et al., 1993; den Hartog et al., 1996; Boatto et al., 1992; Joseph and Pridmore, 1992; Ilg, 1990; Fisher et al., 1990; Arias et al., 1993; Deseilligny et al., 1993), since the a priori knowledge about continuous and discontinuous lines is inherently di€erent. In this paper we propose a complete procedure based on a perceptual grouping approach that exploits the characteristics of discontinuous chains of symbols and the a priori knowledge about the symbols to be tracked. The system proposed has proved to be suitable for the automatic extraction of these type of chains from digitized images. Some preliminary results have been presented in a conference paper (Gamba et al., 1997); here we report the general theory, contemporarily testing the method to extract dashed and dotted discontinuous lines. The main idea comes from the consideration that the task of automated discontinuous line tracking consists ®rst in identifying elementary symbols and then grouping them into consistent chains: therefore, it assembles low and higher level operations. Typical low level operations are the shape recognition procedures, devoted to recognize the instances of the elementary symbols (dashes, dots, asterisks, . . .). The results of these procedures can be insuciently reliable: outcomes depend too much on noise, on morphological di€erences between the models and the true symbols, and on the disturbing context. Misclassi®cations cannot be completely avoided; instead, a method to solve uncertainties and ambiguities is to exploit the perceptual grouping concepts of continuity, collinearity, proximity and periodicity. In this paper we show how chain shape continuity, chain collinearity, proximity and periodicity can be used as features to group symbols (in an ecient way) to the part of the chain already tracked. The algorithm uses these concepts (translated into simple rules) to search for a set of alternative successors to the last symbol detected, and also to pick the most probable one on the basis of a cost function. However, we need to stress that even the grouping procedures are prone to some errors: wrong decisions can be made at each step of the

process and incorrect directions in chain grouping may result. To recover from these errors and restore previously discarded paths, we maintain information about unused successors by means of a well-known arti®cial intelligence technique, the A algorithm. Thanks to the proposed architecture, excellent results have been obtained tracking di€erent types of discontinuous symbol-chains in topographic maps. The system has been studied in order to be extremely general and robust versus di€erent types of errors (low level misclassi®cations, as well as wrong subchain detection). These characteristics allow an easy tailoring of the system to di€erent chain tracking problems, by changing only a few parameters. The work is organized as follows. Section 2 presents a brief overview of the existing literature on map symbol-chains or discontinuous lines extraction, outlining the problems dealt with in this paper. Section 3 introduces the overall search strategy as well as the di€erent modules of the system. Section 4 presents the results of the application of the method to the extraction of dot± dot and dash±dash chains in some digitized maps, and Section 5 discusses the results obtained, pointing out the advantages of the procedure, as well as the problems still open. 2. Brief overview of related literature Many works have been presented about the interpretation of digitized maps and drawings, but many of them are devoted to the extraction of lines and/or symbol (Yamada et al., 1993; den Hartog et al., 1996; Boatto et al., 1992; Joseph and Pridmore, 1992; Ilg, 1990; Fisher et al., 1990; Arias et al., 1993; Deseilligny et al., 1993; Reither et al., 1996; Che et al., 1996; Aoki et al., 1996; Ramel et al., 1996; Trier et al., 1996), without any connection. More precisely, many papers that focus on symbol or line extraction (Yamada et al., 1993; Boatto et al., 1992; Reither et al., 1996; Che et al., 1996; Aoki et al., 1996; Ramel et al., 1996; Trier et al., 1996) work toward a more precise recognition of the searched items by means of suitable operators. Only a few of them (see (Ramel et al.,

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

1996) for instance) point out that some kind of feedback loop must be included in the procedure to reduce the complexity of the operators, enhancing the structure of the recognition network. Knowledge-based approaches (den Hartog et al., 1996; Joseph and Pridmore, 1992; Ilg, 1990; Arias et al., 1993; Deseilligny et al., 1993) to map interpretation, instead, aim to get a higher level information extraction by means of some search or grouping rules. Even if what knowledge does mean in document analysis has not been completely clari®ed, these approaches try to translate the way the human operator acts in a suitable decision strategy. However, since no general knowledge for the map interpretation task exists, the method has been applied to a limited set of problems, generally where drawing rules are already de®ned (see (Boatto et al., 1992; Arias et al., 1993) for instance). No attempt has been made, as far as the authors know, to apply knowledge-based analysis to discontinuous chains of symbols. However, the most recent knowledge-based approaches in this area have reached very similar results to ours. In (den Hartog et al., 1996), for instance, the idea to realize a general environment driven by a set of rules de®ned by the particular problem is presented: with this approach, a topdown extraction procedure is governed by a bottom-up knowledge representation. The method can be considered as a very similar approach to what we have introduced in our work: a general search procedure de®ned for discontinuous chains drives a low-level extraction and classi®cation module aware of the particular line to track. On behalf of this very short excursus, we observe that, even if fully automated digitized image conversion has been studied for a long time, some interesting problems are still open. The task is far from having a satisfactory and general solution: it seems that there is a lack of a general environment easily adaptable to di€erent discontinuous lines of symbols.

357

®nding the occurrences of a particular symbol, and then grouping them in a consistent way. Therefore, the architecture of the extraction algorithm here proposed mirrors the main tasks of the problem. Fig. 1 presents the blocks of the overall system: we isolate three modules, interacting each one with the others: 1. The Successor Strategy Generation (SSG) module: it is responsible for the strategy followed to scan the pixels in the image while searching for symbols. It stops when it has found a number Ns of possible successors, corresponding to the di€erent possibilities to continue the tracking progress. 2. The Symbol Detection (SD) module: it decides if in the image region currently selected there is an elementary symbol or not. 3. The Cost Function Evaluation (CFE) module: it de®nes the quality of every tracking alternative associating a cost to each possible choice. The minimum cost suggests the direction to take. However, this three stage architecture alone is not fully functional for the problem of discontinuous symbol-chain tracking. In fact, the work performed by these modules has to be carefully driven in order to recover from possible erroneous paths.

3. Overall system architecture As outlined in the introduction, discontinuous symbol-chain tracking can be disassembled into

Fig. 1. The program ¯ow of the symbol-chain extracting algorithm, showing the main modules of the system.

358

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

Therefore, an arti®cial intelligence kernel drives and controls the process. 3.1. The kernel The kernel is based on the A algorithm (Ritch and Knight, 1991; Ginsberg, 1993) since this approach can be easily adapted to our problem. The logical structure created by A to store information about the symbol search is a tree, where each node represents a detected symbol: the sons of a given node are its possible successors. This structure is implemented using two lists, called O(open) and C- (closed) lists. The ®rst one contains all the terminal nodes, that is the set of points from which it is possible to re-start exploration; the other one contains all the internal nodes of the tree, that is the nodes for which the candidate successors have already been computed. By means of this list it is possible to re-construct the chain from the current to the starting symbol. However, even if the A algorithm in its classical formulation (Ritch and Knight, 1991) allows backtracking up to the starting point, in our application it can be limited (and memory requirements reduced) since paths very ``far'' from the current one are, very likely, useless. In other words, it is allowed to reexamine only open paths whose distance L from the current node is less than a maximum backtracking length Lb . Moreover, we introduce in the CFE the possibility of negative costs (since prizing the best path with a cost decrement reduces its possibility to be re-analyzed). Although this statement seems to violate the lower cost bound estimate requirement, the backtracking length's limit allow a wise choice of these negative costs without any inconvenience. On the other hand, it is easy to understand that the larger the absolute value of these negative costs, the less reliable and more complex becomes the backtracking in longer paths. Therefore, our choice was to establish ®rst the length limit, and then de®ne the rules (an example is presented in Section 3.4) to compute the path cost. These rules are chosen by requiring that the algorithm works well in some key situations, like bifurcations, gaps, and so on.

Following the above-mentioned guidelines, the kernel drives the search starting from a region supplied by the operator, where the root of the search tree is to be found. It activates the search strategy generation module that scans the region around the current symbol, calling step by step the SD subroutine. According to the search strategy generation settings, Ns possible successors are found: indeed, assuming no errors in the tracking routine, we might search for only one successor, but since this is very unlikely, we need more candidates to improve reliability. They are stored in the O-list and the CFE module is called to compute the cost from the current node to each successor: the lowest cost node is chosen as the winner. With the new current symbol, all these steps are repeated, and so on, until the end of the search is reached, either when no more regions to be scanned exist, or an inconsistency requiring a human operator is achieved. In our experience, this event happens only in a few, very particular situations. Inconsistencies are observed on the basis of equal costing possibilities or by detecting ambiguities in the search algorithm by means of a control index, de®ned as l…h† ˆ

ni ÿ niÿh ; Ns …Li ÿ Liÿh †

…1†

where ni is the global number of nodes at the ith step and Li the maximum depth of the tree. If the tracking goes on without ambiguity at each step Ns nodes are added to the tree and the depth increases to 1, so l…h† remains invariant for a given h. On the contrary, if ambiguous patterns are found, the nodes are anyhow added but the depth does not grow, leading to bigger values for l…h†. If l…h† grows moderately, the current region is marked and highlighted to invite the operator to an interactive correction. If l…h† becomes too big, the algorithm is facing an unsolvable ambiguity and the exploration stops. Finally, at the end of the automated tracking and before the last interactive correction an automatic post-processing phase is introduced to correct the more evident misclassi®cations. Isolated symbols are eliminated and missing symbols are introduced by interpolation. However, this

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

step refers to a few percent of the totally extracted symbols. 3.2. The SD module The SD module takes the responsibility to extract the symbols from the context and has been implemented as a chain of low level operators. It is controlled by the search strategy generation module that supplies the subimage to be analyzed. If the symbol-chain has a complex structure with di€erent elementary symbols concatenated, this module is also responsible for changing the type of symbol currently searched in order to respect the proper sequence. After each step, if a symbol is found, the SD exits with a con®dence number that indicates the quality of the detected symbol (a sort of fuzzy membership). This value gives a ®rst estimate for the choice of each possible alternative. The module is divided in two blocks: 1. SD. It works on the grey levels or colors of the region of interest in the image, and produces a binary output mask. This operation is strongly related to the quality of the image. Working with a scanned topographic map, for example, the global quality depends on the hard copy, the scanning procedure of the original map and also on the density of informative elements. The procedures range from simple grey level thresholding to contrast control by grey histogram stretching to more complex algorithms, like suitable noise removal, ®ltering and adaptive segmentation techniques. An open-structured library including that above-mentioned pre-processing and segmentation procedures (Pavlidis, 1982; Ballard and Brown, 1982; Beveridge et al., 1989) has been implemented. 2. Symbol identi®cation. If a symbol is detected, the symbol identi®er is called. It compares some predetermined features to those of the allowed symbols and classi®es it accordingly. 3.3. The search strategy generation module The search strategy generation module is based on some perceptual grouping rules introduced by the psychologists of the so-called Gestalt school (Wertheimer, 1938, 1958). They explicit how a

359

human observer immediately identi®es symbols and structures in an image. The idea is that each single element does not allow to perceive the overall structure undertaken, but a suitable grouping does. In particular, when searching for a discontinuous symbol-chain, attention must be paid to the continuity of the chain, and the collinearity and periodicity of the symbols that form it. These perceptual grouping rules (Lowe, 1985) are used to explore the possible correlations between symbols and to choose the right one. Therefore, they have been implemented inside the search strategy generation module, in order to make easier the handling of two search steps: 1. the de®nition of the possible search directions for the successor of a given symbol; 2. the choice, if more successors are found, of the most probable one. As for the ®rst step, the task is to perform a suitable analysis of the image around the current starting point. This is done by adopting three Gestalt principles: continuity, collinearity and periodicity. Continuity is introduced into scan strategy by means of the ds parameter (depth search): ds is chosen by looking at the typical distance between symbols and to the typical map to analyze. If the chain does not have gaps, a good depth search may be three or four times the typical distance between symbols (since missing of few symbols can be tolerated). If there are gaps, a typical gap size gt and a maximum gap size gm must be initially guessed by the operator and then automatically updated by the system during the search. Depth search might be a value slightly greater than gm , to get over all the gaps, or a value near gt to get automatically over the most frequent gaps. A large value for ds can be used if we want to stress the importance of continuity (individual symbols can be missed, but without losing the line). Collinearity is instead translated into our scan strategy by means of the cs parameter (search corner). Indeed, since topographic lines are fundamentally composed of long rectilinear or low curvature parts, a small search corner should be wide enough to include almost all the line courses. On the other hand, since line curvature is unpredictable, abrupt direction changes are possible, so that a wide

360

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

search corner should be required to avoid losing any line. The natural idea is to choose a variable amplitude search region depending on the search depth reached. We set a small search corner for the ®rst part of the exploration to track the most frequent situation (linear or low curvature); however if, after the typical distance between symbols (periodicity), we have not reached a reliable continuation of the line, the corner is enlarged to face a direction change and/or a gap. In other words, we look for collinearity as long as a large gap is found: the wider the gap, the more proximity concepts become important. Finally, periodicity is translated into our scan strategy by means of the rs parameter (grid dimension): it refers to the sampling grid used when the image is scanned, and it is linked with the typical dimension of the symbol. The search strategy generation module examines only the points of the image that are cross-points in a square grid rs pixels large. As a ®nal note, we must observe that the settings for all these parameters must be carefully chosen: trade-o€ is between a fast tracking (corner and depth as small as possible) and a reliable one (corner and depth big enough to include all the possible line shapes). 3.4. The CFE module The second step where Gestalt principles has been introduced is the ®nal choice of the successor and its grouping with the symbol-chain. This is done by the CFE module, that assigns a cost to each exploration step, embodying the knowledge about the symbol-chain structure from a global point of view, and looking at the relationships among elementary symbols without considering how these symbols have been detected. As seen above, the A algorithm de®nes the whole search tree. Therefore, the CFE estimates the quality of each branch in this tree (with the aim of exploring ®rst the most promising alternative) by exploiting the A* cost function f 0 : f 0 ˆ g ‡ h0 ;

from the current to the goal point. As we can make no assumptions on the end of the search, we set h0 ˆ 0 (featuring a so-called minimum cost search). The cost function g needs to be carefully de®ned. As already observed, it takes care of the continuity, collinearity and periodicity of the symbol chain, as well as the proximity among di€erent subsections of the line. As an example, we report here some among a set of Gestalt rules in a pseudo-code translation, where d1 ; d2 ; d3 ; a1 ; a2 ; a3 are de®ned in Fig. 2, dt is the typical distance between symbols and ct is the typical curvature of the line. 1. To prize collinearity and periodicity in sequences with low curvature and regularly spaced:  if d1 ˆ dt and a1 ˆ ct then f 0 is decreased;  else f 0 ˆ f 0 ‡ 5d1 ‡ 2a1 . 2. To prize the proximity at the beginning of a sequence after a gap:  if d1 ˆ d3 ˆ dt and d2 > 3dt then f 0 ˆ f 0 is again decreased (but more than in the previous case). 3. To penalize anti-collinearity in abrupt direction changes:  if a1 > 60 then f 0 is strongly increased;  if a1 ‡ a2 ‡ a3 > 170 then f 0 is set to an high value, to discard the path. We note again that in f 0 negative costs are introduced: this could be a problem while considering the original A algorithm. However, since we constrain the backtracking capability to a limited value Lb (as observed in Section 3), the negative

…2†

where the g function weights the cost of the part already tracked, while h0 is an estimate of the cost

Fig. 2. The graphic de®nition of the cost function evaluation ai and di parameters.

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

361

costs can be chosen with respect to this length so that no problem arises. Finally, it should be clear from the abovementioned rules that the search is a€ected by the quality of the chain extracted, in the sense that less aligned or oddly spaced symbols correspond to the path with higher search costs, and therefore are less preferable than straight, regular chains. No link is instead established between the quality of each extracted symbol (i.e. how much it is similar to the actual symbol we are looking for), because this happens to be too much related to the quality of the scanned map to be simply generalized. 4. Experimental results To test the program and the algorithm, a ®rst trial was to work on a small portion of some greyscanned IGMI (Istituto Geogra®co Militare Italiano) maps. The de®nition of the parameters obtained in this training step was used to scan the entire maps, that are very useful as testing ground, since they are full of lines and symbols and show many disturbing elements (for instance, shades to render altitude). Moreover, these maps are very interesting also because di€erences in symbol characteristics might occur. Indeed (see Fig. 3) · di€erences from symbol to symbol are already present on the original drawings; · noise is introduced by the digitizing process; · for di€erent (but possibly very near) places on a drawing the same grey level value identi®es sometimes the background and sometimes the foreground: even for a single symbol the background may be quite di€erent from one part to the other, since shades may appear both gradually and suddenly; · symbols may touch or intersect other lines, letters and other symbols of similar or di€erent grey level values. Many geographical symbol-chains are depicted on these maps: in particular, we tested the program on dotted and dashed lines. Dotted lines (representing municipal bounds), are composed of sequences of slightly spaced black dots, aligned in very short as well as quite long sequences. The distance between subsequent dots

Fig. 3. A typical part of a topographic map, full with symbols, lines and text.

varies in the map, and large gaps are frequently present. Moreover, when a gap occurs, collinearity may not be useful to continue the chain tracking, since the line could continue with a completely di€erent orientation. The curvature of dotted lines is unpredictable too: long rectilinear parts, but also high curvature or cusps, can be found. Other problems are the fact that a discretized circle looks the same as any similar squared region, the background and foreground variability, and the intersecting lines with similar grey level values. Dashed lines, instead, represent regional bounds; generally, they are more regular than the above-mentioned dotted lines, but abrupt direction changes are still possible. Moreover, sometimes dashes may be folded to follow the characteristics of the line, but generally their length is quite the same all over the scanned image. To deal with these symbol-chains, it is necessary not only to properly set the search strategy generation and CFE modules exploiting the Gestalt rules to detect and group these symbols (as observed in Section 3.4), but also to introduce tune to the SD module settings. Since SD is performed by a sort of hierarchical classi®cation scheme (di€erent valuable features are computed starting

362

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

from the simpler (and less time-consuming) ones and coming to the more complex and speci®c ones), this step was made training these procedures by means of parts of one of the maps to be interpreted. After this training step, the algorithm has been applied to the whole map: Figs. 4 and 5 show some results of the tracking process for dotted lines, while Fig. 6 presents those for dashed lines in other parts of the same map. Both Figs. 4 and 5 present situations where the usual line-following algorithms fail: a large corner (i.e. a sudden, notable change in tracking direction), and a gap in the dot chain. The problems are overcome by the capability of our algorithm to adapt its search step and angle to the environment: in the middle images of both ®gures the tracked shaded area shows that the program lets its parameter relax in order to ®nd a possible way to keep tracking. Moreover, Fig. 4 stresses the capability of our algorithm to correctly track the dot chain even in the presence of many disturbing elements (the other lines, the text, the shades). A few (exactly,

two) dots have been missed but the chain continues in the right direction. Fig. 5, instead, is also an example of the importance of Gestalt rules in the search strategy: the isolated house near the character `O', although initially recognized as a dot, was discarded since the collinearity of the other dot chain block was prized more than the proximity of this false positive. Finally, Fig. 6(b) is a typical example of a situation where the backtracking capabilities of the A kernel are useful: after the bifurcations, the right chain is followed, but information about the left chain are maintained and used when they can be fully exploited. To conclude this section, a closer look at the results can be interesting, to compare the overall behavior of the algorithm with how it works in a given zone: the discussion is useful to enlighten the advantages as well as the weak aspects of the method. Indeed, from a global point of view, the results are really very satisfactory: for the dotted lines, the percentage of dots correctly chained is 95% over

Fig. 4. Experimental results of extraction of dotted lines from IGMI maps: (a) the tracking process starts on the original image (search areas are shaded); (b) the symbol chain has a large corner; (c) the tracking algorithms succeeds in following the chain by widening the search angle.

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

363

Fig. 5. More dotted lines from IGMI maps: (a) the tracking process starts on the original image (search areas are shaded); (b) the symbol chain has a large gap; nevertheless, (c) the chain is tracked.

Fig. 6. Two experimental results of extraction of dashed lines from IGMI maps: (a) dashed line with simple bifurcation; (b) dashed line with more complex crossings.

the whole test set. The symbol-chain extractor program worked well also in the detection of dashed lines from the same IGMI maps. However,

due to the folded dashes, the line tracking process was a little less ecient, with 90% correctly chained symbols.

364

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

Instead, from a local point of view, considering the behavior of the symbol-chain extractor in different zones of the maps, we found that it varies fairly from place to place: this is due to the presence of pictorial shades and other elements, originally introduced in the maps to improve their readability (for instance, to render the altitude). In the parts of the map where this problem is present, the misclassi®cation rate becomes higher and the percentage may get down to 75%. On the other hand there are regions where the lines are correctly tracked with a 100% precision rate. 5. Discussion The program realized to test the algorithm has surely proved to be useful in the extraction of the discontinuous symbol-chain from digitized images, provided that a suitable training step to de®ne a number of parameters has been done. The task to obtain the automatic recognition in each situation is still far, but the main part of the situations encountered in the extraction process has been faced and solved in a quite ecient and robust way. We say ecient, since a small number of rules has been used to manage a variety of situation; and we say also robust, since the system works well even in the presence of noise, or disturbing context, as discussed before. Errors can occur when the searched symbols are missed or other, di€erent symbols are misclassi®ed: we found that, extracting one symbol chain at a time, the second error source is less important than the ®rst, since local misclassi®cations are corrected by the overall search strategy almost in any case. However, when an unrecoverable error happens, the error is generally due to the SD module: only in a limited set of situations the search strategy generation and CFE modules get the wrong result. In particular, we found that SD limitations are due to the following reasons: · poor matching of the symbols; · presence of objects very similar to the searched symbol (like, for instance, isolated houses instead of dots, or road blocks instead of dashes); while the situations when search strategy failures are related to

· complex bifurcations without a leading direction to be followed; · gap length much longer than the average value found till that moment. Note that the last cases are also problematic for a human operator: that is why we found satisfactory the behavior of the test program. As for the time required by the search, average tracking time hti is about 1 s per symbol on a 48633 MHz PC. This time is directly related to the SD module analysis, and to the search strategy generation time required to ®nd the successor of the current symbol. We found that a generic formula can be de®ned, t ˆ Kt gt2 ‡ ta ‡ to ;

…3†

where we observe that, dealing with gaps, t increases with the square of the gap length. The time ta depends on the shape of the symbol de®ned in the SD module, and the constant Kt varies accordingly to the search strategy generation strategy (according to the characteristics of the symbol-chain) to bridge the gaps in the lines. The last term, overhead time to , takes into account the normal tracking strategy time, and varies from zone to zone, since the chain features are not uniform on the tested images. Finally, we want to stress that the most important feature of our algorithm is that the perceptual grouping concepts introduced in the program allow to scan the digitized test images without the need of a global pre-processing step (like a segmentation (den Hartog et al., 1996)), and working on a limited subset of the starting image. This allows a signi®cant saving of memory, as well as of cpu-time, since the useless parts of the image are never considered nor analyzed: only regions containing symbols of interest are studied.

6. Conclusions A new algorithm to extract discontinuous lines from generic images is proposed. Our approach improves the limited performances of low level operators by means of a grouping procedure endowed with Gestalt rules based on the topological

P. Gamba, A. Mecocci / Pattern Recognition Letters 20 (1999) 355±365

connections and the a priori knowledge about the type and geometry of the discontinuous lines. The general extraction environment is constituted by independent modules, each one devoted to a particular task. So, the knowledge and the perceptual grouping concepts can be easily introduced in the system even for di€erent line styles and image contents. References Aoki, Y. Shio, A. Aray, H., Okada, K., 1996. A prototype system for interpreting hand-sketched ¯oor plans. In: 13th IAPR International Conference on Pattern Recognition, Vienna, August 1996, Vol. III, pp. 747±751. Arias, J.F., Lai, C.P., Chandran, S., Kasturi, R., Chhabra, A., 1993. Interpretation of telephone system manhole drawings. In: Second International Conference on Document Analysis and Recognition, Tsukuba Science City, October 1993, pp. 365±368. Ballard T.H., Brown, C.M., 1982. Computer Vision, PrenticeHall, Englewood Cli€s, NJ. Beveridge, J.R., Grith, J., Kohler, R.R., Hanson, A.R., Riseman, E.M., 1989. Segmenting images using localized histograms and region merging. Internat. J. Comput. Vision 2, 311±347. Boatto, L., Consorti, V., Del Buono, M., Di Zenzo, S., Eramo, V., Esposito, A., Melcarne, F., Meucci, M., Morelli, A., Mosciatti, M., Scarci, S., Tucci, M., 1992. An interpretation system for land register maps. IEEE Computer 25 (7), 25±33. Caponetti, L., Chiaradia, M.T., Distante, A., Veneziani, M., 1984. A track following algorithm for contour lines in digital binary maps. In: Levialdi, S. (Ed.), Digital Image Analysis. Pitman, London, pp. 149±154. Che, L.-H., Liao, H.-Y., Wang, J.-Y., Fan, K.-C., Hsieh, C.-C., 1996. An interpretation system for cadastral maps, In: 13th IAPR International Conference on Pattern Recognition, Vienna, August 1996, Vol. III, pp. 711±715. den Hartog, J.E., ten Kate, T.K., Gerbrands, J.J., 1996. Knowledge-based interpretation of utility maps. Comput. Vision and Image Understanding 63 (1), 105±117. Deseilligny, M.P., le Men, H., Stamon, G., 1993. Map understanding fro GIS data capture: algorithms for road network graph reconstruction. In: Second International Conference on Document Analysis and Recognition. Tsukuba Science City, October 1993, pp. 676±679.

365

Fisher, J.L., Hinds, S., D'Amato, D., 1990. A rule-based system for document image segmentation. In: Tenth IAPR International Conference on Pattern Recognition, Atlantic City, June 1990, pp. 567±572. Gamba, P., Lilla, M., Mecocci, A., 1997. Extraction of discontinuous chains of symbols by means of perceptual grouping. In: Proceedings of the 1997 IEEE International Conference on Image Processing (ICIP `97), S. Barbara, CA, 26±29 October 1997, Vol. II, pp. 423±425. Ginsberg, M., 1993. Essentials of Arti®cial Intelligence, Morgan Kaufman, S. Mateo, California. Ilg, M., 1990. Knowledge-based understanding of road maps and other line images. In: Tenth IAPR International Conference on Pattern Recognition, Atlantic City, June 1990, Vol. I. Joseph, S.H., Pridmore, T.P., 1992. Knowledge-directed interpretation of mechanical engineering drawings. IEEE Trans. Pattern Anal. Machine Intell. 14 (9), 928±940. Kasturi, R., Bow, S.T., El-Masri, W., Gattiker, J.R., Mokate, U.B., 1990. A system for interpretation of line drawings. IEEE Trans. Pattern Anal. Machine Intell. 12 (10), 978±992. Lowe, D.G., 1985. Perceptual Organization and Visual Recognition. Kluwer Academic Publishers, Dordrecht. Pavlidis, T., 1982. Algorithms for Graphic and Image Processing. Computer Science Press, Rockville, Maryland. Ramel, J.-Y., Vincent, N., Emptoz, H., 1996. Combining local and global vision for technical document understanding. In: 13th IAPR International Conference on Pattern Recognition, Vienna, August 1996, Vol. III, pp. 773±777. Reither, E., Li, Y., Delle Donne, V., Lalonde, M., Hayne, C., Zhu, C., 1996. A system for ecient and robust map symbol recognition. In: 13th IAPR International Conference on Pattern Recognition, Vienna, August 1996, Vol. III, pp. 783±787. Ritch, E., Knight, K., 1991. Arti®cial Intelligence, 2nd ed. International Edition. Trier, O.D., Taxt, T., Jain, A.K., 1996. Gray scale processing of hydrographic maps. In: 13th IAPR International Conference on Pattern Recognition, Vienna, August 1996, Vol. III, pp. 870±874. Wertheimer, M., 1938. Laws of organization in perceptual forms, In: Ellis., W. (Ed.), A Source Book of Gestalt Psychology. Harcourt, New York. Wertheimer, M., 1958. Principles of perceptual organizations. In: Beardsley, D., Wertheimer, M. (Eds.), Readings in Perception. Van Nostrand, Princeton, NJ. Yamada, H., Yamamoto, K., Hosokawa, K., 1993. Directional mathematical morphology and reformalized Hough transformation for the analysis of topographic maps. IEEE Trans. Patt. Anal. Machine Intell. PAMI-15 (4), 380±387.