Journal of Network and Computer Applications (1997) 20, 171–190
Distributed fax message processing system S. C. Hui∗, K. Y. Chan∗ and G. Y. Qian† ∗Division of Software Systems, School of Applied Science, Nanyang Technological University, Nanyang Avenue, Singapore 639798 and †Institute of Systems Science, National University of Singapore, Heng Mui Keng Terrace, Kent Ridge, Singapore 119597 In this paper, a distributed fax message processing system named AutoFax is described. The AutoFax system provides capabilities for recognizing, storing and managing fax messages automatically in a distributed environment. Fax message images are interpreted and routed to the appropriate receiver’s mailbox. In addition, the contents of each fax message are recognized and stored according to their logical and layout formats in order to reduce the amount of storage required. The proposed system is based on a client–server architecture. The server is responsible for recognizing and storing of fax messages. The client provides an interface for the user to compose, manage and send fax messages. 1997 Academic Press Limited
1. Introduction Facsimile or fax has become increasingly important as a communications tool in business, industry and even the personal environment. The fax system combines the functions of the telephone and the photocopier. Fax messages are usually transmitted in CCITT Group 3 format via telephone lines. One way to handle the fax messages automatically in the office environment is to integrate fax systems with personal computers (PCs). Several commercial PC fax systems, such as the InfoFax, FaxNet, FaxWorks and Object Fax, have been constructed and marketed. It connects PCs directly to the other party’s fax devices through the dial-up telephone line. Users are able to send and receive faxes directly from their PCs. Documents sent using PC fax systems are clearer and sharper as the image is generated directly for transmission and there is no intervening scanning of a paper document with its associated noise. Finally, there is no need to make photocopies of each single fax received for distribution. However, in these PC fax systems, the fax messages are captured as images. The cost of storing these fax images is an important consideration in the procurement process for these systems. An A4-size fax image needs about 1 megabyte of data. With conventional compression techniques the storage requirement is reduced to about 40 to 80 Kbytes, depending on the content of the fax image. However, fax volumes can be very high and this may still render the system prohibitively expensive to use. In addition, these PC fax systems are mainly operated in a single-user environment. It means that all the fax messages are received and saved in files within the system. When more and more messages are received, the size and number of files will also increase. It becomes increasingly difficult to locate specific fax messages just by browsing through the files. E-mail: {asschui, askychan}@ntuvax.ntu.ac.sg; E-mail:
[email protected].
1084–8045/97/020171+21 $25.00/0
ma970045
1997 Academic Press Limited
172 S. C. Hui et al. In order to meet the challenge of today’s fax communication environment, a system known as AutoFax has been developed for analysing, recognizing and managing fax messages. The Auto-Fax system transforms incoming fax message images into computer readable form automatically. It captures fax messages from telephone lines, converts them from CCITT Group 3 compressed image form into a standard image format such as TIFF (tag image file format), and finally, it recognizes and stores them for later retrieval. The system interprets fax messages by understanding where it comes from, who are the sender and receiver, what is the subject of the message, etc. and routes the message to the appropriate receiver’s mailbox in a distributed environment. In addition, it converts the contents of the body of fax messages into corresponding ASCII text and image data, and stores them according to its logical and layout formats. In this way, the amount of storage required for each fax message can be reduced tremendously. This research is directed towards producing an intelligent fax message processing system to be used in a distributed environment. In this paper, we will focus on describing the development of the AutoFax system. The remainder of this paper is organized as follows. Section 2 introduces the concepts on document image processing. In Section 3, the system architecture of AutoFax is described. Section 4 describes the fax message analysis process, it focuses on discussing the address block identification and address data analysis. Section 5 presents the user interface. Section 6 discusses the implementation and gives some experimental results. Conclusions and future work are given in Section 7.
2. Document image processing Fax message processing can be classified under the area of document image processing (DIP) [1] which is defined as a process of transformation of any information presented on paper documents into an equivalent symbolic representation accessible to any kind of computer information processing. The goal of DIP is to provide a simple way to restructure the high-level semantic description (logical structure) for later retrieval. As shown in Fig. 1, DIP consists of three major processes [2]: layout extraction, logical analysis and text recognition. The layout extraction process extracts a hierarchy of layout objects such as text and graphics blocks, lines and words based on their presentation on the page. The document image is first captured and preprocessed through skew correction [3–5] and noise removal [6]. It is then subjected to block segmentation which divides the document image into rectangular blocks. Different techniques for block segmentation have been proposed such as the run-length smoothing algorithm (RLSA) [3,7], projection profile cuts [8], Hough transform [3,9], crossing counts [10], and others [11,12]. Blocks determined during the segmentation process are further classified into one of the two categories, i.e. text or non-text (image or graphics), according to their physical information such as size and density during block classification. Text-blocks can be further segmented into lines and words. In logical analysis, the layout objects and their compositions are geometrically analysed to identify corresponding logical objects that can be related to a human perceptible meaning of the content. For example, in fax messages, the document structure associates its contents with a hierarchy of logical objects such as sender, recipient, date, body and signature. Different document types usually have different
Distributed fax message processing system 173 Business documents Paper documents Postal letters
Business letters
Electronic documents
Forms
Fax messages
Fax machine
Scanner
Document images
Document image processing Layout extraction Skew correction Noise removal
Text recognition Logical analysis
Block segmentation
Character recognition
Word recognition
Block classification
Document database
Figure 1. Document image processing.
formats and different layouts. Their corresponding logical meaning are also very different. Currently, different approaches have been adopted for logical analysis: structure transformation [3], sample matching [14,15] and description language compilation [16–18]. These approaches have been applied to a wide range of document image processing applications including postal automation [19], paper understanding of newspapers and journals [20], processing of forms [21–23], tables [24,25] and technical papers [26,27]. The postal automation or address recognition system processes the address on letters at a very high speed and obtains both the zip code and the city name. The forms processing usually has to process business forms in large volumes. The paper document understanding generally processes a full size page. The document page may include graphics as well as images.
174 S. C. Hui et al. Dial-up telephone line Incoming fax
Outgoing fax Local storage
Local object manager
Fax recognizer
Authoring tool
Object manager
Query tool
Fax server
Main storage
User interface
Fax manager
Client
Fax message database Figure 2. System architecture of AutoFax.
After layout extraction, a list of text blocks and non-text blocks of the document image is generated. For non-text blocks, compression can be performed before storing them in a document database. Text blocks are usually processed and recognized using optical character recognition (OCR) engines. The objective of test recognition [28,29] is not only to recognize individual characters of a word, but also to verify that the word is valid. To achieve this, it is necessary to carry out contextual postprocessing which improves recognition accuracy by using a large dictionary.
3. System architecture The system architecture of AutoFax is shown in Fig. 2. The system is based on a client–server architecture. A set of personal computers will be interconnected to a fax server by a local area network. A fax modem is attached directly to the fax server. The fax server consists of three specialized components with each performing certain message processing functions. When the fax messages come in via the telephone lines, the Fax Manager [30] is responsible for receiving the incoming fax messages. The fax messages
Distributed fax message processing system 175 Fax image
Preprocessing Block segmentation Address block knowledge
Layout extraction
Address block identification Other blocks
Address block
Nontext
Block classification TEXT
Knowledge base Text recognition
Address data knowledge
OCR engine
Address/data analysis
Logical analysis Fax message database
Recognized data
Figure 3. An overview of fax message processing.
are forwarded to the fax recognizer for fax message processing. Afterwards, the fax messages are then passed to the object manager [31] which is responsible for the management of fax messages. It provides both data modeling and multimedia storage support for the AutoFax system. An object-oriented data model [32] is constructed based on the fax message format. Each client subsystem has a user interface which consists of authoring and query tools to compose and retrieve fax messages. The authoring tool is used for fax message browsing and composition. The query tool is used for query formulations for message retrieval. The local object manager functions similarly with the server’s object manager except it stores the fax messages which are currently edited or created by the client.
4. Fax message processing Fax message processing is carried out by the fax recognizer. An overview of fax message processing is shown in Fig. 3. During layout extraction, the preprocessing operation
176 S. C. Hui et al. converts the incoming fax message from CCITT Group 3 compression format into the standard TIFF image format. Skew correction and noise reduction are performed to improve subsequent recognition operations. Block segmentation breaks up each fax image into blocks using the projection profile technique. Address block identification locates and identifies the address block using some a priori address block knowledge about fax messages. Logical information such as sender, receiver, date, etc. are assumed to reside within this address block. After the address block identification operation, the fax message consists of two types of blocks: the address block and other blocks. Then the block classification operation classifies all other blocks as either TEXT or NONTEXT (i.e. images or graphics). The address data analysis is performed during logical analysis. The address data analysis analyses the logical information of the address block using the address data knowledge stored in the system. The address block and the TEXT blocks are further segmented into words which are recognized with an OCR engine [33] during the text recognition process. The recognized fax data are then stored in the object database [31] together with the NONTEXT blocks according to its logical and layout formats. 4.1 Address block identification As no fixed fax message format exists, address block identification is not an easy problem. To a large extent, this problem is very similar to the address recognition system in postal automation [19]. The address block identification processes incoming fax messages, identifies the address block and extracts the sender’s name, receiver’s name, telephone and fax numbers, company name, etc. The address block knowledge is mainly used for identifying the location of the address block. In order to collect this knowledge, we need to consider some common properties of address blocks in fax messages. These properties are not only effective for distinguishing the address block from other blocks, but can also be obtained in a simple way. To describe these properties, let us consider three sample fax messages with different address block formats shown in Fig. 4. From Fig. 4, it may be noted that the address blocks are usually in the upper half of the fax message cover page. The process of address block identification focusing on the upper half page is obviously effective. In Fig. 4(a), the address block is bordered by a rectangle. This rectangle can easily be recognized by using the horizontal projection profile. However, two or more rectangles can co-exist in a fax message (as in Fig. 4(a)). According to our study of fax message formats, the address blocks are usually located within the biggest rectangle (that is, the block with highest height). There are situations in which the rectangle is not the address block (as shown in Fig. 4(b) and (c)). In Fig. 4(b) and (c), the address blocks are not enclosed by rectangles. However, both fax messages have a special character string located at the center just above the address block. They are ‘TELEFAX MESSAGE’ in Fig. 4(b) and ‘quotation form’ in Fig. 4(c). In other words, we can find where is the beginning of the address block according to this special character string. Fortunately, this string can be found in most fax messages. As shown in Fig. 4(b), the end of the address block can easily be located because there is a separation line between the address block and the text body. Figure 5 gives the address block knowledge. The symbol ::= in Fig. 5 means that the right part of this symbol is the definition of its left part. After preprocessing and
Distributed fax message processing system 177
Figure 4. Three sample fax messages.
block segmentation, several blocks are generated as shown in Fig. 6. The address block can be found from one of the generated blocks using the address block knowledge. In Fig. 5, the content between StartBlock and EndBlock is AddressBlock, and braces {} indicate its range from the first parameter to the last parameter. The StartBlock is defined under the CenterBlock as indicated by the symbol ⇓. The CenterBlock is a special character string located at the centre just above the address block. The EndBlock is defined as the block above the body content (StartBody) indicated by the symbol ⇑. The symbol && means logical ‘AND’ and the symbol ∀ means logical ‘OR’.
178 S. C. Hui et al.
Figure 5. Address block knowledge.
The CenterBlock is defined as a block which starts from the left portion of the image, the width of the block is less than half of the width of the image and located at the Center position. If the CenterBlock can be found in the fax image, the StartBlock is defined as the block just under the CenterBlock. Otherwise, the CenterBlock is defined as the first block of the fax image, in other words, the StartBlock will be defined as the second block because the first block as the address block is not common.
Distributed fax message processing system 179
Figure 6. Block segmentation for the upper half of the three sample fax messages.
If there exists a block which is a Separator (a horizontal line) or is the Contents of the fax message, then this block is defined as StartBody. The block which is above the StartBody is defined as EndBlock. If the StartBody cannot be found, the EndBlock is defined as the last block of the upper half image. In general, the StartBody can be found in most fax messages. The analysis algorithm for the address block identification is given as follows.
180 S. C. Hui et al. Algorithm Address−Block−Identification: Load the address block knowledge; Break the upper half image into blocks using the horizontal projection profiles; Calculate the average height of each block (HL); Find the CenterBlock which is a special character string located at the center: If (NOT found) CenterBlock := 1; Endif StartBlock := CenterBlock+1; Find the StartBody which is a separator or the body of the fax message; If (found) EndBlock := StartBody−1; Else EndBlock := the last block of the upper half image; Endif AddressBlock is between StartBlock and EndBlock; Return (AddressBlock) As shown in the above algorithm, the knowledge about the address block is loaded first. The horizontal projection profiles technique is used to segment the image into blocks. The average height of these blocks is calculated which is a very important parameter for finding out the special text lines, such as CenterBlock, StartBody, etc. The block segmentation for the upper half page of the three sample fax messages (Fig. 4) using horizontal projection profile are shown in Fig. 6. As shown in Fig. 6(b), the StartBody can easily be located because there is a separation line between the address block and the text body. If such a line is not there, it will be difficult to determine where to separate the address block and the text body (as in Fig. 6(c)). For this case, we should consider the following situations in order to identify the difference between the address block and the text body. Case (1): the address block has two columns and the text body has only one column [Fig. 6(c)]. We can use the vertical projection profile to determine the number of columns in each block. For example, there are six blocks which have two columns under the special character string in Fig. 6(c). The blocks are obviously part of the address block. Whereas the seventh block has only one column (i.e. ‘Date: 06 10 93’). But the width is not wide enough to be part of the text body and its position is at the left hand side of the whole image, this block should also be considered as part of the address block. In general, this block contains information referred to as ‘Date’, ‘Subject’, ‘Hello’ or ‘Dear Sir/Madam’ line. If it is a ‘Date’ or ‘Subject’ line, then it should be treated as part of the address block. If it is a ‘Hello’ or ‘Dear Sir/Madam’ line, then it does not matter whether we put it as part of the address block or text block. The eighth block under the special character string has only one column and its width is wide enough to be considered as the beginning of the text body. That is, in Fig. 6(c), the address block consists of seven blocks under the special character string. Case (2): both the address block and the text body have only one column but the width of the address block is less than that of the text body. Assuming that there is no rectangle
Distributed fax message processing system 181 enclosing the address block in Fig. 6(a), we can consider the width and the position of each block to determine whether they should be part of the address block. Case (3): the number of black pixels of the address block is less than that of the text body. Some fax messages have no obvious difference in number and width of columns between address block and text body. In this case, we can calculate the number of black pixels in each block. Blocks which have fewer black pixels are the possible candidates to be part of the address block. There are two situations in which the address block identification will fail. The first situation is when the CenterBlock is the last block of the upper half image. It is because the address block cannot be at the lower half page of the fax image. The second situation is when the EndBlock is located lower than the StartBlock. If the address block cannot be found according to the current stored knowledge about the address block, the existing knowledge needs to be updated. The knowledge is designed to allow users to add, delete or modify its contents. 4.2 Address data analysis The purpose of address data analysis is to obtain logical information about the sender (name, address, telephone number, fax number), receiver (name, address, telephone number, fax number), date, number of pages and subject. Some of these are optional (such as subject and number of pages) because only some fax messages contain this information. In general, the Keywords such as ‘from’, ‘to’, ‘date’, ‘subject’, etc., are used in most fax messages to indicate the logical information. A keyword-based approach is applied since the fax messages have keywords attached to the address block. For example, the receiver information is usually referred to after the “to” or “attention” keywords and the sender information comes after the “from” keyword. These properties provide a good way to recognize address data accurately and effectively. Figure 7 gives the definition of the logical information of fax messages. During address data analysis, all capital letters are converted into small letters for processing. In Fig. 7, five logical data are defined: sender, receiver, date, subject and pagenum (number of pages). Both sender and receiver information consist of three parts: who is the sender/receiver, telephone number and fax number. The brackets [] indicate that telephone number and fax number are optional. The symbol ⇒ indicates the contents which are followed by the specified parameter. For example, ‘SUBJECT ::= ⇒ (subj ∀ subject)’ defines the contents of the ‘SUBJECT’ that should follow the keyword ‘subj’ or ‘subject’. The information after the keyword ‘from’, ‘fm’ or ‘fr’ is the name and the address of the sender. In order to find out the name and the address of receiver, the keyword ‘attn’ or ‘attention’ should be located first. If it exists, the name and address of receiver are followed by ‘attention’ and ‘to’; otherwise, both name and address of the receiver follow the keyword ‘to’. In Fig. 7, ‘R−Who ::= {Attn, To}’ means that if ‘Attn’ is found, R−Who consists of ‘Attn’ and ‘To’; otherwise, R−Who contains only the ‘To’. Three address block formats are displayed in Fig. 8. In order to carry out address data analysis, we need to identify the keywords from each individual logical component. Receiver information is more difficult to identify than others because it has different
182 S. C. Hui et al.
Figure 7. The definition of the logical information of fax messages.
kinds of formats. Some fax messages put the name of receiver after ‘Attn’ [Fig. 8(b)] and some others put the name of receiver after ‘To’ directly [Fig. 8(a) and (c)]. So the keyword ‘attention’ should be found out first in order to know where to find the name of the receiver. The name of the sender follows ‘from’, and the address of the sender follows the sender name which starts from the next line. Date has several fixed formats such as DD-MM-YY, DD/MM/YY, DDMMYY and DD.MM.YY. In addition, the format for the ‘month’ can be presented as alphabets. For examples, 2 Nov 94 and 14 April 1995. The number of pages is usually a numeral, and the telephone number and the fax number are in digits and dash. The successful rate of the keyword-based approach depends on the recognition rate of the OCR engine. But today, even with the most sophisticated OCR technology, a recognition rate of 100% cannot be reached. Therefore, it is necessary to add word verification before the analysis of logical data. In the address block of the fax message, the number of keywords is limited. The dictionary-based approach is the most straightforward and effective way to check the OCR output and correct them if necessary. In the dictionary, we store the limited keywords including ‘date’, ‘subj’, ‘subject’, ‘page’, ‘pages’, ‘no of pages’, ‘# of pages’, ‘total pages’, ‘fm’, ‘fr’, ‘from’, ‘tel’, ‘telephone’, ‘phone’, ‘phone#’, ‘fax’, ‘fax#’, ‘faxno’, ‘attn’, ‘attention’ and ‘to’. All the OCR output which are not matched to these keywords are considered as the contents followed by keywords. We have used the ScanWorx OCR [33] to recognize the text contents in the address block. The ScanWorx returns a tilde (>) for unrecognized and questionable characters. The OCR output from ScanWorx may look like ‘F>om’, ‘>ate’, ‘Su>je>t’ which correspond to ‘From’, ‘Date’
Distributed fax message processing system 183
Figure 8. Three address block formats.
and ‘Subject’. We have used the following method to confirm the OCR output according to a stored dictionary which contains these limited keywords. There are three steps for word verification in our approach. • For each word in the dictionary, it is marked as ‘unmatched’ if its characters are not equal to that of the test word. For example, if the test word is ‘f>om’, the unmatched words include: which first character is not ‘f’, which third character is not ‘o’ and which fourth character is not ‘m’. In this step, most words in the dictionary are marked as ‘unmatched’. • If the number of unmarked words in the dictionary is more than one, the words whose length is not equal to that of the test word are marked as ‘unmatched’.
184 S. C. Hui et al. • If the number of unmarked words is equal to one, the word verification is successful. Otherwise, the test word is not considered as a keyword. After word verification, the logical information would be recognized one by one. For each logical data item information, the first step is to find out the keyword defined in Fig. 7, then extract the contents that followed the keyword. If the keyword can not be found, then the analysis for that data item will be unsuccessful. For example, if the keyword ‘Subject’ is found, we consider the contents after that keyword is the subject of the fax message. Otherwise, this fax message has no subject data item information. If the main keywords such as ‘From’, ‘To’, etc. do not exist, the analysis will fail.
5. User interface A user interface was designed and used to access to various fax message management facilities of the AutoFax system. The design objective of the user interface was to provide an ‘easy-to-use’ interface to provide fax users with flexibility in using the fax facilities. Fax replying and forwarding were also supported through the user interface. The user interface was divided into the following areas: 5.1. Icon menu area This presents the users with a set of icons displayed on the screen. The users can select any indicated action by simply clicking the mouse on the appropriate icon. The functions supported include the following options: • • • • • • •
Compose—to call for the editor to create a new fax message. Send—send the selected fax message. Reply—to reply a fax message. Forward—to forward a fax message. Extract—to extract the fax messages from the main storage to the local storage. Move—to move the selected fax message to another folder. Misc—for miscellaneous commands like folder maintenance (e.g. creating, renaming and empty folders) and overall maintenance (e.g. undoing last command). • Trash—to delete the selected fax message. • Print—to print a selected fax message.
5.2. Folder listing area This area displays the folders available to the user. New folders can be created by the user as necessary. Three basic folders are initially created by the system for each user. The INBOX folder is used to keep the new fax messages received. The TRASH folder is used to keep the fax messages that were deleted. The DRAFTS folder is used to keep the fax messages created by the user. 5.3. Message listing area The message listing area is the area where the fax message information is displayed. The fax messages can be read by clicking the displayed fax message header on the screen directly using a mouse.
Distributed fax message processing system 185
Figure 9. User interface.
5.4. Message composition and editing area This consists of the facilities for fax message creation, sending of message and message browsing. It includes an authoring tool for the user to create fax messages. The major design consideration for the authoring tool is to provide an easy and natural interface for the users. The authoring tool allows the inclusion of different objects such as images and graphics. 5.5. Query input area This area is used for formulating a query, users start by selecting an appropriate attribute field in the attribute area. The users can directly enter and edit the values of that particular attribute in the query input area. Conditions having multiple values associated with a single attribute are either joined together with an AND or OR operations. The relational operators NOT, >, <, =, >=, (,) and <=are also provided for comparison operations between attributes and values. The formulated query will be translated into the corresponding database query commands for execution. Figure 9 shows the user interface of the AutoFax system. From the query input area, the user is interested in retrieving all fax messages stored in the DRAFTS folder. They are displayed in the message listing area. Finally, when a fax message is selected, the content of the fax message is then displayed in the message composition and editing area.
186 S. C. Hui et al. Table 1. Results of address block identification Address block identification Successful Unsuccessful Total
Number of test fax messages
%
53 7 60
88.3 11.7
Figure 10. Fax messages with special format.
6. Experimental results The AutoFax system was developed using C++ together with the Versant object database system [31] running on a Sun Sparc 2 workstation at Nanyang Technological University. The OCR Engine used is the Xerox Imaging Systems ScanWorX Application Programmer’s Interface Toolkit [33]. The Fax Manager [30] was developed and linked to the AutoFax system. An experiment was conducted as follows. Sixty fax messages with varying formats were tested. The experiment consisted of two phases. The first phase was the address block identification. For those successful cases, the second phase was then carried out on the address data analysis. At present, between 30 and 60 seconds are required to process each fax message. The test results of the address block identification are given in Table 1. Some examples of the test fax messages can be found in Qian [32]. The address block identification was
Distributed fax message processing system 187 Table 2. Results of address data analysis Address data analysis
Number of test fax messages
%
Successful Unsuccessful Total
48 7 53
90.6 9.4
carried out according to the address block knowledge given in Fig. 5. The address blocks of 53 test fax messages were recognized and identified from the 60 test fax messages. The success rate was 88.3%. However, the address block identification for seven test fax messages failed. There are two main reasons for the failures: (1) poor print quality and (2) special fax message format. An example of the special fax message format is given in Fig. 10(a). In this example, the address block identification process is difficult to identify the address block which should exclude the hospital name and logo. Another example of the special fax message is given in Fig. 10(b). In this example, there are some separation lines which cause confusion of the separator defined in the address block knowledge given in Fig. 5. In the address block knowledge, the separation line is considered as the separator between address block and fax body. However, this type of special fax format is not common. After successful identification of the address block, the address data analysis is then applied to analyse the logical information. The test results of the address data analysis for the 53 fax messages, which address blocks were identified successfully in the previous process, are given in Table 2. Forty-eight out of the 53 fax messages were successfully analysed by the address data analysis using the keyword-based approach, giving a success rate of 90.6%. The failure of this process can be attributed to: (1) incorrect recognition results from the OCR engine and (2) the absence of some important keywords. The recognition results depend on the print quality, OCR engine and word verification algorithms. Moreover, logical information can also be handwritten. In this case, handwritten character recognition is necessary. As the ScanWorx OCR engine used in this research can only recognize printed characters, the handwritten fax messages cannot be recognized. Another reason for unsuccesful address data analysis was due to incomplete keyword information. These are the shortcomings of the keyword-based approach. To tackle these two cases, the address block can be analysed in the same way as business letter processing [32] to deduce the receiver and sender information.
7. Conclusion and future work The development of AutoFax shows considerable potential in enhancing and providing added value to the use of fax for communications. In this paper, we have described the overall process of fax message processing. Address block identification and address data analysis have been discussed. When the logical data of the fax message has been identified, it will be routed and distributed to the appropriate receiver(s). The recognized data is stored into the Versant object database system for later retrieval. Experimental
188 S. C. Hui et al. results on a variety of fax messages have been given to demonstrate that the proposed analysis approach is applicable to most commonly-used fax message formats. Currently, the AutoFax system is being enhanced to tackle the problem of special fax messages to improve the recogniton rate. In addition, the system is also being extended to support remote query through telephone lines. The system will allow users to request reading or re-faxing of their fax messages from a remote site via a telephone. The system will read all the incoming fax messages for that user and deliver the text portion of the fax messages as voice output. In addition, if the user wants the fax messages to be re-faxed to a remote site, then the system will initiate a send message command automatically and forward the stored messages by fax to the remote user.
References 1. N. Marovac 1992. Document recognition: concepts and implementation. SIGOIS BULLETIN, 13 (Dec), 28–38. 2. A. Dengel 1992. ANASTASIL: a system for low-level and high-level geometric analysis of printed documents. Structured Document Image Analysis. : Springer–Verlag. 3. D. X. Le, G. R. Thoma and H. Wechsler 1994. Automated page orientation and skew angle detection for binary document images. Pattern Recognition, 27 (Oct), 1325–1344. 4. R. Smith 1995. A simple and efficient skew detection algorithm via test row accumulation. Proceedings of Third International Conference on Document Analysis and Recognition (Montreal, Canada), August, 1134–1148. 5. C. L. Yu Y. Y. Tang and C. L. Suen 1995. Document skew detection based on the fractal and least squares method. Proceedings of Third International Conference on Document Analysis and Recognition (Montreal, Canada), August, 1149–1152. 6. J. L. Fisher, S. C. Hinds and D. P. D’Amato 1990. A rule-based system for document image segmentation. 10th International Conference on Pattern Recognition (Atlantic City, NJ), June, 567–572. 7. K. Y. Wong, R. G. Casey and F. M. Wahl 1982. Document analysis system. IBM Journal of Research and Development, 647–656. 8. G. Nagy, S. C. Seth and S. D. Stoddard 1988. Document analysis system with an expert system. Proceedings of the AGM Conference Document Processing System, 169–176. 9. L. A. Fletcher and R. Kasturi 1988. A robust algorithm for text string separation from mixed text/graphics images. IEEE Transactions on PAMI, 10 (Nov), 910–918. 10. T. Akiyama and N. Hagita 1990. Automated entry system for printed documents. Pattern Recognition, 23, 1141–1153. 11. O. T. Akindele and A. Belaid 1993. Page segmentation by segment tracing. Second International Conference on Document Analysis and Recognition (Tokyo), 341–344. 12. H. S. Baird 1994. Background structure in document images. International Journal of Pattern Recognition and Artificial Intelligence, 8, 1013–1030. 13. S. Tsujimoto and H. Asada 1990. Understanding multi-articled documents. Proceedings of the 10th International Conference on Pattern Recognition (Atlantic City, NJ), June, 551–556. 14. X. L. Hao, J. T. L. Wang, M. O. Bieber and P. A. Ng 1993. A tool for classifying office documents. Proceedings of the IEEE International Conference on Tools with AI, 427–434. 15. G. S. D. Farrow, C. S. Xydeas and J. P. Oakley 1995. Model matching in intelligent document understanding. Proceedings of Third International Conference on Document Analysis and Recognition (Montreal, Canada), August, 293–296. 16. J. Higashino, H. Fujisawa, Y. Nakano and M. Ejiri 1986. A knowledge-based segmentation method for document understanding. Proceedings of 8th International Conference on Pattern Recognition (Paris), 745–748. 17. H. Fujisawa and Y. Nakano 1992. A top-down approach to the analysis of document images. Structured Document Image Analysis : Springer–Verlag. 18. C. L. Yu, Y. Y. Tang and C. Y. Suen 1993. Document architecture language (DAL)
Distributed fax message processing system 189
19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
approach to document processing. Second International Conference on Document Analysis and Recognition (Tokyo), 103–106. P. W. Palumbo, S. N. Srihari, J. Soh, R. Sridhar and V. Demjanenko 1992. Postal address block location in real time. IEEE Computer, 25 (July), 34–42. G. Nagy, S. Seth and M. Viswanathan 1992. A prototype document image analysis system for technical journals. IEEE Computer, 25 (July), 10–22. R. Casey et al. 1993. Intelligent forms processing system. Machine Vision and Applications, 5, 143–155. S. L. Taylrt, M. Lipshutz, D. A. Dahl and C. Weir 1993. An intelligent document understanding system. Second International Conference on Document Analysis and Recognition, Tokyo, 107–110. A. Ting, M. K. Leung, S. C. Hui and K. Y. Chan 1995. A syntactic business form classifier. Proceedings of Third International Conference on Document Analysis and Recognition (Montreal, Canada), August, 301–304. T. Watenabe, Q. Luo and N. Sugie 1993. Towards practical document understanding of table-form documents: its framework and knowledge representation. Second International Conference on Document Analysis and Recognition (Tokyo), 510–515. E. Green and M. Krishnamoorthy 1995. Model-based abalysis of printed tables. Proceedings of Third International Conference on Document Analysis and Recognition (Montreal, Canada), August, 214–217. M. Okamoto and A. Miyazawa 1992. An experimental implementation of a document recognition system for papers containing mathematical expressions. Structured Document Image Analysis. : Springer–Verlag. M. Viswanathan 1992. Analysis of scanned documents—a syntactic approach. Structured Document Image Analysis. : Springer–Verlag. C. J. Wells, L. J. Evett, P. E. Whitby and R. J. Whitrow 1990. Fast dictionary look-up for contextual word recognition. Pattern Recognition, 23, 501–508. J. Zhu and J. J. Hull 1994. Image-based word recognition in oriental language document images. Proceedings of the International Conference on Pattern Recognition, 300–304. K. Y. Leong 1995. Unix Fax Server. Final Year Project Report. School of Applied Science, Nanyang Technological University, Singapore. 1992. Versant C++ Application Toolset Manual. VERSANT Object Technology, June. G. Y. Qian 1995. Business Document Image Processing. M.A.Sc. Thesis. School of Applied Science, Nanyang Technological University, Singapore. Xerox Imaging Systems Inc. Xerox Company 1993. ScanWorx API Programmer’s Guide.
190 S. C. Hui et al. Siu Cheung Hui is a senior lecturer in the Division of Software Systems at the Nanyang Technological University, Singapore. He received his B.Sc. degree in Mathematics in 1983 and D. Phil degree in Computer Science in 1987 from the University of Sussex, UK. He worked in IBM China/Hong Kong Corporation as a system engineer/instructor from 1987 to 1990. His current research interests include document retrieval, database systems, object-oriented technology and multimedia systems.
Guo Yu Qian is a software engineer in the Institute of Systems Science, National University of Singapore. He received his B.Sc. degree in Computer Science in 1986 from Shanghai Jiao Tong University, China and M.A.Sc degree in Computer Science in 1995 from Nanyang Technological University, Singapore. He worked in Shanghai University of Finance and Economic as an assistant lecturer from 1986 to 1993. His research interests are document image processing and handwritten character recognition.
Tony K. Y. Chan, B.E., Ph.D. received both degrees in Electrical Engineering from the University of N.S.W., Sydney, Australia in 1966 and 1972 respectively. His working experience includes technical and management positions with local and multinational electronics and computer companies as well as local startup ventures. In 1990, he joined the Nanyang Technological University, Singapore as senior lecturer in the Division of Computing Systems, School of Applied Science. His research interests are in the fields of computer applications, storage and I/O peripheral devices and parallel computing with Transputers.