An Image Compression and Indexing System Using Neural Networks

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION Vol. 8, No. 2, June, pp. 135–145, 1997 ARTICLE NO. VC970338 An Image Compression and Indexi...

Download PDF

575KB Sizes 1 Downloads 104 Views

Report

PDF Reader
Full Text

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION

Vol. 8, No. 2, June, pp. 135–145, 1997 ARTICLE NO. VC970338

An Image Compression and Indexing System Using Neural Networks J. Jiang Department of Computer Studies, Loughborough University, Loughborough LE11 3TU, United Kingdom Received April 10, 1996; revised August 29, 1996

A powerful neural network system and its parallel computing implementation is presented in this paper for both image compression and indexing. With image compression, a distortionequalized fuzzy competitive learning algorithm is developed for direct vector quantization of input images. The output of the neural network is then used to construct histograms for image indexing in which a weighted counting of codewords scheme is proposed to reflect and classify the image contents. Experiments show that substantially improved performance has been achieved with the proposed system for both image compression and indexing which provides promising potentials for further research on real-time indexing of compressed image databases.  1997 Academic Press

1. INTRODUCTION

Image indexing technology has been successfully used on the management of visual database, image recognition and multimedia computing. At present, the retrieval of images from visual databases are mainly dealt with by content-based and text-search technologies. The contentbased technology [1–6] requires features being constructed from the analysis of shapes, color, and textures as indices to access a look-up table and to retrieve a number of candidates which are close to the input query image. The text search retrieval [5–6], however, relies on paradigms to store a key-word description of each image content, created by users on input. Pointers are often adopted to connect the text-search index with raw images. This process of retrieving an image involves the textual search for key words designed by users to look for similar images. The method is often inaccurate and time consuming due to the fact that the key word description cannot possibly be accurate for the contents of each individual image in the database. Content-based technology is often combined to achieve better performance under this circumstance, especially when image samples or object sketches are used as queries to retrieve images from the database. The basic idea involved in the technique is to extract characteristic features from target images which are matched with those of the query. These features are typically derived from shape,

texture, or color properties of both the query and the targets. After matching, images retrieved are ordered with respect to the query image according to their similarity measures. At present, most image indexing algorithms are designed in noncompressed format. Therefore, shape detection, texture recognition, and color analysis can be performed to construct various image retrieval systems. Considering the fact that images normally contain a huge amount of data, large visual databases will have to find a way to either reduce the input image size and make the database manageable or establish corresponding memory spaces for their accommodation. One of the possible solutions is to use compression technology. Although a series of JPEG, MPEG, and H.26x (x 5 1, 2, 3) standards are available to provide powerful compression performance, problems occur for indexing of those compressed images due to the fact that no technology is available for feature analysis of those images in compressed formats. Therefore, new research initiative is required to redevelop algorithms for both image compression and indexing. This paper presents recent research work on a weighted counting histogram image indexing algorithm and neural network design for the whole image compression and indexing system based on vector quantization and VQ indexing techniques [7]. The structure of the paper is organized as follows: Section 2 discusses a distortion equalized fuzzy competitive learning neural network algorithm for direct image compression and vector quantization; Section 3 contributes to the design of the so-called weighted counting histogram image indexing algorithm based on the usage of vector quantization codewords. In Section 4, we propose a structural design and parallel computing implementation of the complete neural network system for both image compression and indexing. Finally, conclusions and experimental results are given in Section 5 and Section 6. 2. NEURAL NETWORK ALGORITHM FOR IMAGE COMPRESSION

Neural networks for image compression can be classified into direct and indirect approaches [8]. The former includes

135 1047-3203/97 $25.00 Copyright  1997 by Academic Press All rights of reproduction in any form reserved.

136

J. JIANG

as Huffman and arithmetic coding, the bits for labels of codewords can be specifically expressed as bits for labels of codewords 5

X3Y log2 M, N

where N is the block size which is normally 4 3 4 or 8 3 8. In the image compression community, bit rate is sometimes used to measure the compression performance, together with the compression ratio, especially when image transmission is involved. Since the bit rate is normally expressed in bits/pixel, it can be defined as FIG. 2.1. Vector quantization neural network.

back propagation narrow channel learning and various learning vector quantizations, and the latter is mainly developed to assist with the conventional techniques, such as DCT, wavelets, fractals, etc. The general structure of vector quantization neural network can be summarized in Fig 2.1, where two layers of neurons are constructed, one for input and the other for output. Images are split up into blocks of N pixels. Between input and output layers, there are full connections represented by coupling weights, hwij : i 5 1, 2, . . . , N; j 5 1, 2, . . . , M j. In fact, the codebook of vector quantization is represented by the coupling weights hwij j after the training stage is completed. As the input vector is N-dimensional, the code-book contains M codewords each corresponding to one output neuron. Thus, the coupling weights can also be described as a set of M vectors, hwij ; i 5 1, 2, . . . , N; j 5 1, 2, . . . , M j 5 hWj ; j 5 1, 2, . . . , M j and the vectors Wj are N-dimensional. The image compression in such a neural network proceeds in two stages: training and encoding. At the training stage, a set of input images go through the network one block by one block to allow the neural network to learn the best possible coupling weights for representation of the code-book. Following that, encoding begins by using the closest neuron weight to represent each individual block of the input image. Therefore, output codes consist of the code-book and labels for all the blocks in the input image. The compression ratio for a image of X 3 Y size under such a circumstance can be expressed as X3Y3B CR 5 , (2.1) N 3 M 3 B 1 bits for labels of codewords where B stands for the number of bits used for each pixel. When a static code-book is used which means that the trained code-book is used for encoding all images, the compression ratio for the specific image can be calculated without the number of bits for the code-book in the denominator of the above equation. Without entropy coding such

X3Y log2 M log2 M N . 5 bit rate 5 bits/pixel 5 N X3Y For images of 512 3 512, as an example, the bit rates are 0.44 bits/pixel, 0.37 bits/pixel, respectively, corresponding to the code-book size of 128 and 64 when the images are vector quantized in blocks of 4 3 4. An under-utilization problem exists for competitive learning type of neural networks [8–10] in vector quantization, which means that some of the neurons are left out in the training process and never win the competition. Measures like Kohonen’s self-organizing [9] feature map, frequency sensitive [10], and fuzzy competitive learning [11] are adopted to tackle the problem. In this section, we propose a distortion equalized fuzzy competitive learning algorithm not only for solving the under-utilization problem, but also for improving its image compression performance in terms of reconstructed image quality in PSNRs. The originality of the idea is that not only fuzzy memberships are incorporated into the learning process and updating of each neuron weight, but also an extra distortion obtained from the learning process is introduced and fuzzified to scale down the distance measurement to provide a scheme that best possible classification for all the input vectors is learned evenly among all the neurons. Hence, under this scheme, each neuron is allocated a distortion: Vj (t), j 5 1, 2, . . . , M, by its learning process, with their initial values Vj (0) 5 1. The distance between input vector x and all the neuron weights Wj is then modified as Dkj 5 d(xk , Wj (t)) 5

Vj (t)

O V (t) M

ixk 2 Wj (t)i2.

(2.2)

l

l 51

At each training cycle, finally, a fuzzy membership function, ekj (t) for each neuron is constructed to update both distortion Vj and the neuron weights given below: Vj (t 1 1) 5 Vj (t) 1 (ekj )m Dkj

(2.3)

Wj (t 1 1) 5 Wj (t) 1 (ekj )m a(xk 2 Wj (t)),

(2.4)

IMAGE COMPRESSION AND INDEXING SYSTEM

where m $ 1 is a fuzzy index which controls the fuzziness of the whole partition, and ekj 5

S D S @O D 1 Dkj

1/m

M

1

1 D kj j 51

is the fuzzy membership function. Throughout the learning process, the fuzzy idea is incorporated to help optimizing the partition of input vectors, as well as avoiding the underutilization problem. In summary, the proposed algorithm can be given below. (1) Initialization (i) Set required number of competing neurons M. (ii) Randomly set each neurons’ initial N-dimensional vector or coupling weights wij (0): i 5 1, 2, . . . , N; j 5 1, 2, . . . , M. (iii) Set initial average overall distortion D0 5 y. (iv) Set Vj (0) 5 1: j 5 1, 2, . . . , M., where Vj is the amount of distortion mapped to the jth codeword. (2) Modified distance calculation. Calculate the distance between the input vector xk and all of the M competing neurons using a modified square Euclidean distance given in Eq. (2.2). (3) Calculate the fuzzy membership function ekj of each neuron for the input vector. (4) Update V by Eq. (2.3). (5) Update neuron weights (learning). Each neuron’s weight vector Wj is updated by Eq. (2.4), in which t is the iteration number and a is the learning rate. In our experiments, a is set to be 0.05. (6) Comparison of the average distortion values. If (Dm21 2 Dm )/Dm $ «, where « $ 0 is a threshold value, then repeat steps 2–5. Dm 5 (1/n)

O min d(x , w) is the

137

in a way that learning and updating is evenly distributed among all the neurons as the value of m varies. The optimization of codewords is not affected due to the fuzzy membership which specifies the extent of learning from the input vector for all the neurons and the fact that the fuzzy distortion mapping is only introduced to the distance computation in which a principle similar to reinforce-or-punishment is applied. 3. WEIGHTED COUNTING OF CODEWORDS FOR IMAGE INDEXING

Although vector quantization is a lossy compression process, the nature of the compressed image data is still related to pixels/codewords which maintains reasonably good patterns and features in comparison with the original image. Therefore, in principle, all the techniques in content-based image retrieval such as shape based, color based, texture based, etc. can still be applied to image indexing. The main advantage of incorporating image compression is to reduce the storage space, while the quality is being maintained acceptable. This will be particularly useful when large scale visual databases are involved. Around the compressed images with vector quantization, two simple schemes have been built up for image indexing [7]. One is to use a histogram of labels and the other, a usage map. The first histogram is constructed from labels which are used to address codewords in the codebook. Basically, the histogram is an M-dimensional feature vector representing the image encoded. With image fm , as an example, the histogram of labels can be defined as H( fm ) 5 [Nfm(1) Nfm(2) ??? Nfm(M )]T,

n21

k

k50

overall average distortion every time the training of n input vectors is completed. By looking at the above procedures in the algorithm design, it can be concluded that two ideas are adopted to improve the neural network performance in partitioning input vectors into the final code-book. One is to use a fuzzy membership function to update both neuron weights and the distortion mapped to each neuron in the learning process, and the other is to construct a distortion from learning process for each neuron to modify the distance measure between neuron weights and input vectors. The fuzzy membership provides a better controlling mechanism to identify those neurons which are related to input vectors and should be updated each time a vector is processed. The advantage is also adopted in determining how much distortion should be mapped to each individual neuron within each basic training cycle. In addition, the distortion plays an important role to adjust the distance measurement

where Nfm(i) stands for the number of times that label i has been selected for encoding the input image. The second image indexing scheme is to use a so-called usage map to measure the similarity. The usage map is designed to record the usage of each individual codeword. Similar to the histogram of labels, the usage map is also an M-dimensional feature vector with each element, zi , being defined as zi 5

5

1 if the codeword is used 0 otherwise.

(3.1)

In general terms, both the two schemes are making use of the fact that the usage of codewords reflects the content of the input image encoded. Hence, image retrieval can be accomplished when different similarity measures are applied to the query image and target images. The records of usage in the histogram of labels, however, only keeps the information for the number of times each codeword

138

J. JIANG

FIG. 3.1. Encoding sequence for codewords A and B.

has been selected to represent image blocks. The histogram does not have any information to indicate the position of the codewords being used in the input image. Since the position is another important factor which represents a significant amount of information for the content of image, it should be taken into consideration for further enhancement of the indexing performance. A large area of similar background, for instance, might possibly yield a similar histogram to an image where the same background is spread out as small pieces throughout the whole image. Under this circumstance, the indexing accuracy will be affected since the number of times that the codewords being used might be equal roughly for both images. Therefore, the performance enhancement can be expected if the histogram can be constructed in such a way that not only the usage of codewords is recorded but the position of each codeword being used is also taken into account. To reflect this idea, the design of histograms could be very complicated, which might involve a significant increase of operations in comparison of the two images. The simplest way of resolving this problem, as a start, is to construct the histogram by considering the consecutiveness of each codeword being used. This will indirectly reflect the position of each codeword being used for the representation of the whole image. Specifically, we can subject the counting of each codeword to a weighting scheme by the number of consecutive usages of the same codeword. Therefore, the counting of individual codewords can be carried out each time by a general function defined as count(codeword) 5 count(codeword) 1 x 6 weight(x) (3.1) where y1 # weight(x) # y2 , and x stands for the number of consecutive usages. As an example, assume we have two codewords, A and B labelled as k1 and k2 , and the encoding sequence is as shown in Fig. 3.1; the total number of counts for the codeword A can be obtained as

times at different locations in the image, for instance, the weighting of 2 and 3 has to be applied separately in counting the codeword which is: count(codeword) 5 count (codeword) 1 2 6 weight(2) 1 3 6 weight(3). As we can see in Eq. (3.1), the weighted counting can be designed as either positive or negative. This makes the weighted counting scheme have two options. One is positive weighting and the other is negative weighting. Positive weighting has the effect that the image blocks represented by the codeword are highlighted or enhanced since the overall counts are increased. In other words, the role played by this part of the image is enhanced in determining the similarity. On the contrary, the negative weighting will reduce the significance for those blocks since their counts are decreased. Many options are available for the specific design of the weighting function. The principle is to map all the images in the database into a one-dimensional space so that any two of them will have a reasonable distance via which each image can be retrieved unambiguously. Specific design involves the values of y1 and y2 which also defines a maximum range for the consecutive usage mapping and the weighting function. The upper bound for y1 and y2 is [1, (X 3 Y )/N ]. Some of the examples for the weighting functions can be designed as linear, nonlinear, or partly linear as follows and as shown in Fig. 3.2, depending on the statistics analysis of the database: y 5 y2(1 2 e2a1(x21) ) y5

5

kx, x . x0 , 0,

else,

y 5 y1(ea2(x21) 2 1).

(3.2) (3.3) (3.4)

Corresponding to the above equations, Fig. 3.2 illustrates three examples of nonlinear simulation of the weighting function to scale the counting of codewords selected for compressing input images. Other models can also be built up to simulate the weighted counting of codewords under the principle that the position of codewords used in the image should be reflected as clear as possible in the scaling process.

count(A) 5 count(A) 1 4 6 weight(4).

4. PARALLEL COMPUTING FOR IMAGE COMPRESSION AND INDEXING

Therefore, the weighting is only applied to those consecutive counts. Otherwise, the weighting will not be able to reflect the information about the position of each codeword. If the same codeword appears two and three

In this section, we present a parallel computing design for the complete neural network system proposed in the paper to implement both image compression and indexing. The overall parallel computing structure of the neural net-

IMAGE COMPRESSION AND INDEXING SYSTEM

FIGURE 3.2.

work is illustrated in Fig. 4.1 which contains three onedimensional arrays of parallel processing/computing units. Those computing units are organized to implement input, output, and hidden layers of neurons in the neural network. In accordance with the algorithm design in previous sections, the system works in three stages described as follows: Training. A set of images are divided into blocks or vectors of 4 3 4 pixels and those blocks are sent to the input layer to train the neural network. In this process, the distortion equalized fuzzy learning algorithm discussed in Section 2 is implemented to optimize the vector quantization code-book. Therefore, the overall system comprises two parts. One is the neural network for learning vector quantization and the other is the indexing of images in the visual database. As shown in Fig. 4.1, all the neurons at the hidden layer can be designed as a one-dimensional array of computing units which consist of two further arithmetic sub-units. One is for computing the distortion-equal-

FIG. 4.1. Parallel computing and neural network system.

139

140

J. JIANG

TABLE 2 Further Experiments for Fuzzy Competitive Learning

FIG. 4.2. Broadcasting operations.

ized distance between each neuron weight vector and the input vector. The other is for updating the new neuron weight each time a vector goes through the network. If the whole neuron weights or code-book can be represented by M vectors as W 5 hwij : i 5 1, 2, . . . , N; j 5 1, 2, . . . , M j 5 hWj : j 5 1, 2, . . . , M j, the first sub-unit then computes the distance described by d(xk , Wj ), j 5 1, 2, . . . , M, in Eq. (2.2). Following that, each distance calculated will be broadcast from each neuron at the hidden layer to all the neurons at the output layer as shown in Fig. 4.2. This collective operation will enable the neurons at the output layer to perform the competition and to select the closest codeword for the input vector. The second arithmetic unit at the hidden layer is designed to complete the updating of distortion, Vj , mapped to each neuron and its weight as explained in Section 2. At the output layer, competition and computing of membership functions hekj , j 5 1, 2, . . . , M j are primarily implemented at the training stage where the competition is designed for calculating the overall average distortion Dm to control when the training should be terminated. The outputs are fed back to the hidden layer to update both the distortion Vj and neuron weights Wj . Encoding and storing. After the neural network is trained, the code-book is ready for compressing input images. As all the vectors pass through the neural network, the closest codeword is selected on a competitive basis to represent that block of the input image and its label is sent to the visual database for storage. The compression ratio is defined as the number of bits contained in the original image divided by the number of bits used to represent the

code-book and each individual label. The code-book is applied to compress all the images in the same database once the training stage is completed. Therefore, the content of the database will include one code-book for the entire database and labels for each individual image. Indexing. As the input image being encoded by the neural network, all the neurons of the output layer not only select the best possible code-word for each input vector, but also it counts by Eq. (3.1) the number of times that each neuron weight is selected on a competitive learning basis. Various weighting schemes given in Eqs. (3.2)–(3.4) need to be considered in designing the counting operations inside each neuron at this layer. Practical implementation of those weighted counting operations can be completed by building up a look-up table from those equations, since those counts not far from each other often produce very close weights. In this way, the whole operation can be accelerated. As an example, the first neuron will count the number of times that the first neuron weight, W1 , wins the competition. Each time the repeated counting stops, the counter addresses the look-up table and produces an appropriate weighting value to obtain the total counts. Hence, at the end of the encoding, the histogram of the labels for the input image can be constructed at the output layer which is an M-dimensional vector, HL(j) 5 [H1 , H2 , . . . , HM ], where Hj represents the number of times that the jth neuron wins the competition throughout the whole encoding process. As seen in Fig. 4.1, the M-dimensional histogram, HL(j), is used directly to address the visual database.

TABLE 1 Experimental Results for Image Compression (m 5 1.1)

IMAGE COMPRESSION AND INDEXING SYSTEM

TABLE 3 Further Experiments for the Proposed Algorithm

The histogram, HL , is the feature vector of the input image and can be used as the image index [7]. The similarity between the two images fm , and fn can be measured using the L2-metric: d( fm , fn ) 5

O (H ( j) 2 H ( j)) . M

j 51

fm

fn

2

The system is also suitable for implementation of image indexing by usage maps [7]. In this case, the function of neurons at the output layer will record the usage of each code-word instead of counting the number of times that the code-word is selected. The usage function for the output neurons is given in Eq. (3.1). The histogram becomes HL 5 [z1 , z2 , . . . , zM ] and the similarity measure between the two images fm and fn can be designed d2( fm , fn ) 5

O H ( j) % H ( j). M

j 51

fm

fn

5. EXPERIMENTS

To assess the performance of the proposed algorithms and the neural network system, two sets of experiments are designed. One is to use the distortion-equalized fuzzy learning algorithm to compress image samples directly

141

without any preprocessing. In other words, comparisons are not made with those neural networks which have preprocessing stages to help reduce redundancy before the VQ is applied. Examples of such VQ schemes include predictive VQ neural networks [12], DCT based VQ [13], wavelets VQ neural networks [14], etc. The other set of experiments is to index the image databases using the proposed algorithm in comparison with other VQ-based indexing schemes [7]. Since the reference in [7] has proved that the VQ-based indexing actually outperforms the pixelbased indexing, our major concern here is to see if the proposed algorithm does provide any better performance in comparison with the VQ-based scheme. There are a number of competitive learning-based neural networks designed for vector quantization, such as basic competitive learning [9], frequency sensitive [10], Kohonen’s self-organizing feature map [9], fuzzy competitive learning [11], etc. Since Ref. [11] has shown that the fuzzy competitive learning neural network achieves competitive VQ performance superior to that of frequency sensitive and Kohonen’s feature map neural networks, we choose the fuzzy competitive learning algorithm as the baseline to assess the performance of our proposed algorithm. Hence, in the first set of experiments, we use the proposed algorithm to compress four image samples, Lena, Jet, Peppers, and Baboon, among which only Lena is inside the training set. All the image samples are of the same size of 512 3 512. In comparison with the other two algorithms, the experimental results are summarized in Table 1, where the figures given are PSNRs in dB. The compression performance is maintained the same as those in which the number of neurons is 128. It can be seen from Table 1 that the proposed algorithm constantly outperforms the other algorithms tested. Regarding the comparison with fuzzy competitive learning, the fuzziness figure is designed as m 5 1.1. To further extend the comparison, we also tried other figures as shown in Table 2 and Table 3 where the fuzziness figure varies from 1.1 to 1.5. The experimental results presented in Tables 2 and 3 clearly show that the proposed algorithm overwhelmingly

FIG. 5.1. Reconstructed images by neural networks.

142

J. JIANG

TABLE 4 Coefficient Values

provides better quality for reconstructed images in terms of PSNR values when variable fuzziness is applied. The flavor of the comparison can also be shown in Fig. 5.1, where the image sample Lena is singled out to make physical comparison among the algorithms tested for the reconstructed image qualities. For the second set of experiments, we designed three nonlinear weighting functions corresponding to Eqs. (3.2), (3.3), and (3.4), based on the proposed indexing scheme covered in Section 3 to retrieve images in a simulated visual database of about 500 images. A linear decreasing effect is added into the third model based on Eq. (3.4) to reduce the decreasing speed after a certain value of the consecutive counts. This slightly modifies the equation for the model as y5

5

y1(e a(x21) 2 1), k(x 2 b) 1 y1(e

x # b, a( b21)

2 1), x . b.

The values of all those coefficients for the three models are given in Table 4. Ideally, experiments should be carried out to assess the indexing performance with all the coefficients varying inside a specified range. Since the time is limited, unfortunately, we are unable to complete all the experiments and report the best values for the coefficients at this stage. Hence, the coefficient values we used are purely obtained from the principle that the weighted values should reflect the position information for each codeword inside the input image as much as possible. With the first model as an example, the coefficient y2 is designed to control the extent of highlight for those repeated blocks. If y2 is selected too big, the so-called over-highlight occurs which will restrict other blocks to be considered in the retrieval. On the other hand, insufficient highlight will result if y2 is selected too small. The coefficient a controls

the position of the highlight for those repeated blocks. The highlight effect will move to lower counts as a increases. When a decreases, however, the highlight effect will occur at higher counts. By using histograms of labels, we tested the algorithm in comparison with the other VQ-based scheme [7]. The complete neural network system designed in Fig. 4.1 is basically the same, except for the later indexing part, where different histograms are constructed and produced. The overall experimental results are illustrated in Table 5, where the rank is defined as the order of n similar images among the retrieved ones. Specifically, for each query image, we retrieved 12 images and arranged them in an ascending order according to their similarity measurement. The rank value is then determined to be the position of the similar image. If the similar image is not inside the first 12 images, a further 58 images are retrieved to determine its rank. Exhaustive search has to be performed to determine the rank value which is greater than 60. An example of 4 retrieved images is shown in Fig. 5.2. The figures in Table 5 are successful rates in percentage. From this table, it can be seen that the proposed indexing scheme outperforms the VQ-based one [7] for all the three models, in terms of the successful rates in the first rank. The performance is also competitive on all other ranks.

6. CONCLUSIONS

In this paper, we have proposed a novel neural network design and its parallel computing implementation of two algorithms, one for image compression and the other for image indexing. The following conclusions can be drawn to summarize the work presented. The neural network designed also covers the specific

TABLE 5 Retrieval Rates for Image Indexing

IMAGE COMPRESSION AND INDEXING SYSTEM

143

FIG. 5.2. Image indexing examples

parallel computation of each individual operation for implementing the proposed algorithms among different layers. As the hardware design of basic competitive learning VQ neural network is already available [15], the neural network is ready to be extended as a realtime image compression and indexing system as outlined in Fig. 4.1.

Experiments carried out support that the proposed system provides improved performance for both image compression and indexing compared with existing technology in this area. For the indexing scheme, we only covered a few nonlinear weighted counting of VQ labels to construct the histograms. In fact, similar linear-weighted counting models can

144

J. JIANG

also be further established. Further work is required to analyze all the available weighting functions and obtain guidelines for various visual databases and image indexing schemes to obtain the best possible coefficient values and models. The proposed indexing scheme in compressed format has options of positive weighting and negative weighting. As positive weighting increases the overall counts, it is best used to highlight those consecutive image blocks to be recognized in the indexing process. But over-highlight will restrict other important blocks to be taken into account. Therefore, coefficients are selected to adjust the block range to be appropriately highlighted. For a large number of consecutive blocks, they do not need highlighting simply because their counts are already large enough to make some sense in the similarity calculation. Negative weighting is normally used to reduce the dominance of some blocks with too many consecutive counts. Yet those blocks are still recognizable since their counts are not reduced to zero in any circumstance. In addition, high counts should be decreased more on the basis that the decreased counts will add distances to other codewords to make the distinction. Consider a standard image of 512 3 512, the total number of blocks is 512 3 512/4 3 4 5 16384. If one block repeats itself 2000 times and is reduced to 100 by the negative weighting, for instance, the 1900 counts reduced will put some distance to other blocks if a target image happens to have 100 counts for the same block. For other blocks with low counts, the negative weighting should be designed in a way that not much reduction is made to affect their distance from the target images. One possible argument about the compressed image indexing scheme is that the retrieved images do not keep the same quality as their originals. It seems that the scheme trades the reduced image quality for memory savings. In fact, the compressed image indexing works on the same principle as that of lossy image compression. In other words, the scheme can only be applied to situations where the reduction of image quality is tolerable. In addition, various measures can be adopted to enhance the image quality such as those preprocessing techniques, including DCT and wavelet transforms, or predictive coding, etc. For those applications that any loss of information is not acceptable, lossless compression techniques [16, 17] can be incorporated to develop new compressed image indexing algorithms. This could well be another important direction for further research. Finally, further research around lossy image database can also be considered to upgrade its compression performance in line with JPEG, MPEG standards and other existing technologies. This will certainly involve analysis of image contents after various transforms such as DCT, fractals, wavelets, etc. The author thanks Darren Butler for his contribution in part of those experiments reported in this paper.

REFERENCES 1. M. Flickner, H. Sawhney, W. Niblack, et. al. Query by image and video content: The QBIC system, Computer September 1995, 23–32. 2. E. G. M. Petrakis. and S. C. Orphanoudakis, Methodology for the representation, indexing and retrieval of images by contents, Image Vision Comput. 11, No. 8, 1993, 504–521. 3. A. Califano and R. Mohan, Multidimensional indexing for recognizing visual shapes, IEEE Trans. Pattern Anal. Mach. Intell. 16, No. 4, 1994, 373–392. 4. F. Stein and G. Medioui, Structural indexing efficient 2-D object recognition, IEEE Trans. Pattern Anal. Mach. Intell. 14, No. 12, 1992, 1198–1204. 5. R. Srihari, Antomatic indexing and content-based retrieval of captioned images, Computer 28, No. 9, 1995, 49–56. 6. V. E. Ogle and M. Stonebraker, Chabot: Retrieval from a relational database of images, Computer 28, No. 9, 1995, 40–48. 7. F. Idris and S. Panchanathan, Algorithms for the indexing of compressed images, in Proceedings, Visual’96: International Conference on Visual Systems, February 5–6, 1996, Victoria, Australia, pp. 303–308. 8. J. Jiang, Neural network technology for image compression, in Proceedings, IBC’95: International Broadcasting Convention, Amsterdam, September 14–18, 1995, pp. 250–256. 9. S. C. Ahalt, A. K. Krishamurthy, et al., Competitive learning algorithms for vector quantiztion, Neural Networks 3, 1990, 277–290. 10. A. K. Krishnamruthy et al. Neural networks for vector quantization of speech and images, IEEE J. Selected Areas Commun. 8, 1990, 1449– 1457. 11. F. L. Chung and T. Lee, Fuzzy competitive learning, Neural Networks 7, No. 3, 1994, 539–551. 12. N. Mohsenian and N. M. Nasrabadi, Predictive vector quantization using a neural network, in Proceedings, ICASSP-93, Vol. 5, Minneapolis, April 27–30, 1993, pp. 245–248. 13. C. H. Hsieh, DCT-based codebook design for vector quantization of images, IEEE Trans. Circuits Systems Video Technol. 2, No. 4, 1992, 401–409. 14. T. Denk, K. Parhi, and V. Cherkassky, Combining neural networks and the wavelet transform for image compression, IEEE Proceedings, ASSP, Vol. I, 1993, pp. 637–640. 15. W. C. Fang et al. A VLSI neural processor for image data compression using self-organizing networks, IEEE Trans. Neural Networks 3, No. 3, 1992, 506–519. 16. P. G. Howard and J. S. Vitter, New methods for lossless image compression using arithmetic coding, Inform. Process. Manage. 28, No. 6, 765–779. 17. S. Takamura and M. Takagi, Lossless image compression with lossy image using adaptive prediction and arithmetic coding, in Proceedings, DCC’94: Data Compression Conference, pp. 166–175.

JIANMIN JIANG received his Bsc. from Shandong Mining Institute China in 1982, his Msc. by research from China University of Mining &

IMAGE COMPRESSION AND INDEXING SYSTEM Technology in 1985, and his Ph.D. from the University of Nottingham in 1994. He joined Loughborough University as a visiting scholar and later moved to the University of Nottingham as a research assistant funded by SERC in UK. He completed his research work for his Ph.D. in 1992 and joined Bolton Institute as a lecturer in electronics. In Bolton,

145

he established a strong research group funded by EPSRC in the UK and by industry, working on data compression, image compression, genetic algorithms, and neural networks. He has recently moved to Loughborough University, again as a lecturer, working on parallel computing algorithms and data compression.

An Image Compression and Indexing System Using Neural Networks

An Image Compression and Indexing System Using Neural Networks

Recommend Documents