Accepted Manuscript
Local Derivative Radial Patterns: A New Texture Descriptor for Content-Based Image Retrieval Sadegh Fadaei, Rassoul Amirfattahi, Mohammad Reza Ahmadzadeh PII: DOI: Reference:
S0165-1684(17)30069-5 10.1016/j.sigpro.2017.02.013 SIGPRO 6408
To appear in:
Signal Processing
Received date: Revised date: Accepted date:
24 August 2016 27 January 2017 21 February 2017
Please cite this article as: Sadegh Fadaei, Rassoul Amirfattahi, Mohammad Reza Ahmadzadeh, Local Derivative Radial Patterns: A New Texture Descriptor for Content-Based Image Retrieval, Signal Processing (2017), doi: 10.1016/j.sigpro.2017.02.013
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights • A novel local pattern referred as Local Derivative Radial Pattern (LDRP) is proposed. • Proposed LDRP is based on gray-level difference of pixels along a line.
CR IP T
• Instead of binary coding, multi-level coding in different directions is used as well.
AC
CE
PT
ED
M
AN US
• A new similarity measure is presented which is more robust against image rotation.
1
ACCEPTED MANUSCRIPT
Local Derivative Radial Patterns: A New Texture Descriptor for Content-Based Image Retrieval Sadegh Fadaei, Rassoul Amirfattahi, Mohammad Reza Ahmadzadeh
CR IP T
Digital Signal Processing Research Lab., Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156-83111, Iran
Abstract
In this paper, we propose a novel local pattern descriptor called Local Derivative Radial Pattern (LDRP) for texture representation in content-based image retrieval. All prior local patterns are
AN US
based on gray-level difference of pixels located in a square or circle. Since many of the actual textures can be represented by intensity relationship of pixels along a line, these methods do not have a suitable ability to represent texture information. In prior methods, difference between referenced pixel and its adjacent pixel is encoded with two, three or four values which leads to
M
information loss of the image. The proposed LDRP is based on gray-level difference of pixels along a line and their weighted combinations. In addition, multi-level coding in different directions
ED
is used instead of binary coding. The performance of the proposed method is compared with prior methods including local binary pattern (LBP), local ternary pattern (LTP), local derivative
PT
pattern (LDP), local tetra pattern (LTrP) and local vector pattern (LVP). The proposed LDRP outperforms all mentioned prior methods by at least 3.82% and 5.17% in terms of average precision
CE
on Brodatz and VisTex databases, respectively. Keywords: Content-based image retrieval (CBIR), texture descriptor, local patterns, local
AC
derivative radial pattern (LDRP). 1. Introduction Today, various photography devices such as digital cameras, web cameras and mobile phones have caused the rapid growth in the number of digital images and consequently, voluminous digital image databases have been constructed. Therefore, saving, searching, organizing and management of digital databases have become significant and indispensable. Image retrieval is considered as Preprint submitted to Signal Processing
February 22, 2017
ACCEPTED MANUSCRIPT
an important searching subject in this field, and it is generally divided into two main groups: text-based image retrieval (TBIR) and content-based image retrieval (CBIR) [1, 2, 3]. TBIR was initially introduced for searching images in the 1970s. In this method, for description of each image, one or a number of words are allocated manually. Then, these words are used by a database management system to perform the image retrieval [4, 5]. TBIR is based on keywords, so
CR IP T
its implementation is an easy task. However, TBIR methods have some disadvantages: first, it is hard or sometimes impossible to assign features for large databases manually and second, different individuals may never assign the same words to an image [5, 6].
To solve the problems of TBIR methods, Kato first introduced CBIR in 1992 [7]. The main
AN US
difference between TBIR and CBIR methods is that in TBIR, human interference is necessary while in CBIR, retrieval process is done automatically. Color, texture and shape are the typical image features that represent and index an image in CBIR systems [8, 9, 10, 11]. Texture is one of the most important features of an image which has a high potential to discriminate the images from each other [12, 13]. In general, methods of texture feature extraction
M
are divided into four categories: statistical methods, model-based methods, structural methods and signal processing methods [14, 15]. Spatial distribution of the gray-levels is analyzed by statistical
ED
methods and a collection of statistic features is derived. The most popular statistical methods are: co-occurrence matrix, autocorrelation function, Tamura features and Wold features [16, 17].
PT
Model-based methods attempt to obtain a process that produces the texture. In these methods, a
CE
model is firstly assumed for the image and then the parameters of this model are estimated such that the synthesized image using this model closely resembles the input image. The estimated
AC
parameters are considered as texture descriptor features. Among various model-based methods, Markov random field is the most popular one. Simultaneous auto-regressive (SAR) model which is an instance of Markov random field, has been very successful in texture representation [5]. The structural methods describe the primitives and their placement rules. These methods are useful for the textures with large primitives and textures having regular patterns. Since most real textures do not have such regular geometries, applications of these methods are restricted [15]. Signal processing techniques have been used extensively for the texture feature extraction. In
3
ACCEPTED MANUSCRIPT
these methods, by applying some special mathematical transforms to the image, some coefficients are obtained and used as image features. Several transforms have been introduced for signal processing methods: discrete Fourier transform, discrete sine transform, discrete cosine transform, Gabor filter and wavelet transform [14, 18, 19]. In [14], Farsi and Mohammadzadeh applied Hadamard transform to LL component of the image in different levels of decomposition. The
CR IP T
wavelet transform was used for texture feature extraction by Chun et al. in [18]. In this regard, V component of HSV color space often includes texture information, wavelet transform is applied to this component and its BDIP (block difference of inverse probabilities) and BVLC (block variation of local correlation coefcients) features are extracted as texture features. Indeed, BDIP and BVLC
AN US
represent the local brightness changes and local texture smoothness, respectively [20].
The remainder of this paper is organized as follows. In section II the previous related works based on local patterns are given. The proposed LDRP is explained in details in section III. Experimental results and discussion are presented in section IV. Finally, conclusions are given in
M
section V. 2. Related works
ED
Local binary pattern (LBP) is one of the local features based on the gray-level difference between a referenced pixel and its adjacent pixels. It is generated using weighted combination of
PT
the first-order derivative of the referenced pixel in different directions. Since uniform LBP generates inappropriate patterns for textures with irregular shapes and edges, Liao et al. presented dominant
CE
LBP method [21]. Dominant LBP is robust against image rotation and histogram equalization as well as it is more robust against noise than LBP. Gue et al. introduced LBP variance descriptor
AC
to preserve the global spatial information of the image [22]. Later, they developed another LBP descriptor which was more complete than conventional LBP for local texture features [23]. Based on the combination of LBP and Haar wavelet, Su et al. proposed the structured local binary Haar pattern (SLBHP) features [24]. In SLBHP, instead of calculating the magnitude of the difference between referenced pixel and its neighbors, the polarity of this difference is considered. Ahonen et al. proposed soft histograms for LBP (SLBP) which was more robust against noise in comparison with conventional LBP [25]. Fuzzy local binary pattern (FLBP) is another extension of LBP which 4
ACCEPTED MANUSCRIPT
was proposed by Lakovid et al. in which fuzzy logic was used to represent local texture patterns [26]. Histogram of oriented gradient (HOG) uses local information to extract texture features and it is based on the distribution of local intensity gradients. Since the appearance and shape of the local object are well-characterized by the distribution of local gradients, this method is appropriate
CR IP T
for object detection [27]. In this method, gradient is firstly calculated and then the image is divided into several blocks and finally the normalized histogram of each block is used as feature vector. Since the feature vector of this method was long and demanded a large amount of memory, Chandrasekhar et al. developed compressed HOG method which was a type of HOG with low
AN US
bit rate [28]. Later, they presented various quantization schema for compressed HOG [29]. In addition, Wang et al. in [30] illustrated that the combination of LBP and HOG could result in the considerable improvement of the retrieval performance in some databases. Tan et al. in [31] showed that LBP features are very sensitive to noise in near-uniform image regions and they introduced local ternary pattern (LTP) method which was an extension of LBP.
M
Since LBP and LTP just consider the first-order derivative, they cannot describe the image details. Hence, Zhang et al. suggested local derivative pattern (LDP) that could extract more
ED
details of an image by employing the second-order or higher derivatives [32]. For example, secondorder LDP describes the changes of derivative directions for a referenced pixel and its neighbors.
PT
As LDP was not robust against rotation, Gue et al. proposed local directional derivative pattern
CE
(LDDP) [33]. Similar to LDP, LDDP is based on second-order derivative or higher. In LDP, the nth-order derivative of the referenced pixel and its adjacent pixels are calculated in each
AC
direction separately, and the histogram of patterns in each direction is constructed as features. However, the nth-order derivative of the referenced pixel and its adjacent pixels is calculated without considering direction in LDDP [33]. Since the extracted features from LBP, LTP and LDP are merely based on the positive and negative directions of edges, one could improve these methods by differentiating between edges in more than two directions and introduce more directions to obtain higher performance. Subrahmanyam et al. proposed local tetra pattern (LTrP) which was based on introducing four directions for each pixel using horizontal and vertical derivatives
5
ACCEPTED MANUSCRIPT
[3]. Oberoi et al. in [34] showed that the retrieval accuracy of medical images will be improved by applying Fourier transform to LTrP patterns. Due to the high redundancy in LTrP, Fan and Hung proposed local vector pattern (LVP) [35]. This method produces LVP patterns through calculating derivatives of multi-points in different directions and the combination of these derivatives. They also attempted to decrease the length of feature vector by introducing a new algorithm referred as
CR IP T
comparative space transform (CST). The CST algorithm extracts features with more information by applying a type of dynamic linear decision function that is more robust against noise [35]. Moreover, Deng et al. proposed a novel texture classification and retrieval method with adjacent shearlet subband dependences modelling based on linear regression [36]. This method addressed
AN US
the weakness of traditional wavelets in images containing distributed discontinuities. Also, another modelling approach referred as heterogeneous and incrementally generated histogram (HIGH) was proposed in [37] to model wavelet coefficients using four local features in wavelet subbands. In general, LBP, LTP, LDP, LTrP and LVP are based on gray-level difference of the referenced pixel and its nearest neighbors. On the other hand, these patterns are defined based on various
M
combinations of gray-level difference of pixels located in a square or circle. Since many of the natural textures can be shown by the relationship of pixels intensities along a line, these methods
ED
have limited ability to represent the texture information. Besides, the difference between the referenced pixel and its adjacent pixels is encoded with two, three or four values in the aforementioned
PT
methods, and this may lead to losing much of the image information. Therefore, more image
CE
information will be preserved if multi-level coding is used instead of binary coding. In this paper, we propose the LDRP to obviate the previous patterns drawbacks. LDRP is based on gray-level
AC
difference of pixels along a line and their weighted combinations. Therefore, these patterns are descriptors for extracting meaningful texture information from the image. In addition, multi-level coding in different directions is used instead of binary coding which leads to higher precision in image retrieval.
6
ACCEPTED MANUSCRIPT
3. Proposed LDRP 3.1. LDRP patterns As it was mentioned, LBP and LTP can be considered as general definitions of micropatterns which are capable to describe the texture without extracting much information from the relationship between adjacent pixels. On the other hand, LDP has been derived from LBP in different
CR IP T
directions and higher-order derivatives. Also, LTrP is an extension of LDP from 1D spatial relationship to 2D. LVP was introduced for the precision improvement and redundancy reduction of previous works which extracts various 2D spatial structures of the image with the help of CST algorithm. All aforementioned patterns are based on gray-level difference between the referenced
AN US
pixel and its neighbors as well as combination of binary coding of these differences. Because of binary coding in these patterns, a great amount of image information is lost. On the other hand, most of the aforementioned methods cannot describe the radial patterns as well. Thus, in this paper we propose LDRP that, unlike the previous patterns, uses radial patterns and multi-level
M
coding instead of rotational patterns and binary coding, respectively.
Here, we propose a set of novel features based on the first-order derivative of the image. To
ED
define these features, four directions 0◦ , 45◦ , 90◦ and 135◦ are considered. As shown in Fig. 1, the location of the nth pixel relative to gc in α direction is denoted by gα,n . gc is the reference pixel
PT
which is defined as:
gc = g0◦ ,1 = g45◦ ,1 = g90◦ ,1 = g135◦ ,1
(1)
CE
For the image I with k gray-levels, if I(gc ) denotes the gray-level of pixel gc , the first-order
AC
derivative of gc along α direction is defined as: Iα1 (gc ) = Iα1 (gα,1 ) = I(gα,2 ) − I(gα,1 )
(2)
1 1 By considering the above equation for each image, four matrices are extracted as: I01◦ , I45 ◦ , I90◦ 1 and I135 ◦ . Since the image has k gray-levels, 2 adjacent pixels take values between 0 to k − 1 that
7
ACCEPTED MANUSCRIPT
g135
g 90 , 4
,4
g 45 , 4
g 90 ,3
g135 ,3
g 45 ,3
g135
,2
g 90 , 2 g 45 , 2
g0
,2
g 0 ,3
g 0 , 4
CR IP T
gc
Figure 1: Representing referenced pixel gc and its neighbors in different directions for radial patterns.
leads to different values for Iα1 and so, Iα1 may take 2k − 1 integer value:
AN US
Iα1 ∈ Z, −(k − 1) ≤ Iα1 ≤ (k − 1)
(3)
In the following, we define a series of features based on derivatives that can initially define local patterns. So, we nominate them as local derivative radial patterns (LDRP) and they are
M
n represented by LDRPP,α . In this definition, n, P and α represent order of derivative, number of
ED
adjacent pixels in the pattern and pattern direction, respectively.
LDRPs based on the first-order derivative of 2 adjacent pixels: To define these features, the first-
PT
order derivative of 2 adjacent pixels is used. At first, the local patterns based on the first-order
CE
derivative of 2 adjacent pixels in α direction for gc are defined as follows:
AC
1 1 LDRP2,α (gc ) = LDRP2,α (gα,1 ) = Iα1 (gα,1 ) + k − 1
(4)
1 The constant value k − 1 in (4) is added to make LDRP2,α nonnegative, so its range is:
1 LDRP2,α ∈ Z, 0 ≤ Iα1 ≤ 2(k − 1)
(5)
1 In other words, according to (5), we can state that LDRP2,α is always smaller than 2(k − 1) + 1 =
8
ACCEPTED MANUSCRIPT
1 2k − 1. The histogram of each matrix ({LDRP2,α |α = 0◦ , 45◦ , 90◦ , 135◦ }) is calculated as follows: N
1 P (LDRP2,α
M
XX 1 1 f1 (LDRP2,α (i, j), h); = h) = 1 | j=1 i=1 |LDRP2,α
h ∈ [0, 2(k − 1)]
(6)
as:
1 if x = y f1 (x, y) = 0 if x 6= y
CR IP T
1 1 where |LDRP2,α | denotes the number of elements of LDRP2,α and the function f1 (x, y) is defined
(7)
Fig. 2(a) represents a sample of Brodatz database and Fig. 2(b) demonstrates the histogram of 1 LDRP2,α of Fig. 2(a) for k = 8 and α = 45◦ . It is clear that the number of bins in this histogram
AN US
1 is 2k − 1 = 2(8) − 1 = 15 and LDRP2,45 ◦ is smaller than 15.
0.300 0.200 0.100 0.000 0
1
2
3
4
5
6
M
Normalized Histogram
0.400
7
8
9
Weighted Histogram
0.003
0.500
0.0025
0.002
0.0015
0.001
0.0005 0
10 11 12 13 14
0
1
2
3
4
5
1 2 , 45
ED
(a)
(b)
6
7
8
9 10 11 12 13 14
LDRP21, 45
LDRP
(c)
PT
1 Figure 2: (a) A sample of Brodatz database, (b) histogram of LDRP2,45 ◦ for the image (a), (c) weighted image histogram.
CE
The most important point is that the possible states of generating various bins are not the same. For instance, the number of states in which 2 adjacent pixels lead to derivative value of 0 is different
AC
from the number of states which lead to derivative value of 4. Table 1, shows total states in which 2 adjacent pixels lead to various derivative values for an image with k = 5. Specified pairwise in row “various states for generating Iα1 ” in Table 1 represents the gray-levels of 2 adjacent pixels 1 which lead to generation of a specific value for LDRP2,α . According to this Table, the number of 1 generating states for LDRP2,α = h is calculated as follows:
1 s(LDRP2,α = h) = k − |h − k + 1|
9
(8)
ACCEPTED MANUSCRIPT
Table 1: Various states in which 2 adjacent pixels lead to different derivative values for an image with k = 5. Iα1 1 LDRP2,α
-4 0 (4,0)
-3 1 (3,0) (4,1)
-2 2 (2,0) (3,1) (4,2)
-1 3 (1,0) (2,1) (3,2) (4,3)
1
2
3
4
Various states for generating Iα1
1 5 (0,1) (1,2) (2,3) (3,4)
2 6 (0,2) (1,3) (2,4)
3 7 (0,3) (1,4)
4 8 (0,4)
4
3
2
1
CR IP T
Number of states
0 4 (0,0) (1,1) (2,2) (3,3) (4,4) 5
Thus, the bins importance is not equal in the histogram of Fig. 2(b) and for feature definition based on this histogram, assigning an appropriate weight to each bin is necessary. So we define
AN US
features as follows:
1 1 1 1 = {w0 P (LDRP2,α FLDRP2,α = 0), w1 P (LDRP2,α = 1), . . . , w2k−2 P (LDRP2,α = 2k − 2)}
(9)
The higher the number of states for a bin (derivative), the greater the amount of histogram
M
resulted as shown in Fig. 2(b). The histogram side bins (far from the center) have smaller value rather than the central bins. So, if this histogram is directly used as features, the histogram
ED
side bins (the high amount of derivatives) have less information values in the feature vector. To compensate the effect of these bins, the weighting is crucial. Based on the above description, lower
PT
weight should be given to lower absolute value of derivative (central bins of histogram in Fig. 2(b)) and higher weight should be assigned to higher absolute value of derivative (side bins of histogram
CE
in Fig. 2(b)). As shown in Fig. 2(b), by moving from the sides of histogram toward its center, the amount of bins are almost double in each step. It can be stated that the amount of histogram
AC
will be doubled when the number of states of generating derivative is increased by one. Then, we define the weight of bin h as shown below:
wh =
1 1 =h) s(LDRP2,α
2
=
1 2k−|h−k+1|
(10)
Fig. 2(c) shows the weighted histogram of Fig. 2(b). The hth element of the feature vector based
10
ACCEPTED MANUSCRIPT
on LDRPs for 2 adjacent pixels and α direction is defined as:
1 (h) = FLDRP2,α
1 2k−|h−k+1|
1 = h − 1); P (LDRP2,α
h = 1, 2, . . . , 2k − 1
(11)
Finally, LDRP feature vector is constructed by concatenating feature vectors in four directions 0◦ ,
CR IP T
45◦ , 90◦ and 135◦ as follows:
1 1 1 1 FLDRP21 = {FLDRP2,0 , FLDRP2,45 , FLDRP2,90 , FLDRP2,135 } ◦ ◦ ◦ ◦
(12)
AN US
LDRPs based on the first-order derivative of 3 adjacent pixels: These patterns are obtained from the distribution of two first-order derivatives of 2 adjacent pixels. As shown in Fig. 3, by con1 1 catenating two LDRP2,α of three adjacent pixels and a specified direction, a pairwise of LDRP2,α 1 is constructed and then these two LDRP2,α are combined to create LDRPs based on the first1 order derivative of 3 adjacent pixels. As pointed in the previous section, the value of LDRP2,α
M
1 s is smaller than 2k − 1, so, the constructed vector by each pairwise expression of LDRP2,α
ED
1 1 1 (gα,1 ), (gα,2 )) is a number in base (2k − 1) which is shown by (LDRP2,α (gα,1 ), LDRP2,α (LDRP2,α 1 LDRP2,α (gα,2 ))(2k−1) . Hence, LDRPs based on the first-order derivative of 3 adjacent pixels are
PT
defined as:
1 1 = LDRP2,α (gα,1 ).(2k − 1) + LDRP2,α (gα,2 ) (13)
AC
CE
1 1 1 LDRP3,α (gα,1 ) = (LDRP2,α (gα,1 ), LDRP2,α (gα,2 ))2k−1
11
ACCEPTED MANUSCRIPT
g 45 ,3
g 90 ,3
g135 ,2
g 90 , 2 g135 ,1
g 45 , 2
g 90 ,1
g 45 ,1
g 0 ,1
g 0 , 2
g 0 , 3
LDRP21, 0 ( g 0 , 2 )
LDRP21,0 ( g 0 ,1 )
LDRP21, 45 ( g 45 , 2 )
LDRP21, 45 ( g 45 ,1 )
LDRP21,90 ( g 90 , 2 )
LDRP31,0 ( g 0 ,1 )
LDRP31, 45 ( g 45 ,1 )
LDRP31,90 ( g 90 ,1 )
LDRP31,135 ( g135 ,1 )
AN US
LDRP21,90 ( g 90 ,1 )
LDRP21,135 ( g135 , 2 )
LDRP21,135 ( g135 ,1 )
CR IP T
g135 ,3
Figure 3: Constructing LDRPs based on the first-order derivative of 3 adjacent pixels.
M
By using (13), four matrices (for four directions) are obtained in which every element of these matrixes is in the range [0, (2k − 1)2 − 1]. In fact, according to (13), each pairwise expression of
ED
1 LDRP2,α s is mapped to a number in the range [0, (2k − 1)2 − 1]. Therefore, the number of bins in
as follows:
N
M
XX 1 1 = h) = f1 (LDRP3,α (i, j), h); 1 |LDRP3,α | j=1 i=1
h ∈ [0, (2k − 1)2 − 1]
CE
1 P (LDRP3,α
PT
each matrix histogram is equal to (2k − 1) × (2k − 1) = (2k − 1)2 and the histogram is calculated
(14)
AC
According to previous discussion, weighting process is necessary since using above histogram without weighting leads to a feature vector with poor results. Similar to (10), each histogram bin is weighted as follows:
wh =
1 1 =h) s(LDRP3,α
2
=
1 1 =h ) s(LDRP2,α 1
2
·
1 1 =h ) s(LDRP2,α 2
2
=
=
1 2k−|h1 −k+1|
·
1 22k−(|h1 −k+1|+|h2 −k+1|) 12
1 2k−|h2 −k+1| ;
h = (h1 , h2 )(2k−1) (15)
ACCEPTED MANUSCRIPT
The hth element of the feature vector based on LDRPs for 3 adjacent pixels and α direction is defined as:
1 (h) = FLDRP3,α
1 22k−(|h1 −k+1|+|h2 −k+1|)
1 P (LDRP3,α = h − 1);
h = (h1 , h2 )(2k−1) (16)
CR IP T
h = 1, 2, . . . , (2k − 1)2
Finally, LDRP feature vector is constructed by concatenating feature vectors in four directions 1 1 1 1 as FLDRP31 = {FLDRP3,0 , FLDRP3,45 , FLDRP3,90 , FLDRP3,135 }. Fig. 4 illustrates an example of the ◦ ◦ ◦ ◦
AN US
feature extraction using LDRPs based on the first-order derivative of 2 and 3 adjacent pixels.
LDRPs based on the first-order derivative of P adjacent pixels: In general, these patterns are constructed using the (P − 1) LDRPs of 2 adjacent pixels and calculated as follows: 1 1 1 1 LDRPP,α (gα,1 ) = (LDRP2,α (gα,1 ), LDRP2,α (gα,2 ), . . . , LDRP2,α (gα,P −1 ))(2k−1)
M
1 1 = LDRP2,α (gα,1 ) · (2k − 1)P −2 + LDRP2,α (gα,2 ) · (2k − 1)P −3 +
ED
1 1 · · · + LDRP2,α (gα,P −2 ) · (2k − 1) + LDRP2,α (gα,P −1 )
=
P −1 X
AC
CE
PT
i=1
13
1 LDRP2,α (gα,i ) · (2k − 1)P −i−1 (17)
ACCEPTED MANUSCRIPT
k 4
3
2
0
2
1
1
1
2
2
3
3
2
0
0
1
2
3
2
3
0
3
2
1
1
0
2
3
3
3
2
1
1
2
1
2
3
1 I 45
1 I 90
-1 -2 2 -1 0 1
0
1
0 -1
0
1
1
1 -1
1 -2 0 -2 -2 2
0 -2 -1 -2 -1
1
0 -3 -1 -1
2
1
2
2
1 -1
1
1
0
0
1
0
1
0 -1 0
-3 1
0
1
-3 0 -1 0
2
1
0 -2 -1 1
2
2
1
0
0
0 -1 -2 -2
0
1 -1 1
1
2
0 -1 1
1
1
2
0
5
2
3
4
3
4
3
2
4
1
3
1
1
3
4
4
4
2
5
5
5
4
2
0
5
3
2
3
0
4
3
4
4
5
4
3
3
2
3
3
2
1
1
3
4
2
4
4
4
5
4
5
3
4
3
0
2
2
4
5
4
4
3
3
4
4
3
3
4
0
3
2
3
5
4
3
1
2
4
5
6
1
2
2
1
2
4
0
2
2
2
2
4
4
5
4
2
2
3
5
4
1 3, 90
LDRP
LDRP
33 38 29 30 22 23
32 24 21 30
5 33 25 30
4 26 18 25 38 31
11 18 31 38
25 24 18 11
42 10 16 17 12 18
3 15 16 18
20 29 30 37 29 16
25 35 30 23
PT
36 38 36 29
FLDRP1
0.0200 0.0500 0.0100 0.0125 0.0350 0.0500
0
FLDRP1
0.0167 0.0083 0.0292 0.0187 0.0375 0.0250
0
CE
3 1 3,135
0
1
2
3
4
5
0
6
0 0 0 0 0.0026 0 0 0 0 0 0.0007 0 0.0026 0 0 0 0.0013 0.0003 0.0013 0 0.0026 0 0.0007 0.0003 0 0.0003 0.0007 0 0 0.0039 0.0013 0.0003 0 0.0013 0 0 0 0.0013 0.0013 0 0 0 0.0104 0 0 0 0 0 0 0 0 0 0 0 0.0078 0 0 0 0 0 0.0020 0 0 0 0 0 0 0.0010 0 0 0 0 0 0.0002 0.0010 0 0 0 0.0039 0.0010 0.0005 0 0.0020 0 0 0.0117 0.0020 0.0010 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0052 0 0 0 0 0 0 0.0026 0 0 0.0013 0 0.0007 0.0007 0 0 0 0 0.0010 0.0002 0.0010 0 0 0 0 0.0013 0.0010 0.0020 0 0 0 0 0.0013 0.0007 0.0013 0 0 0 0 0 0 0 0 0
3 ,135
2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
FLDRP1
3 , 90
FLDRP1
3 , 45
FLDRP1
3 , 0
FLDRP1
1
0 0 0 0.0020 0 0 0 0 0 0 0 0.0020 0 0 0 0.0020 0.0010 0 0.0020 0 0 0.0020 0 0.0005 0.0002 0.0005 0 0 0 0 0.0020 0.0005 0.0010 0 0 0.0078 0 0 0.0010 0 0 0 0 0 0 0 0 0 0
AC bin
2
0.0167 0.0333 0.0333 0.0104 0.0292 0.0333 0.0167
bin
LDRP
1
FLDRP1
2 , 0
0
3
0.0400 0.0100 0.0350 0.0150 0.0350 0.0200
2 , 45
1
5
FLDRP1
2 , 90
2
1 2 ,135
LDRP
31 37 29 36
2 ,135
2
1 -1 1 2 , 90
ED
31 25 31 23
FLDRP1
2
M
15 12 37 17
25 30 18 32
-1 0
1
LDRP
LDRP
39 31 24 23
-1 1
1 3, 45
1 3, 0
5 38 23 17
1 -3 -1 -1 -1
AN US
1
0
3 -2 -1 -1 -2 -1
LDRP
2
25 32 32 30
1
1 2 , 45
LDRP
3
2
-3 2
1 2 , 0
FLDRP1
1 I135
CR IP T
I 01
Figure 4: Feature extraction example using LDRPs based on the first-order derivative of 2 and 3 adjacent pixels.
14
ACCEPTED MANUSCRIPT
In the next process, we obtain the histogram of these patterns in four specific directions. The number of histogram bins for each specific direction is (2k − 1)P −1 and the amount of histogram in each bin is calculated as follows: M
XX 1 1 f1 (LDRPP,α (i, j), h); = h) = 1 |LDRPP,α | j=1 i=1
The weight for each bin of the histogram in (18) is defined as:
wh =
1 1 =h) s(LDRPP,α
2
=
P −1 Y i=1
1 1 =h ) s(LDRPP,α i
2
=
P −1 Y i=1
1 2k−|hi −k+1|
;
h ∈ [0, (2k − 1)P −1 − 1] (18)
CR IP T
N
1 P (LDRPP,α
AN US
h = (h1 , h2 , . . . , hP −1 )(2k−1) (19)
By using LDRPs and their weights, the hth element of the feature vector for P adjacent pixels and α direction is defined as: P −1 Y i=1
1 2k−|hi −k+1|
1 · P (LDRPP,α = h − 1);
M
1 (h) = FLDRPP,α
ED
h = 1, 2, . . . , (2k − 1)P −1 ,
h = (h1 , h2 , . . . , hP −1 )(2k−1) (20)
PT
Finally, LDRP feature vector is constructed by concatenating feature vectors in four directions 1 1 1 1 as FLDRPP1 = {FLDRPP,0 , FLDRPP,45 , FLDRPP,90 , FLDRPP,135 } in which its dimension is equal to ◦ ◦ ◦ ◦
CE
NF = 4 × (2k − 1)P −1 .
AC
LDRPs based on the nth-order derivative of P adjacent pixels: These patterns are constructed using combination of the nth-order derivative of P adjacent pixels, P ≥ n + 1. According to the definition of the nth-order derivative over n + 1 pixels, LDRPs based on the nth-order derivative of n + 1 adjacent pixels for gc are defined as: n n LDRPn+1,α (gc ) = LDRPn+1,α (gα,1 ) = Iαn (gα,1 ) + 2n−1 (k − 1)
15
(21)
ACCEPTED MANUSCRIPT
n The constant expression 2n−1 (k − 1) in (21) is added to avoid negative values for LDRPn+1,α . The
range of these patterns is: n n LDRPn+1,α ∈ Z, 0 ≤ LDRPn+1,α ≤ 2n (k − 1)
(22)
CR IP T
n In other words, based on (22), it can be stated that LDRPn+1,α is always smaller than 2n (k −1)+1.
After the pattern formation for n+1 adjacent pixels, the patterns for P pixels are defined as follows:
n LDRPP,α (gα,1 ) =
n n n (LDRPn+1,α (gα,1 ), LDRPn+1,α (gα,2 ), . . . , LDRPn+1,α (gα,P −n ))[2n (k−1)+1]
AN US
n n = LDRPn+1,α (gα,1 ) · [2n (k − 1) + 1]P −n−1 + LDRPn+1,α (gα,2 ) · [2n (k − 1) + 1]P −n−2 + n n · · · + LDRPn+1,α (gα,P −n−1 ) · [2n (k − 1) + 1] + LDRPn+1,α (gα,P −n )
=
P −n X i=1
n LDRPn+1,α (gα,i ) · [2n (k − 1) + 1]P −n−i (23)
M
The histogram for the patterns in (23) is calculated as: N
M
ED
XX 1 n f1 (LDRPP,α (i, j), h); = h) = n |LDRPP,α | j=1 i=1 h ∈ [0, (2n (k − 1) + 1)P −n − 1] (24)
PT
n P (LDRPP,α
CE
To define the weight for each bin, we should first define the number of states for generating
AC
n LDRPn+1,α = h:
n s(LDRPn+1,α = h) = 2n−1 (k − 1) − |2n−1 (k − 1) − h + 1|
16
(25)
ACCEPTED MANUSCRIPT
Table 2: Feature vector dimension before and after dimension k=4 P =2 P =3 P =4 NF before dimension reduction 28 196 1372 NF after dimension reduction 28 148 700
reduction process. P =5 9604 3124
P =6 67228 13468
Then, the weight for each bin is calculated as follows:
n =h) s(LDRPP,α
2
=
PY −n i−1
1 n s(LDRPn+1,α =hi )
2
=
PY −n i−1
1
CR IP T
wh =
1
22n−1 (k−1)−|2n−1 (k−1)−hi +1|
;
h = (h1 , h2 , . . . , hP −n )[2n (k−1)+1] (26)
F
n LDRPP,α
(h) =
PY −n i−1
AN US
The hth element of the feature vector for P adjacent pixels and α direction is defined as: 1
22n−1 (k−1)−|2n−1 (k−1)−hi +1|
n · P (LDRPP,α = h − 1);
M
h = (h1 , h2 , . . . , hP −n )[2n (k−1)+1] (27)
Finally, LDRP feature vector is constructed by concatenating feature vectors in four directions as
ED
n n n n } with the dimension of NF = 4 × [2n (k − , FLDRPP,135 , FLDRPP,90 , FLDRPP,45 FLDRPPn = {FLDRPP,0 ◦ ◦ ◦ ◦
1) + 1]P −n .
PT
3.2. Dimension reduction of feature vector 1 Based on LDRPP,α , the feature vector dimension of an image with k gray-levels is NF =
CE
4 × (2k − 1)P −1 . Table 2 represents feature vector dimensions for k = 4 and P = 2, 3, 4, 5, 6. According to Table 2, the dimension of feature vector will be extremely increased as P value is
AC
raised. In other words, when the number of adjacent pixels (P ) is increased by one, the dimension of feature vector will be multiplied by (2k − 1). By considering the definition of histogram bins, it can be observed that some of these bins are empty as the features has 3 and more adjacent pixels. In fact, by regarding the existing limitation in the gray-level values of the image, it is impossible for some cases (bins in histogram) to occur. As an example, for k = 4 and P = 3, the bin 0 never 1 occurs in the LDRP3,α , and it means that none of the elements of this matrix is 0. The reason 1 is that for having a bin 0, the pairwise (2 adjacent elements) of the LDRP3,α should be (0, 0),
17
ACCEPTED MANUSCRIPT
and it means that, in the I1α matrix, this pairwise is (−3, −3). Therefore, the gray values for 3 adjacent pixels should be (6, 3, 0) in the image, and these gray values are not possible when k = 4, and consequently the amount of histogram is always zero in this bin. For this reason, for k = 4 and P = 3, the bins of 0, 1, 2, 7, 8, 14, 34, 40, 41, 46, 47, and 48 will never occur. Thus, these bins should be removed from the histogram which leads to the feature vector dimension reduction. In
CR IP T
general, it can be understood that for the features of P adjacent pixels, the amount of histogram in specified bin will be zero when the absolute sum of 2 or 3 or . . . or p − 1 adjacent derivatives of a bin is greater than k − 1. The feature vector dimension reduction is done according to algorithm 1.
PT
ED
M
AN US
Algorithm 1:Dimension reduction Input: All features, k, P . Output: New features (reduced) 1. For x = 0, 1, . . . , (2k − 1)(P −1) − 1 2. Convert x to a number in base (2k − 1) as: x = (k1 k2 . . . kP −1 )2k−1 . 3. Add all k1 , k2 , . . . , kP −1 with −(k − 1). 4. For y = 2, 3, . . . , P − 1 5. For z = 1, 2, . . . , P − y 6. Calculate K = |kz + kz+1 + · · · + kz+y−1 |. 7. If K > (k − 1) then remove (x + 1)th, ((2k − 1)P − 1 + x)th, (2(2k − 1)P − 1 + x + 1)th and (3(2k − 1)P − 1 + x + 2)th feature and goto 10. 8. End for z 9. End for y 10. End for x End of the algorithm
CE
Table 2 reports feature vector dimension before and after dimension reduction process. 3.3. Similarity measure
AC
Suppose that q and db represent the query image and one of the images in the database, respectively. The feature vectors of these images are fq = (fq1 , fq2 , . . . , fqNF ) and fdb = (fdb1 , fdb2 , . . . , fdbNF ). The distance between two images is calculated as follows [3]:
dq,db =
NF X i=1
|
fqi − fdbi | 1 + fqi + fdbi
(28)
If the image is rotated, its features will be changed. Indeed, its features will be shifted by rotating the image. Therefore, in order to have a robust similarity measure against image rotation, the 18
ACCEPTED MANUSCRIPT
(a)
(c)
(d)
(e)
(f )
d a , b 0 .0091
(b)
d a , c 0 .0077
d a , d 0 .0119
d a , e 0 .0123
d a , f 0 .0102
D a , b 0 .0091
D a , c 0 .0071
D a , d 0 .0118
D a , e 0 .0122
D a , f 0 .0088
CR IP T
Figure 5: Comparisons between dq,db and Dq,db for an image and its rotations. (a) The original image, Rotation of the original image by (b) 45◦ , (c) 90◦ , (d) 120◦ , (e) 135◦ and (f) 180◦ .
features of the first image should be compared with the shifted features of the second image. Hence, to calculate the distance of two images, the following equation is defined:
s
dq,db fq , Circshif t(fdb , s)
AN US
Dq,db = min
(29)
In fact, based on (29), the distance between two images is obtained for different shifts of the second image, and then, the minimum value is considered as final distance. Fig. 5 shows the comparisons
M
between dq,db and Dq,db for an image and its rotations. Since the angular resolution is limited in LDRPs (0◦ , 45◦ , 90◦ and 135◦ ), four cases of shifts are needed. If the image is rotated by 45◦ , its
ED
features will be shifted 1/4 of the dimension of feature vector. Indeed, each shift is proportional to the dimension of the feature vector. According to the dimension of feature vector using LDRPs
PT
based on nth-order derivative of P adjacent pixels, 4 × [2n (k − 1) + 1]P −n , the quantities of four
CE
cases of shifts are calculated as follows: 4 × [2n (k − 1) + 1]P −n 4
AC
si = i ·
!
= i · [2n (k − 1) + 1]P −n ,
(30)
where i = 0, 1, 2, 3. Finally, according to (29) and (30), the proposed similarity measure is defined as follows:
Dq,db = min i
dq,db fq , Circshif t fdb , i · [2n (k − 1) + 1]P −n
19
(31)
ACCEPTED MANUSCRIPT
3.4. Proposed CBIR system Fig. 6 shows the flowchart of the proposed CBIR system. The advantages of LDRP over other local patterns can be summarized as follows: 1. LBP, LTP, LDP, LTrP and LVP encode the image with only two (“0” or “1”), three (“-1”, “0” or “1”), two (“0” or “1”), four (“1”, “2”, “3” or “4”) and two (“0” or “1”) distinguished
CR IP T
values, respectively. In SLBP and FLBP methods, if intensity difference absolute between two adjacent pixels is smaller than a threshold it is decoded to different levels, otherwise it is decoded to zero or one. In contrast with prior methods, our proposed LDRP considers multi thresholds and encodes the image with arbitrary distinguished values which preserves
AN US
more information of the image and leads to higher accuracy in retrieval.
2. All previous local patterns consider the relationship between the referenced pixel and its nearest neighbors. In other words, the previous patterns are defined based on the gray-level difference of pixels in a square or circle while LDRP is constructed based on the gray-level
M
difference and the combination of their weights along a line. Since many actual textures can be represented by the relationship of pixels intensities along a line, the proposed LDRP has
ED
a better description for textures and extracts more meaningful information from the image. 3. As LDRP is based on the gray-level difference of pixels along a line, it can represent more
PT
various and complex patterns in comparison to other patterns. Particularly, its multi-level coding leads to a powerful definition of complex patterns. It may be concluded that LDRP
CE
is more comprehensive than LBP, SLBP, FLBP, LTP, LDP, LTrP and LVP. 4. The proposed similarity measure is more robust against image rotation rather than other
AC
methods.
4. Experimental results In order to evaluate the performance of the proposed method, various experiments have been conducted on Brodatz [38] and VisTex [39] databases. The experiments have been implemented on a machine with 3.3 GHz Corei5 CPU and 8GB of RAM on windows 7 operating system. We have represented the number of images in a database by NT and the number of images of a category by 20
ACCEPTED MANUSCRIPT
Query Image
Derivation Direction 0
RGB to Gray-scale
Reduce to k levels
Direction 45 Direction 90 Direction 135
Image Database
LDRP Patterns
Weighted Histograms
Direction 0
Direction 0
Direction 45
Direction 45
Direction 90
Direction 90
Direction 135
Direction 135
Similarity Measurement
Feature Vector
CR IP T
Retrieval Results
Histogram bins Reduction
Figure 6: The flowchart of the proposed CBIR system.
NC . In each experiment, an image in the database is given to the system as query and the distance between the query image and each images in the database is calculated using (31), then the m
AN US
images with minimum distances are retrieved. Finally, the precision and recall for each query are computed and by averaging over all images in the database, the total average precision and recall are obtained. 4.1. Brodatz database
M
Brodatz database which consists of 112 gray-scale images with dimension of 640 × 640 pixels has been used to evaluate texture-based CBIR systems in many researches [3, 21, 40, 41, 42]. In
ED
this research, this database is augmented as follows. Each image is rotated by 0◦ , 20◦ , 45◦ , 70◦ , 90◦ , 120◦ , 135◦ and 180◦ . Then, each rotated image is divided into 4 sub-images. Another sub-image is
PT
considered from the central part of the rotated image. This procedure leads to extract 5 sub-images of size 320 × 320 from each rotated image. In Brodatz database with 112 image categories and one
CE
image per category, each image is augmented to 8 rotated images and 5 sub-images are extracted from each one. Therefore, in augmented database each category contains NC = 8 × 5 = 40 images
AC
and the total number of images is NT = 112 × 40 = 4480. Fig. 7 shows the created images of a category from Brodatz database. Fig. 8 shows the total average precision for the proposed method for different k, P and m = 20 on the Brodatz database. According to Fig. 8, by increasing k, the precision and recall are initially increased. In fact, the high value of k increases the resolution of the patterns. So, the patterns generate a feature vector with more information. Besides, generating the noisy patterns, an excessive increase of k results in lower precision and recall (especially for bigger P ). Moreover, by increasing P , the precision and recall are increased, especially for smaller 21
CR IP T
ACCEPTED MANUSCRIPT
Figure 7: Created images of a category from the Brodatz database.
90
P=2
80 70 60 50 40 30
3
4
P=3
5
M
2
AN US
Total average precision (%)
100 P=4
6 k
7
P=5
8
9
10
ED
Figure 8: Total average precision for the proposed method on Brodatz database for different k, P and m = 20.
k. The higher the P , the more meaningful and more complex the patterns are, thus, the generated
PT
features from these patterns lead to higher accuracy in retrieval performance. Fig. 9 shows the total average precision for the proposed method for different derivative orders
CE
(n) and P , k = 4 and m = 20 on the Brodatz database. It can be concluded from Fig. 9 that by increasing n, the precision is decreased due to the noise which is generated from the higher-
AC
order derivatives. Table 3 reports the total LDRP average precision and prior methods (LBP, SLBP, FLBP, LTP, LDP, LTrP and LVP) on Brodatz database for different m. For each m, the best result is shown in bold. Fig. 10 depicts the total average precision for these methods for m = 20. From this figure we can observe that: (1) The first-order LDRP outperforms LBP by 8.05%, SLBP by 37.42%, FLBP by 29.73% and LTP by 3.82% in terms of average precision; (2) In the first-order derivative space, the precision of LDRP is increased by 11.44%, 26.60% and 13.88% in comparison with LDP, LTrP and LVP, respectively; (3) In the second-order derivative space, 22
ACCEPTED MANUSCRIPT
Total average precision (%)
90 n=1
n=2
n=3
80 70 60 50 40 30 P=3 P=4 P=5 Number of adjacent pixels
P=6
CR IP T
P=2
Figure 9: Total average precision for the proposed method for different derivative orders (n) and (P ), k = 4, and m = 20 on Brodatz database. 90 70 60
20 10
LDP LTrP LVP(D=1) LVP(D=2) LVP(D=3) LDRP(P=5,k=4) LDRP(P=5,k=5) LDRP(P=6,k=4)
0
2nd-order
3rd-order
M
1st-order
LDP LTrP LVP(D=1) LVP(D=2) LVP(D=3) LDRP(P=4,k=3) LDRP(P=5,k=2) LDRP(P=5,k=3)
30
LDP LTrP LVP(D=1) LVP(D=2) LVP(D=3) LDRP(P=5,k=3) LDRP(P=5,k=4) LDRP(P=6,k=3)
40
AN US
50
LBP SLBP FLBP LTP
Total average precision (%)
80
ED
Figure 10: Comparisons of LDRP with prior methods in terms of total average precision on Brodatz database for m = 20.
the precision of LDRP is enhanced by 15.18%, 27.90% and 20.32% compared with LDP, LTrP
PT
and LVP, respectively; (4) In the third-order derivative space, the precision of LDRP works better than LDP, LTrP and LVP by 19.79%, 35.02% and 29.29%, respectively. Moreover, the results of
CE
Table 3 show that by increasing the number of retrieved images (m), the difference between LDRP and the other methods is increased. Indeed, LDRP not only outperforms the other methods for
AC
all numbers of retrieved images but also works better for larger values of them. In addition, to compare LDRP with the prior methods, the precision versus the number of top
matches (m) and recall are depicted in Fig. 11. It is evident that LDRP outperforms others (LBP, SLBP, FLBP, LTP, LDP, LTrP and LVP).
23
ACCEPTED MANUSCRIPT
M
20 10
95
100
90
85
80
75
70
65
60
55
40
50
0
35
45
15 20 25 30 Number of retrieved images (m)
30
40
10
AC
5
40
35
25
SLBP LTP LTrP LDRP
50
30
35
60
25
45
70
20
55
80
15
65
LBP FLBP LDP LVP
90
5
75
100
SLBP LTP LTrP LDRP
10
PT
85
LBP FLBP LDP LVP
Total average precision (%)
ED
95
CE
Total average precision (%)
3rd-order
AN US
2nd-order
CR IP T
1st-order
Table 3: Comparisons of LDRP with prior methods in terms of total average precision on Brodatz database. Total Average Precision (%) Method m=5 m=10 m=15 m=20 m=25 m=30 m=35 m=40 LBP 98.31 92.13 84.10 72.76 64.38 55.40 48.52 43.19 SLBP 88.82 65.95 52.24 43.39 37.21 32.60 29.12 26.44 FLBP 93.45 72.87 60.12 51.08 44.52 39.31 35.29 32.01 LTP 98.60 94.04 87.45 76.99 69.18 59.45 52.08 46.73 LDP 98.15 90.36 81.60 69.37 60.69 52.44 46.02 41.06 LTrP 96.29 78.03 64.57 54.21 46.80 40.53 35.73 31.98 LVP(D=1) 96.53 86.12 74.84 63.35 55.36 48.52 43.15 38.82 LVP(D=2) 97.25 88.47 77.56 66.93 58.94 51.35 45.38 40.65 LVP(D=3) 96.47 86.51 75.24 64.08 56.03 49.08 43.63 39.24 LDRP(P=5, k=4) 99.23 94.91 88.59 80.09 72.58 63.88 57.08 51.57 LDRP(P=5, k=5) 99.28 94.93 88.49 79.80 72.29 63.28 56.38 50.85 LDRP(P=6, k=4) 99.38 94.96 88.68 80.81 73.56 64.92 57.98 52.40 LDP 97.42 86.69 75.22 61.83 52.96 46.20 41.14 37.02 LTrP 95.42 74.07 59.47 49.11 42.06 36.70 32.56 29.30 LVP(D=1) 95.80 76.28 62.17 51.27 44.00 38.53 34.38 31.02 LVP(D=2) 96.51 78.92 66.60 56.69 49.59 43.05 38.01 34.02 LVP(D=3) 95.63 75.90 62.62 52.56 45.70 40.09 35.70 32.18 LDRP(P=5, k=3) 98.55 92.87 85.08 76.42 68.70 60.18 53.51 48.15 LDRP(P=5, k=4) 98.77 93.42 86.14 77.01 69.24 60.07 52.93 47.38 LDRP(P=6, k=3) 98.25 91.98 84.42 75.61 68.11 60.60 54.60 49.72 LDP 96.60 82.01 67.35 53.72 45.03 38.72 33.99 30.43 LTrP 93.78 63.27 47.93 38.49 32.26 27.91 24.62 22.10 LVP(D=1) 93.39 64.85 49.49 39.77 33.59 29.16 25.80 23.20 LVP(D=2) 94.22 67.88 53.70 44.22 37.80 32.85 29.11 26.19 LVP(D=3) 93.45 67.50 51.60 41.63 35.17 30.55 27.05 24.30 LDRP(P=4, k=3) 95.84 85.54 75.46 65.57 57.91 51.08 45.71 41.46 LDRP(P=5, k=2) 94.21 83.63 74.62 66.39 59.59 53.08 47.76 43.50 LDRP(P=5, k=3) 98.03 90.98 82.51 73.51 65.64 57.30 50.76 45.59
Total average recall (%)
(a)
(b)
Figure 11: Comparisons of LDRP with prior methods in terms of the total average precision on Brodatz database (a) versus the number of retrieved images (m) and (b) versus the total average recall.
4.2. VisTex database The VisTex database has been recently used in many researches to evaluate the texture-based systems and involves 54 color images with dimension of 512 × 512 pixels [3, 43, 44, 45]. Similar 24
ACCEPTED MANUSCRIPT
Total average precision (%)
100 90 80 70 60 50 P=2
40
P=3
P=4
P=5
30
3
4
5
6 k
7
8
9
10
CR IP T
2
Figure 12: Total average precision and recall for the proposed method on VisTex database for different k, P and m = 20.
to Brodatz database, we augment VisTex database. The procedure of constructing augmented database in VisTex database is same with Brodatz database except the dimension of sub-images
AN US
is 256 × 256. Therefore, in augmented database, each category contains NC = 8 × 5 = 40 images and the total number of images is NT = 54 × 40 = 2160. Here, a series of examinations has been done similar to Brodatz database. Fig. 12 illustrates the total average precision of LDRP method for different k, P and m = 20. From this figure, it is evident that the behavior of LDRP on VisTex
M
database versus k and P is similar to Brodatz database. Fig. 13 demonstrates the total average precision with respect to the number of adjacent pixels (P ) and derivative order (n) for k = 4
ED
and m = 20. It can be concluded that by increasing n, the precision is dropped. Table 4 reports the total average precision for prior local patterns and the proposed LDPR on VisTex database
PT
for different values of m. For each m, the best result is shown in bold. demonstrates the total average precision of LDRP and the prior local patterns on VisTex database for different m in which
CE
the best results were shown in bold font for each m. Fig. 14 shows the total average precision for m = 20. From this figure, it can be concluded that: (1) The first-order LDRP works better
AC
than LBP by 11.42%, SLBP by 36.03%, FLBP by 25.86% and LTP by 5.17% in terms of average precision; (2) In the first-order derivative space, the precision of LDRP outperforms LDP, LTrP and LVP by 14.40%, 29.02% and 17.16% respectively; (3) In the second-order derivative space, the precision of LDRP is enhanced by 17.67%, 28.84% and 25.69% compared with LDP, LTrP and LVP respectively; (4) In the third-order derivative space, the precision of LDRP outperforms LDP, LTrP and LVP by 26.09%, 37.90% and 37.15%, respectively. The results of Table 4 also
25
ACCEPTED MANUSCRIPT
confirm that the performance of LDRP is much better than the other methods when the number of retrieved images is increased.
PT
ED
3rd-order
M
2nd-order
AN US
1st-order
CR IP T
Table 4: Comparisons of LDRP with prior methods in terms of total average precision on VisTex database. Total Average Precision (%) Method m=5 m=10 m=15 m=20 m=25 m=30 m=35 m=40 LBP 98.41 93.92 88.40 80.49 74.16 64.87 57.43 51.47 SLBP 89.31 74.68 63.49 55.88 50.18 45.37 41.41 38.19 FLBP 94.56 82.93 73.46 66.05 60.48 55.31 51.03 47.40 LTP 99.18 96.25 92.43 86.75 81.78 74.55 68.24 62.83 LDP 98.57 93.93 87.70 77.51 69.88 60.97 54.12 48.74 LTrP 97.27 84.78 72.78 62.89 55.67 49.01 44.01 39.93 LVP(D=1) 97.54 89.84 79.40 67.51 59.24 52.28 46.93 42.77 LVP(D=2) 97.97 92.11 84.03 74.75 67.42 59.19 52.71 47.48 LVP(D=3) 97.39 90.23 81.61 72.65 65.56 57.99 51.79 46.80 LDRP(P=5, k=4) 99.33 97.47 94.62 90.05 85.62 78.49 72.26 66.57 LDRP(P=5, k=5) 99.59 97.87 94.76 90.40 85.80 78.34 71.93 66.16 LDRP(P=6, k=4) 99.47 98.00 95.45 91.91 88.05 81.03 74.76 68.81 LDP 98.83 91.32 81.30 68.42 60.05 52.83 47.39 43.01 LTrP 97.08 80.44 67.13 57.25 50.26 44.68 40.43 37.06 LVP(D=1) 96.99 77.36 63.23 52.03 44.86 39.46 35.34 32.08 LVP(D=2) 97.22 82.02 69.91 60.40 53.58 47.24 42.17 38.20 LVP(D=3) 96.38 79.57 66.32 56.17 49.23 43.70 39.31 35.80 LDRP(P=5, k=3) 99.10 96.47 92.64 86.69 81.38 74.03 67.58 62.04 LDRP(P=5, k=4) 98.94 96.25 92.53 86.94 81.84 73.92 67.29 61.47 LDRP(P=6, k=3) 98.81 95.75 91.66 86.09 80.82 74.34 68.62 63.51 LDP 98.04 87.13 73.10 59.03 50.06 43.68 39.06 35.31 LTrP 96.50 74.66 58.69 47.22 39.91 34.93 31.26 28.47 LVP(D=1) 95.94 69.07 53.32 42.94 36.42 31.84 28.51 25.94 LVP(D=2) 95.68 72.66 58.20 47.97 41.17 36.13 32.56 29.68 LVP(D=3) 95.69 72.15 57.06 46.30 39.37 34.32 30.51 27.55 LDRP(P=4, k=3) 95.40 86.87 79.13 71.60 65.13 58.52 53.24 48.74 LDRP(P=5, k=2) 95.88 87.38 79.91 73.25 67.16 60.73 55.56 51.21 LDRP(P=5, k=3) 98.72 95.49 91.47 85.12 79.42 71.62 65.08 59.46
CE
Fig. 15 illustrates comprehensive comparisons of LDRP with other methods in terms of total average precision for various numbers of retrieved images (m) and total average recall. The
AC
comparisons confirm that the performance of LDRP is significantly higher than prior methods.
26
ACCEPTED MANUSCRIPT
Total average precision (%)
100
n=1
90
n=2
n=3
80
70 60 50 40 30 P=3 P=4 P=5 Number of adjacent pixels
P=6
CR IP T
P=2
2nd-order
3rd-order
terms of total average precision on VisTex database for
ED
M
48.74 51.21 59.46
LDP LTrP LVP(D=1) LVP(D=2) LVP(D=3) LDRP(P=4,k=3) LDRP(P=5,k=2) LDRP(P=5,k=3)
53.24 55.56 65.08
5 10 15 20 25 30 35 40 98.41 93.92 88.40 80.49 74.16 64.87 57.43 51.47 89.31 74.68 63.49 55.88 50.18 45.37 41.41 38.19 1st-order 94.56 82.93 73.46 66.05 60.48 55.31 51.03 47.4 99.18 96.25 92.43 86.75 81.78 74.55 68.24 62.83 98.57 87.70 77.51 69.88 48.74 Figure 14:93.93 Comparisons of LDRP with60.97 prior 54.12 methods in 97.27 84.78 72.78 62.89 55.67 49.01 44.01 39.93 m = 20. 97.97 92.11 84.03 74.75 67.42 59.19 52.71 47.48 99.47 98.00 95.45 91.91 88.05 81.03 74.76 68.81
15 20 25 30 Number of retrieved images (m)
35
40
95
90
85
80
75
70
65
60
55
0 100
10
10 50
LDRP
20
45
LVP
30
40
LTrP
40
35
LDP
AC
5
LTP
50
30
30
FLBP
SLBP LTP LTrP LDRP
60
25
40
SLBP
CE
50
LBP
70
20
60
80
15
70
LBP FLBP LDP LVP
90
5
80
100
PT
90
Total average precision (%)
100 Total average precision (%)
LBP SLBP FLBP LTP LDP LTrP LVP LDRP
58.52 60.73 71.62
10
65.13 67.16 79.42
LDP LTrP LVP(D=1) LVP(D=2) LVP(D=3) LDRP(P=5,k=3) LDRP(P=5,k=4) LDRP(P=6,k=3)
71.60 73.25 85.12
AN US
86.87 87.38 95.49
LDP LTrP LVP(D=1) LVP(D=2) LVP(D=3) LDRP(P=5,k=4) LDRP(P=5,k=5) LDRP(P=6,k=4)
95.40 95.88 98.72
LBP SLBP FLBP LTP
LDRP(P=4,k=3) LDRP(P=5,k=2) LDRP(P=5,k=3)
100 90 80 70 60 50 79.13 40 79.91 30 91.47 20 10 0 Total average precision (%)
3rd-order
Figure 13: Total average precision in terms of the number of adjacent pixels (P ) and derivative order (n) for k = 4 and m = 20 on VisTex database.
Total average recall (%)
(a)
(b)
Figure 15: Comparisons of LDRP with prior methods in terms of the total average precision on VisTex database (a) versus the number of retrieved images (m) and (b) versus the total average recall.
Table 5 tabulates the dimension of feature vector for LDRP (n = 1, k = 4 and P = 5) and prior methods. Although the feature vector dimension of LDRP is higher than others, the performance is significantly improved in terms of total average precision comparing with the other methods. 27
ACCEPTED MANUSCRIPT
Fig. 16 demonstrates the total retrieval time of all methods. It can be concluded that in Brodatz database, our proposed method has the larger retrieval time in comparison with LTrP method while it is vice versa in VisTex database. It is due to this fact that the retrieving process is done in two feature extraction and feature matching steps. In comparison with other methods, our proposed method is quicker in feature extraction and slower in feature matching since the
CR IP T
feature vector is a high-dimension vector. Brodatz database has more images and needs more feature matching processing rather than VisTex database. Therefore, the retrieval time in Brodatz database is longer than VisTex database. Brodatz Database 0.214 0.354 VisTex Database
0.201
0.328
0.419 0.224 0.343 0.203
0.32 0.284
0.687 0.546
0.451 0.327
0.737 0.398
AN US
Table 5: Dimension of feature vector for LDRP (n = 1, k = 4 and P = 5) and prior methods. LBP SLBP FLBP LTP LDP LTrP LVP LDRP Dimension 36 256 256 72 144 468 144 3124
0.8 0.6 0.5 0.4 0.3 0.2 0.1
LBP
SLBP
FLBP
LTP
LDP
LTrP
LVP
LDRP
M
Retrieval time (s)
0.7
ED
0
Brodatz Database
VisTex Database
CE
5. Conclusion
PT
Figure 16: Comparisons of LDRP with prior methods in terms of the total retrieval time on Brodatz and VisTex databases.
In this paper, we presented a novel local pattern descriptor, Local Derivative Radial Pattern
AC
(LDRP) for generating powerful texture descriptor for CBIR systems. Different from the prior local patterns, the proposed LDRP was based on the coding of micropatterns along a line. It could effectively represent the texture using multi-level coding of micropatterns derivatives. Therefore, these patterns were a better description and more suitable for texture information extraction with more meaningful data from the image. In addition, multi-level coding in different directions was used instead of binary coding so that higher precision in image retrieval was achieved. Moreover, a novel weighting scheme was presented to equalize and balance the effect of different derivative 28
ACCEPTED MANUSCRIPT
levels. Finally, the similarity measure was adopted to evaluate the performance of various methods. In comparison with the previous ones, the proposed similarity measure was more robust against image rotation. Experimental results on two large databases demonstrated the LDRP outperforms all prior local patterns. References
CR IP T
[1] S. Feng, D. Xu, and X. Yang, Attention-driven salient edge (s) and region (s) extraction with application to CBIR, Signal Processing 90 (1) (2010) 1–15. [2] M. Wang, Z.-L. Ye, Y. Wang, and S.-X. Wang, Dominant sets clustering for image retrieval, Signal Processing 88 (11) (2008) 2843–2849.
AN US
[3] S. Murala, R. Maheshwari, and R. Balasubramanian, Local tetra patterns: a new feature descriptor for content-based image retrieval, IEEE Transactions on Image Processing 21 (5) (2012) 2874–2886. [4] L. Zhang, H. P. Shum, and L. Shao, Discriminative Semantic Subspace Analysis for Relevance Feedback, IEEE Transactions on Image Processing 25 (3) (2016) 1275–1287. [5] F. Long, H. Zhang, and D. D. Feng, Fundamentals of content-based image retrieval, in: Multimedia Information Retrieval and Management, Springer, (2003), pp. 1–26.
M
[6] X. Zhang, X. Zhao, Z. Li, J. Xia, R. Jain, and W. Chao, Social image tagging using graph-based reinforcement on multi-type interrelated objects, Signal Processing 93 (8) (2013) 2178–2189.
ED
[7] T. Kato, Database architecture for content-based image retrieval, SPIE/IS&T 1992 symposium on electronic imaging: science and technology (1992) 112–123.
PT
[8] P. Androutsos, A. Kushki, K. N. Plataniotis, and A. N. Venetsanopoulos, Aggregation of color and shape features for hybrid query generation in content based visual information retrieval, Signal Processing 85 (2) (2005) 385–393.
CE
[9] M. Subrahmanyam, R. Maheshwari, and R. Balasubramanian, Local maximum edge binary patterns: a new descriptor for image retrieval and object tracking, Signal Processing 92 (6) (2012) 1467–1479.
AC
[10] J. Zhang and L. Ye, Local aggregation function learning based on support vector machines, Signal processing 89 (11) (2009) 2291–2295. [11] Y. Qian, R. Hui, and X. Gao, 3D CBIR with sparse coding for image-guided neurosurgery, Signal Processing 93 (6) (2013) 1673–1683. [12] Z. He, X. You, and Y. Yuan, Texture image retrieval based on non-tensor product wavelet filter banks, Signal Processing 89 (8) (2009) 1501–1510. [13] V. R. Rallabandi and V. S. Rallabandi, Rotation-invariant texture retrieval using waveletbased hidden Markov trees, Signal Processing 88 (10) (2008) 2593–2598.
29
ACCEPTED MANUSCRIPT
[14] H. Farsi and S. Mohamadzadeh, Colour and texture feature-based image retrieval by using Hadamard matrix in discrete wavelet transform, IET Image Processing 7 (3) (2013) 212–218. [15] R. Jain, R. Kasturi, and B. G. Schunck, Machine vision, McGraw-Hill, New York, 1995. [16] H. Tamura, S. Mori, and T. Yamawaki, Textural features corresponding to visual perception, IEEE Transactions on Systems, Man, and Cybernetics 8 (6) (1978) 460–473.
CR IP T
[17] F. Liu and R. W. Picard, Periodicity, directionality, and randomness: Wold features for image modeling and retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence 18 (7) (1996) 722–733. [18] Y. D. Chun, N. C. Kim, and I. H. Jang, Content-based image retrieval using multiresolution color and texture features, IEEE Transactions on Multimedia 10 (6) (2008) 1073–1084.
AN US
[19] M. N. Arani and H. Ghassemian, A hierarchical content-based image retrieval approach to assisting decision support in clinical dermatology, Iranian Journal Of Electrical And Computer Engineering 9 (1) (2010) 23–33. [20] Y. D. Chun, S. Y. Seo, and N. C. Kim, Image retrieval using BDIP and BVLC moments, IEEE Transactions on Circuits and Systems for Video Technology 31 (9) (2003) 951–957. [21] S. Liao, M. W. Law, and A. C. Chung, Dominant local binary patterns for texture classification, IEEE Transactions on Image Processing 18 (5) (2009) 1107–1118.
M
[22] Z. Guo, L. Zhang, and D. Zhang, Rotation invariant texture classification using LBP variance (LBPV) with global matching, Pattern Recognition 43 (3) (2010) 706–719.
ED
[23] Z. Guo, L. Zhang, and D. Zhang, A completed modeling of local binary pattern operator for texture classification, IEEE Transactions on Image Processing 19 (6) (2010) 1657–1663.
PT
[24] S.-Z. Su, S.-Y. Chen, S.-Z. Li, S.-A. Li, and D.-J. Duh, Structured local binary Haar pattern for pixel-based graphics retrieval, IET Electronics Letters 46 (14) (2010) 996–998.
CE
[25] Ahonen, T., and Pietikinen, M., Soft histograms for local binary patterns, Proceedings of the Finnish signal processing symposium, (FINSIG 2007) 5 (2007) 1–4.
AC
[26] Iakovidis, D. K., Keramidas, E. G., and Maroulis, D., Fuzzy local binary patterns for ultrasound texture characterization, International Conference Image Analysis and Recognition, Berlin (2008) 750–759. [27] N. Dalal and B. Triggs, Histograms of oriented gradients for human detection, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition 1 (2005) 886–893. [28] V. Chandrasekhar, G. Takacs, D. Chen, S. Tsai, R. Grzeszczuk, and B. Girod, CHoG: Compressed histogram of gradients a low bit-rate feature descriptor, IEEE Conference on Computer Vision and Pattern Recognition (2009) 2504–2511. [29] V. Chandrasekhar, Y. Reznik, G. Takacs, D. Chen, S. Tsai, R. Grzeszczuk, and B. Girod, Quantization schemes for low bitrate compressed histogram of gradients descriptors, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops (2010) 33–40. 30
ACCEPTED MANUSCRIPT
[30] X. Wang, T. X. Han, and S. Yan, An HOG-LBP human detector with partial occlusion handling, 2009 IEEE 12th International Conference on Computer Vision (2009) 32–39. [31] X. Tan and B. Triggs, Enhanced local texture feature sets for face recognition under difficult lighting conditions, IEEE Transactions on Image Processing 19 (6) (2010) 1635–1650. [32] B. Zhang, Y. Gao, S. Zhao, and J. Liu, Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor, IEEE Transactions on Image Processing 19 (2) (2010) 533–544.
CR IP T
[33] Z. Guo, Q. Li, J. You, D. Zhang, and W. Liu, Local directional derivative pattern for rotation invariant texture classification, Neural Computing and Applications 21 (8) (2012) 1893–1904. [34] A. Oberoi, V. Bakshi, R. Sharma, and M. Singh, A framework for medical image retrieval using local tetra patterns, International Journal of Engineering and Technology 5 (1) (2013) 27–36.
AN US
[35] K.-C. Fan and T.-Y. Hung, A novel local pattern descriptor-local vector pattern in high-order derivative space for face recognition, IEEE Transactions on Image Processing 23 (7) (2014) 2877–2891. [36] Dong, Y., Tao, D., Li, X., Ma, J., and Pu, J., Texture classification and retrieval using shearlets and linear regression, IEEE transactions on cybernetics 45 (3) (2015) 358–369.
M
[37] Dong, Y., Tao, D., and Li, X., Nonnegative Multiresolution Representation-Based Texture Image Classification, ACM Transactions on Intelligent Systems and Technology (TIST), 7 (1) (2015) 4.
ED
[38] Brodatz P., Textures: A Photographic Album for Artists and Designers, Dover, New York (1966). http://multibandtexture.recherche.usherbrooke.ca/original brodatz.html. [39] Picard R.W. and Minka T.P., Vision Texture for Annotation, Multimedia Systems 3 (1) (1995). http://vismod.media.mit.edu/vismod/imagery/VisionTexture/Images/Reference/.
CE
PT
[40] T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (7) (2002) 971–987. [41] P. Rajavel, Directional Hartley transform and content based image retrieval, Signal Processing 90 (4) (2010) 1267–1278.
AC
[42] R. Davarzani, S. Mozaffari, and K. Yaghmaie, Scale-and rotation-invariant texture description with improved local binary pattern features, Signal Processing 111 (2015) 274–293. [43] S. K. Vipparthi, S. Murala, A. B. Gonde, and Q. J. Wu, Local directional mask maximum edge patterns for image retrieval and face recognition, IET Computer Vision 10 (3) (2016) 182–192. [44] J. J. d. M. S. Junior, P. C. Cortez, and A. R. Backes, Color texture classification using shortest paths in graphs, IEEE Transactions on Image Processing 23 (9) (2014) 3751–3761. [45] J.-M. Guo, H. Prasetyo, H. Lee, and C.-C. Yao, Image retrieval using indexed histogram of Void-and-Cluster Block Truncation Coding, Signal Processing 123 (2016) 143–156. 31