Biologically Inspired Cognitive Architectures 18 (2016) 95–104
Contents lists available at ScienceDirect
Biologically Inspired Cognitive Architectures journal homepage: www.elsevier.com/locate/bica
Research article
Retinal model-based visual perception: Applied for medical image processing T. Rajalakshmi a,⇑, Shanthi Prince b a b
Department of Biomedical Engineering, SRM University, Kattankulathur, 603 203, India Department of Electronics & Communication Engineering, SRM University, Kattankulathur, 603 203, India
a r t i c l e
i n f o
Article history: Received 16 May 2016 Revised 27 September 2016 Accepted 28 September 2016
Keywords: Human visual system (HVS) Spatio- temporal filtering Compression Retinal layers Quality metrics
a b s t r a c t The Human Visual System (HVS) model based image quality metrics, correlates strongly with the evaluations of image quality as well as with human observer performance in the visual recognition process. Physiological modeling of retina plays a vital role in the development of high-performance image processing methods for better visual perception. For image processing in medical diagnosis, one has to follow several steps like image preprocessing, image segmentation, feature extraction, image recognition, and interpretation. This work aims at developing human visual system based image processing which stands advantageous when compared with the conventional processing methods. The main aim of this work is to develop a model for retina, which has complex neural structure, capable of detecting the incoming light signal and transforms the signal before transmitting it through the optic nerve. This retinal model comprises of the photoreceptor, outer-plexiform and inner-plexiform layers exhibiting the properties of compression and spatiotemporal filtering in the processing of visual information. The spatial frequency value is evaluated using Discrete Cosine Transform (DCT) technique thereby enhancing the contrast visibility in the dark area and maintaining the same in the bright area using photoreceptor layer of the retina. Contour contrast enhancement is achieved by modeling outer- plexiform layer of retina and parvo channel of the inner-plexiform layer is modeled to extract finer details of the image. The properties like luminance, spatial and temporal frequencies were considered to develop the human visual system based retinal model. The proposed model is applied to a wide variety of medical images and with simulated results it has been proved that the texture feature values of the processed image are found to be higher than the original input image. Further, this method proves to be more flexible which enables easier practical implementation when compared to that of generic medical image processing techniques. Ó 2016 Elsevier B.V. All rights reserved.
Introduction Imaging modalities like X-rays, Computed Tomograms, Ultrasound, and Magnetic Resonance Imaging are used to assess the condition of an organ/full body. Proper diagnosis and treatment are aided by monitoring the physiological condition over an observation period. To make diagnosis simpler and accurate, the images obtained through the scanning modalities are subjected to processing. Medical image processing technique go through the following steps for disease diagnosis and to check for normal and abnormal conditions, (i) The first step is image preprocessing to filter noise and to enhance image quality. (ii) The next step is image segmentation where the region of interest is segmented using different ⇑ Corresponding author. E-mail addresses:
[email protected] (T. Rajalakshmi), shanthi.
[email protected] (S. Prince). http://dx.doi.org/10.1016/j.bica.2016.09.005 2212-683X/Ó 2016 Elsevier B.V. All rights reserved.
segmentation algorithms. (iii) The third step is feature extraction, where different textures and statistical features are extracted to analyze the morphological behavior of the image. (iv) The final step is classification, where the image is classified as normal and abnormal image by comparing the values of the extracted features. The software can be used for simulation to evaluate strategies and to perform planned treatments (Costin, 2014). To extract the finer details of the image, the acquired image is subjected to several step by step processing, like image pre-processing, image segmentation, feature extraction, image recognition, and interpretation. Algorithms are written to process the image which makes the system more complex. The main scope of these algorithms is fairly expansive, ranging from automatically extracting Region of Interest (ROI) as in the case of segmentation thereby improving the quality of perceived image using image enhancement. Algorithm testing for image processing applications is carried out to check whether the particular
96
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
algorithm satisfies its specification relating to criteria such as accuracy and robustness. Testing an algorithm helps to analyze the algorithm both qualitatively and quantitatively. Performance metrics is a meaningful and computable measure used for quantitatively evaluating the performance of any algorithm. For accessing image quality, there is no single quantitative metrics which correlates well with image quality as perceived by the human visual system. Image quality measurement is important in many image processing applications. Quality assessment methods have been carried out in the literature to assess the characteristics of the human visual system (Wang, Bovik, Sheikh, & Simoncelli, 2004). Gao, Lu, Tao, and Li (2010) in his study proposed image quality assessment technique using human visual system model to provide a better understanding of image using two methods namely bionic and engineering methods. But the main issue in their study was the metrics not extended for chromatic image quality assessment. Pei and Qiao (2010) proposed a cascade model of the retina to gain the output image of bipolar cell layer and spike trains of ganglion cells. The model proposed by the author was further extended to develop artificial retina prosthesis. In his work, the author did not consider contrast gain, the translation equation between cell layers was not accurate, and the connectivity between cell layers was considered neglecting the effect between the cells. Hui, Xu-Dong, and Song (2010) proposed retina model to stimulate complex structure of retina covering main information processing pathway. Mantiuk, Myszkowski, and Seidel (2006) proposed a framework for image processing application that works in the visual response space. The author in his work developed tone mapping algorithm to produce sharper images, but the accuracy was not too good. Hafiz, Alnajjar, and Alnajjar (2010) proposed a model from mammalian retina based on dynamic mask inspired by the neuron connection of retina towards better robotic vision Senan, Saadane, and Barba (2001) developed an HVS model for information coding application. The study involved modeling of visual cortex that is subjected to the high-level information processing and did not concentrate on what happens at the retina level, i.e., low-level processing. Physiological modeling of the retina was carried out including functionalities like luminance, compression properties, spatial and temporal frequencies for better visual perception. Hérault and Durette (2007) developed a model of the retina for processing visual information which includes functionalities like sampling, spatiotemporal, nonlinearity, and color coding. In their work authors have not illustrated much on compression properties of the retina. In the proposed retinal model study on detecting the perception on various layers of retina based on compression, properties are carried out. A lot of works has been carried out to understand the functionalities of HVS and further to apply this knowledge in various image processing applications. A model for processing color was proposed by Herault (1996). In their work, the authors have presented a detailed biological perspective of the interactions taking place in the different layers of the retina. These perspectives have helped in understanding the cellular interactions taking place in the retina and design filter models accordingly. Benoit, Caplier, Durette, and Herault (2010) have proposed a model for different layers of the retina and the processing that occurs in the primary visual cortex of the brain. The image acquisition property of the photoreceptors was modeled on an enzyme relationship called the Michaelis-Menten relationship (Beaudot, 1994, 1996). The subsequent layers have been modeled as low pass spatiotemporal filters which interact to give band-pass spatiotemporal effects, retaining spatial frequencies in a particular band only (Benoit et al., 2010). A study by Ravikumar and Rattan (2012) on analysis of various quality metrics for medical image processing showed that on enhancing the contrast value of an image by increasing the
variance, image quality metrics like mean square error (MSE) value increases and the peak signal to noise ratio (PSNR) value decreases. This paper presents the comparative study of various quality metrics for medical image processing application. The proposed work is based on Benoit et al. (2010) but modified regarding spatial and temporal frequencies. The development of such retina model allows the extraction of finer details of the image. The purpose of this work is to address the importance of applying the HVS model to image processing. Physiological-based HVS model considering image compression properties is incorporated to obtain good image quality at low bandwidth. The main aim of this study is to develop a mathematical model of the retinal layer to extract the finer details of the image thereby reducing the complexity involved in generic image processing techniques. Retinal structure details are given in Section ‘‘Retinal structure”, followed by mathematical modeling of retinal layers in Section ‘‘Mathematical modeling of retinal layers”. Retinal structure Human retina contains nearly about 200 million nerve cells; it is less than a millimeter thick over most of its extent (Kaiser & Boynton, 1996). The retina detects light falling on it, and then converts the incoming light into its equivalent electrical signal and later performs initial processing of signal and finally the processed information is sent to the brain through optic nerve where this information is perceived as an image. Human retina acquires information from the outside world, performs sampling, compresses the information and sends the information to the brain. Path of information flow from the light source to the optic nerve fiber is derived from a three neuron chain which starts from the photoreceptor layer to bipolar cells to ganglion cells Fig. 1. The first and foremost layer of the retina is the photoreceptor layer which is responsible for visual data acquisition; this layer is also associated with local logarithmic compression of the image luminance. Photoreceptor layer is in-turn connected to the ganglion cell layer through a series of neurons. There are two main receptors in the photoreceptor layer namely, the cones responsible for color vision processing and are color sensitive. The rods are even more sensitive than cones and are responsible for dim light vision. Rods are responsible for producing low-level illumination that gives rise to scotopic vision. Rods and cones, in general, produce a nonlinear response. The retina cells are connected to each other for better visual perception, thereby forming two main layers namely, the Outer-Plexiform Layer (OPL) and the Inner-Plexiform Layer (IPL). In the Outer-Plexiform Layer (OPL), the signals are transmitted from the photoreceptors layer to two kinds of cells namely, the bipolar cells and the horizontal cells through a junction called synaptic triad. Connections between cones and bipolar cells are of a one-to-one type in the fovea region; several bipolar cells may be connected to the same cone (Yang et al., 2004). If a cone is excitatory to a bipolar cell, it is also excitatory to a horizontal cell, and this horizontal cell is, in turn, inhibitory to the bipolar one. In the Inner-Plexiform Layer (IPL), the bipolar cells are connected to ganglion cells and the amacrine cells, the axons of the ganglion cell constitute the optic nerve, and the amacrine cells play a similar role as the horizontal cells in the OPL. Retinal layer based HVS model An understanding of the human visual system plays a very crucial role in the design of image processing system. The block diagram of the proposed retinal layer model is shown in Fig. 2. Photoreceptor acts as a sensor of the human visual system which is responsible for converting a photon into a nerve signal.
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
97
Fig. 1. Structure of retina (courtesy: http://www.cvrl.org).
Fig. 2. Retinal layers in HVS model.
This process is called photo- transduction (Benoit et al., 2010). The response of the photoreceptor layer is then transmitted to the outer plexiform layer. The HVS has been modeled as in Hérault and Durette (2007) and Beaudot (1994). In the proposed design of retinal layer model, the three main layers namely photoreceptor layer, outer-plexiform layer, and the inner-plexiform layer is considered. The photoreceptor layer enhances the contrast value of the image for different values of compression parameters ranging from 0 to 1. Photoreceptor layer output is fed to the outer- plexiform layer. In the outer-plexiform layer, photoreceptor cell layer also interacts with the horizontal cell layer. The outer-plexiform layer is modeled considering the response of bipolar ON and OFF channel thereby enhancing the contours. The response of the outerplexiform layer is passed to the inner-plexiform layer. The innerplexiform layer provides us with finer details of the image and information on motion analysis for processing visual data. The output of the inner plexiform layer has two channels namely parvo and magno. Our work focuses only on parvo channel, which is concerned with the extraction of the finer detail of the image; magno channel is associated with motion analysis. In this study, only still images are considered, and only parvo channel is concentrated for processing visual information. Mathematical modeling of retinal layers As shown in Fig. 2, the main retinal layers involved in the signal processing are photoreceptor layer, outer-plexiform, and innerplexiform. Based on their functionalities mathematical modeling is attempted and is presented in this section. Photoreceptor layer model A photoreceptor cell is modeled using the cone transfer function given by the expression in equation (1) Benoit et al., 2010.
F ph ðf s ; f t Þ ¼
1 1 þ bph þ 2aph ð1 cosð2pf s ÞÞ þ j2psph f t
ð1Þ
where F ph denotes the cone transfer function which is a function of spatial frequency (f s ) and temporal frequency (f t ), bph is the gain of the photoreceptor, and its value is set to 0.7, aph denotes spatial cutoff frequency, and its value is set to 7 (Benoit et al., 2010). Spatial frequency (f s ) is computed by applying discrete cosine transform (DCT) to the input image, and temporal frequency (f t ) is considered to be merely dc value since for this work only static image is considered (Rajalakshmi & Prince, 2014). The basic Michaelis -Menten relation is modified (Beaudot, 1996; Levin, 2014) so as to include a local adaption effect and normalized for a luminance range of [0, Vmax]
AðpÞ ¼
CðpÞ V max þ CoðpÞ CðpÞ þ CoðpÞ
CoðpÞ ¼ So LðpÞ þ V max ð1 SoÞ
ð2Þ ð3Þ
Fig. 3 shows the functional block diagram of photoreceptor layer. Compression parameter CoðpÞ depends on LðpÞ, the cone transfer function, static compression parameter value So and V max the highest pixel value in the image. Local luminance LðpÞ is computed using the cone transfer function as shown in Eq. (1). In the human visual system, the static compression parameter value varies depending on the ambient light. A static compression parameter SO of range (0, 1) is considered to increase flexibility and make the system more accurate. V max value is equated to 255 because that is the highest pixel value in the input image. In this study, the static compression parameter value So is adjusted between 0 and 1 to increase the flexibility and to make the system more perfect. The adjusted luminance AðpÞ of the photoreceptor layer as shown in figure depends on the current luminance CðpÞ and on the compression parameter CoðpÞ which is, in turn, linked linearly to the local luminance LðpÞ of the neighborhood of the photoreceptor.
98
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
Fig. 3. Functional block diagram of photoreceptor layer.
Outer- plexiform layer model The outer plexiform layer is modeled using the spatial-temporal filter whose transfer function is given by the expression.
F OPL ðf s Þ ¼ F ph ðf s Þ½1 F h ðf s Þ F h ðf s ; f t Þ ¼
ð4Þ
1 ½1 þ bh þ 2ah ð1 cosð2pf s ÞÞ þ j2psh f t
ð5Þ
As shown in Smirnakis, Berry, Warland, Bialek, and Meister (1997) the spatial, temporal filter is derived as a difference in the low pass filter which models the photoreceptor network and a low pass filter which model the horizontal cell network h of the retina. Fig. 4 shows the functional block diagram of the outerplexiform layer. The difference between F ph , the transfer function of photoreceptor layer and F h , transfer function of horizontal cell layer results in two main channel namely Bipolar-ON and Bipolar-OFF. The response of the outer- plexiform layer is evalu-
ated by finding the difference in the response of Bipolar-ON and Bipolar-OFF channel. bph is the gain of the photoreceptor, and its value is equated to zero. bh is the gain of the horizontal cell layer for extracting contour information and its value is equated to zero. sph & sh are temporal frequency constants of the photoreceptor and horizontal cell layer respectively, and their values are equated to 1. Inner plexiform layer model Contour-Contrast-enhanced image from the outer–plexiform layer is then passed to the inner-plexiform layer. Contour enhancement depends more on contours. Information from the bipolar cell is subdivided mainly into two main channels namely ON and OFF, each of the ON and OFF channel is independently enhanced using logarithmic transformation. Photoreceptor layer is modeled using Michaelis–Menten law in a similar manner modeling of retinal parvo channel was also carried out in the proposed study (Benoit, Alleysson, Herault, & Le Callet, 2009). Fig. 5 shows the block
Fig. 4. Functional block diagram of outer-plexiform layer.
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
99
Fig. 5. Parvo channel modeling.
diagram of parvo channel modeling. A logarithmic transformation is applied to Bipolar-ON and Bipolar-OFF images individually for retrieving the results of parvo channel. Logarithmic transformation in Matlab is evaluated using the formula given in Eq. (6).
Logarithmic transformation ¼ K Logð1 þ doubleðInput imageÞÞ ð6Þ where K is a constant. The input image is doubled to convert the image into its decimal equivalent value. The natural logarithm is then applied as shown in the above equation. A logarithmic transformation is used to expand the value of dark pixel values in an image and to compress the dynamic range of the image. This, in turn, leads to equalization of image contrast. The difference between the two bipolar channels after logarithmic transformation results in Parvo (ON-OFF) output. Since the incoming information is about contours the parvo channel results in contour enhancement. In this study, only still images are considered, and only parvo channel is concentrated for processing visual information. Conventional image processing vs. HVS based image processing Fig. 6. Generic algorithm for processing image.
Any digital image processing algorithm has the following sequence as shown in Fig. 6. A digital image is acquired by any imaging modalities, which includes image sensing, sampling, and quantization. In Pre-processing step image is improved by filtering that increase the chance of success of the other processes. It includes image enhancement, restoration, and morphological analysis, like noise removal, image sharpening. Segmentation is the process of breaking down an image into its constituent parts, i.e. to a meaningful form which is easier to analyze. Some of the most common segmentation techniques followed are edge detection, compression-based method, and boundary extraction. The objects and boundaries in an image are detected using image segmentation techniques (Sivakumar, Tamilselvi, Archana, Deepthi, & Priyadarshini, 2011). Feature extraction is the process of extracting features that result in some quantitative information of interest or features which are basic for differentiating one class of objects from another. It is used to analyze the texture of an image. Image textures are analyzed either by the statistical method or structural
method. Feature extraction gives information on various properties of an image which leads to next step of Recognition and Interpretation. Recognition is the process of assigning a label to an object based on the information provided by descriptor (Feature Extractor). And, interpretation is the process of assigning meaning to an ensemble of recognized object called classification. Any medical images are subjected to the above-mentioned processing steps before diagnosis. The processing time involved in carrying out these steps is time-consuming, and moreover, the segmentation algorithm used is subjective to the application. These processes introduce computational complexity in any imaging modality unit and needs a specific algorithm function. As an alternative, retinal model based Human Visual System for processing medical images is proposed which works with less computational effort. The acquired digital image from any imaging modality is passed through the modeled retinal layers for processing. The processed image features are analyzed based on the features described in next section and found to be better.
100
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
ferent grey levels of an image and Pði; jÞ is the number of coappearance of i and j.
Features and quality metrics Feature estimation gives information on the visual properties of an image either globally for an entire image or locally for a particular region of interest. In this work, features namely energy, and intensity are extracted considering the entire image. Energy represents the sum of all pixel value in an image. Intensity measures the average pixel value in an image. Eq. (7) shows the generalized expression to evaluate the total energy level of the signal. Energy measures the uniformity of an image. It is defined as the measure of extent to which pixel pair repetition occurs; energy value seems to be higher when the pixel values are similar, where Pði; jÞ corresponds to elements of concurrent matrix, which means probability of moving from a pixel with grey level ‘i’ to a pixel with grey level ‘j’.
Energy ¼
XX 2 Pði; jÞ i
ð7Þ
j
Contrast is the measure of the intensity value of a particular pixel and its neighbor over the entire image. Eq. (8) shows the general expression for extracting contrast value of an image. In the case of visual perception in real-world, contrast is evaluated by calculating the difference in color and brightness of a particular object on another object within the same field of view.
Contrast ¼
X ji jj2 Pði; jÞ2
ð8Þ
i;j
Entropy is the statistical measure of randomness that can be used to characterize the texture of the original input image. Eq. (9) shows the general expression of entropy, where, i and j represent two difTable 1 Image quality metrics. S. no.
Quality metrics
Formula
1.
MSE (Mean square Error)
1 MN
2.
PSNR (Peak signal to noise ratio)
3.
AD (Average difference)
n 1Þ 10 log 10 ð2pffiffiffiffiffiffiffi MSE P P M N 1 i¼1 j¼1 ðxði; jÞ MN
4.
NCC (Normalized cross correlation)
PM PN ðxði;jÞyði;jÞÞ i¼1 j¼1 P M PN 2
5.
SC (Structural content)
PM PN ðyði;jÞÞ2 j¼1 Pi¼1 M PN 2
6.
PMSE (Peak mean square error)
PM PN i¼1
ðxði; jÞ yði; jÞÞ2
j¼1
2
i¼1
i¼1
1 MN
j¼1
j¼1
ðxði;jÞÞ
ðxði;jÞÞ
PM PN
yði; jÞÞ
i¼1
j¼1
ðxði;jÞyði;jÞÞ2
ðMAXðxði;jÞÞ2
Note: x (i, j) represents the original image, and y (i, j) represents the processed image, the terminology M and N represents the average value of the original image and processed image respectively.
Fig. 7a. Original X-ray image 1.
Entropy ¼
X Pði; jÞ log ðPði; jÞÞ2
ð9Þ
i;j
Autocorrelation feature of an image calculates the fitness or roughness in an image. The autocorrelation function value of an image I (X, Y) is given by the expression as shown in Eq. (10)
Autocorrelation ¼
N X N X Iðu; v ÞIðu þ x; v U¼0 V¼0
,
þ yÞ
N X N X I2 ðu; vÞ
ð10Þ
U¼0 V¼0
Autocorrelation feature responds to the noise interference pattern I (u, v). Further, to evaluate the performance of the proposed model, quality metrics like the Mean square error (MSE), Peak Signal to noise ratio (PSNR), Normalized cross correlation, Average difference, Structural content, Maximum difference, Peak mean square error was computed (Brooks, Zhao, & Pappas, 2008; Eskicioglu & Fisher, 1995). Table 1 shows the quality metrics and their corresponding mathematical formula which are used to evaluate the quality of the processed image. Results and discussion Different types of medical images like X-ray, MRI, CT, and Ultrasound are processed in the proposed HVS model, and the performance was analyzed. On an average of 20 images were analyzed for the study. The input digital image is subjected to the HVS based retinal model for processing. The image undergoes contrast enhancement in the photoreceptor layer. The results are which are analyzed for the compression parameter values ranging from So = 0.1 to 0.9. In this study optimal compression parameter value of So = 0.9 is chosen for which better contrast enhancement is achieved. Contrast enhanced image then passes through the outer-plexiform layer where the contour is enhanced. Finally, the contour-contrast enhanced image passes through the inner-plexiform layer. The output of the innerplexiform layer results in extraction of finer details of the image through parvo channel. Simulation results are shown in the figure. A MATLAB code is written to model the proposed retinal layer model. Figs. 7a–13a show the original images and Figs. 7b–13b show the processed images. Their corresponding feature metrics are tabulated in Table 2. From Table 2 it can be seen that the processed
Fig. 7b. Processed X-ray image 1.
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
101
Fig. 8a. Original X-ray image 2.
Fig. 8b. Processed X-ray image 2.
Fig. 9a. Original MRI image 3.
Fig. 9b. Processed MRI image 3.
Fig. 10a. Original MRI image 4.
Fig. 10b. Processed MRI image 4.
image has improved autocorrelation values, meaning that the original and processed image have strong correlation. Increase in energy and entropy value also strongly confirm that in the processed image features are highlighted. Overall the processed image has shown around 36% increase in contrast. To gauge the improvement in image features after processing through the retinal model, texture features like autocorrelation, contrast, energy, and entropy are extracted. Image textures are
set of metrics calculated in generic image processing technique to quantify the perceived texture of an image. The numerical values of image textures for the original image and processed images are compared and shown in Table 2. The tabulated results show that the energy and contrast value of the processed image is found to be higher than that of the input image. Energy and contrast value are the most significant parameters which in turn aids in visual assessment. The proposed
102
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
Fig. 11a. Original CT image 5.
Fig. 11b. Processed CT image 5.
Fig. 12a. Original CT image 6.
Fig. 12b. Processed CT image 6.
Fig. 13a. Original ultrasound image 7.
mathematical model provides significant result in enhancing the image thereby aiding in extracting finer details of the image. Image quality metric was evaluated for the original image and the processed image for different compression parameter values So = 0.1 and So = 0.9 and the results are tabulated in Table 3 along
Fig. 13b. Processed ultrasound image 7.
with other image processing techniques namely directional median filtering method and unsharp filter. The result proves that as the compression parameter value increases, the mean square error value becomes high and the peak signal to noise ratio reduces for an increase in compression
103
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104 Table 2 Feature values of the processed image. Images
1. 2. 3. 4. 5. 6. 7.
Autocorrelation
Energy
Entropy
Contrast
Original image
Processed image
Original image
Processed image
Original image
Processed image
Original image
Processed image
26.27 23.304 14.99 5.25 37.53 21.024 11.87
28.89 26.853 17.69 6.821 39.51 23.914 15.43
0.1198 0.1372 0.122 0.382 0.102 0.079 0.121
0.1284 0.1445 0.232 0.421 0.120 0.0976 0.197
2.574 2.516 2.97 1.88 2.801 3.063 2.712
2.6305 2.424 3.23 1.98 2.902 3.135 3.134
0.532 0.3156 2.27 0.716 0.922 1.744 0.749
0.7860 0.3961 3.145 0.949 1.457 1.853 1.936
Table 3 Image quality metrics for proposed Retinal model, Unsharp filter and Directional median filtering. Quality metrics
Images
Proposed retinal model So = 0.1
So = 0.9
Unsharp filter (Polesel, Ramponi, & Mathews, 2000)
Directional median filter (chen & Zhang, 2009)
MSE
1. 2. 3. 4. 5. 6. 7.
219.2 242.5 120.2 734.6 205.2 148.7 200.2
406.6 872.9 595.1 609.4 1039 299.9 845
311.13 835.64 3404 9431 648.59 1574 1203
2436 283 1533 693.14 819.43 745.79 982
PSNR
1. 2. 3. 4. 5. 6. 7.
14.7 14.28 17.32 19.47 15.00 16.40 15.116
12.3 8.72 10.38 10.28 7.961 13.36 8.861
14 15 12.8 8.38 16 6.16 11.23
14.26 23 16.27 19.32 14.3 14.4 13.2
NCC
1. 2. 3. 4. 5. 6. 7.
1.08 1.08 1.04 1.02 1.06 1.07 1.06
1.10 1.13 1.11 1.06 1.13 1.08 1.12
1.01 0.94 0.95 0.34 1 1.35 1.07
0.87 0.96 0.95 0.927 0.98 0.97 0.96
AD
1. 2. 3. 4. 5. 6. 7.
25.28 29.69 16.48 11.47 27.27 21.33 23.63
34.89 57.14 38.96 45.77 63.57 28.54 53.48
2.87 5.42 5.096 39.85 0.232 82.26 30.54
7.87 3.45 2.38 1.52 0.499 0.191 20.34
SC
1. 2. 3. 4. 5. 6. 7.
0.82 0.82 0.89 0.94 0.84 0.85 0.84
0.77 0.68 0.77 0.78 0.66 0.82 0.68
0.96 1 0.85 1 0.8 0.39 0.52
0.922 1.06 0.981 0.963 0.08 0.94 0.86
PMSE
1. 2. 3. 4. 5. 6. 7.
0.13 0.16 0.09 0.07 0.15 0.04 0.14
0.18 0.29 0.28 0.28 0.33 0.13 0.313
0.06 0.18 0.369 0.804 0.075 0.864 0.944
0.178 0.749 0.207 0.247 0.066 0.112 0.845
parameter value. It is observed from the above table that most of the quality metrics show a drastic change in values for different compression parameter except NK and SC. So MSE, PSNR, AD, and PMSE are considered to be more significant whereas NK, and SC has little significance in image compression techniques. Our proposed retinal layer model has the option of processing the images for different compression parameter values unlike other processing algorithms. Table 3 shows the comparison of image quality metrics of our proposed retina model for extreme compression values against unsharp filter and directional median
filtering technique. It is observed that the mean square values and peak mean square value of most of the processed images is lower in HVS model than the image processing methods even for higher compression parameter value (So = 0.9). PSNR and NCC values seem to be more or less equal to that of other compared algorithms. Equal values of SC amongst the compared algorithm prove that in context of retaining the structural information, our proposed model performs equally well. In general, in image processing technique it is a known fact that when the image is subjected to compression, it is prone to have an
104
T. Rajalakshmi, S. Prince / Biologically Inspired Cognitive Architectures 18 (2016) 95–104
increase in MSE value with a decrease in PSNR. The proposed mathematical model shows a similar result. The proposed model results are in concurrence with the generic image processing technique. However, the greatest advantage is that our model can extract details from the image by changing the compression parameter values ranging from 0 to 1 with a compromise over the noise. The knowledge in the perceptual studies using mathematical modeling of the human visual system is limited in the literature. Senan et al. developed an HVS model for information coding application. The study involved modeling of visual cortex that is subject to high-level information processing and did not concentrate on what happens at the retina level, i.e., low-level processing (Senan et al., 2001). In the proposed mathematical modeling of the retina, only low-level processing (retinal level) of information is targeted. Herault et al. developed a model of the retina for processing visual information which includes properties like sampling, spatiotemporal, nonlinearity, and color coding (Hérault & Durette, 2007). In this study, the author did not concentrate much on compression properties of the retina and perceptual studies at each layer of the retina were not carried out. In our study, we have analyzed visual perception at the end of every layer of retina based on compression properties of photoreceptor layer. It is proved with our proposed technique as the compression parameter value increases contrast and contour value of the image is enhanced. A study by Benoit et al. (2010) on modeling different layers of the retina for processing visual information includes compression parameter, luminance, spatial, and temporal properties. Our study was an extension of the Benoit’s model with changes in spatial and temporal frequency values. In the proposed study spatial frequency was computed by applying DCT to the original input image and temporal frequency value was subject to zero since we have analyzed only still image. The results of our study showed similar observations. A study by Ravikumar and Rattan (2012) on analysis of various quality metrics for medical image processing showed that on enhancing the contrast value of an image by increasing the variance, image quality metrics like mean square error (MSE) value increases and the peak signal to noise ratio (PSNR) value decreases. Our study showed similar results like as the compression parameter value is increased there is a significant increase in the contrast value of the image with a subsequent increase in the MSE and decrease in PSNR value. Conclusion Human visual system is a very powerful system. The proposed retinal layer model imitates some parts of the retinal functionalities including its luminance, compression properties and spatial and temporal frequencies for visual processing. The proposed method is applied to a wide variety of images, and from the results it is proved that this model which consist of photoreceptor layer, outer plexiform layer, and the inner plexiform layer compresses the image, enhances contrast visibility in dark area and thus maintaining the same in its bright area and thereby enhancing the contour information and aids in extracting finer details of the image. The proposed method proves with appropriate results that the value of contrast and energy of the processed image is found to be higher when compared with that of the original image. Image quality metrics of the processed image through proposed retinal model seems to show an improvement in comparision with image processing techniques like unsharp filter and directional median filter This method proves to be more flexible which enables easier practical implementation when compared to that of generic
medical image processing techniques which have to undergo a lot of preprocessing steps for further analysis. Further, in the evaluation of quality metrics, it is observed that as the compression parameter value increases the MSE value increases and PSNR value decreases which are a universal behavior. Hence, the proposed HVS model can be used an alternative to the existing generic image processing technique and can be applied extensively in robotic vision or computer vision. References Beaudot, W. H. A. (1994). The neural information processing in the vertebrate retina: A melting pot of ideas for artificial vision PhD Thesis in Computer Science. France: INPG. Beaudot, W. H. A. (1996). Sensory coding in the vertebrate retina: Towards an adaptive control of visual sensitivity. Network: Computation in Neural Systems, 7 (2). 317-3. Benoit, A., Alleysson, D., Herault, J., & Le Callet, P. (2009). Spatio-temporal tone mapping operator based on a retina model. Lecture Notes in Computer Science, 5646, 12–22. Benoit, a. A., Caplier, b. A., Durette, b. B., & Herault, b. J. (2010). Using human visual system modeling for bio-inspired low level image processing. Computer Vision and Image Understanding, 114, 758–773. Brooks, A. C., Zhao, X., & Pappas, T. N. (2008). Structural similarity quality metrics in a coding context: Exploring the space of realistic distortions. IEEE Transactions on Image Processing, 17, 8. Chen, Z., & Zhang, L. (2009). Multistage directional median filter. World Academy of Science, Engineering and Technology, 59. Costin, H. (2014). Recent trends in medical image processingeditorial (preface) for a special issue of computer. Computer Science Journal of Moldova, 2(65) (22). Eskicioglu, A. M., & Fisher, P. S. (1995). Image quality measures and their performance. IEEE Transactions in Communication, 43, 2959–2965. Gao, X., Lu, W., Tao, X., & Li, X. (2010). Image quality assessment and human visual system. Visual Communications and Image Processing, 7744, 77440Z. Hafiz, A. R., Alnajjar, F., & Alnajjar, F. (2010). A novel dynamic edge detection inspired from mammalian retina toward better robot vision. World Automation Congress TSI Press. Herault, J. (1996). A model of color processing in the retina of vertebrates: From photoreceptors to color opposition and color constancy phenomena. Current European Neurocomputing Research, 12, 113–129. Hérault, J., & Durette, B. (2007). Modeling visual perception for image processing. In F. Sandoval et al. (Eds.), IWANN 2007, LNCS 4507 (pp. 662–675). Berlin, Heidelberg: Springer-Verlag. http://www.cvrl.org. Hui, W., Xu-Dong, G., & Qingsong, Z. (2010). Main retina information processing pathway modeling. In Proc. 9th IEEE int. conf. on cognitive informatics (pp. 318–324). Kaiser, P. K., & Boynton, R. M. (1996). Human color vision (2nd ed.). Washington, DC: Optical Society of America. Levin, L. A., (2014). Optic nerve. In: Kaufman, P. L., Alm, A. (Eds.), Adler’s physiology of the eye (10th ed., pp. 603–638). Mantiuk, R., Myszkowski, K., & Seidel, H.-P. (2006). A perceptual framework for contrast processing of high dynamic range images. ACM Transactions on Applied Perception, 3, 286–308. Pei, Z., & Qiao, Q. (2010). An approximate retina model with cascade structures. In Proc of international conference on natural computation. . 2009–2012. Polesel, A., Ramponi, G., & Mathews, V. J. (2000). Image enhancement via adaptive unsharp masking. IEEE transactions on Image Processing, 9(3), 505–510. Rajalakshmi, T., & Prince, Shanthi (2014). Contour-contrast enhancement based on retinal layer processing. In IEEE conference on devices, circuits and systems (ICDCS’14) (pp. 312). Ravikumar & Rattan, M (2012). Analysis of various quality metrics for image processing. International Journal of Advanced Research in Computer science and Software Engineering, 2(11). Senan, H., Saadane, A., & Barba, D. (2001). Design and evaluation of an entirely psychovisual-based coding scheme. Journal of Visual Communication and Image Representation, 12(4), 401–421. 21. Sivakumar, R., Tamilselvi, R., Archana, N., Deepthi, N., & Priyadarshini, N. (2011). Classification and detection of retinal disease. In International conference on signal, image processing and applications (pp. 21). Smirnakis, S. M., Berry, M. J., Warland, D. K., Bialek, W., & Meister, M. (1997). Adaptation of retinal processing to image contrast and spatial scale. Nature, 386 (6620), 69–73. Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13, 1–12. Yang, S. J., Ro, Y. M., Nam, J., Hong, J., Choi, S. Y., & Lee, J. H. (2004). Improving visual accessibility for color vision deficiency. MPEG-21, ETRI Journal, 25(3), 195–202.