Accepted Manuscript Underwater Image Processing Method for Fish Localization and Detection in Submarine Environment Mohcine Boudhane, Benayad Nsiri PII: DOI: Reference:
S1047-3203(16)30084-0 http://dx.doi.org/10.1016/j.jvcir.2016.05.017 YJVCI 1768
To appear in:
J. Vis. Commun. Image R.
Received Date: Revised Date: Accepted Date:
4 August 2015 27 May 2016 30 May 2016
Please cite this article as: M. Boudhane, B. Nsiri, Underwater Image Processing Method for Fish Localization and Detection in Submarine Environment, J. Vis. Commun. Image R. (2016), doi: http://dx.doi.org/10.1016/j.jvcir. 2016.05.017
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Image and Vision Computing 00 (2016) 1–15
Image and Vision Computing www.elsevier.com/locate/procedia
Underwater Image Processing Method for Fish Localization and Detection in Submarine Environment Mohcine Boudhane*’, Benayad NSIRI’ * Faculty of Computer Science and Electrical Engineering, University of Applied Sciences, Grenzstr. 5, 24149 Kiel, Germany ’ University Hassan 2, Faculty of Sciences, Ainchock B.P 5366 Maarif 20000, Casablanca, Morocco
Abstract Object detection is an important process in image processing, it aims to detect instances of semantic objects of a certain class in digital images and videos. Object detection has applications in many areas of computer vision such as underwater fish detection. In this paper we present a method for preprocessing and fish localization in underwater images. We are based on a Poisson-Gauss theory, because it can accurately describe the noise present in a large variety of imaging systems. In the preprocessing step we denoise and restore the raw images. These images are split into regions utilizing the mean shift algorithm. For each region, statistical estimation is done independently in order to combine regions into objects. The method is tested under different underwater conditions. Experimental results show that the proposed approach outperforms state of the art methods. c 2015 Published by Elsevier Ltd.
Keywords: Object detection, Image denoising, Scene understanding, Underwater image processing.
1. Introduction Oceans cover most of the surface of the planet. It cover almost one third of the surface of the earth. The field of detection in the marine environment is a hot topic since many years. Due to the proprieties of underwater and the limitation of human access in this environment. Many technologies have been developed to monitor and track the evolution of the marine environment, such as a remotely operated vehicles (ROVs), systems targeting objects (STO), autonomous underwater vehicles (AUV) [1, 2, 3]. Today, the most detection and monitoring systems underwater are based on cameras and the exploitation of the image data. The computer vision and image processing have been particularly studied
in this context to develop robust and sophisticated algorithms for underwater research topics. The light absorption and scattering pose a bottleneck, because the visibility underwater is only a few meters. In [4], where clear water is considered, twenty meters visibility is shown. Recent works try to enhance the underwater image quality, and to reduce the level of noise, in order to detect and localize the objects in it successfully. Some researchers in [5, 6, 7] propose filter based methods for reduction of undesirable noise. In [8] and [9] wavelet based methods are proposed. In [8], authors combine Wavelet Decomposition and High-pass Filter in order to remove back-scattering noise. Homomorphic filtering, anisotropic filtering and wavelet-based threshold-
2
/ Image and Vision Computing 00 (2016) 1–15
ing are applied to reduce the additive noise in [9]. However, these wavelet based methods cause unsharpness in the resulting image. In [10], the authors use a median filter to remove the noise, RGB Color Level Stretching to enhance the quality of the image, and Dark channel prior to obtain the atmospheric light. This method can Only help in the case of images with minor noise. Very noisy images was treated in [11] by utilizing a Bilateral filtering, the proposed solution presents good result but the required time processing is very high. Statistical methods are proposed in [12, 13, 14], these methods are based on the modulation of the noise as Poisson-Gauss distribution and supposing that the image is independent form the noise. The authors show promising results in different noise level. Beside noise, the absorption, and scattering of light between the camera and the object, degrade the quality of the captured images. the non-uniform absorption of colors, for example, red color is absorbed more than blue color, that make the underwater images dominated by the blue color. This behavior enhance the difficulties of identification and detection of divers, fish and other objects in underwater images. [15, 16, 17] apply regularization methods by means of laser technologies. Color polarization methods are proposed in [18, 19, 20], the authors utilize a filter (at the front of the camera) in order to uniform the color proportions in the captured images. The combination of the laser based technology and color polarization are proposed in [21, 22].
ture, waves, tides and many other effect make the visibility more difficult. Fig. 1 defines the skills of our approach, it is divided on four fundamental blocks : image denoising, image enhancement, estimation, and detection. without knowledge of the environment. in order to generalize the image preprocessing solution. The next sections in this paper are organized as follows. Section 2 describes the theoretical principle of the model under investigation. In this part, denoising-enhancement process as well as statistical estimation of the desired object are derived. In Section 3, experimental results and comparison with other methods are shown. Section 4 conclude this work. 2. Theoretical principle Notations : Variable v αi µ, µi x, y η Ci zk Z m k RI,t QI,t NI,t γ(z ji )
Fig. 1: Skills of the proposed approach.
sl Il L Ni j
The challenge now is to create an efficient tool which is able to solve jointly the problem of noise, light absorption, and scattering effects. Our goal is to offer the submarine biologist to explore the underwater environment and analyze the behavior of different fish species. Several additional effects such as rain, whirlpool, current, salinity, tempera-
Designation a data vector. the mixture weights (1 ≤ i ≤ m). mean element(s). a strictly positive numbers a real number. covariance matrix. the realization of Z.. series of observations. number of Gaussians. the mean of the PGM distribution. the observed random variable] at time t and location I. the signal-dependent Poisson component. signal independent Gaussian component. defines the posterior probabilities for the jth observation and the ith Gaussian. the lth region. th the l region from the image I. number of regions in the image. the effective number of pixels assigned to the ith Gaussian. number of observations (1 ≤ j ≤ N).
The goal of the proposed approach is to reduce the noise level extremely in order to enhance the quality of the images in submarine environment. There are principally two sources of noise: The first one is coming from the capture. Its usually depends
/ Image and Vision Computing 00 (2016) 1–15
on the capture settings and the power of the device. The second is caused in the transmission, a typical example is the information loss due to image compression. Otherwise, without compression, we would have to consume several temporal and material resources for the transmission of images (Fig. 2).
Fig. 2: Sources of noise. The Gaussian mixture is among the most popular model applied in statistics ([23, 24]). It is a parametric probability density function represented as a weighted sum of Gaussian component densities. A Gaussian mixture model is a weighted sum of m component Gaussian densities as given by the equation, p(v/C i , µi ) = αi
m X (v, C i , µi ),
(2.0.1)
3
processing time of the algorithm. This method implicitly segments the image into regions of similar content. Poisson-Gaussian distribution is a statistical model formed by the combination of a Poisson distribution and a Gaussian distribution. In [12] the authors show that the noise produced in the imaging devices can be modeled as Poisson-Gaussian distribution. This combination generates a method of noise removal. The aim of the method is to benefit from the property of each distribution in image denoising. The Poisson component accounts for the signal-dependent uncertainty, while the Gaussian mixture component accounts for the other signalindependent noise sources. The Poisson-Gaussian can use generalized Anscomble transform to stabilize the variance [12], and to ensuring the precision of the denoising process. The authors use this technique to treat pictures with low intensity. In order to improve the perception of underwater images, we propose a new approach based on Poisson-Gaussian model. We assume that the mean and the variance of the noise are not constant in the whole image. By means of the Poisson-Gaussian mixture, the denoising process is adapted to each region in the image. Furthermore, each Gaussian in the mixture takes a set of pixels (e.g. background or foreground pixels), which gives additional information to the next step of the processing.
i=0
where v is a data vector, and αi are the mixture weights (1 ≤ i ≤ m). Given the component Gaussian densities. Each component density as:
f (v, C i , µi ) =
m X i=0
Let X be a random process that follows a Poisson distribution with parameter x > 0, then the probability mass function of X is given by
) 1 T −1 exp − (v − µi ) C i (v − µi ) , √ 2 2πσ (2.0.2) 1
(
with mean elements µi , and covariance matrix C i , The mixture weights satisfy the constraint that m X
PX (X = x) =
e−x xk k! ,
(2.0.3)
and let Y be a random variable that follows a Gaussian distribution with variance σ2 and mean µ, then the probability density function is given by
αi = 1.
i=0
In [25] the authors create a statistical model based on a mixture of projected Gaussian distribution and wavelet based algorithms, in which they use the expectation maximization algorithm to accelerate the
PY (Y = y) =
√1 e 2πσ2
−(y−µ)2 2σ2
, (2.0.4)
/ Image and Vision Computing 00 (2016) 1–15
4
where x and y are the realizations of the random variable X and Y respectively. Let Z = (Z1 , Z2 , · · · , ZN ) be series of observations that form a set of random independent variables, and zk be realization of Z that are considered to be the measure of noise intensity of the signal I. From [12], the Poisson-Gaussian distribution is defined by
where m is the number of Gaussians, Ci is the covariance matrix of the ith Gaussian,and k and αi are the ith mean and mixture coefficient, respectively. The problem is to estimate αi , µi , C i from the available observation vector Z = {Z1 , Z2 , · · · , ZN }, which is a realization of the random field RI,t .
2.2. Estimation of the image parameters p(z/y, σ) =
∞ X −(y−k)2 xk 1 ( e−x √ e 2σ2 ). k! 2πσ2 n=0
(2.0.5)
The idea of this work is to generalize the Poisson-Gaussian distribution by means of several Gaussian. This new distribution is called PoissonGaussian Mixture distribution (PGM). This is defined as p(z/y, C) =
∞ m X X xk ( e−x αi ( f (xi , C i , µi )), (2.0.6) k! n=0 i=0
where x is a strictly positive real number, m is the number of Gaussians, Ci is the covariance matrix of the ith Gaussian,and k and αi are the ith mean and mixture coefficient, respectively. 2.1. Noise model Let Il = (x, y) is a location index which are corrupted with a Poisson-Gaussian noise, and for which we observe the z realizations. Each realization will be indexed by the time index t. Such a framework leads us to the following model: (∀l ∈ 1, ..., L) (∀t ∈ 1, ..., T ) RI,t = ηQI,t + NI,t . (2.1.1) where η ∈ R is a scaling parameter and RI,t is the observed random variable at time t and the location I.
The estimation of the parameters (αi , µi , C i ) will be done by means of the Expectation Maximization algorithm (EM). The estimation can also be done by means of the maximum likelihood estimation algorithm [26]. Given training vectors and a Poisson-Gaussian mixture configuration, the goal is to estimate the parameters (αi , µi , C i ) of this distribution, that best match the distribution of the training feature vectors. The aim of the EM-algorithm is to maximize the likelihood function with respect to the parameters under investigation. This estimation can be divided into four steps: 1. Initialization step: µi , C i , αi and log-likelihood are initialized. 2. Expectation step: evaluate the posterior probabilities using the current parameter values. αi p(x j /µi , C i ) γ(z ji ) = Pm i=0 αi p(x j /µi , C i )
where: γ(z ji ) defines the posterior probabilities for the jth observation and the ith cluster (ith Gaussian), and m is the total number of the Gaussian in the mixture. 3. Maximization step: computation of the parameters using the current posterior probabilities
The noise model is represented with two mutually independent parts, the signal-dependent Poisson component QI,t , and the signal independent Gaussian component NI,t . In the above model, the following auxiliary random variables intervenes • QI,t ∼ P(k) • NI,t ∼ GMM(αi , µi , C i ) ∀i ∈ [1, m]
(2.2.1)
Ni , N
(2.2.2)
N 1 X γ(z ji )x j , Ni j=0
(2.2.3)
αi new = µi new =
C i new =
N 1 X γ(z ji )(x j − µi new )(x j − µi new )T , Ni j=0 (2.2.4)
/ Image and Vision Computing 00 (2016) 1–15
5
where Ni denotes the effective number of pixels assigned to the cluster (Gaussian) i, where : Ni =
N X
γ(z ji )
(2.2.5)
n=0
Fig. 4: Image decomposition via mean shift algorithm
. 4. Evaluation step: evaluation of the log likelihood function: ln(p(x j /α, µ, C)) =
N X j=0
ln{
m X
αi p(x j /µi , C i )},
i=0
(2.2.6) where m is the number of Gaussian in the mixture. The last iteration of the algorithm is achieved when the log-likelihood function and the parameters converge to a constant value. Finally the resulting filter model will be convolved with the original image, to reduce the noise. This estimation allows us to generate a desirable filter. its mission is to eliminate the noise from the image, by the calculation of the convolution between the object and the original image. In the following, the segmentation step is showed. 2.3. Segmentation Several methods have been developed for segmenting images. The choice of the adequate technique depends on many factors (segmentation of medical images is different from underwater images). In this work, we chose to use the so-called Mean-shift algorithm in the segmentation. In this phase, we obtain the segmented image (Fig. 3) that include several regions. Each region is isolated, and then treated separately in order to make a statistical test on it. Fig. 4 shows this isolation applied on the segmented image. In the remainder of this section, the log-likelihood ratio test is carried out in order to measure the reliability to estimate the existence of the object. Then, the obtained value is verified and compared with the original data.
2.4. Statistics of Fish The likelihood ratio method is considered by statisticians as the best method of points estimation, that expresses how many times a test result is likely to be found in true compared with false hypothesis [27]. This method always gives a feasible result, also it can be used when we have truncated data. The likelihood ratio (LR) is written by LR =
P(T esttrue ) , P(T estFalse )
(2.4.1)
Let s = [s1 , · · · sL ]T denote regions of a given image. We suppose these regions follow a PoissonGaussian mixture distribution, furthermore we assume that the segments are statistically independent. With this assumption, the proposed method takes the following forms as a log-likelihood ratio test
LLR =
p(s/H0 ) , p(s/H1 )
(2.4.2)
where H0 and H1 be two hypotheses. H0 is the hypothesis of existing object, and H1 of non-existing one. To obtain the LLR value for the image I, we apply the following formula: LLR =
p(Il /H0 ) . p(Il /H1 )
(2.4.3)
For each region, we have to apply this formula. Fig. 5 shows the application of the formula in each region. Since Il are independent, this formula will be in this form: LLR =
Fig. 3: Image segmentation via the mean shift algorithm.
LLR =
p(I1 /H0 )p(I2 /H0 )...p(IL /H0 ) p(I1 /H1 )p(I2 /H1 )...p(IL /H1 )
(2.4.4)
0, 236.0, 401.0, 756...0, 334 0, 774.0, 599.0, 244...0, 676
(2.4.5)
/ Image and Vision Computing 00 (2016) 1–15
6
Fig. 6: AUV Robbe 131 created in the university of applied sciences Kiel in cooperation with GEOMAR Helmholtz Center for Ocean Research Kiel in 2013.
Fig. 5: Likelihood test process.
The value calcluated in the last equaion is interpeted as follows: • If LLR > log(η) ⇒ H 0 true. We do not reject H0 . • If LLR < log(η) ⇒ H 1 true. We reject H0 . • If LLR ≈ log(η) ⇒ we can not conclude.
3.1. Equipment used: The data was captured using AUV robbe 131 made in the university of applied sciences in Kiel, Germany (Fig. 6). This AUV is equipped by many sensors (sonar, cameras, navigation sensors). In case of optical sensor, the AUV uses two cameras. A forward looking camera can detect buoys, which are used as objects of potential interest (OPIs), in different colors and recognize numbers printed on OPIs. The second camera is either oriented towards the bottom or sidewards and can be used to detect buoys, pipelines or OPIs in general below or next to the vehicle. Both cameras are identical, model IDS UI-1241LEC-HQ, and have the property of very low power consumption and low noise (Fig. 7). Some technical key parameters are shown in Table 1.
In the case of color images, the result is the addition of each LLR associated to each color component. For example, in the RGB space, it is recommended to use the following equation:
LLR = LLRRed + LLRGreen + LLRBlue
(2.4.6)
3. Experimental results: In this section, our method is compared with conventional and state of the art methods. The system is implemented on a standard PC (Intel Core i52520). Different images of different sizes have been used in the experiment. We have conducted experiments to evaluate proposed algorithm on degraded underwater images with unknown turbidity characteristics. These images present typical noise levels for underwater conditions.
Fig. 7: AUV camera and pressure hull. Cameras is installed in compact cubical pressure hulls, which are constructed from POM.
3.2. Lighting Time-varying illumination patterns caused by surface of the waves is a difficult problem to tackle, because the frequency of the waves is not constant in coastal areas. Further, the sun and cloud positions vary over time. In the experiments, we consider three cases: Clair , dark and foggy light condition with different level of noise.
/ Image and Vision Computing 00 (2016) 1–15
3.3. Dataset The system is tested in real underwater videos. Our AUV has been charged to record the data in the Baltic see with more than 30000 frames (31542 frames). The original raw data coming from the acquisition device has often poor quality because of distortions such as noise, blurring, quantization, geometrical aberrations, etc. In this paper, we show the results of twelve captures with different complexity. Fig. 8.1, Fig. 8.2 and Fig. 8.3 describe three
7
different conditions of underwater images tested in the experiment. Table1:
Technical parameters of camera.
Power consumption Interface Frame rate Resolution Opening angle
65 mA @ 5V (USB powered) USB 2.0 5 FPS used (Up to 25 FPS possible) 1280 x 1024 (width x height) 126 (101 horizontal, 76 vertical)
Fig. 8.1: The data used in the experiments. (Clair condition)
Fig. 8.2: The data used in the experiments. (with complex background (i.e. algae))
Fig. 8.3: The data used in the experiments.(with dark or/and foggy condition) 3.4. Experiment 1: experimental results In this subsection, we present the results of the proposed method using the proposed approach. Before presenting the results, we review some terminologies which will be used in the following numerical analysis. Mean Square Error (MSE) and Pixel Signal-to-Noise Ratio (PSNR) are used to determine the reconstructed image quality. MSE is defined by the following equation : M N X X 2 (Xi, j − Yi, j ) MSE =
j=1
i=1
MN
(3.4.1)
where Xi, j and Yi, j represent the original and the reconstructed images respectively, MxN repre-
sents the image size. The peak signal to noise ratio (PS NR) (in dB) is calculated using the following equation: PSNR = 10 log
2552 . MSE
(3.4.2)
All of our experiments consists of the same steps discussed before. Fig. 9 shows the different steps of preprocessing images. Fig. 9.b is corrupted with additive Gaussian white noise at eight different power levels σ = 15. Fig. 9.c and Fig. 9.d show the denoised and enhanced image respectively. In Fig. 9.d, we obtained a more clear image than the original image. Some numerical results are shown in Table 2.
/ Image and Vision Computing 00 (2016) 1–15
8
Original image
Denoised image
Noisy Image
Enhanced image
Fig. 9: Denoising process for our approach: (a) Original image, (b) Noisy image, (c) Denoised with the proposed approach, (d) Correction by means of color filtering. Table 2: PSNR in (dB) for all test images σ/PS NR PSNR (Our approach) 5 / 26.55 34.82 10 / 25.34 30.99 15 / 24.92 29.38 20 / 22.53 28.23 30 / 19.20 26.71 50 / 15.19 24.31 Fig. 10 represents the results of estimation by log likelihood ratio test. It shows the objects estimated in the image. In Fig. 10.b several small regions are obtained, that include the estimated objects. After tresholding, the number of false regions is reduced, then we obtain the new estimation of the object(see Fig. 10.b).
Fig. 10: Segmentation and fish detection process: (a) Enhanced image, (b) Automatic segmentation and statistical estimation, (c) Segmentation and statistical estimation with threshold regularization. 3.5. Experiment: comparison with state of the art methods In this subsection, a comparison with conventional method is made. First we get raw images
(image sequence) with additive noise. Then, we denoise the transformed image (assuming additive white Gaussian noise of unit variance) with either Non local mean filter (NLM filter) [28] and Blockmatching and 3D filtering (BM3D) denoising algorithm [29]. The numerical results for different levels of noise is shown in the Table 4. Table 4 present the PSNR value calculated by our approach after the denoising process. Each line is corrupted with additive Gaussian white noise at eight different power levels (σ = {5, 10, 15, 25, 40, 50, 70, 85, 100}). Different algorithms are applied to the same noise realizations. Table 3: log-likelihood ration test result Case Fig8 (a) Fig8 (d) Total regions 5 29 object regions 1 28 loglikelihod ratio -0.24 403 decision No Yes The proposed approach is tested on different images in the dataset. Table 3 presents the PSNR value for each denoising method, and thereafter, we resume these results in Fig. 10. The graph in Fig. 13.(a),(b) and (c) shows the PSNR values comparing with different levels of noise throughout preprocessing by Non local mean filter Buades et.al[28], BM3D and the proposed approach respectively in Clair, environmental and foggy condition. The PSNR values of the proposed approach are relatively higher at values of σ more than 20. We can see that the Dabov et.al[29] and our approach has PSNR more than 34dB. Also, we can see that between σ = 10 and σ = 20 that Dabov et.al[29] is nearly the same as our method. However, from σ = 15 the percentage declined to a pick around 30dB. and from σ = 20 the NLM filter [28] offer more quality than Dabov et.al[29]. On the other hand, when we see all graphs, it decline gradually in parallel with the addition of the noise σ. Between σ = 0 and σ = 10 the PSNR of the proposed approach is relatively higher than the others. This implies that the greater is the noise, the less sensitive is our approach compared with the others. Fig. 14 respectively Fig. 15 shows the results of the images in foggy condition, respectively environment condition. Fig 15 show a visual example of image denoised by the proposed approach, and its comparison with the state of the arts methods. To give the reader
/ Image and Vision Computing 00 (2016) 1–15
an indication of the typical computation times, for the image in Fig. 14 (512x512) the various denoising methods require time approximately as follows: BM3D 4.5 s, NLM filter 11.5 s, and the proposed method 1.2 s. The compared algorithms were im-
9
plemented in MATLAB. These results are obtained with an Intel Core i5-2520 processor, running at 2.50 GHz.
Table 4: Performance comparison between NLM filter, BM3D, and the proposed approach. Condition Image Peak/Noisy 5/34.15 10/28.13 Cap (1) 15/24.61 25/20.17 40/16.09 50/14.15 70/11.23 85/9.54 100/8.13 5/34.15 10/28.13 Cap (2) 15/24.61 25/20.17 40/16.09 50/14.15 70/11.23 85/9.54 100/8.13 5/34.15 10/28.13 Cap (3) 15/24.61 25/20.17 40/16.09 50/14.15 70/11.23 85/9.54 100/8.13 5/34.15 10/28.13 Cap (4) 15/24.61 25/20.17 40/16.09 50/14.15 70/11.23 85/9.54 100/8.13
NLM 36.78 33.11 31.29 29.23 27.36 26.42 24.78 23.56 20.11 40.00 36.50 34.36 31.66 29.16 27.81 25.32 23.58 22.52 40.56 37.26 35.37 32.63 29.69 28.35 26.70 25.79 24.94 38.23 34.68 32.56 29.83 27.29 25.96 23.64 22.08 20.71
Clair BM3D 37.58 33.71 31.81 29.91 28.41 27.80 26.83 26.27 25.80 41.98 38.39 36.25 33.54 31.08 30.16 28.46 27.48 26.67 42.52 38.84 36.85 34.45 32.24 31.45 29.94 29.06 28.38 40.23 36.52 34.32 31.57 28.99 28.07 26.48 25.62 24.92
Proposed 40.09 34.20 31.91 29.56 28.90 28.09 27.79 26.97 26.34 42.44 38.34 36.12 34.72 32.58 31.46 29.75 28.61 27.68 43.17 38.28 36.05 35.70 33.63 32.39 30.36 29.10 28.24 41.18 35.82 34.19 32.12 30.76 29.62 27.87 26.74 25.84
NLM 38.48 34.62 32.80 30.70 28.57 27.47 25.61 24.64 22.97 37.21 32.97 30.39 27.35 24.28 22.75 20.38 18.98 17.75 42.34 38.41 35.84 32.28 28.75 27.08 24.61 23.11 21.78 39.12 35.31 33.24 30.53 27.59 25.95 23.11 21.32 19.79
Environment BM3D Proposed 40.64 41.45 36.62 36.51 34.47 34.16 32.17 33.41 30.45 31.43 29.87 30.54 28.91 29.07 28.32 28.22 27.81 27.47 38.76 40.77 34.56 35.00 32.18 32.22 29.21 30.02 26.34 28.41 25.23 27.10 23.55 24.99 22.64 23.97 21.39 22.61 44.00 45.91 40.50 42.34 38.55 39.73 36.06 36.78 33.71 33.88 32.84 32.58 31.23 30.37 30.28 29.08 29.47 27.75 40.94 42.03 37.32 39.12 35.32 36.50 32.84 34.16 30.47 31.75 29.66 30.53 28.04 28.49 27.10 27.25 26.29 25.95
NLM 45.05 41.13 38.19 34.19 30.47 28.64 25.60 23.66 21.97 44.81 40.63 37.51 33.28 29.23 27.07 24.16 22.28 20.86 37.31 34.54 33.12 31.06 28.63 27.48 26.04 25.27 24.56 44.41 40.17 37.15 33.00 28.97 27.02 24.11 22.38 20.88
Foggy BM3D 46.62 44.47 42.89 40.71 38.48 37.97 36.18 35.05 34.04 46.60 44.57 43.04 40.89 38.68 38.18 36.47 35.36 34.32 38.57 35.41 34.00 32.45 30.99 30.39 29.10 28.32 27.36 46.44 43.14 41.08 38.43 35.83 34.94 33.22 32.11 31.02
Proposed 46.95 45.63 43.88 41.05 40.47 39.27 36.70 35.02 33.55 48.68 46.84 44.81 41.38 39.92 38.65 35.55 34.42 31.85 40.91 37.22 35.36 33.62 32.03 31.06 29.53 28.37 27.48 46.83 44.41 40.90 38.25 36.64 34.85 33.70 32.04 30.63
10
/ Image and Vision Computing 00 (2016) 1–15
The table 4 illustrate the log-likelihood ratio before and after applying the preprocessing process. We can see in Table 4 that before the preprocessing, the log-likelihood ratio test value is unable to detect enough objects regions, in order to predict the existence of the object. The image input is noisy and the system reject the possibility of the existence of the object in the image. However, After preprocessing, the system can detect more regions, that help the system to give the right decision. In conclusion, our approach have a good performance in term of denoising and enhancement, which permit to allow to the log-likelihood ratio test to make decision by estimating the existence of the object. In Fig. 11 the results over all images are provided. It indicates that the number of false positives (background detected as fish) and the number of undetected fish of the proposed approach, are almost equal. Almost 94% of the pixels that were annotated as fish were automatically detected as fish, against
88% by BM3D (in [29]).
Fig. 11. ROC curves of Automatic detection process. In Blue: by the proposed approach, in Red : by BM3D (in [29]).
Fig. 12. Image denoising using the proposed approach. In left: Noisy (σ = 50) grayscale underwater image, and in right: the proposed approach (PSNR 29.62 dB).
Fig. 13. (a) Performance comparison according to images in Clair environment. In red: The proposed approach, In Black:NLM filter (in [28]), In Blue: BM3D (in [29]).
/ Image and Vision Computing 00 (2016) 1–15
11
Fig. 14. (a) Noisy image (σ = 50), (b) image denoised with NLM filter [28] (PSNR = 28.35 dB), (c) image denoised with BM3D (PSNR = 31.54 dB), (d) image denoised with the proposed approach (PSNR = 32.39 dB). In appendix, Fig. 16 shows more results.
Fig. 15. Fragments of noisy (σ = 50) grayscale images and the corresponding estimation by the proposed approach. In each couple of images, the left ones are the noisy images (with σ = 50), and the right ones are the denoised images using our method.
12
/ Image and Vision Computing 00 (2016) 1–15
4. Conclusion A great variety of automatic object detection and image denoising algorithms is currently known. We have proposed a new method for underwater image preprocessing. In the first step, the denoising process is defined based on Poisson-Gaussian mixture distribution. Then we define an image enhancement process. A statistical estimation by log likelihood ratio test is defined in the second step, which treats each region of the segmented image independently. Despite this fact, algorithms that require user interaction are sill commonly applied, because of practical problems with automatic registration and real time mainly when we have no information for the environment. The solution proposed in this paper can used in different condition without knowledge of the environment. In addition, the proposed approach can be adapted to various noise models such as additive colored noise, non-Gaussian noise, etc., by modifying the calculation of some parameters. References [1] S. Veerachart, P. Patompak, N. Itthisek, A robust adaptive control algorithm for remotely operated underwater vehicle, in: Proc. SICE ’13, Nagoya, Japan, 2013, pp. 655–660. [2] D. Liang, Q. Huang, S. Jiang, H. Yao, G. W., Autonomic management for the next generation of autonomous underwater vehicles, in: Proc. IEEE ICIP ’07, Southampton, UK, 2007, pp. 369–372. [3] W. Kirkwood, Auv technology and application basics, in: Proc. MTS/IEEE OCEANS ’08, Kobe, Japan, 2008, p. 15. [4] M. Legris, K. Lebart, F. Fohanno, B. Zerr, Les capteurs d’imagerie en robotique sous-marine: tendances actuelles et futures, GRETSI, Saint Martin d’H`eres, France, 2003. [5] C. Chang, J. Hsiao, C. Hsieh, An adaptive median filter for image denoising, in: Proc. IEEE IITA ’08, Qingdao, China, 2008, pp. 346–50. [6] C. Wang, J. Zhang, Image denoising via clustering-based sparse representation over wiener and gaussian filters, in: Proc. IEEE S-CET ’12, Qingdao, China, 2012, pp. 1–4. [7] A. Nath, Image denoising algorithms: A comparative study of different filtration approaches used in image restoration, in: Proc. IEEE CSNT ’13, Gwalior, India, 2013, pp. 157– 163. [8] P. Prabhakar, P. Kumar, Underwater image denoising using adaptive wavelet subband thresholding, in: Proc. IEEE ICSIP ’10, Chennai, India, 2010, pp. 322–327. [9] S. Feifei, Z. Xuemeng, W. Guoyu, An approach for underwater image denoising via wavelet decomposition and highpass filter, in: Proc. IEEE ICICTA ’11, Shenzhen, China, 2011, pp. 417–420. [10] M. Donna, F. Kocak, M. Caimi, The current art of underwater imaging - with a glimpse of the past and vision of the future, Marine Technology Society Journal 39 (3) (2005) 5–26. [11] M. Zhang, B. Gunturk, Multiresolution bilateral filtering for image denoising, IEEE Trans. Image Process. 17 (12) (2008) 2324–2333.
[12] M. Makitalo, A. Foi, A optimal inversion of the generalized anscombe transformation for poisson-gaussian noise, IEEE Trans. Image Processing 22 (1) (2013) 91–103. [13] A. Jezierska, E. Chouzenoux, J. Pesquet, H. Talbot, A primal-dual proximal splitting approach for restoring data corrupted with poisson-gaussian noise, IEEE Trans. Image Processing 22 (1) (2014) 91–103. [14] A. Foi, Noise estimation and removal in mr imaging: the variance-stabilization approach, in: Proc. IEEE Biomedical Imaging conference ’11, Chicago, USA, 2011, pp. 1809– 1814. [15] J. Forand, G. Fournier, D. Bonnier, P. Pace, Lucie: a laser underwater camera image enhancer, in: Proc. MTS/IEEE OCEANS ’93, Victoria, 1993, pp. 187–190. [16] Y. Shubin, P. Fuyuan, Laser underwater target detection based on gabor transform, in: Proc. IEEE ICCSE ’09, Nanning, China, 2009, pp. 95–97. [17] B. Ouyang, F. Dalgleish, A. Vuorenkoski, W. Britton, B. Ramos, B. Metzger, Visualization and image enhancement for multistatic underwater laser line scan system using image-based rendering, IEEE journal. Oceanic Engineering 38 (3) (2013) 566–580. [18] L. Yongguo, S. Wang, Underwater polarization imaging technology, in: Proc. CLEO/PACIFIC RIM ’09, Shanghai, China, 2009, pp. 1–2. [19] V. Gruev, J. Van der Spiegel, N. Engheta, Advances in integrated polarization image sensors, in: Proc.IEEE/NIH LiSSA’09, Bethesda, USA, 2009, pp. 62–65. [20] P. Chang, J. Flitton, K. Hopcraft, E. Jakeman, D. Jordan, J. G. Walker, Improving visibility depth in passive underwater imaging by use of polarization, IEEE journal. Oceanic Engineering 42 (15) (2003) 2794–2803. [21] L. Jin, L. Gao, L. Huo, W. Yuan, Y. Luo, A. Ho, C. Lin, Technique for false image correction in second harmonic generation microscopy by modulating laser polarization, in: Proc. International symposium Metamaterials’06, Hangzhou, China, 2006, pp. 72–75. [22] B. Swartz, D. James, Laser range-gated underwater imaging including polarization discrimination, SPIE 42 (15) (1991) 42–56. [23] S. Dasgupta, Learning mixtures of gaussians, in: Proc. Fortieth Annual. IEEE SFFCS’99, New York, USA, 1999, pp. 634–644. [24] M. Jordan, J. Kleinberg, B. Scholkopf, Information Science and Statistics, 1st Edition, Springer, Cambridge, U.K., 2006. [25] H. Rabbani, M. Vafadoost, I. Selesnick, Wavelet based image denoising with a mixture of gaussian distributions with local parameters, in: Proc. ELMAR’06, Zadar, Croatia, 2006, pp. 85–88. [26] F. Liu, Y. Wang, P. Wang, J. Huang, Two iterative algorithms for maximum likelihood esitimation of gaussian mixture parameter, in: Proc.IEEE ICNC’13, Shenyang, China, 2013, pp. 1454–1458. [27] G. Rodriguez, Lecture Notes on Generalized Linear Models, review of likelihood theory, http://data.princeton.edu/wws509/notes/a1.pdf, [Online; accessed 3-June-2014] (2007). [28] B. M. J. Buades, A. Coll, Non-local means denoising, IPOL Journal - Image Processing online. [29] F. V. K. Kostadin, D. Alessandro, E. Karen, Image denoising by sparse 3d transform-domain collaborative ltering, IEEE TRANSACTIONS ON IMAGE PROCESSING 16 (8) (2007) 1–16.
/ Image and Vision Computing 00 (2016) 1–15
13
presents an object (s = o), the hypothesis H0 is defined by
5. Appendix 5.1. Appendix 1: The log likelihood ratio Test Let R1 , ...RL denote the independent regions, where each region represent a part of the images. We suppose these regions follow a PoissonGaussian mixture distribution. Given the input image, a set of segments is formed, each consisting of a group of pixels that are statistically dependent. We then treat these segments as statistically independent. With this assumption, the proposed method takes the following forms as a likelihood ratio test. P(ob ject) η= . P(non − ob ject) ( Where:
T est < η T est > η
H0 : s = o, f (s) ∼ PGM(λ, 0, σ.J) λk
1 T (s s − λ)), 2σ2 (5.1.4) and the hypothesis H1 is defined by
⇒ fH0 (s) =
1 (2πσ2 ) 2 k!
H1 : s = I − o, f (s) ∼ PGM(λ, I, σ.J) ⇒ fH1 (s) =
(5.1.1)
λk 1 (2πσ2 ) 2 k!
exp(−
1 (I − s)T (I − s) − λ) 2σ2 (5.1.5)
Then :
Non-object Object
Let H0 and H1 are two hypotheses. H0 is the hypothesis of existence of the object and H1 is the hypothesis of non-existence.
H0 : s[n] = o(n) , n : number of segment. H1 : s[n] = I[n] − o[n] , where n = 1,...,L. (5.1.2)
λk
1 ((s)T (s) − λ)) 2σ2 (5.1.6) λk 1 T H1 : fH1 (s) = exp(− 2 (I I−I T s−sT I+sT s)) 1 2σ (2πσ2 ) 2 k! (5.1.7) Now we can apply the likelihood Ratio : H0 : fH0 (s) =
1 (2πσ2 ) 2 k!
LR(s) = I[n] o[n] with : λ, C
exp(−
known PGM(λ, αi , µi , Ci ) distribution known
exp(−
fH0 (s) . fH1 (s)
(5.1.8)
Applying the previous with the current equation , we obtain: 1 T (I I − I T s − sT I + sT s) − λ + λ) 2σ2 (5.1.9) 1 T T T = exp(− 2 (−I I + I s + s I) (5.1.10) 2σ
Let:
LR(s) = exp(−
• I(I[0], I[1], I[2], . . . ., I[L − 1]) • o(o[0], o[1], o[2], . . . ., o[L − 1]). (5.1.3) be sets of m segments. I[i] denotes all regions in the image I L X I[i] = I, i=0
and o[i] denotes the hypothesis of existence of the object in each segment. To simplify the problem, we assumed that o[n] follows PGM distribution with one Gaussian and zero mean. Then, when the data x of the ith region
= exp(−
1 (+2I T s − I T I)). 2σ2
(5.1.11)
If the treated image is Gray-scale image, we will find no problem. Since we are also dealing with color images, the likelihood ratio can be written as LR(s) = LRRed (s) ∗ LRGreen (s) ∗ LRBlue (s). (5.1.12) We assume that: P(Ob jet) = 1 − P(non − ob ject) ⇔η=
P(H 0 ) ; P(H 1 )
(5.1.13)
/ Image and Vision Computing 00 (2016) 1–15
14
The value calcluated in the last equaion is interpeted as follows:
Then, the log-likelihood ratio is: LLR = log(LR(s)) = −
2I T s − I T I 2σ2
(5.1.14)
For each region, we have to apply this formula. Fig. 5 shows the application of the formula in each region. To obtain the LLR value for the image I, We apply the following formula: LLR =
p(Ii /H0 ) . p(Ii /H1 )
LLR =
• If LLR < log(γ) ⇒ H 1 true. We reject H0 . • If LLR ≈ log(γ) ⇒ we can not conclude. In the case of color images, the result is the addition of each LLR associated for each color:
(5.1.15)
Since Ii are independent, this formula will be in this form: LLR =
• If LLR > log(γ) ⇒ H 0 true. We do not reject H0 .
p(I1 /H0 )p(I2 /H0 )...p(Im /H0 ) p(I1 /H1 )p(I2 /H1 )...p(Im /H1 )
(5.1.16)
0, 236.0, 401.0, 756...0, 334 0, 774.0, 599.0, 244...0, 676
(5.1.17)
(LLR =
3 X
LLRColori ).
i=1
For example in the RGB space, it is recommended to use the following equation:
log(LR(S )) = LRed (S ) + LGreen (S ) + LBlue (S ) (5.1.18)
5.2. Appendix 2: Performance comparison
Fig. 13. (b)Performance comparison according to images with environmental problems. In red: The proposed approach, In Black:NLM filter (in [28]), In Blue: BM3D (in [29]).
/ Image and Vision Computing 00 (2016) 1–15
Fig. 13. (c) Performance comparison according to images in foggy conditions. In red: The proposed approach, In Black:NLM filter (in [28]), In blue: BM3D (in [29]).
15
Highlights: A new underwater image preprocessing method for underwater detection is proposed. This method consists of three procedures. One is image denoising, another is image segmentation via mean-shift algorithm, and the other is log likelihood ratio test. Poisson-Gauss mixture algorithm is proposed for noise reduction. Log-Likelihood ratio test is applied for robust fish detection. The experiments show that the image quality is better after denosing, which improve the performance of object detection. Numerical results outperform the state of the art methods.