Pattern Recognition Letters 1 (1983) North-Holland Publishing Company
March 1983
155-160
Albedo estimation for scene segmentation Chia-Hoang LEE and Azriel ROSENFELD Computer Vision Laboratory, Computer Science Center, University
of Maryland,
College Park, MD 20742, U .S.A .
Standard methods of image segmentation do not take into account the three-dimensional nature of the underlying scene . For example, histogram-based segmentation tacitly assumes that the image intensity is piecewise constant, and this is not true when the scene contains curved surfaces . This paper introduces a method of taking 3D information into account in the segmentation process . The image intensities are adjusted to compensate for the effects of estimated surface orientation ; the adjusted intensities can be regarded as reflectivity estimates . When histogram-based segmentation is applied to these new values, the image is segmented into parts corresponding to surfaces of constant reflectivity in the scene . Abstract
Key words :
Segmentation, albedo estimation .
1 . Introduction The purpose of this paper is to introduce a method of taking the three-dimensional structure of a scene into account when segmenting images of that scene using pixel classification methods . Most of the standard image segmentation techniques (thresholding, edge detection, region growing, etc .) tacitly assume that the image is piecewise constant, i .e ., is composed of regions each having approximately constant intensity [1] . In thresholding, in particular, the histogram of image intensities is examined for the presence of peaks indicating highly populated intensity ranges . These intensity subpopulations are assumed to represent significant constant-intensity regions in the image, so that classifying the image pixels into such subpopulations segments the image into these regions . The assumption of a piecewise constant image is reasonable in some situations, e .g ., if the scene consists of diffusely reflecting planar surfaces . On
The support of the National Bureau of Standards under Grant NB80-NADA-1043 is gratefully acknowledged, as is the help of Janet Salzman in preparing this paper . Note:
0167-8655/83/S03 .00 G 1983 North-Holland
the other hand, if a surface has substantial curvature, there will be significant changes in intensity across its image, so that it will give rise to a smeared-out collection of intensities rather than to a histogram peak . Histogram-based segmentation of such an image will yield 'regions' that do not correspond to surfaces in the scene . Instead of assuming that image intensity is piecewise constant, one can regard it as piecewise linear (or, more generally, piecewise polynomial of some given degree) ; this is the basis for the 'facet model' used by Haralick [2] . This is certainly a more accurate representation, and is important in improving the performance of image segmentation by edge detection or region growing, since it models the nature of edges and of 'uniform' regions more correctly . On the other hand, it is not immediately useful in histogram-based segmentation, since linearly varying intensities are spread out over a range, and in a piecewise linear image, these ranges will often overlap, making it difficult to detect sub-populations in the histogram . This paper introduces a histogram-based method of image segmentation in which the image intensities are first adjusted to take the effects of surface orientation into account . If this can be done correctly, the resulting adjusted 'intensities' 155
Volume I, Number 3
PATTERN RECOGNITION LETTERS
now represent surface reflectivities-or more precisely : products of illumination times reflectivity ; we shall refer to this quantity as 'albedo' . Peaks on the histogram of these albedos may be assumed to arise from significant visible surfaces of constant albedo in the scene . This provides a basis for segmenting the image into parts corresponding to these surfaces by detecting peaks on the adjusted histogram and classifying the pixels into the subpopulations represented by these peaks . The image intensity at a point depends on the illumination, reflectivity, and surface orientation at the corresponding point in the scene . It will be assumed in this paper that the illumination comes from a distant point light source, and that the reflectivity is Lambertian ; these are the assumptions most commonly used in image intensity analysis . Under these assumptions, the image intensity I at a given point P is given by 1,QN • L, where A and j? are scalars representing the illumination intensity at P ; N is the unit surface normal vector at P ; and L is the unit vector at P in the direction of the light source . Thus I depends only on the product AQ (which we call the 'albedo') and on cos 0, the cosine of the angle between N and L . We shall show in this paper how to estimate cos B at each point P of the given image . If we know cos 0, we can 'correct' the observed intensities by dividing them by cos 0 : I/cos B= iQ . If A is approximately constant across the scene (recall that we assumed the light source was distant), this depends only on p, and allows us to detect histogram peaks corresponding to surfaces of uniform reflectivity (constant p) . Note that in this paper we are not estimating the surface orientation relative to the viewer, but only its slant relative to the light source direction, as defined by cos 0 . Surface orientation (and light source direction) estimation could also be used as a basis for correcting the intensities ; but such estimation cannot be done solely on the basis of local intensity values . (See the work of Horn and his students [3-5] on surface orientation estimation, and most recently the Ph .D . thesis of Pentland [6] .) Another related idea is the work of Horn [7] on eliminating the effects of varying illumination from an image ; but this work deals only with the case of a planar surface . 1 56
March 1983
2 . Method Let Ip denote the image intensity at point P, let Np denote the surface normal at P, and let Op denote the angle between Np and the light source direction L, i .e ., Np .L=cos Op . Let Q and R be points lying near P, on each side of it, in the direction of the intensity gradient at P . Let O Q =Op+4 1 , BR = OP - A2, where d I and 4 2 are small . We will first assume, for simplicity, that 41=42 = -4 . Lemma 1 . tan Op -(IR -Ip ++4 2 Ip)/A!p . Proof. We have IR = Apcos OR - Aocos(Op - 4 )
= IQ [cos Bpcos 4 + sin Opsin 4 ] AQ[cos BP (1-+4 2 )+4 sin Op] =7p (l T42)+AQ4sinOp .
Hence IR -IP (1-+4 2 )
dip
iQ4 sin Op AAocos Op
Lemma 2 .4 2 -(21p-IR
tan Op .
-IQ)/Ip .
Proof . We have 2IP - IR - IQ . IP -
_ 2cosOP -cos(Bp-4)-cos(O9 +4) cos Op =2(1 -cos4)-4 2 . The results in Lemmas 1 and 2 can be used to estimate Op, for each point P, in terms of Ip , IQ, and IR . We can then `correct' Ip by dividing it by cos Op to obtain an estimated opt . Note that Qd is just the intensity that we would observe if the light source were in the direction of the normal at P (for every P!) . [In stereomapping, an orthophoto is a parallel-projection image, showing how the scene would look if the sensor were directly 'above' every point . Our corrected image can be regarded as a photometric analog of an orthophoto .] Note also that if P, Q and R lie on a planar surface we
PATTERN RECOGNITION LETTERS
Volume I, Number 3
have 4=0, and the formula in Lemma 1 breaks down ; our method works only when d * 0 (but is small) . However, a plane will in any case yield a histogram peak, since it should have constant intensity, so that 'correction' is unnecessary for plane surfaces . If we drop the assumption that 4 1 =4 2 , we can estimate the error in the result of Lemma 2 as follows . Lemma 3 . Let d be close to both d l and 4 2 , e .g ., their average. Then 21p-IR-IQ
IP
-4 2 +(4 2 -d,)tan0p .
Proof. We have
March 1983
Bp, the estimate will be bad near occluding edges of objects, where 9p is changing rapidly and is close to 90° . If the estimate is exact, and P, Q, R lie on a surface of constant reflectivity, after 'correction' they should all have the same intensity . If they do not, we can apply the method again (since we still have d * 0), and iterate, as long as d remains small (in our experiments, less than about 5 gray levels or a scale of 0-255) . In practice, it was found to be beneficial to smooth the image using a 5x5 median filter after each iteration ; the use of the median tends to avoid interactions across edges . To avoid bad effects near edges, the following scheme was used : In the interior of a smooth object we should have IR -Ip-A4Asinlp ;
2IP-IR-IQ
Ip 2cosBp - cos(Op - 4 2 ) - cos(Op+ 4 L ) cos Bp 2cosOp-cos((Op-4)-(4 2 -4)) cos Op
IQ _Ip+AQJsinOp .
Hence, the ratio IR -Ipl/JIQ -Ipl should be approximately 1 . If it differs too greatly from 1 (e .g ., lies outside the range [f,21), we examine the neighbors of P,• pick the one for which the ratio is closest to 1 ; and use the new value at this neighbor as the new value at P .
cos((dp+A)+(dL-4))
•
cos Op
_ 2cosBp-cos(Op-d)-(4 2 -d)sin(Op-d) cos Bp
-cos(Op+4)+(4 L -4)sin(Op+4) cos Op =42+
(4 1 -4)sin(Bp+4) cos Op
-(4 2 -4)sin(Op-d)
3 . Experiments Our first two examples (Fig . 1 and 3) are synthetic images of spheres . We see that in three iterations the intensities of the spheres have become quite constant . Figs . 2 and 4 show the histograms at iterations 0,1,2,3 for the two examples . The
(by Lemma 2)
cos Bp d•2 + (4, - d)(sin Op + d cos Op) cos Op -(4 2 -4)(sinOp-4cosOp) cos Op -4•2 +(4, -A 2 )tan Op (ignoring the term 4 (4 L -4 2 )) . From Lemma 3, we see that the error in the Lemma 2•result becomes large if 4 2 -4 L or tan Op is large . Thus if we use Lemmas I and 2 to estimate
Fig . I .
1 57
Volume 1, Number 3
PATTERN RECOGNITION LETTERS
March 1983
Fig . 2 .
sphere histograms rapidly change from broad ramps to sharp spikes . Fig . 5 shows a real example - an image of three eggs . Here the reflectivities seem to be approximately Lambertian, but we used an extended light source ; this appears to make little difference . Fig . 6 shows the histograms corresponding to the parts of Fig . 5 . We see that initially the three eggs produce a single broad histogram peak, but after two iterations they have separated into three individual sharp peaks . 4 . Concluding remarks Fig . 3 .
158
We have seen that by estimating the angle bet-
PATTERN RECOGNITION LETTERS
Volume 1, Number 3
March 1983
∎
∎ Fig . 4 .
Fig. 5 .
ween the surface normal and the light source direction at each image pixel, we can 'correct' the image intensities so that they correspond more faithfully to surface reflectivity values . Subpopulations in the histogram of these corrected intensities thus correspond, at least roughly, to uniformly reflective surfaces in the scene, so that histogram-based segmentation can be used to extract such surfaces . This paper illustrates the benefits of using estimated 3D information about a scene as an aid in segmentation . Similar benefits should apply to other image analysis processes . The idea of using 3D surface shape estimation as an integral part of image segmentation and analysis is important when dealing with scenes containing curved surfaces, and should be investigated further .
1 59
Volume 1, Number 3
PATTERN RECOGNITION LETTERS
March 1983
b
d
Fig . 6 .
References
[I] Rosenfeld A . and L .S . Davis (1979) . Image segmentation and image models . Proc. IEEE 67, 764-772 . [2] Haralick R .M . and L . Watson (1981) . A facet model for image data . Computer Graphics Image Processing 15, 113-124 . [31 Horn, B .K .P . (1975) . Obtaining shape from shading information . In : P .H . Winston, ed., The Psychology of Computer Vision . McGraw-Hill, New York, pp . 115-155 .
1 60
[4] Ikeuchi K . and B .K .P . Horn (1981) . Numerical shape from shading and occluding boundaries . Artificial Intelligence 17, 141-184 . [5] Woodham, R .J . (1981) . Analyzing images of curved surfaces . Artificial Intelligence 17, 117-140 . [6] Pentland, A .P . (1982) . The visual inference of shape : computation from local features . Ph .D . dissertation, Department of Psychology, Massachusetts Institute of Technology . [7] Horn, B .K .P . (1974) . Determining lightness from an image . Computer Graphics Image Processing 3, 277-299 .