Hybrid shadow testing scheme for ray tracing K S Eo and C M Kyung
The paper presents a new shadow testing acceleration scheme for ray tracing called hybrid shadow testing (FIST) based on conditional switching between the conventional shadow testing method and Crow's shadow volume method, where the shadow polygons as well as the object polygons are registered onto the corresponding cells under the 3D space subdivision environment. Despite the preprocessing time needed for the generation and registration of the shadow polygons, the total shadow testing time of HST was approximately 50 % of that ot conventional shadow testing for several examples, while the total ray tracing time was typically reduced by 30%. This is due to the selective use of the shadow volume method, with a compromise between maximizing use of the shadow's spatial coherency and minimizing the computational overhead for checking ray intersections with the shadow polygons. A parameter N,h, denoting the critical number of shadow polygons between successive reflection points, was used as a guideline for switching the shadow testing scheme between the conventional method and shadow volume method. A method for calculating Nth from statistical data such as the number of object polygons, average polygon size, and average peripheral length of the polygons was proposed, resulting in good agreement with the experimental results. computergraphics,ray tracing,shadowtesting
The principle of ray tracing has been widely used for realistic image synthesis in 3D computer graphics using the global illumination model instead of the local one used in the scan-line algorithm. However, ray tracing still suffers from extremely long computation time which is mostly taken up by the shadow testings and ray-object intersection calculations ~. Until now, more effort has been put into reducing the ray-object intersection calculation time than towards reducing shadow testing time. There are two representative schemes for accelerating the ray-object intersection calculations, i.e. bounding volume method 1-3and space subdivision method 4-6. The bounding volume method use a hierarchy of bounding volumes or extents enclosing primitives in a scene, to prune many unnecessary intersection checks at the early stages. The Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, P.O. Box 150, Cheongryang` Seoul 131, South Korea
38
bounding volumes can be spheres, boxes or other simple volumes for which it is mathematically relatively simple to calculate the intersection point with a ray. The space subdivision method divides the whole object space into an octree structure s or into a 3D array of uniform cube cells 6. The ray visits the cells pierced by it one by one to test intersections against the objects registered in those cells only. Dippe and Swensen proposed an adaptive subdivision scheme that changes the size of cells dynamically to balance the workload of each cell 4. In this paper, an approach is described for reducing the CPU time for shadow testing during ray tracing, as shadow generation is very important for realistic image generation. Accelerating the shadow testing procedure is essential in reducing the overall ray tracing time, especially when there are multiple light sources, as it is not uncommon for shadow testing time to account for more than 50% of the total rendering time in conventional ray tracing. The shadow testing during ray tracing corresponds to determining the visibility of a reflection point from the light source. If any opaque object lies between the reflection point and the light source, the point is not visible from the light source, i.e. it is in shadow. In the conventional shadow testing scheme 1, a 'shadow ray' is fired from a reflection point towards each light source to determine the visibility of that reflection point from each light source (see Figure l(a)). The shadow ray is advanced until it meets either a blocking opaque object (the point is in shadow) or a light source (the point is not in shadow). Although the program implementation is very easy since it is basically identical to the ray-object intersection calculation, the shadow testing time is very long because of the many intersection calculation steps involved. Haines and Greenberg 7 proposed another shadow testing acceleration scheme, a 'light buffer', that is a cube surrounding each light source consisting of six (front, back, top, bottom, right, left) screens for each facet. This is constructed in a preprocessing step. Each pixel of the light buffer contains visibility information of low resolution in the direction of each pixel from the light source. Each pixel also contains an ordered list of all objects that intersect with the cone composed of the light source and the pixel rectangle. The shadow testing of a reflection point begins by looking at the pixel of the buffer pierced by the shadow ray. The blocking object list prestored in each pixel of the light buffer enables fast determination of the visibility of the
0010-44851(89/010038-11 $03.00 © 1989 (Butterworth & Co) Publishers Ltd
computer-aided design
Light ~J/__ source / ' ~ ~
Light \1/
source~
Shadowrays
// \7
Front-facing
shadowpolygon a
/ shadowpolygon \
b
Figure 1. 2D analogy of two shadow testing methods, conventional and shadow volume method, in 3D spatial subdivision environment (a) Conventional shadow testing method: shadow ray is fired from reflection point to test shadow condition. As shadow ray, SR7 fired from RP7 towards light source meets an object on point P while other shadow ray, SR2 fired from RP2 towards light source meets no object, RP7 is derived to be in shadow and RP2 is not ( b ) Shadow volume method: intersection calculation between ray and each shadow polygon is performed instead of shooting the shadow ray towards the light sources. Shadow volume is defined by region bounded by front-facing and back-facing shadow polygons, each of which is in turn defined by extension of plane formed by light source and relevant silhouette edge. Shadow depth count, DC is an integer value incremented by 1 when the ray intersects each front-facing shadow polygon and is decremented by 1 when the ray intersects each back-facing shadow polygon. Spatial regions where DC is positive nonzero are in shadow while those where DC is zero are not in shadow
reflection point from each light source. Therefore, the light buffer approach can be considered as subdividing the whole 3D space into a set of pyramids emanating from the light source position. This pyramidal subdivision has the drawback that spatial resolution decreases as the distance from the light source increases, which results in a longer computation time for shadow calculation of a reflection point far from the light source. This is because the shadow testing time of a reflection point is increased when the number of objects prestored in the corresponding pixel of the light buffer increases. Yet another shadow calculation scheme proposed by Crow is based on the concept of shadow volumes in the context of the scan-line hidden surface algorithm 8. Shadow volumes are composed of shadow polygons which are defined as the projections of the silhouette edges from each corresponding light source. There are two kinds of silhouette edges as defined by Bergerong: • the extreme edges of the open polyhedrons • the edges between two polygons that are facing towards and away from the light source Although an edge is theoretically a silhouette edge only when it is visible from the light source, in this discussion, edges are considered as silhouette edges as long as they meet either of these two conditions, as it reduces
volume 21 number 1 january/february 1989
computation without any influence on the shadow condition. When a ray pierces a shadow polygon in ray tracing, the visibility from the light source may change. A term denoting the current shadow depth, depth count (DC), is updated for e.ach intersection of the ray with shadow polygons. When the ray pierces a front-facing shadow polygon, DC is incremented (+1), and it is decremented (--1) when the ray intersects a back-facing shadow polygon (see Figure 1(b)). If there are no shadow polygons in the trace path of a ray, the depth count does not change, thereby leaving the end point of the trace path in the same shadow condition as the starting point at no additional cost. This is due to the shadow's spatial coherency which needs to be exploited in ray tracing to reduce the shadow testing time. This paper describes an efficient shadow testing algorithm for ray tracing using the shadow volume concept; i.e. when a ray visits a cell, the intersections of the ray with the shadow polygons as well as the object polygons stored in the list are tested. The depth counts are updated as the ray is traced. When the reflection point of a ray on the surface of an object is found, the depth count value determines the visibility of a light source at that point. In this scheme, the generation of a shadow ray is unnecessary. However, despite the use of a shadow's spatial coherency, the ray tracing scheme with shadow volume was shown to be not necessarily superior to the conventional shadow
39
testing scheme according to experimental results given in this paper. This is attributed to the,a'dditional cost for calculating the intersection point of a ray with shadow polygons when there are a large number of shadow polygons traversed by the ray, while in the conventional scheme the cost of intersection calculation with shadow polygons is zero. The so-called hybrid shadow testing (HST), which is a new shadow testing algorithm proposed in this paper, is based on the conditional switching between the shadow volume method and the conventional shadow testing method under the uniform space subdivision environment, where the shadow polygons as well as the object polygons are registered onto the relevant 3D space subdivision cells. In the next section, the notation used in this paper is defined, and in the section which follows that, the CPU time behaviours of the shadow volume method and the conventional shadow testing method are analysed, and an expression for Nfh, the critical number of the shadow polygons which are registered onto the cells pierced by the ray between two successive reflection points, is derived. A brief description of the proposed HST algorithm then follows and the experimental results are given.
NOTATION 2DDDA 3DDDA Aa~ p
Ap Ath C, C~v Ci Cc C~ C~1 C~,2 Co DC HST k
I L Lav~p
i.p
MBB N
40
2D digital differential analyser 3D digital differential analyser average area of polygons area of polygon threshold parameter of the polygon area which is typically 3 cost of testing the intersection between a ray and a shadow polygon shadow testing cost of the shadow volume scheme ray initialization cost in 3DDDA ray advancement cost in the uniform spatial subdivision environment cost of shadow testing in the conventional shadow testing scheme cost incurred by the ray advancement in the conventional shadow testing process cost incurred by the ray-object intersection calculations in the conventional shadow testing process cost of calculating the intersection point of a ray with an object polygon shadow depth count hybrid shadow testing probability that at least one polygon among the polygons registered in a cell actually intersects the shadow ray segment in the conventional shadow testing process length of a line segment light vector average peripheral length of polygons peripheral length of polygon minimum bounding box surface normal vector number of cells pierced by a shadow ray
segment between the reflection point and the light source N CI total number of cells in the uniform spatial subdivision environment total number of object polygons regisNo tered in all the cells pierced by the shadow ray segment between the reflection point and the light source average number of objects registered in a nor cell total number of object polygons in the No~ whole environment Ni the number of cells pierced by a line segment of length/in 3DDDA Np(Lp, Ap) the number of cells in which a polygon whose peripheral length and area are Lpand Ap, respectively, is registered under the uniform spatial subdivision environment total number of shadow polygons regisN~ tered in the cells pierced by one ray segment threshold number for switching between Nth the conventional shadow testing and the shadow volume testing Nth2 threshold number for switching between the coherence shadow testing and the shadow volume testing average probability that an object polyP gon registered in a cell is really intersected by a ray visiting that cell probability that the conventional shadow q, testing process is finished after the ray's visit to the ith cell shadow condition which is zero when the SC reflection is visible from the light source and otherwise nonzero 0[ average number of object polygons registered in a cell which must be tested for intersections with the ray visiting the cell hit rate that denotes the probability that the object blocking a pixel from a light source also blocks the adjacent pixel from the light source
COST COMPARISON BETWEEN SHADOW VOLUME AND CONVENTIONAL SCHEMES In this section, an analytic expression is derived for the shadow testing time in the conventional shadow testing method and the shadow volume method. It was assumed that the whole object space was divided into 3D space uniform subdivision cells, and each primitive cell contains within it a list of all polygons partially or fully overlapping that primitive celP. A ray segment is defined as a line segment between two successive reflection points or between eye position and the first reflection point, which are shown as P1 and P2 in Figure 2, respectively. The ray visits each cell pierced by the ray segment one by one from P. to P2- In this implementation of ray tracing under uniform space subdivision, each ray segment is provided with a bucket for storing a polygon list. Even though some
computer-aided design
Light
polygons registered in the ith cell and I denotes the set of the indices of the cells pierced by the current ray segment. Note that the actual number of shadow polygons pierced by the ray segment cannot be greater than N~.
\l / "~ hadow ray
Shadow testing cost: conventional scheme The shadow testing cost of the conventional shadow testing scheme, which is denoted as C~v, can be represented as the sum of
Figure 2. Schematic illustration of two important numbers, N~ (total number ot shadow polygons registered onto cells pierced by ray segment between P~ and P2) and N O(total number of object polygons registered onto the cells pierced by shadow ray shot from P, towards light source)
• the cost incurred by the advancement of the shadow ray and • the cost incurred by the ray-object intersection calculations between the shadow ray and each object registered in the cells pierced by the shadow ray. The former cost is written as C~, and the latter as C~v2. Assume that object polygons in the scene are uniformly distributed. Then Co,,1can be expressed as the sum of three components: Nc
object polygons registered in a cell are not intersected by the ray or have an intersection point outside the cell, these polygons are still stored in the bucket so that they are not checked again for intersection with the same ray segment at following cells. This bucket is also used to avoid the erroneous detection of the ray's intersection with the polygon comprising the firing point of the same ray, which could occur due to the finite precision in representing the floating point numbers. A similar procedure was used to check the intersection of the ray with the shadow polygon in this implementation of the ray tracer based on the shadow volume method.
Shadow testing cost: shadow volume scheme Before describing the proposed shadow testing algorithm called HST (hybrid shadow testing), it is helpful to compare the conventional scheme and the (pure) shadow volume scheme in terms of the CPU time requirement for shadow calculation. The shadow testing cost of the shadow volume method is considered first. The shadow testing cost of P2 in Figure 2 using the shadow volume method is denoted C~,,, which can be expressed a follows C~,,= C~N~
(1)
where C~ is the cost of testing the intersection between a ray and a shadow polygon, and N~ is the total number of shadow polygons registered onto the cells which were pierced by the ray segment, as expressed by
N~ = ~ n~(i)
(2)
In equation (2), n~(i) represents the number of shadow
volume 21 number 1 january/february 1989
C~ = C, + ~ iq, + N~C~(1 -- k) Nc
(3)
i=1
These components denote, from the left, initialization cost, the cost incurred when the point turns out to be in shadow, and the cost incurred when it is not in shadow. The meanings.of C, Cc and Nc are as foilows: C,
the cost for initializing ray parameters for 3DDDA (three dimensional digital differential analyser) such as direction vector, distance counters, and current cell index C~ the cost incurred by the advancement of the ray to the next cell in 3DDDA; that is, the cost incurred in updating the ray parameters to determine the next cell to be visited Nc the number of cells pierced by the shadow ray segment between the reflection point and the light source position The second term on the right-hand side of equation (3) denotes the sum of cost contributions from each of the Nc cells, where q, is the probability that the shadow testing process is terminated at the ith cell in the list of Nc cells pierced by the shadow ray segment. This occurs when at least one of the object polygons registered in the ith cell is found to intersect the shadow ray. Before the third term in the right-hand side of equation (3) is considered, two other basic statistical parameters are defined, p and 0c p denotes the average probability that a ray intersects an object polygon given that the cell onto which the object polygon is registered is pierced by the ray. ~ denotes the average number of object polygons registered within a cell that must be tested for intersections with the ray visiting the cell (the polygons already tested for intersection with the
41
current ray segment in the earlier cell are not further tested at the current cell due to the bucketing scheme mentioned previously). The average probability that the shadow testing is terminated in one cell, denoted k, can be represented as k = 1 - (1 - p)~
According to the definition of (10)
N o = o~Nc
Hence, equation (9) becomes
(4) C ~ , , = C , + { 1 - - ( 1 - - P )N°} 1 _ ( 1 _ p ) ~
In other words, k represents the probability that at least one polygon among those polygons registered onto a cell actually intersects the shadow ray segment. Returning to the meaning of q~ (the probability that the shadow testing process is returned at the ith cell), it can easily be seen that it can be rewritten using k as
qi = k(1 -- k) i-1
(5)
Therefore, the second term in equation (3) can be expanded using k NC
Nc
C~ E iq,= Q E ik(1-k)'-' i~1
i~1
= C~{k + 2k(1 -- k ) + 3k(1 -- k) 2
+ ... +Nck(1 --k) N~-I}
(6)
Finally, the third term in equation (3)is the product of (I --k) N~, the probability that none of the N~ cells has even a single polygon intersecting the shadow ray segment, and N~C~, the actual cost incurred in such a case. Inserting equation (6) into equation (3) and rearranging the terms using the relation in equation (4), the following equation is obtained Co,~ = Ci + C~k + 2Cck(1 -- k) + 3C~k(1 -- k) 2 + . . . + NcCck(1 -- k) N~- 1 = Ci + C~
= Ci + C~
(11)
Figure 3 shows the behaviour of C~v against N o as expressed by equation (11) when the machinedependent cost parameters C,, Co, and Co are given the values of 2.5 ms, 0.2 ms, and 2.0 ms, respectively. In equation (11), ~ and p are scene-dependent statistical parameters. In Figure 3, Co, is calculated as a function of N o, while 0~ is fixed as 0.2 and p assumes several values between 0 and 1. In the Appendix, the procedures for deriving p and ~ as functions of various statistical parameters such as the number, average peripheral length and area of the object polygons, and the number of 3D space subdivision cells are described. The cost parameters such as C s, C,, Co, and Co in equations (I) and (11) can be measured because they are dependent only upon the procedures and machines used. At a reflection point, N~, which is the number of cells pierced by the shadow ray segment, can be calculated simply using the method described in the Appendix. Then N o can be obtained according to equation (10). It can be seen that C~ in equation (11) (and also in Figure 3)increases monotonically as N o increases, and decreases as p or 0~increases. As N o increases, the term { 1 - ( I _p)No} goes to I making C~ independent of N o. The y-axis intercept of the curve of Co,, shown in Figure 3 corresponds to the value of C, which reflects the initialization cost of the shadow ray.
1 -- (1 -- k) Nc
HYBRID SHADOW TESTING (HST) SCHEME
k 1 -- (1 -- p)~c (7)
1 -- (1 -- p)~
As was mentioned in earlier sections, the shadow testing scheme using the shadow volume concept
On the other hand, C~,2, which denotes the ray-object intersection calculation cost, can be expressed in a similar way
15
p=0.3
0"=0"2
C~2 = Cop + 2Cop(1 - p )
+ 3Cop(1 - p ) 2
_ I -
+ ... + NoCop(1 --p)No-1 + NoCo(1 -- p)No 1 -- (1 -- p)No
(8)
=Co
P where Co is the cost of the calculation of a ray-object intersection between shadow ray and object polygon, and N o is the total number of object polygons registered in all the cells pierced by the shadow ray segment. Hence, C~ becomes
p=0.9
> o O
o
Number
S
10
o f objects possibly i n t e r s e c t i n g
1S shadow ray (N O )
Co,, = Co,. 1 -t- Coy 2
1 - - ( 1 __ p)~N~ = C, + C~
42
1 -- (1 -- p)~
1 - - ( 1 _p)No + Co
p
(9)
Figure 3. Plot of Coy, c o n v e n t i o n a l s h a d o w testing time versus No, lot p = 0.3, 0.5, 0.7 and 0.9, while o~ = 0.2. (C, = 2.5 ms, Cc = 0.2 ms, Co = 2.0 ms)
computer-aided design
becomes less attractive as the number of shadow polygons tested against the ray is increased, due to the computational overhead of ray-shadow polygon intersection. In the other extreme case, when the number of shadow polygons intersecting the ray is zero, the shadow volu'rne scheme is expected to be much faster than the conventional scheme since there is no need to fire a shadow ray at all. This observation led to the consideration of a hybrid shadow testing scheme (HST) as a compromise between the conventional and the shadow volume methods. In the proposed HST scheme, the value of N~ becomes a guide for determining the shadow testing scheme on the reflection point between two methods. When the total number of shadow polygons registered onto the cells pierced by the current ray segment (N~)is less than a certain switching threshold N,h, the shadow volume scheme should be more effective than the conventional scheme for shadow testing. At every reflection point, Co, can be evaluated using the value of NOin equation (11), where Ci, Cc and Co are machine-dependent parameters, and p and ~ are scene-dependent statistical parameters. The switching threshold Nth, which is the critical value of Ns for deciding between conventional and shadow volume shadow testing schemes, is determined at a reflection point by equating the right-hand sides of equations (1) and (11), i.e. 1{ ( Cc + Co'~'~ Nth=-~.~ C , + ( 1 - - ( 1 - - p ) N°) l _ ( ~ - - p ) ~ p/J (12) For example, in Figure 3, if No of a reflection point is 3 and p is 0.5, then Co, becomes 7.3 and N~h of the reflection point becomes 3.65 when C~ is 2.0. Therefore, in cases where N~ is less than or equal to 3, the shadow volume scheme is selected for testing the shadow on the reflection point. Otherwise, the conventional scheme is selected. In the proposed HST scheme, there are four different subschemes for determining the shadow condition of a reflection point, which are represented as Sustain, Self-shadow, Conventional, and Shadow volume. These are shown in Figure 4, which is a state diagram of the HST scheme. It is to be noted that the transition from shadow volume to conventional scheme is always allowed, while the reverse transition is allowed only when the shadow condition (SC) of the previous reflection point of the current ray is zero or the shadow depth count is valid. The reason is that the shadow volume scheme requires the maintenance of the shadow depth count, and when the visibility is false, the depth count value cannot always be correctly resumed because it can be any positive integer. Before resorting to actual shadow testing using the conventional or shadow volume method, overall shadow testing time can be significantly reduced by using the so-called ray segment shadow coherence. That is, if the shadow bucket contains no shadow polygon (N~ = 0), the shadow condition does not change and the previous shadow condition is simply sustained. The first condition to be checked is self-shadow, which filters out all
volume 21 number 1 january/february 1989
L'N>O
N=0
(%
$
L. N.~O
~Conventional ~ " L.N>O
Ns / N t h l ° r ( S D F = O )
I Figure 4. State transition diagram of HST scheme using four different subschemes.for evaluating shadow condition depending on values of the control parameters such as /.. N, N s, SC and SDF, where Ns denotes the total number of shadow polygons registered onto the cell pierced by the ray segment, Nth, the switching threshold, t, light vector on the reflection point, N, surface normal vector on the reflection point, SC, previous shadow condition (0: no shadow; 1:shadow), and SDF, shadow depth flag (1: shadow depth count value is valid, 0: not valid) polygons facing away from the light source. It can be simply tested by taking the dot product of the light vector L with the surface normal N on the reflection point. Although the shadow testing, i.e. checking whether L. N > 0 or L. N < 0, is very simple, it has the drawback that the shadow depth count cannot be maintained, which makes it quite difficult to revert to the shadow volume method later. The pseudocode of the HST scheme is as follows. For each light source if( N s equals O)/* Shadow bucket is empty. */ (Null action); else if(C'. N < 0) { /* L'. i~ < 0 means self-shadow. */ shadow_condition = 1 ; / * Shadow_condition of 1 or 0 denotes that the reflection point is in shadow or not, respectively. */ shadow_depth_flag = 0;/* Shadow_depth_flag is I when the shadow depth count is valid and 0 otherwise. * / else if(N~ ~< Nth and (shadow_condition equals 0 or shadow_depth_flag equals 1)) { shadow_condition = shadow_volume_test(); shadow_depth_flag = 1 ;
}
else { shadow_condition = conventional_shadow_test(); shadow_depth_flag = 0;
}
The initial shadow condition for a primary ray, whose starting point is the eye position, has been determined by casting a ray from the eye position to each light source. However, the computational overhead due to this is negligible since the initial shadow condition for each light source is required to be tested only once throughout the whole rendering process.
E X P E R I M E N T A L RESULTS A N D D I S C U S S I O N Three different schemes for the shadow testing, i.e. the
43
shadow volume scheme, the conventional scheme, and the proposed HST scheme, were implem/ented on a Sun 3/75 using C. The program lengths, including documentation, are about 4800, 3000, and 5000 lines long, respectively. Fujimoto's uniform space subdivision scheme was adopted, and this requires a preprocessing step for registering the object polygons and the shadow polygons onto the relevant cells. Determining optimal cell resolution is a very difficult and arbitrary problem because scene complexity is not straightforward to parametrize. In. this regard, Marsh 1° reported that a total number of cells of S0 times the total number of object polygons is quite adequate; this agrees well with these experimental results. In this implementation, 3DDDA, which is an extension of 2DDDA, is widely used in the uniform space subdivision 6'1° for registering the object and shadow polygons onto the cells and for the incremental determination of the sequence of visiting cells pierced by the ray ~°. The shadow polygons are clipped with the scene MBB (minimum bounding box) before they are registered onto the spatial subdivision cells so that they can be dealt with like object polygons. As a result, each cell in the spatial subdivision contains two lists: an object polygon list and a shadow polygon list, after such a preprocessing step. The additional storage requirement for the shadow polygon lists, which is proportional to the number of the light sources, was roughly the same, in this example, as the storage space for the object polygon lists for the case of one light source. Note that the shadow polygons typically have areas larger than those of the object polygons. The image data files generated on the Sun system were transferred to an IBM PC AT, where a graphic system with 8-bit pixel depth was installed. The output images are intentionally corrupted with Gaussian noise to smooth out the 'spurious contours' occurring due to the quantized intensity levels of each colour component. Five example images are shown in Colour Plate 1. Test images for cost measurements were generated at a resolution of 512 x 480 without antialiasing, and the images presented in Colour Plate 1 were prepared at a resolution of 1024 x 768 with antialiasing using the adaptive super-sampling technique 11. In this implementation of the HST scheme, N,h is determined only once in the preprocessing step and used over the whole shadow testing to avoid the evaluation of N,h at every reflection point. The measured values of Cs, Ci, Co, and Co used in equations (1) to (12) in the Sun 3/75 environment are shown in Table 1. As shown in the Appendix, ~ and p can be determined from such data as the average polygon area (A~vg.p), average peripheral length of polygons (L~,,~p), the number of registered objects per cell (nor), total number of objects (No,) and total number of cells (Nc,), all of which are measured in the preprocessing step, while the optimal value for N~h can be predicted from equation (12) using p and ~ values thus obtained and in situ measurement of No. These values for each of the five example images are presented in Table 2. Figure 5 shows the measured CPU times for the shadow testing using the three shadow testing schemes, i.e. the conventional, shadow volume and hybrid shadow testing (HST) methods, for each of the five example
44
Table 1. Experimental values (in ms) for machine-dependent parameters
Items
C~
C.
C~
Co
Time
2.00
2.54
0.23
1.88
images in Colour Plate 1, where the values of N~h for the HST scheme are externally supplied and varied. The predicted optimal values for N~h, as calculated by equation (12) and denoted by N,~.(, are in close agreement with their measured values denoted as Nt°h.m for all examples. The reason that the value of N,h for the image 'Glasses' is significantly larger than those of the other images is that this image is composed of relatively small polygons, making the values of p and smaller and the value of Co,.greater. Table 3 summarizes the frequency of occurrence of the four subschemes within the HST scheme for each of the five example images in Colour Plate 1. The Sustain case occupies 30% to 60% of the total shadow testings. The CPU time differences between the conventional shadow testing scheme and the HST scheme at Nth = 0 in Figure 5 is due to this exploitation of the 'ray segment shadow coherence' property. On the other hand, as N,h goes to infinity, the CPU time of the HST scheme approaches that of the pure shadow volume scheme. Table 4 presents the CPU times consumed in preprocessing, shadow testing and other procedures of the three shadow testing schemes for each of the five example images. Total shadow testing times consumed by the HST scheme are reduced by factors of 2.5, 2.2, 1.5, 1.8, and 1.3, compared to the conventional scheme, for the 'Glasses', 'Desk', 'Stars', 'Cards', and 'KAIST' examples, respectively. Finally, the pixel shadow coherence TM which denotes the fact that if a given reflection pixel point is in shadow, nearby pixel points are likely to be shadowed by the same object, is additionally used in the conventional and HST schemes. 'Hit rate' j~ is defined as the probability that the object blocking a pixel from a light source also blocks the adjacent pixel from the light source. When the pixel shadow coherence is used in the HST scheme for shadow testing, some modifications are necessary, i.e. another switching between the coherence test (coherence test denotes the shadow testing of the reflection point against the polygon which blocked the previous pixel from a light source) and the shadow volume scheme is required in the shadow testings. Referring to Figure 2, if N s is greater than a certain threshold, say N,h2 (it was found out that for all examples tried, Nth2 N,h, the conventional method is used, and otherwise, the shadow volume scheme is used for the shadow testing given that the shadow depth count is maintained. Another switching threshold, Nth2is now considered.
computer-aided design
100
J
Conventional scheme
80
T:
Shadow volume scheme
200
Shadow volume scheme
\
60~
E
o~ c
40
~ ~(~w'~Hybrid
_
160
120
Hybrid scheme -
80
scheme
I
NO
i
O "O
20
I 5
I
I
,~-~--.
,
iI
N °
10
a
L.~ I
15
20
I 25
o
I 30
0
5
I
I
25
30
\
C
400
4;
~,~ ] ~
r
i
d
E
scheme
100~'
300
Conventional scheme
C
I ~Nth.m i
50
I
k
0
O "O
N°
',l 0
0
I
I I i ,
u3
5
10
i
15
I
I
20
25
I
3O
Shadow volume scheme Hybrid scheme
I 200
-
~
N°
_
. . . .
N O
_
oth, m [ : ~ T "-'- N th.,c 5
10
I
I
I
I
15
20
25
30
Switching threshold. Nth
Figure 5. Shadow testing times by HSTschemefor various switching threshold values (Nth), compared with those of conventional and shadow volume approaches for five example images (a) Glasses, (b) Desk, (c) Stars, (d) Cards, and (e) KAIST (shown in Co/our Plate "/). N °th.c denotes calculatedoptimal value of Nth, and N~h.m denotes measured optimal value of N.h
Conventional scheme
300
?:
0
"[I
d
400 -
/
100 "
th ,c ~
Switching threshold, Nth
C
th,m
Table 2. Experimental values for various statistical parameters for each of the five test images from Colour Plate 1
O
"o U3
I 20
150
I
o~ c-
J 15
Shadow volume scheme
200
o "o
~
10
Switching threshold, Nth 500
S~hadow volume scheme y C~n~entional s c h e m e J
E o'1 c~
N°th,c
[h
b
Switching threshold, Nth 200
T:
I I
th,c i
th,m
4O
2U'I
I
U3
N°
O 'ID
th, m
I
100
I I
Images AavR~p
LavR, p
nor
Not
Ncl
Glasses Desk Stars Cards KAIST
3.479 6.656 12.316 4.568 15.868
0.229 3781 0.332 523 0.734 113 0.343 157 0.590 981
O(
p
NIh
I
1 0
e
,
I 10
0
I
I
I
J
15
20
25
30
Switching threshold, Nth
1.435 2.702 7.285 4.304 17.704
74 x 39 x 22 x 27 x 45 x
32 x 22 x 14 x 14 x 17 x
67 32 19 23 69
0.115 0.276 7.042 0.166 0.430 4.546 0.367 0.703 2.924 0.172 0.630 3.432 0.295 0.844 2.657
Table 3. Absolute numbers of occurrences (and relative fractions in parentheses) of each of four subschemes in proposed hybrid shadow testing scheme for five test images of Colour Plate 1 Images
Sustain
Glasses Desk Stars Cards KAIST
109 266 306847 259 048 609 782 204471
(47%) (39%) (30 % ) (62 % ) (21%)
Self-shadow
Shadow volume
19050 108861 55 152 53 400 107 532
79971 169977 346139 78 808 208 535
(8%) (14%) (6 % ) (5 %) (11%) "
v o l u m e 21 number 1 january/february 1989
(34%) (23%) (40 % ) (8 % ) (21%)
Conventional
Total
24279 201 782 207 746 245 440 472 442
232566 (100%) 787467 (100%) 868 085 (100 % ) 987 430 (100 % ) 992980 (100%)
(11%) (24%) (24 % ) (25 % ) (47%)
45
Table 4. Comparison of CPU times (min)/gn Sun 3/75 for shadow testing methods for example images in Colour Plate 1 Image
Scheme
Preprocessing*
Shadow testing
Others'l"
Total
Glasses
Conventional Shadow volume HST(Nth = 9)
2.00 (1% ) 6.50 (5%) 6.50 (5%)
94.47 (55 % ) 48.82 (37%) 37.99 (32%)
75.78 (44 % ) 75.78 (58%) 75.78 (63%)
172.25 (100 % ) 131.10 (100%) 120.27 (100%)
Desk
Conventional Shadow volume HST(Nth = 4)
0.42 (0 % ) 1.42 (0 % ) 1.42 (1%)
170.83 (52 % ) 174.37 (52 % ) 79.52 (33%)
158.00 (48 % ) 158.00 (48 % ) 158.00 (660)
329.25 (100 % ) 333.79 (100 % ) 238.94 (100%)
Stars
Conventional Shadow volume HST(N~h= 4)
0.08 (0 % ) 0.52 (0%) 0.52 (0%)
147.55 (61% ) 199.58 (68%) 92.13 (50%)
94.00 (39 % ) 94.00 (32%) 94.00 (50%)
241.63 (100 % ) 294.10 (100%) 186.65 (100%)
Cards
Conventional Shadow volume HST(N~h= 3)
0.08 (0 % ) 0.90 (0%) 0.90 (0%)
182.05 (60 % ) 417.57 (77%) 97.88 (44%)
124.45 (40 % ) 124.45 (33%) 124.45 (56%)
306.58 (100 % ) 542.92 (100%) 223.23 (100%)
KAIST
Conventional Shadow volume HST(N~h= 5)
0.67 (0 %) 3.63 (1%) 3.63 (1%)
335.35 (71% ) 321.10 (70%) 252.05 (64%)
135.60 (29 % ) 135.60 (29%) 135.60 (35%)
471.62 (100 % ) 460.33 (100%) 391.28 (100%)
* Preprocessing includes file input, flattening object hierarchy, object registrations, and viewing transformations. In shadow volume and HST schemes, extraction of silhouette edges and registrations of shadow polygons are also included "i'Jobs required for finding reflection points, illumination calculations, and the file output
Table 5. CPU times (min) and hit rates (,8) for shadow testing using pixel coherence
Glasses Desk Stars Cards KAIST
Using pixel coherence
Conventional
HST
Conventional
HST
Nth
N,h 2
97.47 170.83 147.55 182.05 335.35
37.99 79.52 92.13 97.88 252.05
64.90 108.98 99.77 128.65 304.85
29.53 49.25 61.68 53.40 114.57
9 4 4 3 5
1 1 1 1 1
Only the case where N 5 is less than N,h may be considered, because N,a2 is assumed to be less than Nta. The cost of the shadow testing scheme which has the coherence test option on top of the earlier HST scheme is denoted as Ccs and is written as Cc5 = rico + (1 -- fl)(CO + CsN,)
(13)
where Co and Co + C~N5 are the costs incurred when the coherence test is 'hit' and 'missed', respectively, which are multiplied by the probability of their occurrences ~ and ( 1 - 8), respectively. By equating the right-hand sides of equations (1) and (13) and setting Ns = N~h, the expression for N~h2 can be obtained as follows
co 1 Nih 2 = - - .
Cs 1
_
_
(14)
since Co ~ C~, for most cases. However, the value of Nfh2 was fixed at 1 in this experiment, based on the following two observations. First, according to the experiments for the five example images, the wide variation of Nth2
46
Related parameters
Not using pixel coherence
64 % 96 % 93 % 91% 93 %
influenced the shadow testing times within only 3%. Second, the substitutions of the measured/~ values for the five example images, shown in Table 5, into equation (14), shows that the values of N~h2 are in the range of 1 to 1.5. The CPU times of the shadow testing process that includes the pixel shadow coherence test were shortened by additional 2 3 - 5 5 % compared to those of the scheme where it is not included (see Table 5).
CONCLUSIONS A new shadow testing algorithm during ray tracing called hybrid shadow testing (HST) is proposed and implemented under the 3D space subdivision environment. The HST algorithm exploits various subschemes for fast determination of the shadow condition, i.e. 'Sustain' for exploiting the ray segment shadow coherence, 'Self-shadow' for fast screening of the back-facing polygon, 'Shadow volume' and 'Conventional shadow testing' methods. An empirical formula for determining the optimal switching threshold Nfh was proposed using analytically derived statistical parameters such as p and ~. The experimentally obtained optimal values for Nrh
computer-aided design
were generally in excellent agreement with those predicted from the analytic derivation, for various example images. The shadow testing time was itself reduced by a factor of approximately 2 compared to the conventional shadow testing schemes, while the total ray tracing time was reduced by about 30% for the five example images. Furthermore, by using the pixel shadow coherence, the CPU times for shadow testing were shortened by an additional 20-55 %. Work continues on an efficient scheme to generate the silhouette curves and edges of curved objects such as spheres, cylinders, ellipsoids, and cones. REFERENCES 1 Whilted, T 'An improved illumination model for shaded display' Commun. ACM Vol 23 No 6 (June 1980) pp 343-349 2 Rubin, S M and Whirled, T 'A 3-dimensional representation for fast rendering of complex scenes' Comput. Graph. (Proc. SIGGRAPH '79) Vol 14 No 3 (July 1979) pp 110-116 3 Kay, T and Kajiya, J 'Ray tracing complex scenes' Comput. Graph. (Proc. SIGGRAPH '86) Vol 20 No 4 (1986) pp 269-278 4
Dippe, M and Swensen, J 'An adaptive subdivision algorithm and parallel architecture for realistic image synthesis' Comput. Graph. (Proc. SIGGRAPH '84) Vol 18 No 3 (1984) pp 149-158
5 Glassner, A S 'Space subdivision for fast ray tracing' IEEE Comput. Graph. Appl. Vol 4 No 10 (October 1984) pp 15-22 6 Fujimoto, A, Tanaka, T and Iwata, K 'ARTS: accelerated ray-tracing system' IEEEComput. Graph. Appl. Vol 6 No 4 (April 1986) pp 16-26 7 Haines, E A and Greenberg, D 'The light buffer: a shadow-testing accelerator' IEEE Comput. Graph. Appl. Vol 6 No 9 (September 1986) pp 6-16 8 Crow, F 'Shadow algorithm for computer graphics' Comput. Graph. (Proc. SIGGRAPH '77) Vol 11 No 2 (1977) pp 242-248 9 Bergeron, P 'A general version of Crow's shadow volumes' IEEE Comput. Graph. Appl. Vol 6 No 9 (September 1986) pp 17-28 10 Marsh, D M 'UgRay: an efficient ray-tracing renderer for Unigrafix' Report No UCB/CSD 87/360 Computer Science Division, University of California, Berkeley, CA, USA (May 1987) 11 Cook, R, Glassner, A, Haines, E, Hanrahan, P, Heckbert, P and Speer, L R 'Introduction to ray tracing' SIGGRAPH '87 Course # 13
such statistical data as the average peripheral length and the average area of the polygons, the number of polygons, and the number of 3D space subdivision cells. The following is an analytic, and also empirical, approach for obtaining Np(Lp, Ap) which is, in turn, used for deriving expressions for p and 0~. The average number of cells pierced by a line segment with length/in 3D is calculated before trying to calculate the number of cells onto which a polygon with peripheral length Lp and area Ap is registered, as denoted by Np(Lp, Ap). Without loss of generality, it can be assumed that an.end point of the line is located at the origin and the other end point is at a point (x, y, z) in the first octant. Note that the object coordinate system is normalized to the size of the cell in the subdivision. Then, according to the definition of 3DDDA mentioned in the results section, the number of cells N~ pierced by the line of length I becomes the Manhattan distance between the two end points, that is
Nl(x, y, z ) = x + y + z
(14)
where the length of the line / equals (x2+ y2 + z2)1/2.In the spherical coordinate system
N,(O, ~) =/{sin 0 (cos ~ + sin ~) + cos 8}
If we assume the distribution of the orientation of line segments to be uniform, the expectation of N~ is a function of I only, which is given as an integration of N1(0, ~) over the surface of the octant of a sphere of radius I (see Figu.re 6) ~:/2 ~/2
N1(0, ~) ~ 0
8
F sin 0 d8 d~
0
~/2 ~/2
{ sinS(cos~ + sin~) + cosS} sin8 d8 d~ 0
0
= 1.51
A formula for p and ~, whose meanings are given earlier, is derived in which they are expressed as functions of
volume 21 number 1 january/february 1989
(16)
It is assumed that Np(Lp, Ap), the number of cells onto which a polygon with its peripheral length Lp and its area Ap is registered, can be considered as a sum of two terms, i.e. the number of cells pierced by the boundary line of the polygon and the number of cells pierced by the internal region of the polygon. This assumption generally holds fairly well especially for polygons that have large areas
Np(Lp, Ap) = NI(Lp) + g(Ap) = 1.SLp+ g(Ap)
(17)
Equation (17)is rearranged as g(Ap) = Np(Lp, Ap) -- 1.5Lp
APPENDIX
(15)
(18)
The right-hand side of equation (18) was plotted against the area of polygon (Ap) in Figure 7, using the values of Np(Lp, Ap) for a multitude of polygons having various values of Lp and Ap obtained from one set of our
47
'2° F (£d6)-(£sined¢)
o~ 4
0
~
~ Slopeof 3
tz// OIIF ~ 0
~"
1 10
1 20
I
I 30
40
Area,A Figure 7. Plot of measured values of g(A)=NpfL, A ) - 1.5L versus polygon area (A) X
Figure 6. Scheme for integration of N,(0, ~) over the sudace of the octant of a sphere with radius I experimental data. We found that, for most experimental data, we can approximate g(Ap) as a linear function having a slope of 3, whereby Np(Lp, Ap) becomes
---
r--, L-
(19)
~
F--]
r--~
L---
L-_J
L--J
r- 7
--
t.__J
I
--~
"'
--1 ~__:
L__' r. . I
Np(Lp, Ap) = 1.5Lp + 3Ap
.... .I I - _ _ _
r--
.
r---
7
L___,
.
.
.
. ,L~__~ 'l-,. . . . .~ _
, r" - - 1 i
--
Figure 8 shows the projection of a polygon and subdivision cells as seen in the direction of the current ray segment. We can easily see that the value of p should be unity for the internal cells, while it is less than one (1/2 for a sufficiently long peripheral line) for the cells in the periphery of the polygon. For a sufficiently large-area polygon, an intuitive" expression for p can be written as in equation (20), where the premultiplying term is an adaptation factor for smooth fitting with the small-area polygons
i
Nl(Lp)/2 x
p = (1 -- e-A~/A'h)• 1 -- Np(Lp, Ap)J = (1 - e -Ap/A'h) Ap + 0.25Lp Ap + 0.5Lp
(20)
where A~h is the threshold area, whose typical value was found to be about 3 (in the unit of the area of the facet of the 3D subdivision cell) from experimental measurements. Note that the premultiplying term (1 - - e x p ( - Ap/Ath)) becomes unity as the area of the polygon Ap becomes sufficiently large. In the postmultiplying term of equation (20), Np(Lp, Ap) corresponds to the total number of cells onto which the polygon with area Ap and peripheral length Lp is registered, and NI(Lp)/2 denotes half of the number of cells crossed by the bounding line segments of the polygon, i.e. those cells shown as dotted squares in the 2D analogy shown in Figure 8. Let ~, the average number of object polygons registered within a cell which must be tested for intersections with the ray visiting the cell, be con-
48
C .
i I
.
-l'
i .
.
-J
r-I
I I -J
r -"
-7, '
1
-- -']
]
L__., L_~_' ~,.~-
~
L__
I
I
L__-I
L__..I
I L__.j
I
L---J
.--J
Figure 8. 2D projection ot 3D uniform subdivision cells onto which a polygon is registered. Boundary line segments of polygon are registered onto cells denoted as dotted squares sidered. The average number of objects registered within a cell n~,, can be calculated using the following formula no, =
)/
~ Np(Lp.,, Ap.,) I=1
N~.~
(21)
where No, is the total number of object polygons and N,, is the total number of cells in the subdivision. Some object polygons within the current cell need not be tested for ray intersection, since they were already tested in the previous cells. In obtaining the expression for ~, the value of no,, which is measurable in the preprocessing step, is divided by two to account for those object polygons registered in more than one cell. Thus ~ can be calculated using the following formula
o~= n,,,/2 =(,=~ Np(Lp.,, Ap.,)) / 2Nc,
(22)
computer-aided design