Comput. & Gralyhics Vol. 14, No. t, pp. 101-115, |990
0097-8493/90 S3.00 ~- ,00 ~ I9q0 Perl~mon Pr'ess plc
Printed in Gr~at Britain.
Short Technical Notes/Tutorials/Systems AN OVERVIEW OF RENDERING TECHNIQUES ALAN R. DENNIS Department of Management Information Systems, College of Business and Public Administration, University of Arizona, Tucson, Arizona 85721 Al~trlct--The purpose of this paper is to present a basic introduction to computer 8raphies rendering techniques--techniques for the generation of realistic visual images. The paper has four major sections, reflecting the major areas of rendedns: visible surface identification; anti-aliasing; lighting shading and shadows; and texture. The development of rendering techniques is an active research area, particularly with respect to lighting models and texture. Some of the techniques presented in this paper will be superseded within a very short time. Nonetheless, this should still serve as a base from which researchers, practitioners, and students can explore those new developments. rithms, which examine the projected image to identify visible surfaces, and object space algorithms, which examine the object definitions directly. With image space algorithms, the surfaces of each object are examined separately for each pixel on the display, to determine which surface is visible. For an image of n surfaces to be generated on display with N pixels, the time required is proportional to o(nN). In contrast, object space algorithms compare each of the n surfaces to the remaining n - 1 surfaces to determine which are visible. The computational complexity is proportional to o(n2).
!. INTRODUCTION The purpose of this paper is to present a basic introduction to computer graphics rendering techniques. We presume that the reader has a basic understanding of general computer graphics concepts, such as the representation of objects with polygons and patches. Rendering is primarily concerned with generating visual images--transforming a definition of a set of objects into a picture. A significant portion of recent computer graphics research has been in the area of rendering, particularly lighting models. The objective of rendering is to produce the most realistic image possible [1]. Realism is, of course, more important in some computer graphics applications than others. Engineering and modeling, for example, are more concerned with producing unambiguous images than those that are truly realistic. This paper has four major sections, reflecting the major areas of rendering. The paper begins with an examination of visible surface identification techniques, followed by a brief discussion of the theory and methods ofanti-aliasing. The next section presents an overview of lighting, shading and shadows from early models to more recent ones, including ray tracing and radiosity methods. The final section discusses the generation of texture.
2.1. Image space techniques 2.1.1. z-Buffer. This image space technique (which is also called Depth-Buffer) is the simplest technique to implement, but requires additional memory. With this technique, the frame buffer is used to store the generated image to be sent to the output device, and a second, additional, buffer of the same size (called the z-buffer) is used to store depth information. The frame buffer is first initialized to the background of the image, and the depth buffer is initialized to the highest value on the z-axis (the depth axis). Then each pixel on each surface is generated, but before being stored in the frame buffer, the depth for that surface's pixel is compared with the current value in the depth buffer. If the generated pixel is closer to the viewer than the pixel currently in the frame buffer, it is stored in the frame buffer, and the depth buffer is updated; otherwise, the pixel is ignored. While this technique is simple, it has some drawbacks. The images appear on the screen in the order they appear in the object database, which can be confusing to the user. However, double buffering can be used, so that the image is displayed on the screen only after it has been completely formed. This technique also requires much memory: a 512 × 512 display has over 250,000 pixels, and thus requires 256K bits, bytes or words (depending on the volume of information stored for each pixel) for the frame buffer and another 256K for the depth buffer. To reduce the memory demands, the image can be partitioned and generated in stages--for example, four passes of 64K each. Antialiasing is difficult, as all pixels are processed individually, and therefore smoothing jagged edges of diagonal
2. VISIBLE SURFACE IDENTIFICATION The first step in generating an image is to identify which objects can be seen from the current position of the viewer. Objects outside this field of view are clipped, and objects inside the field of view must be examined to determine if part or all of them are hidden by other objects. While hidden surfaces can be indicated with dashed lines or more advanced techniques[e.g., 2], a more common approach is to remove them completely. Although many algorithms for visible surface identification have been developed, no one algorithm is best in all situations. All visible surface identification algorithms use some form of geometric sorting to identify visible and hidden surfaces[3]. In their classic 1974 paper, Sutherland, Sproull and Schumaker[3] separated visible surface techniques into two categories:* image space algo* Two techniques[4. 5] fall into both categories. cxa I(:I-x
I01
102
ALAN R. DENNIS
surfaces is complex. However, by modifying the --buffer algorithm to examine sets of pixels, this can be reduced, although not eliminated [ 6 ]. More recently, Carpenter[7] described the use of an A-buffer to improve anti-aliasing that was success,'ully used to produce scenes in Star Trek IL The Abuffer is a buffer similar to the z-buffer, except that in cases where several surfaces overlap into one pixel. "Fragments" of each surface in the pixel are stored in a "pixel struct[ure]," and later merged to produce a better image in that pixel. A competing technique was introduced by Catmull[8], who at the time worked for Lucasfilm along with Carpenter. For each pixel, a list of surfaces overlapping on the pixel is built. Each pixel is examined independently (thus permitting parallel processing) to determine its final image. 2.1.2. Area-subdivision. Area-subdivision algorithms are image space techniques that follow a divide and conquer strategy to exploit the property of area coherence--smaller image areas are often coherent (i.e., contain only one surface), thus making visible surface identification trivial. The original area-subdivision algorithm developed by Warnock[9] first examines a window containing the entire display area. If surfaces overlap in this window, the window is examined to see if the image generation is simple, and, if so, the image is generated. If not, the window is divided into four squares, and this procedure applied recursively to each resulting window. Subdivision ends when a window is processed or cannot be further divided (e.g., it contains one pixel), which is at most nine levels of subdivisions for 512 × 512 resolution or 10 levels for 1024 × 1024. A variation on this technique is to adjust the boundaries of the windows to better match the surfaces in the image, rather than using the preset four squares. Weiler and Atherton[10] suggested first sorting the surfaces by depth, then applying the Warnock technique. Rather than using rectangular boundaries for the subdivisions, they used the edges of polygons or patches to determine the subwindows. While this greatly reduced the number of subdivisions required, it required additional time to determine the window boundaries. 2.1.3. Scan-line. Scan-line image space techniques[e.g., I 1] create an image one [scan] line at a time. First, an edge table for all edges of all surfaces is created and sorted into buckets based on the edges' smallest y-coordinates, with edges within each bucket sorted by x-coordinates. This table is then processed, causing the image to be built from top to bottom, left to right. As each line is processed, the surfaces intersecting that line are processed to determine which are visible. These techniques also take advantage of linecoherence. Once the visible surface has been determined, it will remain the same until an edge is crossed: and each line is likely to be very similar to the preceding line. The number of calculations can be greatly reduced. Several variations to the basic process have been suggested. Hamlin and Gear[ 12 ] presented a technique
which can sometimes maintain depth-coherence when the edge changes, by using recursive list processing on each scan. Crocker[13] proposed an algorithm that, as it moves to a new line, first identifies the depth of the nearest surface to the viewer and then compares that to the depth of the previous scan-line. No calculations need to be performed unless the new surface is closer than that on the previous line. In another variation, Sechrest and Greenberg [14] used scan-lines defined by the objects in the image--rather than generate the image at preset horizontal lines, they used the surface boundaries to determine the size and number of scan-"strips." This removes unnecessary calculations for pixels in the same surface, and at the same time, reduces aliasing problems, as the image is generated by examining objects. 2.2. Object space techniques 2.2.1. Depth-sort. In this object space algorithm [4], all the surfaces in all objects are first sorted in order by depth, and then the image is generated, starting with the surface farthest from the viewer. This is often called the "painter's algorithm," as this is the same technique used by artists--paint the background first, then gradually add the foreground objects. All surfaces may not have a constant depth (or z-coordinate ), and thus sorting the objects based on depth can be sufficiently complex to render it impractical. Iftwo or more surfaces have overlapping depths, they are then compared on the minimum and maximum X and Y coordinate bounds (called extents). If they do not overlap, then the surfaces need not be rearranged in the sorted list. If they may still overlap, then the surfaces are examined to see ifone is entirely on one side of the other, as shown in Fig. 1, and, if so, are sorted appropriately. If this is not the case, the edges in all surfaces are examined to see if they overlap, and if some do, one or more surfaces are divided into two or more surfaces, each of which are sorted separately. Once the surfaces have been sorted, they are painted onto the output device back to front. Some unnecessary image generation is performed, as hidden surfaces are generated, then obscured by other surfaces. Obviously,
7
X Fig. 1. Overlapping surfaces.
An overview of rendering techniques this technique is only appropriate for output devices that permit new surfaces to obscure previously painted images occupying the same location. This technique cannot, for example, be used with film recorders; once the film has been exposed, it cannot be unexposed. While not mentioned explicitly in any of the methods above, a simple preprocessing step, which removes the back (hidden) surfaces of each object, may greatly increase the speed of the algorithms in cases where hidden surfaces are to be removed. Obviously, this technique cannot be applied in all cases, as some objects could be transparent, etc. In general, such an algorithm could be expected to reduce the number of surfaces to be examined by a factor of two, as, on average, half the surfaces of any object will be hidden by its other surfaces. In fact, each object could be examined separately first to remove all hidden surfaces, not just those easily identified [15]. There has been little recent research into hidden surface identification techniques, probably because the ones presently in use are well understood and adequately meet the needs of the graphics community.
and therefore sometimes fail. Blinn et al. [ 24 ] and Lane and Carpenter[25] have also developed subdivision algorithms that repeatedly subdivide images until the windows contain areas that can be approximated by flat polygons.
2.3. Curved surfaces The above algorithms apply to objects defined with planar polygonal surfaces. While curved surfaces can be approximated by a set of many polygonal surfaces, curved surfaces are usually represented by patches. Most algorithms [ 16-19 ] locate the intersection of patches by solving the resulting fourth order equations by numerical approximation. Special case algorithms have been developed to take advantage of the simpler requirements of spheres, as spheres form the basis of molecular modelling applications [ 20-23 ]. Two scanline algorithms have been developed by Blinn et al. [ 24 ], which also rely on numerical approximation
Normalized
103
2.4. Choosing an algorithm Fig. 2 presents the estimated performance of the above algorithms in terms of processing time required, as determined by Sutherland et al. in 1974, with adaptations by Foley and van Dam[26, p. 570]. The depth sort algorithm is best for a small number of surfaces (as simple overlap tests usually succeed), but it falls down when many surfaces are present. The time required for the scan line and subdivision techniques also increases as the number of surfaces increase, but not as sharply. In contrast, the depth buffer technique provides relatively stable processing time, due to the tendency of the size of surfaces to decrease as their number increases. In summary, for most images, the depth sort technique is best; however, for complex images, depth buffer techniques are preferred. Although widely cited, Suthedand et al. pointed out that these measurements were only estimates. Small differences between algorithms could be the result of the accuracy of the tests, and thus comparisons should be made only where differences of an order of magnitude are present. The variations to the basic algorithms suggested since these tests were performed have all sought to reduce the time required to generate an image. For example, Crocker [ 13 ] demonstrated that under certain conditions, his variant on the scan-line algorithm reduced processing time by almost 50% for complex images. As well, in many applications, only a part of the image is changed at one time, and thus a system that is designed to take advantage of this needs
Time
1000
1 I
..
.'1
...............
Alaorithms Depth I
0.1 100
i
i
i
i
I
illl
i
I
t
I
Iflll
looo 10000 N u m b e r of S u r f a c e s
I
I
I
Sort
Z-Buffer
-+-
Scan Line
.....
Subdivision
f i l l
100000
Fig. 2. Time comparisons for hidden surface removal algorithms.
104
ALAN R. DENNIS
only to generate a small number of surfaces each time[27]. Therefore, while these tests can be used as guidelines, they should not be considered absolute. 3. ANTI-ALIASING(SAMPLING AND FILTERING) Some images contain annoying defects such as jagged edges or distortions of small objects, especially in areas ofcomplex detail. These defects are called aliasing artifacts, as they are caused by generating an improper rendering--an " a l i a s " - - o f the true image. The problem arises because a pixel is not infinitely small. A pixel covers a small area of the imagema small area that may contain two or more surfaces as shown in Fig. 3. If we determine the contents of a pixel by examining only its center, aliasing may occur[28]. To better understand the attempted solutions to this problem, we must shape the problem in terms of signal processing theory. In signal processing, we have an input signal that we sample at time intervals. By examining this sample, we attempt to determine the value of the original signal. The rate of sampling determines the ability to reconstruct the original signal. If we sample at a greater frequency than the incoming signal, we can easily reconstruct the true value of the original signal. However, if we sample at a frequency below that of the original signal, we cannot properly reconstruct the original signal. Instead, we produce an alias signal. This theory treats the image as a continuous signal which is sampled at intervals corresponding to the distance between picture elements [i.e.. pixels]. The well-known "sampling theorem" states that the sampled picture cannot represent spatial frequencies [i.e., changes in the image across space] greater than 1 cycle [per] 2 picture elements. "Miasing" refers to the result of sampling a signal containing frequencies higher than this limit[29, p. 542]. As we move across the image, alia.sing occurs if the frequency of changes in the image is greater than the frequency of changes in our sampling (i.e., the changes in pixel boundary). In fact, we must sample the image (i.e., create pixels) at twice the rate of changes in the image. To prevent aliasing, each set of two pixels must contain at most one change in the image. This is prac-
Fig. 3. The area ofa pixel may contain several surfaces.
tically impossible, as there is a finite number of pixels on the output device, buL in theory, an infinite number of objects to cause changes in the sampled image. Aliasing problems are therefore nearly inevitable. The basic approach to anti-aliasing has two components: super-sampling and filtering. Super-sampling refers to the process of taking more samples than there are pixels. That is, we examine several points in the area covered by the pixel, and then generate the pixel based on this sample. This, of course, increases computation time. Aliasing is noticeably reduced, but not eliminated [ 30 ]. An important question is "How many samples is enough?". The two standard answers are "more is always better" and "we find that n is usually enough," where n is some integer between 4 and 256. Given that color values are stored with limited precision, it seems likely that the number of useful samples per pixel is also limited. [A second issue is that] If the sampling is done in some fixed pattern, then geometries always exist for which that particular sampling pattern generates a poor estimate of the integral [i.e., combined image for that pixel] and unwanted artifacts are created. However, if the sampling is done randomly, this problem can be eliminated. Furthermore, a statistical test can be developed to deten-nine when enough samples have been used[31, p. 61]. Thus, better images are generated in less time by adjusting the sampling rate within pixels to the frequency of the pixels. Even if the sampling rate is not adjusted to pixel frequencies, sampling randomly determined points in the pixel, rather than predefined points produces better images. Jittered sampling[ 32 ] uses a stratified sampling approach. The pixel is separated into predefined areas (hexagons proved better than rectangles). A point within each of these areas is then randomly selected ( using the uniform probability distribution) and sampled. This randomness changes the characteristics of the aliasing so that it becomes noise [ 33]. Noise is preferred to aliasing, as signal processing theory provides several techniques for dealing with noise to improve the image. A discussion of these techniques and theory behind them is beyond the scope of this paper. Filtering refers to the method by which the samples taken from one pixel are merged to produce one value for the entire pixel. In theory, it is possible to remove the high frequency components o f a pixel beyond some predetermined frequency without any aliasing by constructing a weighted average filter. However, such a filter is mathematically too complex to compute efficiently [ 34 ]. Several approximations have been developed. One approach is to calculate the visible areas of each surface within the pixel and calculate the average based on relative areas covered[28 ]. Every part of the pixel receives the same weighting in the calculation of the average. There are, of course, many other types of filters. Rather than assign equal weight to all parts oftbe pixel, one can choose to weight areas differently. Triangular filters assign the greatest weight to the center of the image. Gaussian filters (i.e., truncated normal distribution) provide more realistic images [ 30]. The sine
105
An overview of rendering techniques filter + provides the best results, but its infinite width makes calculation impractical [ 35 ]. Feibush et at [ 36 ] introduced the use of a filtering function stored in a lookup table for faster computation, which is frequently used [ 34 ]. Abram et al. [ 37 ] used the lookup table method, but instead of storing a filter, they stored precomputed image intensity values. The contribution of a surface to the pixel is the product of its intensity, its area within the polygon and the filtering value. Abram et al. defined a series of shapes that appear most commonly within the pixel and precomputed the filtered value, which was then stored in the table. Surfaces that did not fit the predefined shapes could usually be approximated by adding or subtracting the filtered values from several predefined shapes. Most complex calculations could therefore be avoided. 4. LIGHT, SHADING, AND SHADOWS The nest step in image generation is to shade the surfaces, taking into account lighting and object characteristics. The light reflected by objects falls into two categories: diffuse, which is scattered equally in all directions; and, specular, which is reflected only in one direction. Diffuse light is generated by dull matte surfaces, while specular light is generated by shiny surfaces such as mirrors. We begin by examining how to determine the light at a point on the surface of an object, using the initial lighting models in computer graphics as examples.
4.1. Properties of light 4.1.1. Diffuse light. As diffuse light is reflected equally in all directions, the amount of the light seen by the viewer is independent of the viewer's position. As shown below and in Fig. 4, the light reflected depends on the intensity of the light source (Ia), the angle between the light source and the surface normal (a vector perpendicular (90") to the surface) (0), and the reflective coefficient of the surface for diffuse light (ka): I =Idkacos 0 where 0
I = laka(L*N). In most environments, ambient light is also present. Ambient light is diffuse light of uniform brightness caused by multiple reflections of light from many sources. To increase realism, the ambient component (Iako) must be added to the equation. As the intensity of ambient light is constant on all surfaces regardless t sine(x) = Orx)-Isin(rx) [35].
.,0.\o !
Surface Normal (N)
Surface I
Fig. 4. Diffuse reflection.
of their orientation, the source intensity is often treated as a constant. More realistic radiosity models for ambient light have recently been developed and will be discussed below. Likewise, the distance from the light source to the object must be considered, although experience has shown that using the distance from the viewer to the object (r) plus some constant (k) produces better images[26]. The equation for the diffuse light at a point (considering ambient light and viewing distance) is therefore:
I = Ioko + l~k¢tL*N)/(r + k). As white light is composed of the three primary colors, we end up with three equations in the form above, one for each primary color. 4.1.2. Specular light. Specular light is observed on any shiny surface as a bright patch or "highlight" in the same color as the light illuminating the object. In contrast to diffuse light, specular light is reflected only in the direction for which the angles of incidence and reflection are equal. In Fig. 5, a would need to be zero for the viewer to see the highlight. However, most surfaces are not perfect reflectors, and thus light is reflected through a range of angles, although the intensity of the light drops sharply as a moves away from zero. Dull surfaces reflect specular light through a wider range of angles than shiny surfaces. The specular reflection also increases as the angle of incidence increases. Bui Tuong Phong[38] introduced one of the first reasonable approximations for specular light: I = W(0)cos"a where 0 is the angle of incidence and a the angle between the viewer and light source. The first component, W(O), takes on a value between 0 and 1, depending on the exact characteristics of the surface being modelled. In practice, this component has been ignored or set to a constant[ 34]. The second component, cos"t~, reaches a maximum at a = 0 and falls off as a increases. The rate of change is determined by n, the shininess factor. For shiny surfaces with a narrower area of reflection, n is set to a high number (e.g., 200), while for dull surfaces, n is set to I.
ALAN R. DENNIS
106
Surface Normal (N) Reflection
.,o.,
S urce
In 1971, Gouraud[42] introduced intensity interpolation shading (now called Gouraud shading), which significantly improves the realism of images. In this technique, the surface normal is calculated at each vertex by averaging the surface normals of all surfaces connecting at that vertex. Then the intensity at each vertex is determined, using this vertex normal. Finally, the shading across the surface is determined by linear interpolation between the vertices (Le.. calculating the intensity at each pixel as the weighted average of the intensities of each vertex in the surface, based on distance to each vertex). For colored surfaces, each of the three RGB color components would be determined separately. Gouraud shading is not perfect; a surface with a rapid change in slope is not properly shaded. To adjust for this, Bui Thong Phong[38] developed normal-vector interpolation shading. Rather than interpolating the intensity across the surface, the surface normal is interpolated (in much the same manner as the intensity was interpolated). Once the normal for each point is estimated, the intensity is calculated at that point. While this significantly improves the realism, it also greatly increases calculation costs--by a factor of 10 in one example[43]. To reduce the calculations required, Bui Phong and Crow [ 44 ] suggested using normal interpolation only in polygons where specular light highlights are expected. Duff[45] showed that by combining the surface-normal interpolation equation with the reflection equation, calculation time could be reduced. More recently, Bishop and Weiner[43 ] demonstrated an algorithm that approximates Phong shading by using Taylor's series for approximating Phong's equation. They reported that the algorithm reduced the time for Phong shading by a factor of 5 to approximately twice the time required for Gouraud shading. Many of these same techniques are used to shade bicubic patches. For best results, however, the surface normal must be calculated at each pixel and then used to calculate the shading at that pixel [ 26 ]. This is often done by recursively subdividing the patch into smaller and smaller patches until the patch covers only one pixel [ from Catmull, reported in 46 ]. This is very timeconsuming, but effective.
Xly'V,ewe, /
Surface Fig. 5. Specular reflection. The combined equation for both diffuse and specular light becomes: I = I=k= + Id/(r + k)(ka(L*N) + W(O)cos%t). Calculating the light at each pixel can be time consuming, so Phong also developed an efficient method for incremental evaluation along a scan-line. In contrast to the empirically derived Phong model, Blinn [ 39 ] proposed a more accurate model based on the physics literature [ 40]. Each surface is modelled as a collection of microscopic facets, which are perfect reflectors. The facets are generated by first breaking the surface into facets and then using a normal probability distribution function to determine the angle of the facet relative to the surface normal. To calculate the shading of the surface, each facet able to reflect light to the viewer is examined and the intensity determined. The Phong and Blinn models produce similar images, except when light hits the surface at sharp, grazing angles, in which case the Blinn model is more accurate[34]. One of the most realistic models in use today was introduced by Cook and Torrence[41], who realized that the color of the surface influences the reflected light. They modified the Blinn model to consider wavelengths of the light. 4.2. Shading polygon meshes and patches There are three basic ways to shade polygon meshes: constant shading, intensity interpolation shading, and normal-vector interpolation shading[26 ]. Constant shading, as the name implies, calculates one intensity value for the entire polygon. This works well, provided that the angles from the light and to the viewer are constant across the surface. In other words, the source of light and the viewer must both be at infinity, and the surface must not be curved. If this is not the case, the results of constant shading is not realistic. Exacerbating this problem is the Mach band effect, first discovered in 1865 (by E. Mach). Simply put, our eyes exaggerate the differences in intensities when two surfaces of different intensities meet at a common edge. Thus while two surfaces may have almost the same shading, as we look across the image, we perceive a harsh, unrealistic, change in intensity between the two.
4.3. Shadows Shadows play an important role in providing the appearance of reality. While they are not always required, they are sometimes essential. The addition of shadows to perspective views vastly enhances the depth perception in the display [30 ]. The complexity of generating shadows is directly related to the model of light sources used. If light is provided by a source outside the image (e.g., at infinity) calculations become much simpler [ 34 ]. Shadow algorithms for specular light from point light sources are identical to hidden-surface algorithms! The hidden-surface algorithm determines which surfaces can be "seen" from the viewpoint, and the shadow algorithm determines which sur-
An overview of rendering techniques faces can be "seen" from the light source. The surfaces that are visible both from the viewpoint and from the light source are not in shadow. Those that are visible from the viewpoint but not from the light source are in shadow. This logic can easily be extended to multiple light sources[26, p. 584]. Recent research on shadows has made extensions to various hidden line algorithms to better model specific aspects of shadows or has examined various effects caused by different assumptions in light source type and position; see [47, 51]. 4.4. Ray tracing Three other approaches to lighting and shading have been suggested: ray tracing, radiosity and wave theory. Ray tracing involves following a beam of light through the image, as it is reflected and refracted by the objects it encounters, while the radiosity approach applies energy conservation theories from physics, These are both active research areas and will be discussed in more detail in the next sections. The third approach, wave theory [ 52 ], was similar to ray-tracing, in that waves of light, rather than particles, were traced through the image. However, the images produced were disappointing, and computation time was very high[34]. No further research has used wave theory. Ray tracing is a method for determining shading (particularly for specular light). There were early attempts at ray tracing (or ray casting as it was called then [ 53, 54 ] ) but it has become popular in the last 10 years. Ray tracing works by tracing a ray of light as it travels from the eye of the viewer back through a pixel, then into and around the image. As the ray is traced through the image, a tree is constructed that contains information about the intersections with surfaces. Once the ray exits the image, or reaches a light source, this tree is traversed and the intensity calculated by examining the contribution of each surface. Ray tracing may also stop when the length of the ray's path reaches a predetermined attenuation distance [ 1]. Reflections and transparency are considered, which may cause a ray to be split into several rays, although some implementations are limited to two rays per initial ray [ 55 ]. The calculation at each point in the tree of objects has two parts[49]. The first determines if the intersection between the ray and the surface can be seen from each light source. If no surface blocks the light source, then the second step is to calculate the contribution of the intersection to the pixel, based on the characteristics of the surface and light source. Ray tracing is versatile and can be used for more than just lighting and shading determinations. "First order" ray tracing can by used for hidden surface determination, as the first surface struck by the ray is the visible surface. Generation of shadows can also be determined by firing rays at light sources. Ray tracing has been a major area of recent research for two reasons: first, the effects are important in modeling reality; and, second, ray tracing is very simple, both conceptually and algorithmically [ 34 ]. However, ray tracing also has drawbacks: it is computationally
107
very expensive, as each pixel must be examined--typically requiring 250,000 to 1 million rays to be traced [ 34 ]. Computation times are often measured in CPU hours[ 1, 56, 57 ]. Determining the intersections is the most computationally expensive part of ray traci n g as every object in the environment must be examined to see if it blocks each light source from the intersection. This can absorb up to 75% of the total computation time for simple images[ 58 ], and increases linearly with the number of objects and number of rays[ 59 ]. Reducing computation time is an active research area and will be discussed in more detail below. A second drawback is anti-aliasing. Each pixel is considered separatelyma process that can cause problems. More importantly, a ray of light is infinitesimally thin and, therefore, provides only a tiny sample of the image relative to the size of a pixel. To improve the quality of the image, it is common for several rays to be traced for each pixel (super-sampling), thereby getting a better sample from which to generate the pixel image[60]. Naturally, this dramatically increases computation time. While most early research concentrated on finding ways to quickly calculate the intersections of rays with more and more complex objects (e.g., patches, planar cubic splines, etc.)[ 34], much recent research has attempted to reduce calculation time. One approach has been to apply more hardware processing power to the problem by using larger computers or vector processors[61, 62]. Other developers have subdivided the image and distributed its pans to several computers for parallel processing [63-67 ]. These approaches often require significant changes in the implementation of previous techniques. 4.4.1. Bounding volumes. Three algorithmic approaches to reducing computation time have been suggested. The first set of techniques built upon attempts to reduce computation time through the use of "bounding volumes." For complex images, 95% of computation time is used by shadow-testing and intersection-testing[49]. Bounding volume techniques attempt to reduce the number of intersection-tests by surrounding complex objects with simpler imaginary objects such as boxes or spheres that require fewer or simpler intersection calculations. If the ray does not intersect the bounding volume, we can omit the multiple tests required for the more complex objects inside its volume[56, 58, 68, 69]. There is an important balance in using bounding volumes. The volume must be fit tightly around the object, so that few calculations are wasted on rays that pierce the bounding volume, but do not intersect the objects. Therefore, the bounding must adapt to the contours of the object. On the other hand, the shape of the bounding volume must be simple, or else the calculations required to determine the intersections with the bounding volume will be as complex as those required for the object itself. This technique can be extended so that these bounding volumes can themselves be surrounded by a larger bounding volume to provide further simplifi-
108
ALAN R. DENNIS
cation of the image. In essence, the image is partitioned into smaller pieces. By creating larger bounding volumes around other bounding volumes and objects, the objects are organized in a hierarchical tree. Using this technique, one no longer needs to examine every object when searching for ray intersections; only certain branches on the tree need to be searched. Kay and Kayija[70] partitioned the objects by surrounding each object with a set of planes, rather than the previously used three-dimensional objects such as spheres or cubes. This, combined with a more efficient search through the object tree, resulted in an implementation that was three times faster than an algorithm developed by Glassner (discussed below). Many different trees of objects can be created for any given scene, based on the method used to build the tree. Building the right tree is important. "The time required to render a simple image can easily vary by a factor of fifty due simply to the choice of different trees"[71, p. 15]. However, most trees will be in the same order of magnitude as the optimal tree in terms of time, simply because of probabilities--making the worst tree building decision at each choice is quite unlikely. Goldsmith and Salmon[71] developed a heuristic to improve computation time. The tree is built so that each bounding volume contains as many objects as possible for the smallest increase in size. Performance is, to a large extent, determined by the order in which objects are entered into the tree. If objects are added to the tree in the order they were created by the user, some spatial coherence may exist, as the user probably created the objects in one area first, then moved on to a different area of the image. Some objects may exhibit very poor spatial coherence (e.g.. the user later added objects) and, therefore, result in poor performance. By randomly shuffling the objects, the time required was made more consistent, albeit slightly longer for images with high spatial coherence. 4.4.2. Octrees. The second set of ray tracing techniques divides the image into smaller volumes and stores the images in each volume. This approach builds a tree by volume, in contrast to the above approach, which builds a tree by object. Dippe and Swensen[72] divided the image into a set of subregions of approximately the same volume. Objects were then loaded into one or more subregions. As all regions did not contain the same number of objects or amount of complexity, the region boundaries were then reassigned to balance computational complexity. This balance was continuously monitored during the processing and boundaries reassigned on the fly. Interestingly enough, this technique was implemented on a set of computers operating in parallel--each computer was assigned a set ofsubregions. Ray information was shared between processors as rays entered and exited regions. In a similar approach, Glassner[ 73] divided the image into a set of small cubes ("voxets") and stored a list of all images present in each cube. Rays entering the voxel either pierced an object or continued on into the next voxel. Voxels were organized and stored in
an octree (an 8-way tree), as cubes can easily be subdivided into a 2 x 2 x 2 set of 8 subcubes. The tree was built by subdividing voxels into smaller and smaller voxels until the number of objects in each voxel was below some maximum number. Fujimura et aL [74], Matsumoto and Murakami[75]. and Kaplan [76J independently developed other similar octree techniques that also partitioned the space of the image. These octree techniques reduced computation time by up to an order of magnitude over previous ray tracing techniques. Another advantage ofthe Glassner (and Kaplan) octree technique is that it considers objects roughly in the same order as the ray would encounter them [ 70]. Techniques based on octrees also have their drawbacks. As one object may lie in several nodes on the tree, it may be examined several times for the same ray[70]. Octrees also require an immense amount of memory--often several megabytes [ 71, 77 ]. Therefore, they can be impractical for some applications. Fujimoto et a/.[78] extended the traditional twodimensional algorithm for drawing lines and used it to approximate the path of a ray through the voxels in an image. This, when combined with a different approach to coding the voxels for storage in the octree,* provided significantly reduced computation time: an image that was estimated to require 40 days to generate with standard ray tracing techniques, required only 135 minutes with this technique. Fujimoto et al. claim that this technique is actually faster for rendering complex images than using the techniques discussed above. Arvo and Kirk[79] determined that the intersection calculation between the ray which starts from a coordinate in 3D space and uses a 3D direction vector could actually be modelled as a hyper-cube in 5-space. The image space is divided into disjoint space sets based on 5-D bounding volumes determined by the rays, so that all rays in the same area intersect approximately the same set of objects. The techniques used to trace rays are similar in concept to those described a b o v e - but in 5-space. 4.4.3. Ray-coherence. The third set of ray tracing techniques use ray-coherence--the probability that rays close together will follow parallel paths through the image--and can be used in conjunction with bounding volume techniques. Heckbert and Hanrahan [ 80 ] suggested tracing clusters ("beams") of rays through the image, rather than individual rays. This not only reduced computation time, as one computed path could be used for several rays, but also assisted in anti-aliasing, especially for soft-edged (blurred edged) objects and shadows. However, curved objects caused problems, as the individual rays in a beam follow different paths. As well, in complex environments, there is less raycoherence. With more objects, adjacent rays are more likely to strike different objects and, therefore, not follow the same path. Time savings are reduced in complex images--the very. images that require the most
*The voxelsare coded such that each bit in the 3-bit number represents a different x, v, or z axis.
An overview of rendering techniques time to generate. A similar ray coherence technique was used by Joy and Bhetanabhotia[81], but used Newtonian methods to reduce the time required for intersection calculations. Amanatides [ 82 ] suggested generalizing the concept of a ray from a line to a cone. This reduced the computation time, by reducing the total number of rays that needed to be traced to just one per pixel. It also improved anti-aliasing, as area sampling rather than point sampling could be used. However, intersection calculations were made much more complex. Speer et al. [ 55 ] constructed tunnels, or cylindrical "'safety regions," through which rays could pass without striking an object. However, the computation time for constructing and using these tunnels exceeded the benefits they provided. One major problem with ray tracing is the difficulty in accurately modeling diffuse light. The reflection calculations required for specular light are few, as it is reflected in one direction. Diffuse light, by definition, is reflected in all directions. To properly trace diffuse light, it would be necessary to generate thousands of new rays each time a diffuse surface is encountered. Fuzzy images are therefore difficult to properly model. "Ray traced images are sharp because ray directions are determined from geometry"[33, p. 137]. When light strikes a dull surface, reflected light does not follow ideal geometry; it is reflected according to the nature of the surface. Cook et al. [ 83 ] developed distributed ray tracing in an effort to extend ray tracing to fuzzy surfaces, diffuse light, and animation. They used super-sampling (using several rays for each pixel), but randomly adjusted the exact angle of reflection for each ray to produce the effect of diffuse light (as proposed by Whitted[ 56 ] ). They also included a similar technique based on random sampling for object blurring due to depth of field differences. Normal image generation techniques produce images that are equally sharp across all depths of field. In contrast, cameras (and our eyes) can only focus on one depth at a time. Objects at other depths are blurred. While this and other ray tracing methods made attempts to model diffuse light, virtually no attempt was made to model ambient light--global "background" light produced by light reflected by all surfaces in the image. To properly calculate ambient light, it is necessary to calculate the light reflected by all surfaces; interdependencies between surfaces can make accurate estimates computationally intractable. 4.5. Radiosity models A better model for ambient light, the radiosity model, was suggested by Goral et a/.[84]. In this appreach, a hypothetical enclosure is constructed around the image which includes surfaces existing in the image, plus fictitious surfaces such as a window or the sky. All surfaces are assumed to be perfectly diffuse light sources or perfectly diffuse reflectors--reflectors that diffuse light equally in all directions. A "form factor," defined to be the fraction of light from one surface that
109
strikes another surface, can then be calculated using theory from thermal engineering for heat radiation in enclosures (Fundamental Law of Conservation of Energy within closed systems[85, 86]). The total light energy leaving the patch must sum to the same amount that entered, as the patch is assumed to be a perfectly diffuse reflector. Thus the fraction of light reflected in any direction can be determined by iteratively solving a set of simultaneous equations. In this manner, the actual light reflected from all surfaces in the image (or pan-image) that composes ambient light is calculated. Goral et al. were even able to model "color bleeding"m the spread of color from colored surfaces to lighter surfaces that occurs with diffuse light. 0 The model is computationally complex, but as ambient light is constant regardless of the position of the viewer ( i.e., viewindependent), the shading can be calculated once for a given image and continue to be used even if the viewer moves to a new viewpoint. This approach was extended by Cohen and Greenberg[ 87 ] for more complex environments that include occluded (hidden) surfaces. They replaced the assumption of a hemisphere around each patch used by Goral et al. by a hemi-cube. This simplified the calculations and thereby enabled hidden surfaces to be considered. However, computation time was still large (but in the same range as ray tracing techniques), increasing with the square ofthe number of patches. Also, patches with a high variation in intensity (e.g., patches with shadows falling on part of them) were not modelled properly. Interpolation of intensity across the patch or subdividing the patch into a set of smaller patches is possible but expensive [ 88 ]. Cohen et al. [89 ] addressed these issues by subdividing patches with high intensity variation into "elements." In the first "coarse" shading pass, the intensity of the patches is determined, ignoring any elements they may contain. Then, each element in the patches with high intensity variation is examined, and its intensity is calculated based on the coarse shading results of the first shading pass. The images produced are similar to those produced by subdividing patches into a mesh of smaller patches, but reduce computation time (by a factor of 3 in one example). Nishita and Nakamae[90] suggest that outdoor ambient light can be modelled using these techniques by dividing the image into bands, then calculating a level of ambient light that remains constant in each band by examining the area of sky visible from the center of the band and the level of light reflected by objects in the image. To reduce calculations, small surfaces are ignored, and large surfaces are assumed to be uniform. This was later enhanced to consider the effects of atmospheric panicles[9 l]. In 1986, Immel et al. developed a technique to extend the radiosity method to include surfaces that are not perfectly diffuse. The specular component was ad-
! For example, hold a white piece of paper next to a bright red surface; the red color "bleeds" onto the paper.
110
ALAN R. DENNIS
dressed for the first time, by generalizing the radiosity formulation to consider directional reflection characteristics. While there is an infinite number of directions from which light can arrive at a patch and, therefore, an infinite number of directions to which it can be reflected, the problem was simplified by defining a finite number of directions. In this case, 1000 directions were defined, resulting in a matrix of simultaneous equations with 10,2 coefficients. Fortunately, the matrix was very sparse and was further pruned as only a few patches were specular reflectors. Even so, computation time was two orders of magnitude more than previous radiosity techniques. As the specular component can change quickly as the angle to the viewer changes, the specular component can vary greatly across the same patch, thereby requiring a huge number of patches to achieve realism. Surfaces may need to be reduced to the size of a pixel for proper results, which is computationally impractical [ 92 ]. Wallace et aL [92] presented a two-pass technique that first uses radiosity techniques to determine global diffuse (ambient) lighting, then ray tracing techniques using the view position to calculate the specular component. 4.6. "The rendering equation'" In his 1986 paper entitled "The Rendering Equation," Kajiya[ 93] presented a new approach to lighting and shading that begins to integrate ray tracing and radiosity to produce one equation for both specular and diffuse light. The paper provides mathematical proof to better unite ray tracing and radiosity, based on a concept of balancing energy not unlike that of radiosity. A discussion of the proof is beyond the scope of this paper. However, the basic concept is intuitively simple. When a ray strikes a diffuse surface it is reflected equally in all directions. We must generate an infinite number of rays at each intersection of a ray and a diffuse surface. However, we could approximate this by generating some finite number of rays at each surface. If this finite number was 1000, we still generate a huge number of rays to be traced through the image,
as at each intersection I000 new rays need to be added. as shown in Fig. 6. Most surfaces are not perfectly diffuse; they do not reflect all the light shone on them. Most surfaces are at most 50% diffuse reflectors. Light also attenuates as more surfaces are encountered. Thus, the contribution of each ray of light to the overall image decreases sharply as the number of surfaces intersected increases--for example, 50% with the first surface, 25% with the second, 12.5% for the third, and so on. As we follow the ray through the image, the number of rays increase sharply, while the contribution of each ray decreases sharply, as shown in Fig. 7. Kajiya proposed balancing this trade-off by choosing to trace a randomly selected sample of the rays generated by each intersection, rather than tracing all rays. The size of the sample could be large initially, but decreases as the number of surfaces intersected increases, without seriously affecting the realism of the final image. $. TEXTURE "One of the most frequent criticisms of early synthesized raster images was the extreme smoothness of surfaces: They showed no textures, bumps, scratches, dirt or fingerprints. Realism demands complexity, or at least the appearance of complexity"[35, p. 56]. Texture can be defined as part of the object--a surface can be composed of thousands of tiny polygons or patches, each with different characteristics. When texture is created in this way, the texture is not a separate entity--no special techniques are needed. Rendering of the object is done as previously. However, this is very computationally expensive. Another approach is through texture mapping, where a separately defined texture is mapped (or "wallpapered") on the image when needed. This is considerably less expensive than defining objects with thousands of microscopic surfaces, and, when properly done, can achieve high levels of realism.
5.1. Texture mapping Texture mapping begins with a definition of the texture. The definition can be either a stored array or a
Initial Ray
First Surface Intersected
Second Surface
Third Surface
Fig. 6. Ray tracing diffuse surfaces: the number of rays increases exponentially.
An overview of rendering techniques
Number
of Surfaces
Number of Raya
--
Intersected ImDortance of Eacl~
Fig. 7. Number of rays versus their importance.
mathematical function, and it can be one-dimensional, two-dimensional, or three-dimensional, although twodimensional definitions appear to be most common [34]. This texture is first mapped onto the surface in 3D object space, by taking the coordinates from the surface and using them in the texture function to obtain the desired texture. Once the surface has been textured, it is then projected onto the screen. As added textures are likely to have many variations (a high frequency of changes), they are particularly susceptible to aliasing. There are three basic approaches to texture mapping: texture scanning, two-pass, and screen scanning[ 35 ]. Blinn[94] developed an early texture scanning technique called "Bump Mapping." As we have seen, reflections are controlled by the surface normal. Bump mapping works by perturbing (i. e., changing) the surface normal as the surface is generated in the image. These changes in the angle of the surface normal work to produce the appearance of a textured surface. The perturbations can be generated by a stored texture table or generated by a mathematical function as the image is being generated. However, as the surface is not actually changed to produce a texture, the surface's silhouette will always betray its true shape[95 ]. Max [21] addressed this problem by providing a bump mapping function for shadows. In this case, a stored table was used to do the bump mapping for the surface, while a separate table based on the first table was used to properly generate the shadows. Kajiya [ 96 ] extended the bump mapping technique by perturbing other characteristics in addition to the surface normal to better model the directionality of surface features in nature, such as hair, cloth, etc. This technique, called "horizon mapping," was further extended [ 97 ] to consider the attenuation in light reflected by rough surfaces. Each surface is divided into a set of triangular facets. Then, using the tables from horizon mapping, the amount of light actually reflected by each facet is estimated. The second approach is a two-pass method which was originally proposed by Catmull and Smith[98]. The first pass maps the rows of an image, while the second pass maps the columns. This method also works better with atone images and is most appropriate for images that cannot be accessed in a random access
111
manner (e.g., the texture map is so large that it must be stored on an external device). The third approach, screen scanning, is also called inverse mapping, as it inverts the "normal" direction of image generation (from objects to screen). For each pixel on the screen (or other output device), the corresponding objects are found, and then the appropriate texture is located. This is the preferred technique when the screen must be built sequentially (for a film recorder for example), as the image is generated in pixel order. Screen scanning is currently the most popular technique. Texture mapping requires subtlety, as the texture does not map directly to the screen unless there is a ai~ne (linear) relationship between the surface and screen. If there is a nonat~ne relationship, the texture may appear warped or twisted. Nonaffine texture mapping must also be done adaptively to prevent holes or overlaps in the texture. Likewise, it is hard to wrap a two-dimensional texture around a complex three-dimensional object. The texture must be stretched and compressed to fit the shape of the surface. Perspective projection further distorts the texture, especially on curved surfaces. Williams[ 99 ] suggested keeping the tables of the average texture over a square area, with each table providing the proper texture at a different resolution. As the texture was distorted over the image (at curves for example) and more texture squeezed into each pixel, one merely needed to select the appropriate resolution table to generate the texture. Adjacent pixels textured from different tables are blended to avoid obvious differences in texture. However, using tables based on square regions produce fuzzy pictures when distorted over curved surfaces. Crow [ 100 ] generalized Williams' approach by using one 2-clement table of rectangular textures. Those two entries enable the rectangular texture area to be properly distorted to match the shape of the area over which the texture is being applied. 5.2. Procedural modeling However, these techniques are not perfect. Consider a three-dimensional block of wood. The texture on each surface is related; the grain flows throughout the block. With the techniques above, texture is applied separately to each surface. While the computer generated block has a "correct" set of surfaces, it may not look like a block of wood, as the surfaces are not consistent with each other [101]. To produce realistic threedimensional images, it can be necessary to provide a three-dimensional texture. However, storing a threedimensional texture array requires a large amount of space. To overcome this, Peachey[101] used a mathematical function ("that makes heavy use of a square root library function" [p. 285]) to produce texture. Perlin[102] also introduced three-dimensional procedural textures using stochastic (random) nonlinear functions. Procedural texture mapping obviously trades reduced storage space for an increase in processing time. However, the implementation provided by Peachey
112
ALAN R. DENNIS
added only 18% to the total image generation time. Procedural textures are also generally difficult to antialias [ 34 ]. 5.2.1. Natural phenomena. Natural objects are among the most difficult to model[34]. While humangenerated objects are fairly regular, natural ones usually have a high degree of irregularity and complexity that is hard to describe, and harder still to define and store in an object data base. Traditional approaches for texture mapping can, of course, still be used[ 103-105 ], but are sometimes unrealistic. "A fundamental limitation of standard approaches is that objects are modeled at a predetermined, fixed scale regardless of its suitability for a particular viewing distance"[ 106, p. 372]. Objects modeled with coarse detail (i.e., a few large surfaces) do not look realistic when viewed close up. They require more detail to be realistic. In contrast, when viewed at a distance, objects modeled with fine detail (i.e., many small surfaces) produced at high computational expense appear no more realistic than those modeled with coarse detail. The key is to strike a balance between the level of detail required for the appearance of reality and its computational cost[107]. Standard modeling techniques cannot do this, as detail is defined irrespective of the viewing distance. An alternative is to use a procedure to generate details for a specific surface or patch. The user (or system) selects the basic shape of the patch, and the procedure generates the details. Initially, a minimum amount of detail can be provided. However, as the viewpoint moves closer to the patch, the procedure is called to recursively generate more detail as needed. This procedure can use either a "legitimate" mathematical model of the actual phenomenon or an empirically determined process discovered by experimentation that produces the appearance of reality (the latter appears more common). The process may be either deterministic or stochastic (random). However, the best models follow a random distribution that can be modeled statistically [ 108 ]. While such processes may not be "scientific" [although Mandelbrot would probably disagree, as we shall see], they create remarkably realistic images of natural phenomena, and they integrate well into more traditional graphics techniques. The first use of procedural modeling[106] arose from a series of works by Mandelbrot, culminating in The Fractal Geometry of Nature[ 108 ], in which Mandelbrot describes a new form of geometry: fractal geometry. 5.2.2. Fractal geometry.
Why is geometry often described as "cold" and "dry"? One reason lies in its inability to describe the shape of a cloud, a mountain, a coastline, or a tree. Clouds are not spheres, mountains are not cones, coastlines are not circles, and bark is not smooth, nor does lightning travel in a straight line. More generally, ! claim that many patterns of Nature are so irregularand fragmented,that, compared to Euclid--a term used in this work to denote all of standard geometry--Nature exhibits not simply a higher degree but an altogetherdifferent
level of complexity. The number of distinct scales of length of natural patterns is for all practical purposes infinite[108,
p.l]. Nature does not play by the rules of Euclidian geometry, of lines, polygons, and spheres. Whereas a Euclidian curve is an object whose length between two points can be measured, a fractal curve contains an infinite variety of detail at each point on the curve, such that we cannot measure its length. When we look at a fractal curve, we see fine detail; as we move closer, we see more detail and the length of the curve grows. Move closer still and more detail appears--the length grows still more (see Fig. 83. Thus the level of detail depends upon the closeness of the view and the viewer. In 1980, Fournier and Fussell, and Carpenter separately applied Mandelbrot's concepts and submitted papers to SIGGRAPH '80. Although they took different approaches, the papers were integrated and published as a single work in CACM[I06]. To model terrain, Fournier and Fussell generated a two-dimensional table using a fractional Brownian motion model (a simple subcase of fractal geometry[108]). The fractional Brownian motion model produces a series of numbers that are the weighted moving average of a series of Gaussian ( i.e., normally distributed) random numbers. Thus the numbers in the table are randomly-generated, but related to the adjacent numbers by a weighting parameter. This table was then used to generate the surface detail, by moving each point on the surface perpendicular to the surface (e.g., raising it above the surface) by the amount specified in the table. Oppenheimer [ 109 ] used the fractional Brownian motion technique above to model trees. However, he found that if the parameter used in the Brownian motion formula was held constant, the tree became quite regular in appearance, like a fern. To better capture the irregularity of most trees, Oppenheimer used a randomly-generated parameter in the Brownian motion calculation. Carpenter[106 ] began with an object modeled as a series of triangles, which is a reasonably common model for real world data that has been acquired automatically[i 10]. Each triangle can be subdivided into four smaller triangles by connecting the midpoints of the triangle sides. By using a fractal algorithm to determine the "midpoints" to connect, the triangle can be recursively subdivided into a series of fractal triangles. This is faster than the fractional Brownian technique above, but results in a patch that is more self-similar (but fractal at the limit). While this ap-
Distant View
Closer View
Fig. 8. The level of detail changes with viewpoint.
An overview of rendering techniques proach can be generalized to work with polygons, the complexity of calculation required[e.g., 111] has made triangle subdivision popular[34]. Unfortunately, fractal subdivision models are slow, which limit their usefulness in such real time applications as flight simulators. Modeling terrain can require up to 1 million points to be generated[46], so time is a constant issue with fractals. Fractals also exhibit "creasing problems," which is the occurrence of slope discontinuities ("creases") along surface or patch b o u n d a r i e s [ f i l l . Fractal models for specific objects are also difficult to create [ 112 ]. However, Demko et al. [ 112 ] have developed a prototype for a system that can be used to generate reasonable fractal models of regular two-dimensional objects. We have seen how fractals are generated by recursively applying a transformation to the object. Demko et al. apply this logic in reverse. They recursively shrink several copies of the object and place those copies over the original. If there is a good match, a fractal model can be generated by applying the same logic mathematically. Norton [ 113 ] described a system to generate fractals in three dimensions, but the system required large quantities of space and time to run. Fractals have also been used in "four-dimensions,'" that is, three dimensions plus time--computer animation [46 ]. They can be very effective in modeling objects displaying complex motion, such as a leaf in the wind. It is not clear how well fractal techniques can be adapted to realistically model other natural phenomenal65 ]. For example, fractal clouds and oceans just don't look realistic[l l l, 114]. Some natural objects can be modeled as using large sets of basic polygons. For example, Schachter[ 1 ! 5 ] modeled cumulus clouds with large sets of overlapping ellipses. However, the same approach produced unrealistic results when applied to contrails. 5.2.3. Other procedural modeling techniques. Sev. eral authors have used other stochastic procedural approaches (i.e., nonfractal approaches) to model natural phenomena. Reeves [ 116 ] used a procedural model to randomly generate a series of moving points ("particles") to model fire. The user defines several (sometimes overlapping) particle areas by specifying the mean number of particles, the particle lifetime, velocity, and color for each area. The model then generates thousands of particles for each area, producing the appearance of fire. As Reeves and Blau point out, one advantage that these techniques have over the fractal approach is the ability to better model dynamic or changing environments, such as fire, or a field of grass moving in the wind. This technique was extended to produce branches and leaves[117], and waves, surf and spray[118, 119]. Other researchers [ 120, 121] have used formal, nonrandom procedural models developed by biologists to model plant growth[see 122]. These models (which Smith calls "graftals"), when applied successively, closely model the actual growing process of plant and trees. Unfortunately, the model for a specific class of objects is not always obvious and can be difficult to
113
create and quickly compute [ 112 ]. In contrast, Mastin et al. [ 114 ] model ocean waves using an equation developed in geophysics. The technique uses a nonlinear waveform model of energy transfer. An interesting new approach to procedural modeling for natural textures has been developed by Gagalowicz and de Ma[123]. Their objective was to develop an approach easily generalizable to across many textures, unlike the fractal approach which works well only for certain types of natural phenomena. They first examined a series of natural textures microscopically, and, by a long series of experiments, developed a series of statistical models to represent natural texture. Thus to generate a new texture, they examine a sample of the texture, and, based on past experimentation, can quickly generate a statistical model to approximate it. This technique is not yet complete--it has only been applied to black and white textur~s=-but does look promising. 6. CONCLUSION This paper has provided a brief introduction to current graphics rendering techniques. Computer graphics is a rapidly changing field; the state of the art advances quickly. Many of the techniques presented here will be improved and superseded by other techniques within a very short time. However, this overview should still serve as a convenient foundation from which to explore those new developments.
Acknowledgement--This work was partially supported by the Social Sciences and Humanities Research Council of Canada. REFERENCES
I. R. A. Hall and D. P. Greenberg, A testbed for realistic image synthesis. IEEE Comp. Graphics and dppl. 3(8 ), 10-19 (1983). 2. T. Kamada and K. Satoru, An enhanced treatment of hidden lines. ACM Trans. on Graphics 6(4), 308-323 (1987). 3. I. E. Sutherland, R. F. Sprouil, and R. A. Schumacker, A characterization of ten hidden surface algorithms. Comp. Surveys 6( 1), 1-55 (1974). 4. M. E. Newell, R. G. NeweU, and T. L. Sancha, A new approach to the shaded picture problem. Proceedingsof the ACM National Conference, 443.-450 ( 1972 ). 5. R. A. Schumacker, B. Brand, M. Gi!li!and, and W. Sharp, Study for Applying Computer-Generated Images to Visual Simulation. AFHRLTR-69-14, U.S. Air Force Human Resources Laboratory (September 1969 ). 6. L. Carpenter, A new hidden surface algorithm. Proceedings of NW76, Seattle (1976). 7. L. Carpenter, The A-buffer,an antialiased hidden surface method. SIGGRAPH'84, 103-108 (1984). 8. E. Catmull, An analytic visible surface algorithm for independent pixel proce~ing. SIGGRAPH'84, 109-115 (1984). 9. J. E. Warnock, A hidden-surface algorithm for computer generated halftone pictures. Computer Science Department, University of Utah, TR 4-15 (1968). 10. K. Weiler and P. Atherton, Hidden surface removal using polygon area sorting. SIGGRAPH'77, 214-222 ( 1977 }. I 1. C. Wylie, G. W. Romney, D. C. Evans, and A. C. Erdahl, Halftone perspective drawing by computer. FJCC 1967, 49-58 (1967). 12. G. Hamlin and C. Gear, Raster scan hidden surface algorithm techniques. SIGGRAPH'77, 206-213 ( 1977 ). 13. G.A. Crocker, Invisibility coherence for faster scan-line
114
ALAN R. DENNIS
hidden surface algorithms. S1GGRAPH'84. 95-102 (1984). 14. S. Seehrest and D. P. Greenberg, A visible polygon reconstruction algorithm. ACM Trans. on Graphics 1 ( 1), 25--42 (1982). 15. J. R. Rankin, Algorithmic hidden line processing algorithm. Comp. & Graphics !1( 1), I 1-19 (1987). 16. J. Levin, A parametric algorithm for drawing pictures of solid objects composed of quadric surfaces. CACM 19(10), 555-563 (1976). 17. R. Mahl, Visible surface algorithm for quadric patches. IEEE Trans. on Comp. C-21, 1-4 (1972). 18. R.A. Weiss, Be vision, a package oflBM 7090 Fortran programs to draw orthographic combinations of planes and quadric surfaces. Journal of the ACM 13(2), 194 (1966). 19. P. Y. Woon and H. Freeman, A procedure for generating visible line projections of solids bounded by quadric surfaces. Proceedings of 1971 IFIP Congress, 1120-1125 (1971). 20. K. Knowlton and L. Cherry, "ATOMS---A three-D opaque molecule system for color pictures of space-filling or ball-and-stick models. Comp. and Chemistry 1, 161166 (1977). 21. N. Max, Shadows for bumlr-mapped surfaces. In Advanced Computer Graphics, T. L. Kunii (Ed.), SpringerVerlag, Tokyo, 145-156 (1986). 22. T. Porter, Spherical shading. SIGGRAPH'78, 282-285 (1978). 23. J. Staudhammer, On display of space tilling atomic models in real-time. S1GGRAPH'78, 167-172 (1978). 24. J. F. Blinn, L. Carpenter, J. Lane, and T. Whitted, Scan line methods for displaying parametrically defined surfaces. CACM 23( 1 ), 23-34 (1980). 25. J. Lane and L. Carpenter, A generalized scan line algorithm for the computer display of parametrically detined surfaces. Comp. Graphics and Image Processing I1, 290-297 (1979). 26. J. D. Foley and A. van Dam, Fundamentals oflnteractive Computer Graphics. Addison-Wesley, Reading, MA (1982). 27. G. A. Crocker, Screen area coherence for interactive scan line display algorithms. IEEE Comp. Graphicsand Appl., 10-17 (September 1987). 28. E. Catmull, Computer display of curved surfaces. Pro-
ceedings of the Conference on Computer Graphics: Pattern Recognition and Data Structures. 4-17 ( 1974 ). 29. J. F. Blinn and M. E. Newell, Texture and reflection in computer generated images. CACM 19(10), 542-547 (1976). 30. D. Greenberg, A. Marcus, A, H. Schmidt, and V. Gorter, The Computer Image. Addison Wesley, Reading, MA (1982). 31. M. E. Lee, R. A. Redner. and S. P. Uselton, Statistically optimized sampling for distributed ray tracing. SIGGRAPH'85, 61-66 ( 1985 ). 32. M. A. Z. Dippe and E. H. Wold, Antialiasing through stochastic sampling. SIGGRAPH'85, 69-78 ( 1985 ). 33. R. L. Cook, Stochastic sampling in computer graphics. ACM Trans. on Graphics 5, 51-74 (1985). 34. J. Amanatides, Realism in computer graphics: A survey. IEEE Comp. Graphics andAppl. 7( 1), 44-56 (January 1987). 35. P. S. Heckbert, A survey of texture mapping. IEEE Comp. Graphics andAppl. 6( 11 ), 56-67 (1986). 36. E. A. Feibush. A. M. Levoy, and R. L. Cook. Synthetic texturing using digital tilters. SIGGRAPH'80, 294-301 (1980). 37. A. G. Abram. L. Westover, and T. Whitted, Et~cient alias-free rendering using bit-masks and look-up tables. SIGGIL4PH'85, 53-59 (1985). 38. Bui-Tuong Phong, Illumination for computer generated pictures. CACM 18(6), 311-317 (1975). 39. J. Blinn. Models of light reflection for computer synthesized pictures. SIGGRAPH'77, 192-198 (1977). ,,tO. K. Torrance and E. Sparrow, Theory for off-specular
reflection from roughened surfaces. Journal of the Optical Society of America 57 ( 9 ), 1105-1114 ( 1967 ). 41. R. L. Cook and K. L. Torrance, A reflectance model for computer graphics. ACM Trans. on Graphics 1( 1 ), 24 (1982). 42. H. Gouraud, Continuous shading of curved surfaces. IEEE Trans. on Comp. C-20(6), 623-628 (June 1971 ). 43. G. Bishop and D. M. Weimer, Fast Phong shading. SIGGRAPH'86, I03-105 (1986). 44. Bui-Tuong Phong and F. C. Crow, Improved rendition of polygonal models of curved surfaces. Proceedings of the Second Japan-USA Computer Conference ( 1975 ). 45. T. Duff, Smoothly shaded renderings of polyhedral objects on raster displays. SIGGRAPH'79, 270-275 (1979). 46. N. Magnenat-Thalmann and D. Tbalmann, Computer Animation: Theory and Practice. Springer-Verlag, Tokyo (1985). 47. P. Bergeron, A general version of Crow's shadow volumes. IEEE Comp. Graphics and Appl.. 17-28 (September 1986). 48. L. S. Brotman and N. I.Badler, Generating sot~shadows with a depth buffer algorithm. IEEE Comp. Graphics andAppl.. 5-15 (October 1984). 49. E. Haines and D. P. Greenberg. The light buffer: A shadow testingaccelerator. Comp. Graphics and Appl. 6(9),(1986). 50. T. Nishita and E. Nakamae, Half-tone representation of 3-D objectsilluminated by area of polyhedron sources. Proceedings COMPSAC, 237-242 (November 1983). 5 I. T. Nishita,I.Okamura, and E. Nakamae, Shading models for point and linearsources.ACM" Trans.on Graphics 4(2), 124-146 (1985). 52. H. P. Moravec, 3D Graphics and the wave theory. SIGGRAPH'81,289-296 ( 1981 ). 53. A. Appel, Some techniques for shading machine rendering of solids.ProceedingsofSICC. Washington, 3745 (1968). 54. R. Goldstein and R. Nagel, 3-D Visual simulation. Simulation,25-31 ( 1971 ). 55. L. R. Speer, T. D. DeRose, and B. A. Barsky, A theoreticaland empirical analysisofcoherent ray tracing.In Computer Generated Images. N. Magnenat-Thalmann and D. Thalmann (Eds.), Springer-Verlag, Tokyo, I 125 (1985). 56. T. Whitted, An improved illumination model for shaded display. CACM 23(6), 343-349 (1980). 57. T. Whitted, Processing requirements for hidden line surface elimination and realistic shading. IEEE COMPCON Digest of Papers. 245-250 ( 1982 ). 58. H. Weghorst, G. Hopper, and D. Greenberg, Improved computational methods for ray tracing. ACM Trans. on Graphics 3( I ), 52-69 (1984). 59. J. T. Kajiya, Tutorial on ray tracing. SIGGRAPH'83 Tutorial Notes ( 1983 ). 60. S. Coquillart, An improvement pit the ray tracing algorithm. EuroGraphics'85, 77-88 (1985). 61. N. L, Max, Vectorized procedural models for natural terrain: Waves and islands in the sunset. SIGGRAPH'81, 317-324 ( 1981 ). 62. D. J. Plunker and M. J. Bailey, The vectorization of a ray tracing algorithm for improved execution speed. IEEE Comp. Graphics and Appl., 52-60 (August 1985 ). 63. C. Brown, Special purpose computer hardware for mechanical design systems. NCGA'81,403--414 ( 1981 ). 64. J. G. Cleary, B. Wyvill, G. M. Birtwistle. and T. Vatti, Multiprocessor ray tracing. Graphics Forum 5( 1), 312 (1983). 65. H. Deguchi, H. Nishimura, H. Yoshimura, T. Kawata, I. Shirakawa. and K. Omura. A parallel processing scheme for three-dimensional image creation. Proceed-
ings of the International Symposium on Circuits and Systems, 1285-1288 (1984). 66. M. Ullner, Parallel Machines for Computer Graphics, Ph.D. Thesis, Cal Tech (1982). 67. P. Walker, The transputer. Byte 10(5 ), 219-235 (1985L
An overview of rendering techniques 68. S. Roth, Ray casting for modelling solids. Comp. Graphics and Image Processing 18( 2 ), 109-144 ( 1982 ). 69. S. M. Rubin and T. Whirred, A 3-dimensional representation for fast rendering of complex scans. SIGGR,4PH'80, 110-116 (1980). 70. T. L. Kay and J. T. Kajiya, Ray tracing complex scenes. SIGGRAPH'86, 269-278 (1986). 7 I. J. Goldsmith and J. Salmon, Automatic creation of object hierarchies for ray tracing. IEEE Comp. Graphics andAppl., 14-20 (1987). 72. M. Dippe and J. Swensen, An adaptive subdivision algorithm and parallel architecture for realistic image synthesis. SIGGRAPH'84, 149-158 (1984). 73. A. S. Glasser, Space subdivision for fast ray tracing. IEEE Comp. Graphics and Appl. 4(10), 15-22 (1984). 74. K. Fujimura, H. Toriya, K. Yamaguchi, and T. L. Kumii, An enhanced oct-tree data structure and operations for solid modelling. Technical Report 83-0 I, Dept. of IS. University of Tokyo (1983). 75. H. Matsumoto and K. Murakani, Ray tracing with octtree data structure. Proceeding of the 28th Information Processing Conference. Tokyo, 1535-1536 ( 1983 ). 76. M. Kaplan, Space tracing: A constant time ray tracer.
State of the Art in Image Synthesis Tutorial. SIG. GRAPH'85 (1985). 77. D. Meagher, Geometric modeling using octree encoding, Comp. Graphics and Image Processing. 19, 127-147 (1982). 78. A. Fujimoto, T. Tanaka, and K. lwata, ARTS: Accelerated Ray Tracing System. IEEE Comp. Graphics and AppL, 16-29 (1986). 79. J. Arvo and D. Kirk, Fast ray tracing by ray classification. SIGGRAPH'87, 55-63 ( 1987 ). 80. P.S. Heekbert and P. Hanrahan, Beam tracing polygonal objects. SIGGRAPH'84, 119-127 (1984). 81. K. !. Joy and M. N. Bhetanabhota, Ray tracing parametric surface patches utilizing numerical techniques and ray coherence. SIGGRAPH'86, 279-285 (1986). 82. J. Amanatides, Ray tracing with cones. SIGGRAPH'84, 129-135 (1984). 83. R. L. Cook, T. Porter, and T. Carpenter, Distributed ray tracing. SIGGRAPH'84, 137-145 (1984). 84. C. M. Goral, K. E. Torrence, and D. P. Greenberg, Modeling the interaction of light between diffuse surfaces. SIGGRAPH'84, 213-222 (1984). 85. R. Siegel and J. R. Howell, Thermal Radiation Heat Transfer. Hemisphere Publishing Corporation, Washington, D.C. ( 1981 ). 86. E. M. Sparrow and R. D. Cess, Radiation Heat Transfer, Hemisphere Publishing Corporation, Washington, D.C. (1978). 87. M. F. Cohen and D. P. Greenberg, The hemi-cube: A radiosity solution for complex environments. SIGGIL4PH'85, 31-40 (1985). 88. T. Nishita and E. Nakamae, Continuous tone representation of three-dimensional objects taking account of shadows and interreflection. SIGGRAPH'85, 23-30 (1985). 89. M. F. Cohen, D. P. Greenberg, D. S. lmmel, and P. J. Brock, An efficient radiosity approach for realistic image synthesis. IEEE Comp. Graphics and AppL. 26-35 ( March 1986 ). 90. T. Nishita and E. Nakamae. Continuous tone representation of three-dimensional objects illuminated by skylight. SIGGRAPH'86, 125-132 (1986). 91. T. Nishita, Y. Miyawaki, and E. Nakamae, A shading model for atmospheric scattering considering luminous intensity distribution of light sources. SIGGR.4PH'87, 303-310 (1987). 92. J. R. Wallace, M. F. Cohen, and D. P. Greenberg, A 2pass solution to the rendering equation: A synthesis of ray tracing and radiosity methods. SIGGRAPH'87, 311320 ( 1987 ). 93. J. T. Kajiya. The rendering equation. SIGGtL4PH'86, 143-150 (1986).
115
94. J. F. Blinn, Simulation of wrinkled surfaces. SIGGtLqPH'78, 286-292 (1978). 95. G. Lorig, Advanced image synthesis---Shading, In Advances in Computer Graphics L G. Enderle, M. Grave, and F. Lillehagen (Eds.), Springer-Verlag, Berlin, 441456 ( 1986 ). 96. J. T. Kajiya. Anisotropic reflection models. SIGGR.4PH'85, 15-21 (1985). 97. B. Cabral, N. Max, and R. Springmeyer, Bidirectional reflection functions from surface bump maps. SIGGRAPH'87, 273-281 (1987). 98. E. Catmull and A. R. Smith, 3D Transformations of images in scan line order. SIGGK4PH'80, 279-285 (1980). 99. L. Williams, Pyramidal parametrics. SIGGRAPH'83, Ill (1983). 100. F. C. Crow, Summed area tables for texture mapping, SIGGRAPH'83, 207-212 (1983). 101. D. R. Peachey, Solid texturing of complex surfaces. SIGGRAPH'85, 279-286 ( 1985 ). 102. K. Perlin, An image synthesizer. SIGGRAPH'85, 287296 (1985). 103. J. Bloomenthal, Modeling the mighty maple. SIGGRAPH'85, 305-311 (1985). 104. G. Y. Gardner, Simulation of natural scenes using texture quadric surfaces. SIGGRAPH'84, 11-20 (1984). 105. G. Y. Gardner, Visual simulation of clouds. SIGGRAPH'85, 297-303 (1985). 106. A. Fournier, D. Fussell, and L. Carpenter, Computer rendering of stochastic models. CACM 25(6), 371-384 (1982). 107. J. H. Clark, Hierarchical geometric models for visible surface algorithms. Comm. of the ACM 19(10), 547554 (1976). 108. B. B. Mandelbrot, The Fractal Geometry of Nature (revised), Freeman, San Francisco (1983). 109. P. E. Oppenheimer, Real time design and animation of fractal plants and trees. SIGGRAPH'86, 55-64 (1986). 110. H. Fuchs, Z. M. Kedem, and S. P. Uselton, Optical surface reconstruction from planar contours. Comm. of the ACM 20(10), 693-702 (1977). 111. G. S. P. Miller, The definition and rendering of terrain maps. SIGGK4PH'86, 39--48 (1986). 112. S. Demko, L. Hodges, and B. Naylor, Construction of fractal objects with iterated function systems. SIGGRAPH'85, 271-278 (1985). 113. A. Norton, Generation and display of fractals in 3-D. SIGGRAPH'82, 61-67 (1982). 114. G. A. Mastin, P. A. Watterberg, and J. F. Moreda, Fourier synthesis of ocean scenes. IEEE Comp. Graphics andAppl., 16-23 (March 1987). 115. B. J. Schacter, Generation of special effects. In Computer Image Generation, John Wiley, New York, 155-172 (1983). 116. W. T. Reeves, Particle systems--A technique for modeling a class of fuzzy objects. SIGGRAPH'83, 359-376 (1983). 117. W. T. Reeves and R. Blau, Approximate and probabilistic algorithms for shading and rendering structured particle systems. SIGGRAPH'85, 313-322 (1985). I 18. A. Fournier and W. T. Reeves, A simple model of ocean waves. SIGGRAPH'86, 75-84 (1986). 119. D. R. Peachey, Modeling waves and surf. SIGGRAPH'86, 65-74 (1986). 120. M. Aono and T. I. Kunii, Botanical tree image generation. IEEE Comp. Graphics and Appl. 4(5), 10-34 (May 1986). 121. A. R. Smith, Plants, fractals, and formal languages. SIGGRAPH'84. 1-10 (1984). 122. A. Lindenmayer. Mathematical models for cellular interactions in development. Part I and II. Journal of Theoretical Biology 18, 280-315 ( 1968 ). 123. A. Gagalowicz and S. de Ma. Model driven synthesis of natural textures for 3-D scenes. Comp. Graphics 10(2 ), 161-170 (1986).