Geometric transformations for displaying virtual objects on stereoscopic devices

Geometric transformations for displaying virtual objects on stereoscopic devices

Comput. & Graphics, Vol. 21, No. 3, pp. 329-334. 1997 :c~1997 ElsevierScienceLtd. All rights reserved Printed in Great Britain 0097~8493/97 $17.00+0.0...

515KB Sizes 0 Downloads 30 Views

Comput. & Graphics, Vol. 21, No. 3, pp. 329-334. 1997 :c~1997 ElsevierScienceLtd. All rights reserved Printed in Great Britain 0097~8493/97 $17.00+0.00

Pergamon PII: s0097-8493(97)ooo10-1

Technical Section

GEOMETRIC VIRTUAL

TRANSFORMATIONS FOR DISPLAYING OBJECTS ON STEREOSCOPIC DEVICES

MAURO CARROZZO’2t and FRANCESCO LACQUANIT1123 ‘Sez. Fisiologia Umana-CNR, Istituto Scientific0 S. Lucia, via Ardeatina 306. 00179, Rome, Italy e-mail: [email protected] ‘Istituto di Neuroscienze e Bioimmagini, Consiglio Nazionale delle Ricerche, Milan, Italy ‘Istituto di Fisiologia Umana, Universiti di Cag iari, Cagliari, Italy Abstract-Undistorted through-the-window (i.e. behind the projection screen) stereograms can be generated by using the stereoscopic window projection technique on CRT based flat projection screens. Widely used graphics libraries like Silicon Graphics GL afford the direct implementation of this technique. Physiological concerns may demand the display of the virtual object in front of the projection screen. Here we discuss the problems related to displaying front-screen undistorted stereograms using Silicon Graphics GL. We show that the spatial resolution achievable with front-screen stereograms is better than that typical of through-the-window stereograms. Furthermore we characterize analytically the physical volume in which virtual objects can be displayed. 0 1997 Elsevier Science Ltd

1. INTRODUCTION

One of the mostattractive featuresoffered by Virtual Reality is the ability to generate 3-D interactive environments[1, 21.Geometrical transformationsto be used in generating virtual environments using stereoscopicdeviceshave beenpresentedby different authors [3-61. Orthostereoscopycan be achievedon time-multiplexed devices using flat projection screens. In particular, the stereoscopicwindow projection techniqueis adequateto generateorthoscopic through-the-window stereograms[3], and widely used software graphics libraries such as SiliconGraphics GL or OpenGL afford a direct way to implementthis technique. The CRT display can be modeled as a flat rectangularsurfacedefininga window through which we can view computer generatedobject images.The stereogram

is generated

by projecting

the objects in

the sceneonto the display surfacefor eacheye. The imageplanehasthe samepositionand orientation for both eye projections. The complete transform to generatethe stereogramis organized as a modeling transform, a viewing transform, a projection transform and a screencoordinate transform [3]. The viewing

transform

is the only

one that

changes

according to the half-imageto be displayed.In this paper we analyze problems related to displaying undistorted virtual objects to be perceived in front of the projection screen (front-screen stereograms). To

addressthis issuewe provide the relationshipbetween the graphic modeling coordinates and the world

coordinates (used to describe a perceived virtual object). This relationship is obtained by composing left-eye and right-eye transforms. We showthe generalform of the transformations

impliedby d.isplayingstereograms usingstandardoffaxis and on-axisperspectiveprojectionsasimplemented by Silicon Graphics GL. We show that off-axis projections lead to undistorted stereogramsbehind the projection screenonly, while on-axis projections result in stereogramsrequiring geometricalcorrections both in front and behindthe screen.Moreover, the transformation required to present undistorted front-screenstereogramsby usingon-axis projection iscomputationally lessexpensivethan that relative to off-axis based,front-screenstereograms. There are two main reasonswhy onemay want to usethe visual workspacein front of the projection screen,in addition to that behind the screen,more frequently utilized. First we showherethat the actual spatial reso:lutionachievablein front-screenstereogramsis better than that for stereogramspresented behind the screen.Second, the human factor may dictate the choiceof front-screenworkspace.In this context, one should consider the existence of perceptual distortions of the human visual system. Depth and distanceperceptionof 3-D objectstend to be fairly accuratefor distanceslessthan about 50 cm from the viewer. but they deterioratesubstantiallyat longer distances:the farther the target, the greater the error [7, 81. Thus, a visual target presentedat 60 cm istypically undershotby asmuch as 10cm [8], Placing the monitor

close to the viewer to display the

virtual object behindthe screenwithout the described perceptualdistortions would lead to strong, uncomfortable accommodation of the eyes. Moreover,

’ Author for correspondence. 329

and F. Lacquaniti

M. Carrozzo

330

R

D2 have the same Y-coordinate. as do L and R. Consequently, the line 1, through DI and R and the line 1, through D2 and L always have an intersection point P, whose coordinates (X, Y,Z) are: x=

(&I

-

Xcn)(Xn

X(/I

+

-

X,

E)

-

2Exn1

-

(1)

2E

(2)

z=

(Xdl

-,Ydd

X,,, - X,

Fig. 1. Geometrical sketch of the virtual setup. P represents the location of the virtual point perceived when looking at the stereo pair composed by Di and D,. The left and right eye are located at L and R. respectively.

realistic manipulation of a virtual object using hand effector devices (e.g. the .various types of hand trackers and data gloves) demands that the object be displayed in front of the screen anyway, so as to match object distance with hand distance relative to the operator. It is known, in fact, that human operators do not cope easily with conflicting sensory informations about hand position in space, such as would result when vision of virtual manipulation locates the hand behind the screen while hand proprioception locates it in front of the screen [9].

2. BASIC

WORLD

GEOMETRY

Virtual reality on time-multiplexed stereoscopic devices involves the computation of the 2-D halfimages of the object, and their alternate display on the monitor [5]. The two half-images are normally separated horizontally (stereodisparity), and are viewed by the user through LCD-glasses, shuttered in synchrony with the monitor refresh. Figure 1 depicts the geometric relations in the physical environment. The XYZ-reference frame defines world coordinates with the origin in the centre of the screen and the XY plane on its surface. D, =(X,,, Y,, 0) is an arbitrary point belonging to the half-image viewed by the right eye (R) and D2= (A’,, Y,. 0) is the corresponding point for the half-image viewed by the left eye (L). 2E is the interocular distance and M= (X,,, Y,,,, Z,,) is the midpoint between the eyes. Eyes coordinates are L = (X,,- E, Y,, Z,,) and R = (X, + E, Y,,,, Z,,,), i.e. the eyes are assumed to be at the same height’. Note that in this setup, D, and ’ Horizontal parallax is the main factor responsible for depth perception [IO], if the eyes are at the same height. If the head is tilted away from the vertical, a suitable correction should be introduced in order to generate the parallax in the direction given by the position of the two eyes.

z

- 2E

m

P represents the virtual image in 3-D perceived by the viewer. In Equation (3) we impose Xdi -A’, -2E < 0 to prevent P falling behind the viewer (Z,, 0 and the virtual image is perceived in front of the screen. If instead Xcn > X,, (positive horizontal parallax) then Z< 0 and the virtual image is perceived behind the screen.

3. IMPLEMENTING TECHNIQUE

THE WINDOW PROJECTION USING CL “WINDOW”

The two half-images are usually computed using standard perspective projection in conjunction with translations and pans [5]. The window projection (Fig. 2) affords a direct way to implement this technique. In this section we show the relation between world coordinates and graphic coordinates which is implied by this approach. In the following, uppercase and lowercase denote world and graphic variables, respectively. We first derive a general transformation between graphics and world coordinates without setting any specific value to the graphics parameters E and u: s accounts for horizontal off-centering of the window perspective projection with respect to the actual viewport on the display surface, a represents a

Fig. 2. Window projection to define an asymmetric viewing volume. The center of projection is located at the origin 0. The viewing frustum is defined by the following six parameters: I, Y (left and right clipping and bottom clipping planes), n,,f (near planes). u and v coordinates are functions parameters.

planes), t,b (top and far clipping of the other fixed

Geometric transformations translation of the graphics coordinates reference frame in the x (i.e. horizontal) direction. We will then show how to set the values of E, a, X, and Y,,, to obtain orthostereoscopy. We consider orthostereoscopy achieved when the transformation between graphics and world coordinates does not change distance and angles. Let us assume O
331

thus the virtual object is perceived undistorted as it is modeled in graphics coordinates. A basic difference exists with respect to previous approaches to the problem. Southard [3] proposed the general form of two transformations between graphics and screen coordinates Nright (to display the right half-irnage) and N,+ (to generate the left halfimage). In our case, the effect of applying such transforms is described by Equation (4)*: by substituting them into Equations (l)-(3) we composed the two transforms to obtain a single transformation describing the relationship between graphics and world (or visual) coordinates. The meaning of Equation (:3) is that it is possible to identify the graphics coordinates system with the visual one (unless a translation) just beyond the screen and when the head is centered with respect to the projection screen. If these conditions are not respected then assuming the two coordinates systems coincide is no longer legitimate. Moreover, the shape of the geometrical distortions related to assuming the two systems coincide is given by Equations (5k(7).

and Y, = -:.I? 4. DISPLAYING

in Equations

(lt(3)

we obtain:

x _ (na - z&)X, + Enx na- (E+E)z y=

z=

(na-ze)Y,+Eny na- (E+E)z na na-(E+E)Z

(5)

(6)

ZE Z,

(7)

Equations (5)<7) define the general transformation between the graphic coordinates (xyz) in which graphical objects are modeled and the world coordinates (XYZ) in which we can compute where the corresponding virtual objects are perceived. This is the transformation obtained when using the GL “window” perspective projection and applies to both front-screen and through-the-window (i.e. behind the screen) stereograms. Furthermore, there are no constraints about head position. The application of Equations (5t(7) is computationally expensive: anyway, they can be simplified greatly by imposing a few restrictions on head position and about the space where virtual objects can be displayed. In fact, in order to eliminate the nonlinear dependence of X, Y, Z on z. we must set E = -E: this implies Z < 0 (because z < -n), i.e. the virtual point will be perceived beyond the screen. By setting X, = 0 and Y, = 0 (i.e. the head is constrained to be centered with respect to the screen), we get rid of the linear dependence of X and Y on z. Finally, to further simplify the equations, we can set a = E and II = Z,,, obtaining XV=.x,Y=y

and

Z=Z,+z.

(8)

VIRTUAL OBJECTS “PERSPECTIVE”

USING

CL

In this section we derive the general form (in the sense of the: previous section) of the transformation between graphics and world coordinates related to the use of an on-axis perspective projection. We show that, in contrast to what happens when using an offaxis perspective, in this case such a transformation can never be reduced to a translation. Anyway, from a computational standpoint, the equations describing the geometrical corrections to display undistorted front-screen stereograms are simpler than those achievable by using the off-axis projection. GL “perspective” implements a standard perspective on-axis projection. The main difference with respect to GL “window” perspective projection is that the viewing volume must be symmetric with respect to the viewport. .‘iy is the plane II of projection .and c= (O,O,d), d> 0, is the center of projection. The projection of a point p = (xJ.:) on n is n,, = (.x,,J,,O), with

If the display area has height 2H and 2c( z=2tan ~ ’ (H/d) is the aperture angle of the viewing frustum in the vertical direction, then the factor to scale to world coordinates is unitary and we assume that it holds in both X and Y directions. Thus, the mapping of point p in graphics coordinates is (x,,, .v,,, 0). We want to find the intersection point P = (A’, Y,Z) given by a stereo pair composed by sl = (x-a,y,z)

’ In our case the application of NriKh, to the graphics point (x. .v z) produces DI =(X,,,,YdO). Analogously. results in D2 == lXd,,z, Y,, 0)

the point applying

on the screen IV,<,,, to (x. J, ;)

M. CarrozzoandF. Lacquaniti

332

chosencorrespondingto the verticesof the polygons that define the virtual object’ssurface. Once E and A4= (X,,, Y,,,, Z,) are establishedwe can proceedin computing the two half-images.In the first step we xdl = &(x - a), &2 = -& (.Y + a) perform the correction of the geometricaldistortions: note that a > 0 impliesX,, < A’,, sowe have negative by using Equations (9~(1 I), transform the set of parallax (Z> 0) while a0 gives X& > X, and the world points 9(X,, Yi,ZJ into graphicsonespi = (xi, yj, z;) i= 1, . ,N, henceobtaining from the desired parallax is positive (Z
6. FRONT-SCREEN

DISPLAYABLE

VOLUME

In order to compute the boundaries in world coordinatesof the volume wherevirtual objectscan be displayed we assumethat (in graphics coordinates)the near clipping plane is located at distancen from the centre of projection c while the far clipping plane is located at distance L with f>n. This translatesinto the following inequality for ::

and 3=-d[se-1)-l]

Note that there is no way to eliminate the nonlinear dependenceof x, .v, z on Z by an appropriate choice of graphicsparameters.The only constraint on the graphics parameter a to display -f 0, thus we can always set a = E so we get then from Equation (II) we obtain x = -Z?, - x - x,,, Z

(9) and the inferior limit is always positive, that is the whole displayablevolume is in front of the screen. The maximum displayable range for x values is a function of 2:

Z=-d%+Zd

(11)

-Ar(d - ;)tan(cr) < x < Ar(d - z)tan(a) Equations (9)-(11) define the relation between where Ar = 5 is the aspectratio of the window on world and graphic coordinatesresulting when genthe screen,and a similar relation holds for y: erating front-screen stereograms using the GL “perspective” projection. Note that the above -(d - z)tan@) < y < (d - z)tan(a) transformation does not pose restrictions on head position and is equivalentto the compositionof three usingEquations(9) and (11) we obtain primitive operators:first a perspectiverepresentedby the following matrix -w++-~+~
1 0 0 01 0 00 0 ! 0 0 -d

0 0 1 0I

In a similar way we computethe inequality for Y: -fi+&‘m+H) m

then a uniform scaling by Z, and finally a translation to (-X,, - Y,, 2”). Note that Z= 0 representsa singularplane of the transformation. 5. USING

THE

TRANSFORMATIONS TO DISPLAY SCREEN STEREOGRAMS

,tl

FRONT-

For any given object to be displayed, a set of Points pi(Xf, Yi>Zi), i= 1,. ,N in world space is

< Y< H++H)

I?,

The size of the displayable volume decreases linearly in both X and Y direction when Z increases (i.e. moving toward the observer)by R,=

-2;:

111

and R,=

-2;

m

for the X and Y direction respectively(when A’,, = 0 and Y,=O).

333

Geometric transformations 7. SPATIAL

RESOLUTION

Stereoscopicvoxels [l 1] define the uncertainty in locating a virtual point in spacewhen displaying stereograms.Pixel pitch and shapeas well as eye positions determine the shapeand size of stereoscopic voxels. An analytical recursive characterization of stereoscopicvoxel sectionalshapein the plane of view has beengiven [ll]. However, by setting in Equations( 1X3) appropriate valuesto X,i, X, and Y, (derived from pixel geometry and its location on the screen)it is possibleto derive a non-recursive analytical descriptionof stereoscopicvoxels. A cue about stereoscopicvoxel size as a function of the position is given by the changesof X, Y and Z causedby a one pixel shift of the imageon the screen (seeFig. 3). If iy, is the horizontal resolution of the screen expressed in pixels we define 6,=2 W/N, and similarly we define 6,.= 2H/N,. We assumethat the two half-imagesare separatedby a distance(parallax) D=X,,,-Xa on the screen.Note that D 0 and D > 0 resultsin Z< 0. If both the half-imagestranslate by 6, in the X direction, we have a correspondingchangein X

Since Z increasesfor decreasingvalues of D (Equation (3)) we can concludethat the discretization error due to the monitor’s resolution decreases with increasingvalues of Z. Thus the resolution achievable with front-screen stereogramsis better than that obtaineddisplayingstereograms behindthe projection screen. 8. CONCLUSIONS

The algorithms describedhave been tested and validated on a SiliconGraphics Crimson graphics workstation equippedwith Reality Engine graphics and Stereoview. A 1280x1024 pixels 19” monitor (H= 140mm and W= 175mm, Ar= 1.25) has been used. In default stereomode,the actual vertical resolution is 492 pixels, hence 6, =0.27 mm and a,= 0.57 mm. When the viewer sits in front of the screenand i!, = 500mm, E= 30 mm (then LY = 15.6”, d= SOO),litniting the workspacedepth range from 50 mm in front of the screenup to 200 mm from the viewer’seyes,(i.e. 50 mm< Z < 300mm) corresponds to n = 333 and f =4000. When Z= 50 mm (i.e. 450 mm from viewer’s eyes, -D= 6.7 mm) the spatial resol.ttion is about 0.24 mm for X, 0.51 mm for Y and 1.8mm for Z. When Z= 300mm (i.e. 200 mm from viewer’s eyes, -D= 90 mm) the spatial resolution is about 0.11 mm for X, 0.23 mm and in a similarway we concludethat a translation for Y and 0.36 mm for Z. If color in.formationis not relevant, the anaglyphic of S, of both the half-imagesin the Y direction techniquecan be used. In this casethe full screen producesa changein Y: resolution can be exploited using two different subpixels (e.g. red and blue subpixels): the GL BY=%& m “RGBwritemask” routine affords the direct implementation of this technique. If D varies by 6, then we have a changein Z: In eachexperiment,E is measuredby meansof an interpupillometerand the location of the nasion(M in Fig. 1) is measuredby meansof a head tracker. Az = (D + 6, z&D - 2E) zm Alternatively, a head-holdercan be usedto restrain headposition. The algorithm hasbeenappliedsuccessfullyto the study of human psychophysicsfor visuomanual coordination [ 121:all subjectsreported a comfortable view of the front-screen stereogramspresented during the long lasting experimentalsession. REFERENCES

1. Foley. J. I).. Interfacce per l’elaborazione avanzata. Le Science (il. trans. Scientific American), 1987, 39, 78-89. 2. Eberts. R. E., User Interface Design. Prentice Hall, Englewood Cliffs, New Jersey, 1994. 3. Southard, D.. Transformations for stereoscopic visual simulation. Computers & Graphics, 1992, 16, 401410. 4. Southard, D., Viewing model for virtual environment displays. Journal of Electronic Imaging, 1995, 4, 413-

420. 5. Hodges,

Fig. 3. Spatialresolution.When translates Shrinking

the stereo pair DtDz vertically by 6, the perceived point P shifts by AY. the distance D between D, and D2 by 6 shifts P toward the screen by AZ.

6.

L. F.. Time multiplexed stereoscopic computer graphics. IEEE Computer Graphics & Applications. 1992, 12, :+30. Robinett, W. and Holloway, R., The visual display transformation for virtual reality. Presence, 1995, 4, l-

23. I. Collewijn, ments

H. and Erkelens, C.J., Binocular eye moveand the perception of depth. In Eye Movements

334

M. Carrozzo and F. Lacquaniti

and their Role in Visual and Cognitive Processes, ed. E. Kowler. Elsevier, Amsterdam, 1990, pp. 213-261. 8. Soechting, J. F. and Flanders, M., Sensorimotor representations for pointing to targets in three-dimensional space. Journal of Neurophysiology, 1989, 62, 582-594. 9. Rossetti, Y., Desmurget, M. and Prablanc, C., Vectorial coding of movement: vision, proprioception or both? Journal qf Neurophysiology, 1995, 74, 451463.

10. Poggio, G. F. and Poggio, T. The analysis of stereopsis. Annual Review of Neuroscience, 1984, I, 319412. 11. Hodges, L. F. and Thorpe Davis, E., Geometric considerations for stereoscopic virtual environments. Presence, 1993, 2, 3443. 12. Carrozzo, M. and Lacquaniti, F., A hybrid frame of reference for visuo-manual coordination. Neurorepor?, 1994,5,453-456.