COMPUTER GRAPHICS 2~ND IMAGE PROCESSING 5, 41--51
(1076)
Recording Motion Picture Sound Tracks Using a Computer Output Film Recorder ROBERT M. LEE Compuletr Graphics Group, Lawrence Livz~wwre,Laboralory, Livermor~, California 9~550 Communicated by H. Fre~nan
Received February 24, 1975 The use of computer graphics equipment to draw sound tracks directly on film is discussed. This method of recording enables the computer user to add sound to his movies using equipment and techniques with which he is familiar. It introduces problems fl'om two sources, the incremental movement of the film and the detailed control over the recorded wave shapes that must be exercised by the programmer. :Examples of some of these problems, with suggested solutions, are given.
I. INTRODUCTION W e have made several short motion pictures in which both the picture and the sound track were drawn using a computer output film recorder. This m e t h o d of recording differs from the ordinary in that the sound is recorded while the film is standing still, a frame at a time. I t gives the computer programmer complete control over what is recorded. The great advantage is t h a t this m e t h o d of recording sound enables computer users to generate sound using only t h e computer facilities they are already using to make motion pictures of their data. It is surprising t h a t sound is so infrequently used with computer graphic displays. We are experts at aural p a t t e r n recognition. It is not unusual for a musician to identify a composer, a specific composition, and/or a performer after hearing only a few bars of music taken from some random starting point in a composition. We recognize people's voices after hearing only a few words spoken over the telephone, we can distinguish and understand one voice in a r o o m full of conversing people. There must be ways of using these abilities in analyzing data presented b y a computer. A simple use of sound would be to use a tone to mark the time of an important. event. In a 15-see movie of changing data there m a y be 1 sec of particular interest. The time at which t h a t event occurs can be readily highlighted for the viewer b y sounding a tone during, or just preceding, the critical time. Another use of sound would be to make the frequency of a tone a function of the velocity of an object. Accelerations of the object, which are generally poorly perceived visually, will be 41 Copyright ~ 1976 by Academic Press, Inc. All rights of roproduetlon in any form reserved,
42
R O B E I t T M. L E E
m/ U FIG. 1. Sound ~rack drawn using a computer outpu~ film recorder.
"B
............................
Fro. 2. Sound track enlarged and drawxl with a small spot so t h a t the lines call be clearly seem
I:~ECORDING SOUND TRACKS USING A COMPUTEYr
43
easily detected as changes in the frequency of the tone. Perhaps there are situations in which we can " c o n n e c t " tones to several parameters and use the ear for p a t t e r n recognition. T o explore these ideas we have developed F O R T R A N subroutines to enable our users to record sound on their movies. The problems discussed in the following parngr~phs were encountered in developing these routines. A sound track d r a w n using a computer o u t p u t film recorder is shown in Fig. 1. A similar sound track, enlarged and d r a w n with ~ small spo~ so t.h~t %he lines can be clearly seen, is shown in Fig. 2. 2. VA[~IABLE AJ~A VS VA[~IABLE D_ENSI~I'Y I n mobion picture recording two kinds of sound tracks are used, variable area and variable density. The sound track shown in Fig. 1 is variable ares; Fig. 3 shows a variable density sound track [4].
Fro. 3. Vari,~ble density sound track.
I~OBEI~T IVL LEE
~i
Variable density recording has the advantages that it is more tolerant to faulty edge guiding and enters overload more gracefully than variable area. Variable area has the advantages that it gives sharper reproduction of transients and is less sensitive to variations in exposure and film processing. We have chosen to use variable area recording primarily because of its relative insensitivity to exposure and processing variations 3. TYPES There
OF VARLkBLE
ARE~k SOUND
are three types of variable area sound
TR•
tracks that appear of interest for
our purposes: unilateral, bilateral, and double bilateral. These are shown in Figs. 1, 4, and 5 (the sound track shown in Fig. 1 is a bilateral, wu'iable area, sound track). From our point of view these types of sound tracks vary in the expense of producing them and in their tolerance to edge guidance errors. The cost of producing unilateral and single bilateral sound tracks is the same. Unless special equipment is built the double bilateral track requires the generation and plotting of twice as many data as are required for the unilateral or single bilateral sound tracks. This is because two horizontal lines are required to draw the two tracks of the double bilateral sound track for each line required to draw either of the other two types of sound track. This difference will probably result in only a 10 or 15% increase in computation time but it will probably more than double the plot time required. If, as is often the case, the plot data are transmitted
/
1
J
.
J
FIG. 4. Unilateral variable area sound track.
RECORDING SOUND TI~.ACIKSUSING A COMPUTEIt
4:5
I L
I
1
F~(~. 5.
D~mblebila~eriaIvariable arm~sound track.
from computer to plotter via magnetic tape, then twice as much tape will be required for the double bilateral tracks. The double bilateral sound tracks are the least sensitive to inaccurate edge guidance, the unilateral sound tracks the most sensitive. Edge guidance accuracy is most likely to be a problem with inexpensive 8 or 16 mm home movie equipment. Most professional movies are recorded with two or more sound tracks but there is little extra cost involved, when conventional recording techniques are used, and the multiple track recordings have the advantage that they arc less sensitive to uneven recording slit illumination and to uneven playback photocell sensitivity than single track recordings. On the basis of these considerations we have settled on using bflatei'al sound tracks, as shown in Fig. 1. They cost less than double bilateral recordings and they are less sensitive to edge guidance inaccuracy than unflateraI recordings. 4. VEI~TICA_L VS HORIZONTAL
PLOTTING
Since the sound tracks are generated a frame at a time i~ is possible to draw them with horizontal lines, lines drawn across the width of the sound track, or with vertical lines, lines running along the length of the sound track. Figures I, 2, 4, and 5 were drawn with horizontal lines. Figure 6 shows an enlarged sound track drawn with vertical lines. Most of our sound track plotting is done using horizontal lines, primarily becanse the algorithms for generating them are a little simpler. This is worth con-
46
I~,OBERT M. LEE
/ !
1
/ Fro. O. Sound track drawn wiiJl vertical lines. Enlarged trod ch',rwl~wil,h a small spot, so I,he lines can be clearlyseen. sidering bec~mse the ~figorithms "tre still subject to frequen~ change. The horizontal lines also require 5 or 10% less computer time, compared I)o verbieal lilies, in our experience. The vertical lines have two advan[,ages ; they typic~lly require about 10% less plot time on our plotter, and they sound better. Plotting using vertical lines tends to be faster th,'~n using horizontal lines, because the center of the sound Crack can often be drawn with lines running the full length of the frame, but the actual plo~ time for a given frame is highly content dependent. High-frequency, high-amplitude sounds will require more plot time using vertical lines. Low-frequency, low-amplitude sounds will require more plot time using horizontal lines. 5. JOINING FRAMES Using our equipment, approximately 808 horizontal lines are required per frame to record the sound track. A gap of the width of one line between frames is clearly heard as a 24 Hz buzz. Overlapping adjacent frames will, unless the lines are bright enough to completely expose the emulsion, produce a bright line at the frame junctions which will also be heard, but not as loudly. Overlapping more than a very few lines produces a garbled effect which is strange to the ear and annoying. 5oining adjacent frames with an error less than the width of one line requires an accuracy better than one part in 800. To reduce the effee~ of slight errors in
RECORDING SOUND TRACKS USING A COY[PUTEI%
47
joining we have designed our program to produce a s~aggered junction. An example is shown in Fig. 7. In this example we have purposely used fewer lines t h a n would be necessary to properly close the gap. 6. JOINING
SIN],', WAVES
ON SUCCESSIVE
FRAMES;
FREQUENCY
ACCUI~ACY
A program writben to generate sine waves, or other wave shapes, musg contain provisions for continuing the lunch, ions when crossing frame junctions. An example of a frame ending with the recorded sine wave at maximum amplitude joined to a frame in which the sine wave is started at zero amplitude is shown in Fig. 8. Such a discontinuity will be heard as an audible click or thump when the sound track is played. If it occurs at every frame junction it will produce a loud, 24 Hz, buzz similar to that produced by very bad frame joining. One way to record, for example, a sine wave is to use a sine function, advancing the angle slightly as the length of each horizontal line of the sound track is calculated and drawn. To ensure a continuums funcI,ion it is only necessary to avoid reset;ting the value of the anglo when one frame is completed and the nexg begun. For most applications, however, this method will use more eompul~er time than is necesst~ry. Substantial savings in computer time can be made by computing and storing a set of values of the lengths of the sound track lines for one cycle of the frequency that is to be recorded. As the herizon~N lines of the sound track are drawn, a
r
_
_
|
| Pro. 7. Staggered frame junction. ]~nlarged and drawn wigh fewer lines than would be required to properly dose the gap.
48
1LOBERT M. LEE
Jl
l?m. 8. Sine wave diseontinuilT at the junction ~ff ~wo l'r~mms. pointer is moved along the list of values to select the next value. This pointer Js reset to the beginning of the list every time the end of l~hc cycle is retmhed, independently of where the frame junctions occur. If a complex wave, made up of several componml~s, is ~o be recorded, ml extension of the same method can be used. A set of values of the lengths of t,he sound track lines is calculated and stored for one cycle of each component of the complex wave. Separate lists are maintNned for each component, each with its own pointer. As the horizontal lines of the sound track are drawn the pointers are incremented and as each pointer reaches the end of its list it is reset to the beginning of its list. Storing one eyeIe of the frequency that is to be recorded saves coinputer time but it limits the choices of frequencies to those that can be drawn with an integral number of equally spaced horizontal lines. The frequency of a sine wave can be computed from Freq = (Frames per second) X (Lines per f r a m e ) / ( L i n e s per cycle). Using a frame rate of 24 frames per second, 808 lines per frame, and 20 lines per cycle gives Freq = 24 X 808/20 = 969.6 Hz. Changing to 21 lines per cycle gives Preq = 24 X 808/21 = 923.43 Hz. If a frequency of 9~t5 I-Iz were required, the closest approach to it would be
RECOI~DING SOUND TRACKS USING A COMPUTER
.49
923.43 Hz, an error of more than 2%. At these frequencies a person can detect frequency differences of less than 0.5%. The maximum error possible from storing ~me cycle of the frequency that is to be recorded is a direct function of the fi'equency and an inverse function of the nmnber of lines used per frame. An obvious way of reducing frequency error from this source is to store more than one cycle of the frequency to be recorded. The use of a list holding a maximum of 1000 values, in which are stored as many cycles as possible of the frequency to be recorded, will reduce the maximum frequency error from this source to be not more than 0.1%. This is adequate for music and probably adequate for most other applications. 7. JOINING SINE WAVES OF DIFFERENT FREQUENCIES We have made the simplifying, but not necessary, restriction that frequencies can change only at the iunetion of two frames, not within a frame. This is convenient for generating sounds that are synchronized with the picture. But a difficulty arises when it is desired to generate a sound of changing frequency. If the new frequency is alw~ys generated, for example, starting at zero amplitude there may be a noticeable click or thump caused by the sine wave en the preceding frame ending ~t something other than zero "unplitude. This is similar to the problem of joining successive frames of a single frequency discussed in the preceding secion and illustrated in Fig. 8. To minimize this source of noise we si~art ~ sine wave at zero amplitude if it is the beginning of the sound. When the frequency of a continuing sound is to be changed, the generating program notes the position o[ the pointer that is cycling through the list of values of the lengths of the sound track lines and calculates its relative angular position, theft is, calculates where it is in the sine wave cycle. When the generating progr-un calculates and stores the list of values for the new frequency it sets the pointer at the s~une :regular position '.~s was stored. Thus, if the old sine wave was iust beginning to go in the positive direction the new frequency sine wave will continue to go in the positive direction from the same point in the cycle. If the old sine was at its negative peak value the new sine wave will begin at its negative peak value. An example of two quite different frequencies joined together is shown in Fig. 9. If successive frequency differences are small ~he ear will hear them as a gliding tone, like a toy whistle or a trombone. As the differences become larger the sound changes to that of a very smooth scale such as can be easily produced with a panpipe. 8. SOUND QUALITY Our sound tracks are recorded on 35 mm film. The CI~T raster is square and is adjusted to just fill the area between the sprocket holes. With our hardware ~nd software it is convenient to address 1024. lines in the vertical direction and 4096 in the horizontal direction. This results in sound tracks written with 808 horizontal lines per frame, each line having a maximum length of 76 linewidths; that is, the sound waves are formed of quantized samples. Each horizontal line is a sample; the length of each line is adjustable in small increments.
50
ROBEI:tT h{. LEE
/
/ Fro. 9. 252 Hz ~i1~o w~ve j~ined ~o 40(} TIz s}ue w~wo. The 808 lines, at 24 frames per second, pass the opticttl sound pickup at a rate of 19392 lines per second. The theoretical bandwidth of a contimmus function approximated by samples is R/2, where R is ~he sampling r~tte. Thus, our recorded sound has a theoretical maxfinmn bandwidth of 9696 Hz [3]. Each horizontal line can be drawn with from 0 to 76 increments. This iutroduces a quantizing error which limits t.he signal to noise ratio. With 76 increments the maximum error in signal amplitude is 0.5 and the maximum possible signal-tonoise ratio is 76/0.5 ~ 44 db. In practice we find that the sound quality is comparable to that produced by a typical television receiver. The motion picture projeetor's playback system limits the lowest usable frequency to be about 150 Hz. The usable upper frequency is limited by noise and distortion to about 3500 Hz. We assume that the high frequency is limited by variations in fihn processing, varying spot intensity, CI{T misalignment, the projector's playback system, etc., but we have made no attempt to evaluate these effects. ACKNOWLEDGMENi ~. I am indebted to Stephen Levine of Lawrence Livermore Laborat,ory for convineing me tha~ computer-drawn sound tracks are practical and for his comments on early versions of this paper. I am indebted t() Roy I(eir of the Universil~y of
I~ECORDING SOUND TRACKS USING A COMPUTER
51
Utah, who called to m y attention the frequency error t h a t results from storing only one cycle of the frequency to be recorded and who also modified m y program so t h a t it would draw sound tracks with vertical lines. This work was prepared under the auspices of the Atomic Energy Commission. REFERENCES 1. N. McLaren, "Synchromy." Film, Learning Corporation of Americ.% New York, 2. E. K. Tucker, Computer generated optical sound tracks, in Proceedings of the AFIPS Fall Joinl Computer Co,nferencc, pp. 147-152, Anaheim, Calif. Dec. 1972. 3. M. V. Matthews, The Technology of Computer Music, MIT Press, Cambridge, Mass., 1969. 4. H. M. Tremaine, Audio Cyclopedia, Howard W. Sam_s, New York, 1973.