1280×720 HD
SMPTE ST 296, 1280 × 720 Progressive Image Sample Structure – Analog and Digital Representation and Analog Interface.
41
This chapter details the scanning, timing, and sync structure of 1280×720 video, also called 720p. The scanning and timing information in this chapter applies to all variants of 720p video, both analog and digital. Scanning
24 ≈ 23.976 1.001 30 ≈ 29.97 1.001 60 ≈ 59.94 1.001
720p video represents stationary or moving twodimensional images sampled temporally at a constant rate of 24⁄ 1.001 , 24, 25, 30⁄ 1.001 , 30, 50, 60⁄ 1.001 , or 60 frames per second. The sampling rate is 74.25 MHz (modified by the ratio 1000⁄ 1001 in 720p59.94, 720p29.97, and 720p23.976). All of these systems have 750 total lines (LT). The number of samples per total line (STL) is adapted to achieve the desired frame rate. Table 41.1 below summarizes the scanning parameters. A frame comprises a total of 750 horizontal raster lines of equal duration, uniformly progressively scanned top to bottom and left to right, numbered consecutively System
fS [MHz]
STL
720p60
74.25
1650
720p59.94
74.25⁄
1650
720p50
74.25
1980
720p30
74.25
3300
720p29.97
74.25⁄
3300
720p25
74.25
3960
720p24
74.25
4125
720p23.976
74.25⁄
4125
1.001
1.001
1.001
Table 41.1 720p scanning parameters are summarized.
467
Line number tri
Trilevel pulse
BR Broad pulse The vertical center of the picture is located midway between lines 385 and 386.
1–5 6 7–25 (19 lines) 26–745 (720 lines) 746 –750 (5 lines)
Contents tri/BR (5 lines) Blanking Blanking/Ancillary Picture [Clean aperture 702 lines] Blanking
× 720 line assignment Table 41.2 1280×
starting at 1. Of the 750 total lines, 720 contain picture. Table 41.2 above shows the assignment of line numbers and their content. For studio video, the tolerance on frame rate is normally ±10 ppm. In practice the tolerance applies to a master clock at a high frequency, but for purposes of computation and standards writing it is convenient to reference the tolerance to the frame rate. At a digital interface, video information is identified by a timing reference signal (TRS) conveyed across the interface. (See SDI and HD-SDI sync, TRS, and ancillary data, on page 433.) The last active line of a frame is terminated by EAV where the V-bit becomes asserted. That EAV marks the start of line 746; line 1 of the next frame starts on the fifth following EAV. Analog sync 0H precedes the first word of SAV by 256 clocks. 0H follows the first word of EAV by STL - 1280 -260 clocks.
468
Horizontal events are referenced to 0H , defined by the zero-crossing of trilevel sync. Digital samples and analog timing are related such that the first (zeroth) sample of active video follows the 0H instant by 260 reference clock intervals. At an analog interface, each line commences with a trilevel sync pulse. Trilevel sync comprises a negative portion asserted to -300±6 mV during the 40 reference clock intervals preceding 0H , and a positive portion asserted to +300±6 mV during the 40 reference clock intervals after 0H. The risetime of each transition is 4±1.5 reference clock intervals. Vertical sync in the analog domain is signaled by broad pulses, one each on lines 1 through 5. Each broad
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES
pulse is asserted to -300±6 mV, with timing identical to active video – that is, to the production aperture’s picture width. The risetime of each transition is 4±1.5 reference clock intervals. Line 1 can be detected as the first broad pulse of a frame – that is, by a line without a broad pulse followed by a line with one. Lines 7 through 25 do not convey picture information. They may convey ancillary or other signals either related or unrelated to the picture. Analog signal timing is defined by the digital standard; the digital sampling frequency defines reference time intervals used to define analog timing. Figure 41.1 overleaf shows details of the sync structure; this waveform diagram is the analog of Table 41.2. Picture center, aspect ratio, and blanking The center of the picture is located midway between the central two of the 1280 active samples – that is, between samples 639 and 640 – and midway between the central two 720 picture lines – that is, between lines 385 and 386. The aspect ratio is defined to be 16:9 with respect to the production aperture of 1280×720. In Transition samples, on page 378, I mentioned that it is necessary to avoid, at the start of a line, an instantaneous transition from blanking to picture information. A clean aperture pixel array 1248 samples wide and 702 lines high, centered on the production aperture, should remain subjectively uncontaminated by edge transients. R’G’B’ EOCF and primaries Picture information is referenced to linear-light primary red, green, and blue (RGB) tristimulus values, represented in abstract terms in the range 0 (reference black) to +1 (reference white). Three nonlinear primary components R’, G’, and B’ are computed such that the intended image appearance is obtained on the reference display in the reference viewing conditions; see Reference display and viewing conditions, on page 427. In the default power-up state of a camera, the nonlinear primary components are computed according to the opto-electronic conversion function of BT.709
CHAPTER 41
1280 × 720 HD
469
470
DIGITAL VIDEO AND HD ALGORITHMS AND INTERFACES
746... 750
1
2
Figure 41.1 720p raster, vertical
745
(bottom)
3
30 H 25 H 5 H
4
5
PROGRESSIVE SYSTEM, FRAME
0V
6
7
8 ...
25
26
top image row
27...
745
bottom image row
746... 750
OECF, described on page 320; this process is loosely called gamma correction. The colorimetric properties of the primary estimates are supposed to conform to BT.709 primaries described on page 290: DTV transmission standards call for BT.709, and modern consumer displays use BT.709. However, as I write in 2011, nearly all 50 Hz program material is created and mastered using EBU primaries (see EBU Tech. 3213 primaries, Table 26.4 on page 293), and nearly all 60 Hz program material is created and mastered using SMPTE primaries (see SMPTE RP 145 primaries, Table 26.5 on page 293). I expect the situation in mastering to change upon the introduction of new studio display technologies such as OLEDs – but among content creators and broadcasters, old habits die hard. Luma (Y’) Luma is a weighted sum of nonlinear R’, G’, and B’ components according to the BT.709 luma coefficients: 709
Y ′= 0.2126 R′+ 0.7152 G′+ 0.0722 B′
Eq 41.1
The luma component Y’, being a weighted sum of nonlinear R’G’B’ components, has no simple relationship with the CIE relative luminance (Y) used in colour science. The formulation of luma in HD differs from that of SD; see SD and HD luma chaos, on page 350. Video encoding specifications typically place no upper bound on luma bandwidth. Video encoding specifications typically place no upper bound on luma bandwidth. Component digital 4:2:2 interface For details of the analog interface, see Component analog HD interface, on page 485.
Y’CBCR components are formed by scaling Y’, B’-Y’, and R’-Y’ components, as described in CBCR components for BT.709 HD on page 371. TRS is inserted as described in SDI and HD-SDI sync, TRS, and ancillary data on page 433. The HD-SDI interface is described in HD-SDI coding, on page 440. It is standard to subsample according to the 4:2:2 scheme (sketched in the third column of Figure 12.1, on page 124). Image quality wouldn’t suffer if subsampling were 4:2:0 – that is, if colour differences were subsampled vertically as well as horizontally – but this would be inconvenient for hardware design and interface.
CHAPTER 41
1280 × 720 HD
471