DKXTAL SFXWL. PROCESSING 2, 146-156
( 1992)
A Bi-level Coding Technique for Compressing Broadband Residue Sequences Samuel D. Stearns,* Lizhe Tan,+ and Neeraj Magotra+ *Sandia National Laboratories, Albuquerque, New Mexico 87185; and ‘Department Electrical and Computer Engineering, University of New Mexico, Albuquerque, New Mexico 87131
I. INTRODUCTION
where IW)l,,,
Many coding techniques result in residue sequences with Gaussian, Laplacian, and Gamma amplitude distributions. Such coding techniques occur in LPC (Linear Predictive Coding) of speech, image, and seismic waveforms [ l-4,6]. We assume that the residue sequence is a white (memoryless) signed integer sequence. For example, the residue sequence from a least-squares linear predictor for seismic waveform data [ 4,6] is shown to be white Gaussian. A technique called bi-level sequence coding is developed in this paper for such applications.
II. BI-LEVELSEQUENCE CODING The following development of bi-level sequence coding is a modification of the development given in Ref. [ 41. Bi-level sequence coding is a type of runlength coding [ 51. It is binary in the sense that short sequences of residues are encoded using two different sample word sizes, or “levels.” Let the sequence of integer residues be designated as [ ir-( j) ; 0 < j < K] . We designate two sample word sizes (levels), N, and N, bits (magnitude plus sign), with N, = min. integer such that 2 No-1 > ] ir (j) 1max O< jtK
(1)
and Nl
of
(2)
is the maximum residue amplitude, and K is the frame size of the data sequence. We also designate a number of bits (X) for designating the length of a short sequence, n, along with a minimum sequence length, p, such that the short sequence length has the following 2’ - 1 possible values; p G length(n)
< 2’ + p - 1.
(3)
The parameters, X, N,, and N, , are required to specify the coding. We begin with the following example: let X = 4, N, = 5, N, = 3, and Jo= 3. Consider the portion of a residue sequence shown in Fig. 1. In Fig. 1, residue samples are shown on the top line, and the number of bits is shown on the second line. On the third line, each level is either 0 or 1 depending on whether the corresponding word size exceeds the specified N, = 3. The short sequence, { 2, 3, 0, -2, -1) , of residues at level 1 is shown on the fourth line and encoded on the last line. The underlined first four (h) bits in the encoded short sequence specify the sequence length n = 5 as n - CL= binary 2 = 0010, as specified in rule 5 of the following coding procedure, with p = 3 in this example. A. Coding
Rules
The specific rules for bi-level sequence coding are described as follows: 1. Each short residue sequence begins with a word ( X bits long) which indicates the sequence length (n) , and also indicates whether or not the sequence “continues,” as described below. 2. The first short sequence is always at level 0, and consists of X length bits followed by n samples, each
Residue (k(j)) : No. of bits (mag.+sign) Level (0 or 1): Short residue sequence Encoded in binary :
FIG.
1.
7 : 4 0
10 5 0
: M1p
Simple
2 3 3 3 11111 2 3 010 011
example
0 0
-2 3
-1 2
O-2-1 000 110
of encoding
-4 4 0
2 3 10
4 4
8 5 0
101
one short
sequence.
having a word size of N0 bits. Then, unless the sequence is designated as continuing, the next short sequence switches to level 1. 3. The previous step is essentially repeated for each short sequence. The length, n, is designated in the first X bits and, unless the sequence continues, the level switches for the next sequence. If the sequence continues, the level does not switch. 4. For level-O sequences, p = 1 is fixed and the range of lengths is I
L=2’-1.
(4)
The “length” L + 1 = 2’ designates a continuing level-O sequence (note that there are 2’ states in all). 5. For level-l sequences, p > 1 and the range of lengths (n), as in Eq. (3), is p
(5)
The “length” n = L + p = p + 2’ - 1 designates a continuing level-l sequence (again, note that there are 2’ states in all). The minimum level-l sequence length, II, is found as follows. A level-l sequence, encoded as in the example in Fig. 1, requires X + nN, bits. If instead the previous (or following) level-O sequence were lengthened without continuation, n N,, bits would be required. Thus, for a level-l sequence to be profitable, n N, must exceed X + n N, , and so, if p is the minimum value of IL, then
FIG.
3.
Typical
keyword
contents
for bi-level
sequence
followed by the encoded bi-level sequences, as shown in Fig. 2. The parameters K, N,, N, , and X must be encoded with the residue sequence so that the decoder can recover the original sequence. In our application, we designate a keyword at the beginning of each data frame. Typical keyword contents are shown in Fig. 3. To illustrate the coding rules described above, consider the complete example of an encoded residue sequence shown in Fig. 4. Assume that the coding parameters illustrated on the first line are chosen to be K = 35, N,, = 6, N, = 3, and X = 3. The minimum length for the level-l sequence, p = 2, is evaluated by Eq. (6). The residue values are shown in lines 3-6. The keyword in Fig. 3, requiring 30 bits, is encoded on line 8 in Fig. 4. The binary sequence produced by the bi-level encoder is shown on lines 10-15. According to the coding rules, the level-O sequence on line 10 begins with the sequence count, 1 (n) , represented by 0 (n 1) , coded as the underlined first 3 bits, followed by a single residue with 6 (No) bits. Then the sequence switches to the level-l sequence on line 11. The level-l sequence begins with the sequence count, 8 (n) , represented by 6 (n - p) , coded as the underlined first 3 bits, followed by 8 residues, each with 3 (N, ) bits.
x
. .
p = mm. znteger > N, _ N,
x [ 1
=l+INT
No-N,
where INT is the integer truncation
’
(6)
Keyword 1 Bi-level sequences K, N,,Nl, h 1 ir(O), ir(l), .. .. ir(K-1) FIG.
2.
The
encoded
data
Encoding
parameters:
K P 36
NO = 6 NI = 3 h = 3
RBSi3”Braq”e”ce:
function.
When the number of encoded residues reaches the frame size K, the coding procedure is terminated. Therefore, the encoded data frame consists of a “keyword” containing the parameters K, N,, N,, and X,
frame.
coding.
Keyword
~aramek?rs:
0000000000100011
Encoded
shon
00110
00011
0011
sequences
QQQ
OlOOIO
m
111
jJJ
00,001
110100
OOQ111
D1p
001101
lOi
001101
jJJ
110
101
101
011
10,
010
101
111
Ilp
101
010
011
110
(01
000
011
010
FIG.
4.
110
101
00,
Illustration
101
101
010
000010
IO, 11oooi
of encoding
000101
example.
000000
Next, lines 12-13 present a continuing level-0 sequence with length 10. The underlined first 3 bits on line 12 indicate the length 8(L + 1) , designating a continuous level-O sequence in accordance with rule 4. These are followed by 7(L) level-0 residues. Then line 13, the continuing level-O sequence, begins with the sequence count, 3 (n = 10 - 7), represented by 2 (n - 1) and coded as the underlined first 3 bits, followed by 3 level-O residues. Finally, lines 14-16 present a continuing level-l sequence with length 16. The underlined first 3 bits on line 14 indicate the length 9 (L + p) which designates continuation in accordance with rule 5. These bits are followed by 8 (L + p - 1) level-l residues. Line 15 is the continuation of the level-l sequence with length 8 ( IZ = 16 - 8). The underlined first 3 bits on line 15 indicate 8( L + p - 1) level-l residues. In this example, 186 bits are required to encode the keyword and the residue sequence.
B. Distribution
of Sequence
Lengths
In the keyword, K and N, are fixed by given properties of the residue sequence. To select optimal values for h and N,, we need to know how the level-O and level-l sequence lengths are distributed. First, recall that N, and N, include sign bits unless equal to zero. Let P(i) represent the probability that a residue, ir(j), has i magnitude bits. Thus, assuming N, 3 2, the expected fraction of residues at each level is given by fL = Pr{ ir(j) fl = P(0)
is at level L}
+ f * * + P(mux(0,
Then one can obtain ~(1, n) = 0;
= f;-@“- fo; Note that may contain mum length sum of p( 1,
(9a)
the sequence must begin with level 1, ( n - p) level-l residues beyond the minip, and must be followed by level 0. The n) over all possible sequence lengths is
For v < k, S(k,u,p)
k
=
,
k>Oandutp
0V
N, - 1))
N1) + . 0. + P(N,, - 1))
We know that, since P(0) + . . + + P(N, in any sequence ir(O:K - l),
n & j.b
On the other hand, a sequence of n level-O residues must be followed by p level-l residues and may contain short runs of level-l residues, as in the example of a short level-O sequence of length n = 7, with P = 3, shown in Fig. 5. This level-O sequence begins and terminates with a level-O residue, contains 2 short runs of level-l residues, each with length -C p, and is followed by p level-l residues. In order to obtain the probability that a short sequence is a level-O sequence of length n, we define S ( k, u, P) to be the number of different sequences of length k having v level-l residues, with u < k, and having fewer than ~1continuous level-l residues. For example, with )I = 3, S( 7,5,3) = 3, and the sequences are 1011011, 1101011, and 1101101. A recursion formula, discovered by J. A. Davis of Sandia National Laboratories [ 4, 61, is derived in the Appendix. The formula is
S(I1,l.b
f,, = P(max(1,
n
(7)
- 1) = 1
fo + fl = 1.
(8)
Using the assumption of whiteness which causes the residues to be statistically independent, we locate within ir (0: K - 1) , a long residue sequence, the start of a short sequence. We define
0
S(k, u, /A) = S(k - 1, Y, P) + S(k - 1, v - 1, P) - S (k - P -1, v - P, I*) where in the first We can use Eq. the occurrence of residues and k than P continuous
otherwise,
0100110111...
n
(9)
(11)
line, (f) is the binomial coefficient. ( 11) to show that the probability of a sequence of length k with v level-l v level-O residues, and with fewer level-l residues, is
YY
~(1, n) = Pr { sequence is level-l and has length n }
FL) =
FIG. 5. Example of
P a level-0
sequence.
q(k
v, A.4 = ft”*f;*
= fp.f;s(k
X S(k-
B(1, n) = kX + nN1,
S(k, v, cl) - 1, v, /,L) + f;-“.f;
l,v-
[ 1
k&(n-p+l)-.l+INT L
1,~) -ft-“.f;
X S(k - P -1, v - P, P)
n-p L
fo*q(k-l,v,p)+flq(k-l,v-l,p) I
-f,,*f:*q(k--CL-l,v--,p),
n=landv=O
[ (12)
As demonstrated in the example Fig. 5, a level-O sequence must begin and end with a level-O residue, must be followed by P level-l residues, and may contain any of the S (n - 2, v, p) sequences described above. Thus, we define
Using this equation,
residues } .
The expected pressed as E[B]
number
m
+ 2 n=2
(17)
of bits per sequence is ex-
= B(0, l,O)p(O,
( 13)
we obtain
I
n-l,andv
p(O, n, v) = P r { sequence is level -0, has length n, level-l
k=l+INTT,
= kX + nNo;
ifv>p.
and has exactly v internal
(16)
The number of bits required to encode a level-O sequence with n residues, v of which are level-l’s, is B(0, n, v) = X + No;
=
*
1,O) + 5 B(l, lZ=p
n)p(l,
n)
n-2
2 B(0, n, v)p(O, n, v).
(18)
u=o
Obviously from Eqs. (13) and (17), the first term in Eq. (18) is
P(O, 1, 0) = fo.f? B(O, 1, O)p(O, 1,O) = (A + N,)f,*
p(O, 4 v) = fi.f;.q(n-2,v,p), = fo.p(O,
n - 1, v - 1)
-fo*fYp(O,n-~-l,v-~).
(13a)
$ B(l,
fl=ll
n) = 2 (kX + nN,)s f TPfl+l* f. n=p
n)p(l, fi+L-l
2
2fy-“+
I
(14)
v=o
+ N,f,.f,
5 nf(l”-’ “=p
Equations (9a) and (13a) will be used to derive the average bits per residue in the next section.
= Xf,. f, 5 k ‘il f y+(k-l)L + Nlf,,=fl
C. Average
= xfo.fl
Bits per Residue
Having derived equations for sequence lengths, we can now derive formulas for the average number of bits per residue as a function of X and N, , the coding parameters described above. First, the number of bits needed to encode a level-l sequence consisting of n 2 ~1level-l’s in accordance with rule 5 is
B(l,n)=X+nN,,
***
n=p+L
n-2
p(O, LO) + C C p(O, n, v) = fo. n=2
n+2L-1
c fy-“+ PI=*
In spite of not having a closed form of Eq. ( 13) at hand, we can still conjecture that, similar to Eq. ( lo), m
(18a)
We can express the second term in Eq. (18) as
n>landv
n - 1, v) + fl*p(O,
f:.
p
(15)
For n 2 L + Jo,the sequence must be continued, and there is a minimal integer k for each length n such that
k=l
m-0
g (i + p)f i i-0
2 kf I”-““. k=l
= Aff-L(l
-f
;) $ kf ;” k=l
+Nlfo.fl
1
f, (l-ff,)2fy-f1 Af1 = (1-f:)
1
I +N,f,
1 I P++
0
(19)
and the third term in Eq. (18) is given by co
n-2
c n=2
c B(O, u=o 00 =
n,
Y)P(O,
(kh
+
n,
v)
n-2
,c,
“TO
nNo)fi*f:*q(n
- 2, v, cl).
(20)
Thus, P ( ir) can be evaluated by integrating the probability density of r in the range from (ir - f ) to (ir + i) . We can modify Eq. (23a) to express the probability that a residue, ir (j) , has i magnitude bits. This is the probability that 1r 1 is in the range from (2’-’ - i) to (2i - t), i.e., P(i)
Similarly, the expected number quence is derived as E[nl
+
=
= l*p(O, cc
n-2
c n=2
c u=o
fo.ft;
m +
c n=2
=
to-f:
m
= Pr{ ir(j)
= Pr{ 2’-’ - + < Irl < 2’-
y)
: r&f;-‘+’ n=&t
= @ 2’-” - ;L;1-;i ( i
where
*f.
a(r)
n-2
=
~f;f:d~-bw)
+
5 n=O
(n
+
bits} a}.
(23b)
considered
(a) Gaussian sequence: P(i)
c v=o
has i magnitude
Examples of P(i) for three distributions in this paper are listed below:
1,O) + 5 np( 1, n) ft=p
w(O,12,
+
_
+
2i-a-1
’ 2a+1
i
r 1 s-r G-2-
’
(24)
1
. e-w/2.22‘7)
&
(25)
and (Y is an integer chosen such that the standard deviation of the residue sequence is expressed approximately as
PL)f;ffl.fO
n-2
+ c n=2
of residues per se-
dir = 2”.
c nf:*fyq(n-2,u,pI)
(26)
v=o
(b ) Laplacian
1 1
= fo*fY+f*P++
p(i)
0
cc
+ c n=2
Finally,
sequence:
= efi/2a+1. le-(1/fi)2’-”
- e-fi.2’-“],
(27)
n-2
c
nfi*fR(n-
2, v,p).
(21)
where the Laplacian
distribution
used is expressed by
u=o
the expected number
p(r) = ’
of bits per residue is
. ,+fi/z+
lrl
(28)
iG=
EBPR = $f
.
(22)
( c ) Gamma sequence: 2’-n-~,.p+l
We will describe how the EBPR can be minimized with respect to the coding parameters ( iV1, A) in Section IV.
P(i)
=
3 l/4 . s 2&-~-‘-~,2af’ i 47r2 1
where the Gamma distribution
(y&/Z)
1x1
dx,
(29
)
(30
)
lilxi
used is
III. DISTRIBUTION OF SAMPLEWORD SIZE In order to obtain the coding parameters (N,, A), we first need to investigate the distribution of sample word sizes. Since each integer residue, ir ( j) , is treated as the integer nearest the corresponding continuous distribution value, r, the probability that a residue has integer value ir is defined as P(ir)=Pr{ir-i
i}.
(23a)
and p is given as (31) Plots of P (i) versus (i - a) for these three cases are illustrated in Figs. 6a-6c. From the plots, one can ob-
0.6
a
-6
-i
.i
-3
-2
-1
0
1
2
3
4
5
I I --o-
I a-o
6
06
i-a
FIG. 6. (a) Probability that a residue from the Gaussian distribution has residue from the Laplacian distribution has i magnitude bits, plotted versus tion has i magnitude bits, plotted versus i - CT.
serve that for the Gaussian tions, PF{i-&3)
and Laplacian
=PF(&hX+4}
distribu-
e-0
(32)
X0.
(33)
and for the Gamma distribution, PF{i-04)
=PF{&>CV+5)
When CY2 8, the difference in these distributions can be considered to be negligible as illustrated in Figs. 6a-6c. We assume N,, = (Y + 4 for the Gaussian and Laplacian sequences and No = (Y+ 5 for the Gamma sequence for theoretical comparisons.
IV. CODINGPARAMETERS Minimization of the expected number of bits per residue, EBPR in Eq. (22)) is difficult because of the nonlinearity of the EBPR function. We cannot usually obtain optimal coding parameters by minimizing EBPR during the encoding process in data
i magnitude
bits,
i - 01. (c) Probability
plotted that
versus i - 01. (b) Probability a residue from the Gamma
that distribu-
a
compression, since massive computations would be required and an unacceptable time delay in coding each data frame would result. In order to obtain efficient coding, we assume that the residue sequence is very long and that N, and bir = 2* are known. We can tabulate the optimal (A, Ni) pairs that minimize EBPR for reasonable values of N,, and (Y,as discussed in the preceding section. For the Gaussian, Laplacian, and Gamma distributions, the corresponding probabilities of level-l and level-O sequences in Eq. (7) can be evaluated using Eqs. (24), (27)) and (29)) respectively. Since we do not have closed-form equations for E[B] in Eq. (18) and E[n] in Eq. (21), we make a practical choice (J) for “co” in Eqs. (20) and (21). Therefore, using a modified version of Eq. ( 14), we restrict the maximum value off0 such that J n-2 P(O, 190) + c c P(O, n, v) a (In=2 v=o Likewise, that
we restrict
the maximum
6)-h.
(34)
value of f1 such
TABLE Optimal
Bi-level
0.25
0 75
0.50
1
N, and X for
a
No
N,
x
EBPR
a
No
N,
h
EBPR
0 15 2 3
4
2 3 4 5
4 5 5 6
2.5888 3.4182 4.3420 5.3043
4 5 6 7
8 9 10 11
6 7 8 9
6 6 6 6
6.2866 7.2782 8.2741 9.2721
6 7
N,
x
EBPR
N,, - 2
6
Nn - 1.7289
NO
a28: 0.00
2a
Coding Parameters Gaussian Distribution
cu+4
00
P&ability f,,
FIG. ‘7. Plots of the function y( f,,) versus f,,.
J c
P(L
?I=#
n)
2
(1
-
(35)
J).fl,
where 0 < 6 6 1, and J is the practical maximum sequence length. In other words, since the p ( * ) values are functions off,, as seen in Eq. ( 13a), f. must have at least a certain minimum value (which depends on J) in order for the conditions (34) and (35) to hold. Furthermore, from Eq. ( 14)) at least 100 X ( 1 - 6) % of the level-O and level-l sequences have length n G J. To restate the constraint on f,, in a more useful way, we define J
Y(fo)
(35a)
in Eq. (34) is equivalent Y(fo)
;
p(l,
f :-,+I. f, = fl(l
n) = ;
l=p
-f
(-,+I,.
(37)
I&=/l
Combining Eqs. (35) and (37)) we have the constraint on the minimum value off0 as follows: f. > 1 - 61/Wr+l).
(38)
n-2
= (1 -fo)” + c c fdl -fdP n=2v=o Xq(n-Zbd.
Then the condition
As seen in Fig. 7, this result constraints the maximum value of fo. That is, y ( f,,) is monotonically nonincreasing; therefore f,, must be less than a maximum value determined by 6 and CL.Likewise, Eq. (35) constraints the minimum value of fo. Using the maximum sequence length, J, we modify Eq. ( 10) and obtain
to (36)
2 1 - 6.
TABLE
In order to produce examples of optimal coding parameter pairs (X, N,), we use J = 512 and 6 = 0.01, so f. and fl are restricted such that 99% of level-O and level-l sequences have length n < 512. The resulting restrictions on f. are tabulated in Table 1. A reasonable range of p, from 1 to 9, is used to generate the table. The conditions in Eqs. (34) and (35)) which are also equivalent to Eqs. (36) and ( 38) correspond-
1 TABLE
Constraints on the Probabilities fOmin and fem.. versus Parameter p (Using J = 512, 6 = 0.01) P
fomi”
fomax
1 2 3 4 5 6 I 8 9
0.009 0.009 0.009 0.009 0.009 0.009 0.009 0.009 0.009
0.99 0.90 0.77 0.66 0.56 0.49 0.43 0.37 0.33
Optimal
2b
Bi-level Coding Parameters Laplacian Distribution
IV, and X for
a
No
N,
h
EBPR
a
No
N,
X
EBPR
0 15 2 3
4
2 3 4 5
4 5 5 5
2.5426 3.4350 4.3795 5.3563
4 5 6 7
8 9 10 11
6 7 8 9
5 5 5 5
6.3457 7.3406 8.3381 9.3369
6 7
ar8:
NO a+4
N,
x
EBPR
N,, - 2
5
N,, - 1.6637
TABLE Optimal
Bi-level
2c
Coding Parameters Gamma Distribution
Since the keyword requires 30 bits for compressing a sequence, the EBPR in Eq. (22) is modified as
N, and X for
a
Nn
N,
x
EBPR
a
N,
N.
X
EBPR
0 1 2 3
5 6 7 8
2 2 3 4
4 2 2 4
2.5585 3.3535 4.1790 5.0953
4 5 6 7
9 10 11 12
5 6 7 8
4 4 4 4
6.0469 7.0239 8.0127 9.0072
NO
N,
h
EBPR
N,, - 4
4
N,, - 2.9955
a t 8: a+5
ingly, are used to obtain the values of fOminand tomal in the table. Having established the limit J, we can use Eqs. (18) through (21) to determine the optimal coding parameters that minimize EBPR in Eq. (22). The optimal coding parameters (N,, X) for the three sequence amplitude distributions are summarized in Tables 2a-2c.
EBPR= ?!! + EIB1 K
(39)
E[n].
V. PERFORMANCE As described in Ref. [ 51, the entropy of the residue sequence implies the lower bound of EBPR, the expected number of bits per residue. In order to investigate theoretically the performance of the proposed coding scheme, we define the discrete-integer entropy as
PDF
x
log2
1 PDF
1 ,
(40)
a 64.0
..-.........-
-...-.*-.._-..
.._..+.....
_ .
.
. . . . . .. . . -
__.a....
^ ..-..
.
..y...-
.-...
!...
.-...........
-64.0 0
4coO.o
0
Sample number
a
b
-64.0
64.0
Integer FIG. 8. (a) Generated Gaussian sequence N0 = 01 + 4, where (Y = 3. (c) Performances encoding generated Gaussian sequences.
Experimental
EBPR
Experimental
H(k)
0
10.0
0
a
level ir with No = a + 4, where a = 3. (b) Amplitude of bi-level coding for encoding Gaussian
distribution sequences.
(d)
of generated Performances
Gaussian sequence with of bi-level coding for
a 64.0
-----**o
7 -..--.--.. ;
+-- --.-_...;
_.__._
;
;
.&.-y..” ;
_.____. i
0
j ____..__.
:
/
j
EBPR
Equation
(39)
H(k)
Equation
(40
4tCQ.0
a
Sample number
d
b 0.09
1
I
15.0
Experimental
-64.0
H(ir)
64.0 Integer
level ir
(a) Generated Laplacian sequence FIG. 9. N,, = (Y + 4, where (Y = 3. (c) Performances encoding generated Laplacian sequences.
with N,, = LY + 4, where of bi-level coding for
where p(r) is the probability density function of a continuous residue, r, and ir is the corresponding residue integer level. As discussed previously, N,, is set equal to ((Y + 4) for the Gaussian and Laplacian sequences, and ((Y + 5) for the Gamma sequence. Three typical white sequences with 4000 data samples each and (Y = 3 for the Gaussian, Laplacian, and Gamma distributions are shown in Figs. 8a, 9a, and 10a. The amplitude distributions of these sequences are shown in Figs. 8b, 9b, and lob in which the solid lines are the theoretical values shown in Eq. (23a) obtained from their continuous probability density functions, using the appropriate (Y values. The discrete-integer entropy H( ir) , the optimized EBPR in Eq. (39) with an assumed frame size of K = 4000, and N, obtained via CYas described previously are plotted in Figs. 8c, 9c, and 10~. The experimental H ( ir) , evaluated from the generated residue sequence, EBPR, the expected bits per residue from encoding the generated residue sequence, and N, versus (Y are also shown in Figs. 8d, 9d, and 10d. The results illustrate
LY = 3. (b) Amplitude encoding Laplacian
distribution sequences.
(d)
of generated Performances
Laplacian sequence with of bi-level coding for
that the coding scheme for encoding the Gaussian sequence performs at the same level as that for the Laplacian sequence; and more bits per residue can be compressed for the Gamma sequence. Figures 8c-8d, 9c-9d, and lOc-10d also demonstrate that the values of the expected bits per residue obtained using bi-level sequence coding are close to their optimal values.
VI. CONCLUSIONS A type of run-length coding scheme called bi-level coding has been developed and investigated, and its simplicity of implementation has been demonstrated. The performance, in terms of expected bits per residue, of the bi-level technique in coding Gaussian, Laplacian, and Gamma sequences appears nearly optimal when compared to the entropy of the residue sequence. The coding scheme is robust and can be used without any restriction on the range of data values.
a 126.0 H(ir)
Sample
Equation
(40)
a
number
d
b
Experimental
EBPR
0 Integer
a
level ir
FIG. 10. (a) Generated Gamma sequence with No = (Y + 5, where 01= 3. (b) Amplitude distribution of generated Gamma sequence with N,, = 01+ 5, where a = 3. (c) Performances of bi-level coding for encoding Gamma sequences. (d) Performances of bi-level coding for encoding generated Gamma sequences.
APPENDIX
S(k, v, p) = S(k - 1, v, FL)+ S(k - 2, v - 1, p) + iS(k-i,v-i+l,p)
Recall that the level-O sequence under discussion has a total of k level-O residues (O’s) and level-l residues (l’s), v of which are l’s, and has fewer than p successive l’s, which means p is greater than the largest “run” of 1’s. Assume that k > v > p, and consider the location (i) of the first 0 in the sequence, which must be constrained by i < p. If i = 1, the sequence begins with 0 and there are v l’s in the remaining sequence of length k - 1. The number of such sequences of length k - 1 with fewer than p successive l’s is S (k - 1, v, p) . If i = 2, the sequence begins with (1, 0) and there are v - 1 l’s in the remaining sequence of length k - 2, and so on, until i = p; thus, we have S ( k, v, p) given by S(k,v,p)=iS(k-i,v-i+l,p).
(Al)
i=l
To obtain (Al) as
the recursive
(A2)
i=3
S(k - 1, v - 1, p) = 5 S(k - i - 1, v - i, p) i=l
p-1 = S(k - 2, v - 1, p) + c S(k - i - 1, v - i, p) i=2
+ S(k - II - 1, v - II, cl) = S(k - 2, v - 1, p) + 5 S(k - i, v - i + 1, p) i=3
+S(k-p-l,v-g,p). Equation
(A3)
( A3 ) becomes
S(k - 2, v - 1, p) + i: S(k - i, v - i + 1, p) i=3
form,
we rewrite
Equation
= S(k - 1, v - 1, p) -S(k-p-l,v-p,p).
(A4)
By substituting Equation (A4) into (A2), tain the result in Eq. ( 11) :
we ob-
S(k, v, p) = S(k - 1, v, jL) + S(k - 1, v - 1, /.L) - S(k - p - 1, v - y, P).
(A5)
The startup values are given by k
S(k,v,d= S(P,
P,
0V PL) =
0,
,
V
(ASa)
(A5b)
where (f) is the binomial coefficient. Equations (A5a) and (A5b) follow because all sequences with v < k (fewer than p l’s) do not have p consecutive l’s, and the sequence with length k = p with ~1l’s has consecutive 1’s.
REFERENCES 1. Paez, M. D., and Glisson, T. H. Minimum mean-squared-error quantization in speech PCM and DPCM system. ZEEE Trans. Commun. COM-20 (Apr. 1972), 225-230. 2. Jayant, N. S., and Noll, P. Digital Coding of Waveforms. Prentice-Hall, New York, 1984, Chap. 10. 3. Spanias, A. S., Jonsson, S. B., and Stearns, S. D. “Transform methods for seismic data compression. IEEE Trans. Geosci. Remote Sensing 29,3 (May 1991), 407-416. 4. Stearns, S. D. Predictive data compression with exact recovery. Sandia National Labs., Albuquerque, NM, Sandia Rep., SAND 90-2583. UC-403, December 1990. 5. Ingels, F. M. Information and Coding Theory. International Textbook Company, Scranton, PA, 1971. 6. Tan, L. Theory and techniques for lossless waveform data compression. Ph.D. dissertation, Department of Electrical and Computer Engineering, University of New Mexico, May 1992.
SAMUEL D. STEARNS is a distinguished member of the Technical Staff at Sandia National Laboratories, Albuquerque, New Mexico. His principal research areas are digital signal processing and adaptive signal processing. Dr. Stearns received the B.S.E.E. degree from Stanford University in 1953, and the M.S.E.E. and D.Sc. degrees from the University of New Mexico in 1957 and
1962. He joined the Dikewood Corp., a consulting firm in Albuquerque, New Mexico, in 1960, where he eventually became Director of Research and Principal Scientist. Since 1971, he has been a technical staff member at Sandia National Laboratories and an adjunct professor at the University of New Mexico. His work at Sandia has been in the development of signal processing techniques with a variety of applications in field test operations, telemetry, safeguards, intrusion detection, seismic studies for treatyverification, development and production testing, etc. He was given the Distinguished Member of Technical Staff award in 1982. Dr. Stearns is a fellow of the IEEE. His fellow citation reads, “For contributions to education in digital and adaptive signal processing systems and algorithms.” His IEEE activities include serving on the governing board of the Signal Processing Society, coediting a special joint issue of the ASSP and CAS transactions (July 1987)) and chairing the 1986 Asilomar Conference on Signals, Systems, and Computers. Dr. Stearns has published a number of papers in signal processing, adaptive signal processing, and related areas. He is a coauthor of the following Prentice-Hall texts: Digital Signal Analysis, 2nd ed. ( 1990)) Signal Processing Algorithms ( 1987)) and Aduptiue Signal Processing ( 1985). A new text, Signal Processing Algorithms in Fortran and C, is now in production. LIZHE TAN was born on April 30, 1963, in China. He received his Ph.D in electrical engineering from the University of New Mexico (1992); an M.S. in civil engineering from the University of New Mexico ( 1987) ; and a B.S. in civil engineering from Southeast University, Nangjing, China ( 1984). From 1988 to 1992 he was a teaching assistant and research assistant in the Department of Electrical and Computer Engineering at the University of New Mexico. His areas of interests include theory and techniques of lossless data compression, algorithms of seismic, speech signal processing, and adaptive signal processing. He is a member of Eta Kappa Nu, Tau Beta Pi, and Sigma Xi. NEERAJ MAGOTRA was born on December 5, 1958. He received a Ph.D. in electrical engineering from the University of New Mexico, Albuquerque, New Mexico (1986); an M.S. in electrical engineering from Kansas State University, Manhattan, Kansas (1982); and a B.Tech. in electrical engineering from the Indian Institute of Technology, Bombay, India ( 1980). From 1987 to 1990 he had a joint appointment with Sandia National Laboratories and the Department of Electrical and Computer Engineering (ECE) at the University of New Mexico, and he is currently full time with ECE. Dr. Magotra has been involved in signal/image processing research in the areas of seismic, speech, and radar (synthetic array radar imaging) signal processing for the past 8 years. The research involved the theoretical design of algorithms as well as their realtime implementation on DSP chips. He has served as past associate editor of the IEEE Transactions on Signal Processing and served on the organizing committee of ICASSP 1990. He is a member of Sigma Xi, Phi Kappa Phi, Tau Beta Pi, the Institute of Electrical and Electronics Engineers (IEEE), and the Seismological Society of America. He has authored/coauthored 46 technicai articles, including journal papers, conference papers, and technical reports.