Chapter 17
W r i t i n g on c o n s t r a i n e d memories We start this chapter with a seemingly strange question" what are the worst coverings? More precisely, what is the worst way of shortening a linear code, as far as covering radius is concerned? In Section 17.1, we apply the results of this study to efficient overwriting on write-once memories (WOMs), describe the so-called coset-encoding method and give examples using the Golay and Hamming codes. In Sections 17.2 and 17.3, we define a general error model for these memories and design, in Section 17.4, single-error-correcting WOM-codes based on 2- and 3-errorcorrecting BCH codes. A nonlinear extension is presented in Section 17.5.
17.1
Worst case coverings and WOMs
We consider the following question: how may a "write-once memory" (WOM) be reused? T h a t is, we have a storage medium, called W O M or n - W O M , consisting of n memory positions, or wits, each initially at "0". At any step, a wit can be irreversibly overwritten with a "1" (e.g., by a laser beam in some digital optical disks). We describe a method, called coset encoding, enabling m a n y rewritings on a WOM. The coset-encoding writing rule: An [n, k, d]R code C is used to encode n - k bits on a W O M as follows: every message s E IF n-k is one-to-one associated to a coset of C in IF '~, say x+C, having as its syndrome s. T h a t is, s = H x T = s(x), where H is a generator matrix for C • the [n, n - k, d • dual code of C. Encoding, or "writing", involves finding a minimum weight vector y with syndrome s, and 423
Chapter 17.
424
Writing on constrained memories
writing it on the WOM. This requires a complete decoding algorithm in the sense of error-correcting codes. Decoding, or "reading", is simply a syndrome computation calculating s from y. We present here an algebraic treatment, giving precise estimates of t h e "worst-case" behaviour of these TOM-codes. For I C {1, 2 , . . . , n}, we write C(I) for the shortening of C on I. That is, m
c(•
c.
c, - 0
/ c I).
Upon dropping these coordinates, we get a code of length n - ] I [ , dimension and minimum distance at least k - [I[ and d, respectively, which we also call C(I). If I t ( C ) - (hi, h 2 , . . . , h,~) is a parity check matrix for C where h, are column vectors of size n - k, then the set of columns {hi" j E { 1 , 2 , . . . , n ) \ I ) m
constitutes a parity check matrix for C(I). Recall that a linear [n, k, d] code C is mazimal if it is not contained in a larger code with the same minimum distance. Clearly, if k - a[n, d], then C is maximal. We are now ready for a few lemmas. L e m m a 17.1.1 Writing r bits using a linear code C and the coset-encoding
scheme is possible if and only if C • has dimension at least r.
[]
L e m m a 17.1.2 ( S u p e r c o d e l e m m a ) If C1 is properly included in C2, then R(C1) >_ d(C2). Proof.
Pick any x C C2 \ C1. Then d(x, C1) >_ d(C2). See also Lemma 8.2.1. []
L e m m a 17.1.a If C is maximal, then R(C) < d ( C ) - 1. P r o o f . If R(C) >_ d(C), take a vector x such that d(x, C) >_ d(C). Then C U {x} generates a linear code with minimum distance d(C), a contradiction. This result is also a consequence of Theorem 17.2.2. [] Actually, the property R(C) _ d ( C ) - 1 can be used for defining a maximal code, as we did in Section 2.1. L e m m a 17.1.4 The largest number of wits to be written to coset-encode n - k
bits with an In, k]R code C is R.
[]
17.1.
Worst case coverings and WOMs
425
L e m m a 17.1.5 / f an [n, k] code C is shortened on a set I of positions not containing the support of any nonzero codeword in its dual, then C(I) has dimension exactly k - III, thus enabling the writing of n - k bits. P r o o f . For the dimension of C ( I ) , notice that its dual must still have dimension n - k, since no nontrivial linear combination of rows of the matrix [hj 9 j C{ 1 , 2 , . . . , n } \ I] can be 0. For the writing capacity, use L e m m a 17.1.1. []
L e m m a 17.1.6 Let C be an [n, k, d]R code. Then, any [n - i, k - i, d] shortened maximal code C' satisfies R ( C ' ) - d - 1. P r o o f . By L e m m a 17.1.3, R ( C ' ) ~ d - 1 . We prove the opposite inequality for i - 1; let us assume without loss of generality that the shortened coordinate is the first one, i.e.,
c'-
{c' c
(0lc') 9 c c},
Let Cl - {0} (9 C'; Cl is an [n, k - 1, d] proper subcode of C. Let x - (l[y) be a codeword of C. Then d(x, C1) _> d, hence d(y, C') _> d - 1. The general case follows by successive shortenings. [] The rewriting rule: Let us define a rule for rewriting n - k bits on the WOM. After the first writing, y l is stored in the W O M and represents Sl, where $1 -
~ h~. iEsupp(yx)
Let I1 - s u p p ( y l ) . Encoding s2 at the second generation amounts to writing an y2 representing s2 + Sl, not using wits already written, that is, s2 + sl -
~ hi, iesupp(y~)
with s u p p ( y 2 ) n I1 - 0. This is clearly equivalent to encoding s2 + sl with the shortened code C(I1), yielding y2 in IF '~. Again, y2 is chosen of m i n i m u m weight, and the state of the WOM, y l + y2, represents H ( y l + y2) T - s2. Let us illustrate the process with two examples, the first one having already been discussed in Section 1.2. Denote by {n, m, g} a WOM-code allowing g successive writings of m bits on an n - W O M . We define the efficiency r of a T O M - c o d e as the number of bits written per wit in the worst possible case, i.e., r mg/n.
426
Chapter 17.
Writing on constrained memories
E x a m p l e 1"/.1.7 Only 3 wits are needed to write 2 bits twice. That is, there exists a {3, 2, 2} TOM-code. Here r 4/3. This was proved in Section 1.2 (Example 1.2.6). Let us sketch the ideas once again with the new notations. Take for C the [3, 1,311 repetition code. By Lemma 17.1.4, the first writing of 2 bits with C uses at most 1 wit, i.e., w(yl) _ 1. Thus I1 - supp(yl) contains no support of a nonzero codeword in the [3, 2,2] dual code of C, and, by Lemma 17.1.5, C(I1) allows a second writing of 2 bits sl - ( 1 (0 f
9
Therefore, taking H -
[ 1
\
1
0 1
1 ~
0 ] '
and for instance, '
1) T, sl is encoded by yl - (1 0 0)" H y T - sl. Now, H ( C ( I 1 ) ) 01 )
' d e n ~~1 7 6 1 7 6
T~176
example, s2 -- (1 0) T, one encodes sl +s2 - (0 1) T with H', getting y~ - (1 0) and y2 (0 1 0). Finally, Yl + Y2 - (1 1 0) is the state of the T O M , representing H ( y l + y2) T - s2. [] -
-
E x a m p l e 17.1.8 (The Golay {23, 11, 3} TOM-code.) We recall a few facts about the [23, 12, 7]3 Golay code C (cf. Section 11.1). The set of the weights of its codewords is W = {0, 7, 8, 11, 12, 15, 16, 23}. It contains its [23, 11, 8] dual code and possesses a four-fold transitive automorphism group M23. Thus, arbitrarily assigning binary values to at most 4 positions among the 23, there exists in C a codeword of any weight in W \ {0, 23}, taking the chosen values on these positions. This property will be referred to as transitivity . Now let us write on the T O M . The first generation uses at most 3 wits (by Lemma 17.1.4). By writing additional wits just before the second writing we can always assume that exactly 3 wits have been written; we can further assume, by transitivity, that they lie in the first 3 positions. Thus we are left with a [20, 9, 7] shortened Golay code which is maximal. Hence, by Lemmas 17.1.6, 17.1.4 and 17.1.1, we get T h e o r e m 17.1.9 Any three times shortened Golay code allows writing 11 bits, using at most 6 wits. []
Hence, for the first 2 generations, we need at most 9 wits. We now prove that the first 2 writings do not contain the support of any nonzero codeword of C • and thus a third writing is possible, by Lemma 17.1.5. This is guaranteed for II[ _< d • - 1 - 7, hence only cases when I I I - 8 or 9 must be considered.
17.1.
Worst case coverings and WOMs
427
Case ] I [ - 9. For this case to occur, the only possibility is to write 3 wits (Yl) at the first generation and 6 wits (y2) at the second. By transitivity, one can assume that I = I1 U / 2 , with I1 - {1, 2, 3} -- s u p p ( y l ) and 4 E /2 = supp(y2). Now suppose that there exists c • E C • \ {0} such that supp(c • C_ I. The only possible weights in C • are 0, 8, 12, 16, so c • has weight 8 and I = s u p p ( c • {j} for some position j. From C • C C follows that c • E C. Defining ej by supp(ej) = {j}, it follows that, with y = y~ + y~, s ( y ) = s(c • + e3) = s(e#). Now, we should distinguish between two possibilities for j: 1) If j E I1, one can assume j - 1 and then find c E C, w(c) = 7, c - 011 .... 2) Similarly, if j E / 2 , assume j = 4 and find c E C, w(c) = 7, c = 1110 .... The previous expressions of c are valid by transitivity. Now the vector x - c q- ej has the same syndrome as y, is writable after the first generation because it starts with 3 o n e s - and w(x) - w ( y ) - 1, contradicting the minimum weight writing hypothesis. Case [I[ = 8. It is simpler and can be dealt with along the same lines; we skip it. This completes the proof of the following. -
-
T h e o r e m 17.1.10 Coset-encoding with the Golay code, 3 writings of 11 bits are possible on a 23-WOM. [] The efficiency of this WOM-code is r - 33/23 ~ 1.435. The previous analysis can be straightforwardly extended to other codes, yielding T h e o r e m 17.1.11 Let C be an In, k, d]R maximal code. If for some i, i < d • - 1, its shortened versions of lengths at least n - i remain maximal and of m i n i m u m distance d, then at least g writings of n - k bits are guaranteed, with g - 2 + [ ( i - R ) / ( d - 1)J. [] C o r o l l a r y 1 7 . 1 . 1 2 A Hamming code of length 2 r - 1 yields a {2~-1, r, 2 r - 2 + 1} WOM-code. P r o o f . Use the known fact that Hamming codes of length 2r maximal when shortened at most 2~-1 - 1 times.
1 remain []
The efficiency of this WOM-code is asymptotically 0.25 r. In fact, g can be increased to g - 2r-2 + 2r - 4 A- 1 (17.1.13) (for r > 4), by use of geometric arguments (see Notes), and the efficiency is then asymptotically improved to 5 r / 1 6 - 0.3125 r.
Chapter 17. Writing on constrained memories
428
17.2
The
error
case
We adapt the previous methods in order to answer the following question: how can error-correcting WOM-codes be constructed? Let us describe the general principles underlying the encoding schemes we use (the following is a reformulation of the "coset-encoding" technique of the previous section in a more general setting): Let II denote the set of wits of the WOM, identified with (1, 2 , . . . , n). The set of messages we wish to write is identified with a subset M of a finite abelian group G (in practice G = ]Fm). We also index the wits of II by a subset P of G with a one-to-one mapping cr of II onto P. Let e be the function II ~ (0, 1) describing the state of the WOM, i.e., e(Tr) : 0 when 7r is unused, and e(r) : 1 when 7r is used, ~r ranging over II; the function e is identified with its image e(II) E IF'~. R e a d i n g a m e s s a g e : The last message c E M written on the W O M is read by computing c(e)- ~:~(~)=1 a(Tr). W r i t i n g a m e s s a g e : Given a state e of the WOM, writing a message c E M is done by finding a set W C II of unused wits such that Z
o'(~')+
IrEW
~ o'(Tr) - c. ~':e(~')--i
(17.2.1)
Let C be a given [i, k, ~ code. We denote by a[i, d, C] the maximal dimension of a linear code of length i and minimum distance d containing C. We set a[i, d] = a[i, d, (0)] (for i < d, one has a[i, d] = 0). Recall that an [i, k, d] code C for which k = a[i, d, C] is called maximal, see Section 17.1. Clearly,
k <_a[i, d, C] < a[i, d]. T h e o r e m 17.2.2
Any In, k, d]R code C satisfies a[R, d] + k <_a[n, d, C].
(17.2.3)
P r o o f . Let C be an [n, k, d]R code, and z a deep hole with respect to 0n, i.e., such that d(z, C) = w(z) = R, with support, say, [1, R]. Define B, an JR, a[R, d], d] code, and consider a nonzero e' = b0 '~-R E C' := B ~ 0'~-R. Since supp(e') C_ supp(z), then d(e', C ) = w ( e ' ) c l e a r l y holds. But w(e')= w(b) _ d, thus C + C' is an [n, k + a[R, c~, ~ code containing C. D
This theorem is used in Section 17.4 for providing upper bounds on R for shortened codes.
17.3.
17.3
A model for correcting single errors
A model
for correcting
429
single
errors
Suppose the group we use to construct our WOM-code is G - IF'~; to every state e of the W O M , we associate the set S(e) C_ P corresponding to the unused wits, i.e., S(e) - (a(Tr) e(Tr) 9 - 0}. So, in the initial state S(0) - P. We write simply S when no confusion can arise. Let I-I(S) be any m • 1S I matrix the columns of which are the elements of S ( I t ( S ) is defined modulo permutations of its columns). Then I-I(S) is the parity check matrix of a code that we denote by C ( S ) (cf. Section 16.1). We wish to correct H a m m i n g type errors; more precisely, we say that a (single) error has occurred when the state e(Tr) of a single position r is either written or read incorrectly. This means that the state of the W O M which is actually read, differs from the desired state by an n-tuple of weight 1. Let E C_ IF '~ be the set of all the authorized states of the W O M (with no errors) that may occur during the history of the memory. Clearly, if we are to distinguish between two states el E E and E2 E E, given that a single error m a y occur, then we must have, as in classical coding problems,
d(el, e2) > 3. In other words, E must be a (not necessarily linear) code of length n and m i n i m u m distance at least 3. We will display error-correcting WOM-codes that make full use of the capacity of the W O M , in the sense that the set E will be a Hamming code (for a WOM-length n = 2 ~ - 1). Let us now describe the scheme. When the desired message is c C M and an error has occurred on position 7r, then the word actually read is c + a(~r). So the sets M and P should be such that any two ordered pairs (c, p) E M • P and (c', p') E M • P satisfy the condition
c + p :/: c ' + p'. To achieve this when the size of the W O M is n - 2~ - 1, we choose a group IF r G - G1 • G2, where G1 The set P C_ G should verify: (P) the projection on the first coordinate, p r l 9 P ---, G1 is a one-to-one mapping between P and G1 \ (0}. The set M C_ G of messages should verify" (M) M C_ {0} • G2. W h e n a single error occurs, the message read on the W O M is c-t-p instead -
of c; p ope ty (M)
-
that
+ V) -
p opr
(P)
that prl(p) uniquely determines p, and hence c. Note that conditions (M) and (P) imply that the authorized states of the W O M are all the words of the Hamming code of length n (cf. Example 17.4.9).
430
Chapter 17. Writing on constrained memories
Next, we want to maximize the number of times the WOM can be reused. To do this, we must find sets P and M satisfying (P) and (M) such that any reasonably "large" subset S of P generates M, and furthermore such that only "few" elements of S are required to generate an arbitrary message c E M. This is the object of the next section.
17.4
Single-error-correcting
WOM-codes
Consider, for r _> 4, C - BC~/(2, r) the 2-error-correcting [ n - 2~ - 1, n - 2 r , 5] BCH code (see Section 10.1). If a is a primitive element in IF2., then a parity check matrix of C is given by H-
1 a a2 1 a3 a6
::.
.
(2n-1 )
a3('~-1)
-[hl,
h2,...,h,~],
where every element in H must be thought of as an r-tuple (column) of elements in IF. Now take G 1 - G 2 - I F * , P - {hi" 1 < i_< n } , M {0} • Then conditions (P) and (M) are fulfilled. We use the following properties of C:
R ( C ) - 3. d(C • > 2r - l -
(17.4.1) 2r/2.
(17.4.2)
Combining (17.4.2) and Lemma 17.1.5, we see that any S C_ P with [S[_ 2,- 1 4- 2r/2 generates G. Now we use Theorem 17.2.2 to upperbound the covering radius of C(S). T h e o r e m 17.4.3 (i) If IS[ >_ (n 4- 1)/2 4- (n 4- 1) 1/2, then R(C(S)) (ii) z/, mo~o~, Isl > ( ~ / 2 ) ( ~ + 1), then R(C(S)) < 7.
~ 9.
P r o o f . Let s - IS[. If D is any linear code with length s and minimum distance at least 5, the sphere-packing bound reads: [D[(14- s 4- s ( s - 1)/2) _~ 2 s. This implies Iels2/2 _< 2 ~, hence
~[~, 5] <_ ~ - 2 log ~ + 1.
(17.4.4)
Suppose S satisfies [S[ >_ (n 4-1)/2 4- (n 4-1)1/2; then C(S) is an [s, s - 2r, >_ 5] code (its parity check matrix H(S) has full rank).
17.~. Single-error-correcting WOM-codes Set R we obtain:
431
R(C(S)). Applying Theorem 17.2.2 to C(S) and using (17.4.4), a[R, 5] _< 2 r -
We have s > 2~-1, so:
2 log s + 1.
(17.4.5)
a[R, 5] < 3.
(17.4.6)
But a[10, 5 ] - 3 (see Table 2.3), so (17.4.6)implies R _< 9; this proves (i). If, besides, S verifies s > (v/-2/2)(n + 1), then log s > - 1 / 2 + r, and (17.4.5) yields aiR, 5] < 2. Since a[8, 5 ] - 2 (see Table 2.3), this implies R _< 7. This proves (it). [] Theorem 17.4.3 means that the above scheme yields single-error-correcting WOM-codes, with parameters {2 ~ - 1, r, g}, where, applying Lemmas 17.1.4 and 17.1.5 and straightforward averaging"
g-
1 - v/2/2
7
+
v/2/2 - 0.5)
9
n + o(n) ,.~ n/15.42 + o(n).
An estimate of their efficiency is: r .~ r/15.42. We now use 3-error-correcting BCH codes; for r > 4, let C - BCT/(3, r) be a 3-error-correcting I n - 2~ - 1, n - 3r, 7] BCH code. For the sake of brevity, we only sketch this case, very similar to the previous one; the parity check matrix of C is now H-
1 1
a 3 a 6 ... a 5 a 1~ ...
a 3('~-1) a 5(n-1)
-[hl,h2,...,h,~].
We set G I - I F ~, G 2 - IF ~ • ~, P - {hi" 1 <_i_< n } , M thus again fulfilling conditions (P) and (M).
{0} • G2,
17.4.7 (i) If [S] > (n + 1)/2-4- 2(n + 1) 1/2, then R(C(S)) < 16. (it) If, moreover, ISI > 1.145 (n + 1)/2, then R(C(S)) < 14. Oii) If, moreover, IS[ > 1.443 (n + 1)/2, then R(C(S)) <_ 13. 5v) If, moreover, [S I > 1.818 (n + 1)/2, then R(C(S)) <_ 12.
Theorem
P r o o f . Proceed as in Theorem 17.4.3. Note that now, if ISl ~ (n + 1)/2 + 2(n + 1) 1/2, then C(S) is an [s, s - 3r, >_ 7] code, so that instead of (17.4.5) we obtain: a[R, 7] <_ 3r - 3 log s + log 6. (17.4.8) Since s > 2*-1, a[R, 7] < 6 and (i) follows from a[17, 7 ] - 6 (see Table 2.3). If s > (~/3/16)(n + 1) ..~ 1.145 (n + 1)/2, then a[R, 7] < 5 and (it) follows
Chapter 17.
432
Writing on constrained memories
from all5, 7 ] - 5 (see Table 2.3). If s > ( ~ / 3 / 8 ) ( n + 1) ~ 1.443 (n + 1)/2, then a[R, 7] ( 4 and (iii) follows from a[14, 7] - 4 (see Table 2.3). If s ( ~ / 3 / 4 ) ( n + 1) ~ 1.818(n + 1)/2, then a[R, 7] ( 3 and (iv) follows from a[13, 7 ] - 3 (see Table 2.3). []
We get {2 ~ - 1, 2r, g} WOM-codes with g - n/26.9+o(n) and an estimated efficiency: r ~ r/13.45. The following example is too small for the previous methods to display real efficiency, but is intended to be illustrative. E x a m p l e 17.4.9 Set C - BC~/(2, 4). We are thus dealing with 4-bit messages, which we write on a 15-WOM using the lower part H2 of the parity check matrix
f 1 0 0 0
0
0
0
1
0
0
1
1
0
1
0
1
1
1
1
0
0
1
1
0
1
0
1
1
1
1
0
0
0 0
1 0
0 1
0 0
1 0
1 1
0 1
1 0
0 1
1 0
1 1
1 1
1 1
0 1
1
0 0 0
0 0 1
0 1 0
1 1 1
1 0 0
0 0 0
0 0 1
0 1 0
1 1 1
1 0 0
0 0 0
0 0 1
0 1 0
1 1 1
1
1
1
1
0
1
1
1
1
0
1
1
1
1
0 0
Let II - {1, 2 , . . . , 15}. We know by (17.4.1) that R(C) - 3, hence writing any (7 requires at most 3 wits. Suppose c - (0110) T is to be written; then s is associated to - (0000 0110) T, and we have = h i + h2 -~ hs, so that c is represented by writing on positions 1, 2 and 5. The state of the memory is now e - ( 110010000000000). Note that H1r T - 0 so that e is a codeword of the Hamming code. At that stage, we are left with a set S with size 12 yielding a [12, 4, 5] code by (17.4.2). Thus any new message can be written on the memory using at most 7 wits, by Theorem 17.4.3-(ii).
17.5.
Nonlinear WOM-codes
433
Now suppose we wish to read the last message written on the WOM; say the state of the memory is e --(011011110000100). We evaluate O"
-
-
O.tt
- h2 + h3 + h5 + h6 + h7 + hs + hi3 - (1101 0100) T.
Vector or' being nonzero means that an error has been made. Moreover, we see that (1101) T is the first part of hs, so that the error is on position 8. Evaluating: cr + hs - (0000 0111) T, we read the message as (0111) T. []
17.5
Nonlinear
WOM-codes
We survey briefly nonlinear WOM-codes. Let us use the notation w(< v >g) for the m i n i m u m length n of a W O M needed to update one of v possible values g times. The conjecture logv} w(< v >g) - (1 + o(1)) max{g, 91og 9
(17.5.1)
is disproved with an easy counterexample. Observe that for every fixed positive a < 0.5,
w(< v
>__2av.
Indeed, if n < v - 1, then every updating requires at least 2 wits in the worst case.
The group 7/p can be used for coset encoding. Theorem
17.5.2 Let p be a prime and S a nonempty subset of 7/p. Then
]S#tl >_ min{p, * ] S I - t 2 + 1}, where S #t denotes the set of sums of t distinct elements of S and S #~ - {0} by convention. [] In particular
(*) _< (p +
- 1)/,,
where, for an abelian group G, the function sa(t) - - s t u d i e d in the next chapter - - denotes the smallest integer s such that, for any generating set S
Chapter 17. Writing on constrained memories
434
of G, ~o<~ ( 1 + o ( 1 ) ) p
11 1 11 (1-~)~+(~-~)~+...+(i_1
1
11) /)7+...
~.2
- (1 + o ( 1 ) ) p ( 2 -
-~-),
i.e.,
9 > (, + o(,))0.35p.
(,7.5.3)
Note that, for large enough p and t, Theorem 17.5.2 only guarantees:
s z , ( t ) <_ ~p ( 1 +
t2-1) p
[77,] ~~. t
Whereas, for large t and r, by Theorem 18.2.6,
~r.(t) < I]F'l(t + -
2t
1) ,
so linear WOM-codes should be asymptotically more efficient than Zp-codes. The main problem is that the set of columns of H we are left with while updating is not guaranteed to generate IF~ when ISI < 2r - l , so, starting from the parity check matrix of a Hamming code, we could have to settle to writing fewer bits after g - 2~-2 + 1 generations. This has been improved to
g - 2~-2 + 2~-4 + 1,
(17.5.4)
by use of geometric considerations: remember (see Lemma 17.1.5)that H(C), the parity check matrix of a code C, remains of full rank if and only if the written wits do not contain the support of a nonzero codeword of C • (here, codewords of C are just complements of hyperplanes in PG(r- 2, 2)). In fact, from the point of view of the writing efficiency, a decrease of the rank is good, since the covering radius then drops (the code dimension increases). Of course, to implement such a system, one would need a few extra wits used as flags to warn the reader and future writer of the surviving rank. If we are ready to relax the assumption that the writing rate be kept constant, then the WOM can be overwritten more. Namely, in Lemma 17.1.5, we have seen that C(I) has dimension k-lI I when I does not contain a support of a nonzero codeword in C • This is guaranteed if
I/I < d x -
1.
However, if II] _> d • and II] is not too large, one can hope that the support of only one nonzero codeword of C • is contained in I, and thus r - 1 bits can still be written. The idea is captured by the following definition and result.
17.5.
435
Nonlinear WOM-codes
D e f i n i t i o n 17.5.5 The i-th distance of C, denoted by di(C) or di, is the m i n i m u m size of the union of the supports of i linearly independent codewords.
T h e o r e m 17.5.6 Shortening an [n, k, d] code C on at most s - d~ - 1 positions gives a code C'[n - s, k' <_ k - s + i -
1, >_ d]
k - (i - 1) bits by coset encoding.
enabling to write at least n -
[]
Let us not pursue further but conclude by noting that our knowledge of sF~(t) for small t's (see next chapter) already gives oo
g >_ 2~ (1 + o ( 1 ) ) E
~-(sF1
i--2
6
>E
-
=2~(1+o(1))
1
1
1 sF.(i-
i=2
7(
1 1
1
( 1 - ~ ) + ~(~ - ~)
1 ) - sF~(i)) 11
1
1(1
1
11
1 )
+ a ( ~ - a) + s a - ~ ) + ~(~ - 17)
This leads to an improvement on (17.5.3)"
g > 0.a6.2" (1 + o(1)). T h e ease g - 2. The following was first proved nonconstructively: w(< v >2),,~ 1.2Ologv.
(17.5.7)
Let us show how to achieve this value semi-constructively (cf. Section 20.3), using linear codes. One can obtain, through a greedy algorithm, an [n, x.n]R code C, with R - nn, on the Gilbert-Varshamov bound with rate satisfying = H-l(1-
~),
i.e., ~ ,,~ 0.227. After coset-encoding n ( 1 - ~) bits with C, we can always assume that exactly n~ wits have been used. For the second writing, we consider these used wits as defects: we now want to write n(1 - ~) bits on a memory of size n with s - n~ defects (in fact s asymmetric defects). We know that this is possible (cf. Notes on Section 18.7). The efficiency of this WOM-code is r = 2(1 - tr ~ 1.546.
(17.5.8)
9
436
Chapter 17.
Writing on constrained m e m o r i e s
Table 17.1: Parameters of a few WOM-codes, linear or not.
5 . 2 ~-4 +
r 2 0.77 n 11 3 4 4 r log7 log 11 log 15
n 3 n 23 8 15 16 2r - 1 7 11 15
r 1.33 1.55 1.43 1.50 1.60 1.75 ~ 5r/16 1.60 1.57 1.82
r162 1.33 1.55 1.65 1.50 1.71 1.87 < r
1.54 1.55 1.94 2.24 2.5 2.8
Comments Ex. 17.1.7 (17.5.8) Th. 17.1.10 [153] (17.1.13) [153] (17.1.13) [553] [76] [483]
r = r g / n - achieved efficiency through WOM-codes;
r r
r ) - upper bound on the efficiency as a function of g and r [553]; upper bound on the efficiency as a function of g [553].
17.6
Notes
w WOMs have been introduced by Rivest and Shamir [553]. Section 17.1 follows Cohen, Godlewski and Merkx [153], where the coset-encoding writing rule is introduced. The supercode lemma is folklore. Example 17.1.7 is by Rivest and Shamir [553]. The properties of the [23, 12, 7]3 Golay code can be found in, e.g., MacWilliams and Sloane [464]. The case [I[ = 8 in the proof of Theorem 17.1.10 is treated in [153]. w The treatment of the error case, presented in Sections 17.2, 17.3 and 17.4, follows Z~mor and Cohen [702]. Writing on the WOM requires a complete decoding algorithm for shortened BCH codes, which is simple for 2and 3-error-correcting BCH codes. Reading and correcting errors is straightforward (it amounts to syndrome computation). Note that we have only estimated the efficiency of the BCH WOM-codes (it could be higher). It is not clear to us whether increasing the minimum distance can further raise the efficiency of the error-correcting WOM-code. w The notation w(< v >g) is introduced by Rivest and Shamir [553]. Their conjectured (17.5.1)is disproved by Alon, Nathanson and auzsa [19], who also prove Theorem 17.5.2. The construction yielding (17.5.4) is due to Godlewski [252]. See Wei [682] and Cohen, Litsyn and Z~mor [163] for bounds
17.6.
Notes
437
on the i-th distance. The nonconstructive proof of (17.5.7) is by Rivest and Shamir [553]. Finally, let us mention that some cryptographic aspects of WOMs are studied by Cohen and Godlewski [152], [253].