Introns and exons

Introns and exons

Introns and exons L szl6 Patthy Hungarian Academy of Sciences, Budapest, Hungary Analysis of the exon-intron structures of genes, and a survey of th...

1004KB Sizes 27 Downloads 173 Views

Introns and exons

L szl6 Patthy Hungarian Academy of Sciences, Budapest, Hungary

Analysis of the exon-intron structures of genes, and a survey of the evolutionary distribution of mosaic proteins suggest that modularization of protein domains by intron insertions and their dispersal through exon-shuffling become significant only in higher eukaryotes. The appearance of this powerful evolutionary mechanism probably contributed significantly to the 'Big Bang' of metazoan radiation. Current Opinion in Structural Biology 1994, 4:383-392 Introduction

The discovery o f introns and the realization that intronic recombination could facilitate exon-shuffling [1] has led many molecular biologists to believe that exonshuffling was significant from the beginning - - all proteins were assembled by this mechanism and all introns should be viewed as relics of the original assembly process. More recent studies, however, have cast serious doubts on the validity of these assumptions (reviewed in [2,3,4°]) and o n e of the most important strongholds of this 'introns-old' hypothesis, that plant globins with these introns represent the ancestral form of the gene, has also come u n d e r attack [5°]. A number of pieces of evidence challenge the view of introns as relics of the original assembly process. First, it is well established n o w that the exon-intron pattern of genes is not static - - introns are inserted [6°°,7°°] as well as removed from genes; therefore present exon-intron structures do not necessarily reflect the original assembly process, even if the gene was produced b y exon-shuffling. Second, there is evidence, from their phylogenetic distribution, that introns appeared and spread relatively late during evolution and are restricted to a rather small group of higher eukaryotes [8]. Third, even if ribozyme-like self-splicing introns were around w h e n the first proteins were formed, they were practically unsuitable for exon-shuffling by intronic recombination [2,3,9]. As these selfsplicing introns e n c o d e an essential function, the capacity to act as ribozymes, their sequence is not as tolerant to intronic recombination as are the spliceosomal pre-mRNA introns typical of vertebrates. Exon-shuffling by this mechanism could, therefore, not b e c o m e significant until spliceosomal introns evolved from selfsplicing introns.

Finally, only a limited fraction of exons are really valuable in exon-shuffling [2,3,9]. The splice junctions of the shuffled exon have to be phase-compatible with those of its new neighbours, otherwise a shift in the reading frame would obliterate the protein information of the shuffled exon, as well as that of the exons downstream from the inserted exon [9]. For this reason, only 'symmetrical' exons (exon-sets) of class 1-1, class 2-2 and class 0-0, (i.e. exons [exon-sets] flanked by introns of the same phase ]phase 1, phase 2 or phase 0]) are suitable for exon duplication and exon insertion. The superiority of such 'symmetrical' exons is convincingly demonstrated by the dozens of class 1-1 modules used in the construction of clan 1 mosaic proteins (mosaic proteins assembled from class 1-1 modules) [3]. In contrast to these observations o n mosaic proteins, the exon-intron structures of the genes for 'old' proteins (i.e. protein folds c o m m o n to both prokaryotes and eukaryotes) do not support the claims that they evolved by exon-shuffling, as their hypothetical" 'modules' do not conform to the rules of exon-shuffling described above ]2,9]. Despite this progress in our understanding of the rules, mechanistic details and evolution of intronic recombination and exon-shuffling, some intriguing questions remain. Why does the domain organization of some mosaic proteins show little or no correlation with the exon-intron organization of their genes? If exonshuffling is a relatively recent evolutionary mechanism, h o w were the class 1-1, class 2-2 and class 0-0 modules created? What is the explanation for the preponderance of class 1-1 modules and the paucity of class 0-0 and class 2-2 modules? Can we determine more precisely the time w h e n creation of modules and modular assembly of proteins by exon-shuffling became significant? This review evaluates some recent developments

Abbreviations

Clr--complement C1r; DISC~discoidin lectin; EGF-~epidermal growth factor; FS--folistatin; G--growth factor; Ig--immunoglobulin; INH--Kunitz-type trypsin inhibitor; LDL--Iow density lipoprotein; LK--link protein; LN--C-type lectin; PP--pancreatic secretory polypeptide; WH--whey protein. © Current Biology Ltd ISSN 0959-440X

383

384

Sequencesand topology that bear on these questions, based primarily on papers published in 1993.

Exon-intron organization of mosaic protein genes In earlier reviews [2,3,9], I have shown that the genes for most mosaic proteins reflect the assembly process: dozens of proteins assembled from class 1-1 modules were shown to obey the rule that phase 1 introns are found at the boundaries separating the modules in their genes. Recent studies have described several additional examples that support the general validity of these rules [10°-12°,13]. The most spectacular examples are the genes of the selectin family where the different module types are all encoded by discrete class 1-1 exons [14,15,16°]. In the case of some mosaic proteins the correlation between exon-intron structure and modular organization is less than perfect, illustrating the point that as time passes intron insertion and intron removal start to erode the original exon-intron organization of these proteins. For example, in the human perlecan gene [17] the low density lipoprotein (LDL)-receptor modules and the inamunoglobulin (Ig) modules are still flanked by phase 1 introns, but the original introns are missing from the regions constructed from laminin A, laminin B and epidermal growth factor (EGF)-like modules. Another interesting example is the case of the agrin gene [18°]. The genomic organization of the 5' region is a perfect example of a clan 1 mosaic gene: the nine follistatin modules, the two laminin B modules and the two agrin modules are all flanked by phase 1 introns [19°,20°]. On the other hand, in the 3' part of the agrin gene the expected phase 1 introns are absent from the boundaries separating the four EGF modules and the three laminin A modules [19°,20°]. The most plausible explanation for this unusual genomic organization of the agrin gene is that the 3' part is older and the original introns used for the modular assembly of the 3' part of the gene have already been eliminated. A similar situation exists in the case of the gene for the LDL-receptor-related protein of Caenorbabditis elegans [21"]. Despite their significant evolutionary distance, the LDL-receptor-related proteins of nematodes and man have a nearly identical arrangement of LDLreceptor and EGF modules. The only major difference is that the LDL-receptor-related protein of Caenorbabditis elegans has 35 LDL modules, whereas there are 31 copies in the human protein. In C. elegans the extra four repeats form a cluster in the amino-terminal part of the molecule and are the only modules that are still encoded by separate class 1-1 exons. The most plausible interpretation is that this part is younger than the other parts of the molecule (it probably arose after the divergence of the ancestors of vertebrates and nematodes), so that its original structure had a greater

chance of survival. Analysis of the exon-intron structures of members of the thrombospondin family also suggests that younger parts of mosaic protein genes are more likely to retain their original introns [22*]. There are cases where intron insertion and intron removal has almost completely eliminated the original exon-intron organization of a mosaic protein gene. Phase 1 introns are missing from all the boundaries of the class 1-1 thrombospondin, LDL-receptor, EGF and C7 modules of the complement C6 gene [23°]. The original phase 1 introns are found only at the boundaries of one of the complement B-modules. An obvious conclusion from such cases is that continual removal and insertion of introns starts to obliterate the original gene structures of mosaic proteins. It follows from these observations that the absence of introns from some module boundaries of a mosaic protein does not exclude the possibility that it evolved by exon-shuffling. If a protein is composed of class 1-1 modules (i.e. modules that have been shown to be duplicated and inserted into new locations by intronic recombination), it may rightfully be assumed that it also arose by exon-shuffling, even if some of the 'original' introns are already missing from the gene encoding it.

Creation of modules by intron insertion The mosaic proteins assembled from class 1-1 modules have one c o m m o n characteristic: they are young proteins that are unique to eukaryotes, and predominantly to multicellular animals [3]. In the past year or two several new members were added to the list of mosaic proteins that supports this generalization. They include contractile proteins of muscle [24], the extracellular domains of various receptor tyrosine kinases [25°-30",31], receptor tyrosine phosphatases [32°], and a number of functionally diverse receptor proteins [33°,34"]. I have previously suggested that the reason why all the clearcut cases of exon-shuffling involve young proteins is that the exon-shuffling machinery (i.e. spliceosomal pre-mRNA introns and protomodules) appeared relatively late during evolution [2,3,9]. According to the 'modularization hypothesis' [3], modules suitable for exon-shuffling were created by the insertion of introns (of identical phase) into protein coding genes at positions corresponding to the aminoand carboxy-terminal boundaries of protein domains, converting them to protomodules (Fig. 1). The next stage in the development of the module is that it undergoes internal tandem duplications via recombination at these, strategic introns. Since genes containing numerous tandem copies of symmetrical modules are prone to undergo tandem duplications, along with deletion and excision of modules [3,35], this stage may facilitate exon-shuffling: excision of symmetrical modules may provide a major source for mobile modules to be inserted elsewhere (Fig. 1).

Introns and exons Patthy

~1

A ' ' ~'!®~g~G~,,,,~l I

I

A

I

A Fig. 1. Different stages in the conversion of a domain to a module. The figure illustrates the modularization of a protein possessing a secretory signal peptide domain. The boxes represent exons, the connecting lines indicate the introns. The exon-part encoding the secretory signal peptide domain is heavily shaded, the exon-parts encoding the autonomous protein fold (A) are lightly shaded. Stage 1 : Insertion of introns of identical phase at the amino- and carboxy-terminal boundaries of the protein fold A. Stage 2: Tandem duplications of the symmetrical protomodule A via intronic recombination. Stage 3: module A is transferred to new locations.

+

£

This scenario implies that the modules identified in mosaic proteins may have existed as independent proteins before b e c o m i n g modules. Recent studies have s h o w n that this is true for many modules and in the case of several module types all three stages of 'modularization' have b e e n observed. In an earlier review [3], I s h o w e d that the protomodule stage, the tandem duplication stage, and the shuffling stage are all observed for the Kunitz-type trypsin inhibitor (INH) module, the C-type lectin (LN) module, the discoidin lectin (DISC) module, the link protein (LK) module and EGF-like growth factor (G) module. In the case of the pancreatic secretory polypeptide (PP) module only the the first two stages were documented in the earlier review [3].Now there is evidence for the shuffling of this m o d u l e to other proteins [36,37]. Some n e w m o d ules m a y n o w be added to the list of cases where the p r o t o m o d u l e stage has also b e e n foundi the complement Clr (Clr) module [38*], the LDL-receptor module [39"] and the w h e y protein (WH) module [40,41"]. As s h o w n previously [3], the case of the pancreatic secretory trypsin inhibitor domain is especially intriguing for another reason. Modularization of the ancestor of this domain has taken two different routes. Acquisition of p h a s e 1 introns at its boundaries gave rise to the class 1-1 follistatin (FS) module, a module found previously in follistatin and osteonectin. Last year the follistatin module was found in nine copies in agrin in the c o m p a n y of the class 1-1 laminin B-, EGF-like and laminin A modules [19",20"] and in testican in the c o m p a n y of a class 1-1 thyroglobulin module [20*,42"]. An i n d e p e n d e n t modularization of the pancreatic secretory trypsin inhibitor domain ancestor occurred b y the insertion of phase 0 introns at its boundaries, giving rise to the class 0-0 ovomucoid module [3]. In the

case of this modularization route, only the first two stages have b e e n documented: the class 0-0 ovomucold module is present in several tandem copies in various m e m b e r s of the o v o m u c o i d family, but no evidence has so far b e e n obtained for the shuffling of this class 0-0 module to other proteins. On the basis of the modularization hypothesis, it has b e e n predicted that the b o m b a r d m e n t of genes by intron insertions could convert any domain to a protomodule, thereby mobilizing it. I will n o w discuss another interesting example illustrating this point.

Modularization globin

by three

different

routes --

the

module

Recent studies have revealed that the ancestral globin fold has made two major steps in becoming a module. Even more striking is the fact that its modularization exploited all three options: it gave rise to class 1-1, class %0 and class 2-2 modules. Members of the globin family are present in both prokaryotes and eukaryotes, indicating that this protein fold arose during the early part of protein evolution. In some invertebrates there are globin-species that contain several tandem globin domains. Analysis of the gene structures of such internally duplicated globins revealed that intronic recombination was involved in their generation. In the case of the clam, Barbatia reeveana, it was demonstrated that its two-domain intracellular globin arose by unequal crossing over b e t w e e n two identical, or very similar, genes for a single-domain globin [43].

385

386

Sequencesand topology Reconstruction of the crossing over event suggests that recombination took place in an intron close to the 5' boundary and a latent intron at the 3' boundary of the ancestral single-domain globin. The two introns were in phase 2 relative to the reading frame of the protein, giving rise to a class 2-2 protomodule. In the duplicated gene, a phase 2 intron lies between the two globin domains. Some nematodes, such as Ascaris s u u m [44 °] and Pseudoterranova decipiens [45°], possess extracellular globins with a secretory signal peptide and two tandem globin domains. In the genes for these proteins, phase 1 introns are found at the boundaries separating the two globin domains from each other and from the secretory signal peptide domain. These two-domain globins are clearly the product of an internal gene duplication event that occurred by recombination in phase 1 introns found at the carboxy- and aminoterminal boundaries of the globin fold (the latter lies between the signal peptide and globin domains). In summary, intron insertions have, in this case, converted the globin domain to a class 1-1 module. In a third evolutionary line, the crustacea, a globin gene has again embarked on internal duplication. The brine shrimp, Artemia salina, has a polymeric globin with nine tandem globin-domains [46]. Analysis of the structure of its globin gene revealed that phase 0 introns are found at the boundaries separating the individual globin units [47°]. This observation suggests that in the case of crustacea modularization of the globin-fold occurred by the class 0-0 route.

Why are class 1-1 module types more numerous than class 0-0 and class 2-2 modules? In principle, three symmetrical module-groups are possible: class 1-1, class 0-0, and class 2-2, depending on whether phase 1, phase 0 or phase 2 introns are found at both boundaries of a module [2,3,9]. With the rapid increase in the number of module types identified, it is more striking than ever that there is a convincing predominance of class 1-1 modules. Although more than two dozen class 1-1 modules (and a host of clan 1 mosaic proteins assembled from these) are known, only a few class 2-2 modules (e.g. serum albumin module, preproglucagon module, globin module) or class 0-0 modules (e.g. ovomucoid module, crystalline module, globin module) have been identified so far. Given the low number of class 2-2 and class 0-0 module types, it is no surprise that no clan 2 or clan 0 mosaic proteins assembled from different module-types have yet been found. According to the modularization hypothesis, the question of this enigmatic preference for class 1-1 modules and clan I proteins may be formulated in the following way: w h y would domains have a greater chance of modularization as class 1-1 rather

than as class 2-2 or class 0-0 modules? In other words: why w o u l d phase 1 intron insertion be preferred at the boundaries of a module? Recent data o n the mechanism of intron insertion have important implications for these questions, inasmuch as they suggest that intron insertion in the three different phases of a protein-coding gene may be non-random. First, it is clear n o w that group II introns (types of self-splicing introns) can transpose to new locations by a naechanism involving reversal of the splicing reaction, a mechanism that is also likely to operate for the insertion of perfect pre-mRNA introns [6",7"°]. Second, it is n o w k n o w n that splicing of spliceosomal introns requires numerous small nuclear (sn)RNAs which scrutinize pre-mRNA sequences for splice junctions and participate in the removal of the intron and ligation of flanking exons [48",49"]. The snRNA U5 component of spliceosomes is essential for identifying the protosplice site (i.e. the short exon-sequences that flank introns). Recent studies have shown that U5 binds to these sites, thereby aligning the flanking exons for the ligation step [48",49°°]. Since U5 of the spliceosome interacts with protosplice sites of spliced RNA [49°°], it is clear that reversal of the splicing reaction could target intron insertion at such protosplice sites. In translated regions the base preferences of protosplice sites (AG/G) are also manifested in some amino acid preferences at exon-intron junctions. For exampie, phase 2 introns are most likely to split an arginine codon (AGG), whereas phase 1 introns are most likely to split codons for glycine. Biased amino acid composition of a given segment of a protein may thus be reflected in a biased phase-distribution of protosplice sites. As a consequence, intron insertion in different phases may also be biased in such a target region. I have already pointed out, that in most of the cases where the various stages of modularization are k n o w n it is clear that the substrate was a protein possessing a secretory signal peptide. One of the introns initiating modularization had to be inserted at the boundary separating the secretory signal peptide from the mature protein domain (Fig. 1). The amino acid sequence at the boundary of signal peptide domains is far from random: small neutral residues (glycine, alanine) are k n o w n to abound in the proximity of the signal-sequence cleavage site. The non-random amino acid composition around such sites could thus bias intron insertion in different phases. Analysis of more than 150 genes possessing an intron at the boundary of their signal-peptide domains has shown that 92% of such 'signal peptide introns' were of phase 1, and only 8% were phase 0 (Patthy L, unpublished data). This observation suggests that intron insertion at the boundary of the signal peptide domain is strongly preferred in phase 1; therefore, modularization of exported proteins with secretory signal peptide domains are most likely to take the class 1-1 route.

Introns and exons Patthy Table 1. Mosaic proteins found in lower metazoaa. Species

Hydrozoans Hydra vulgaris Nematodes Caenorhabditis elegans

Molluscs Aplysia califomica Annelids Lumbricus terrestris Echinoderms Sea urchins

Arthropods Drosophila melanogaster

Bombyx mori Grasshopper Tachipleus tridentatus

Class I-I module b

Referencec

laminin BI

LMB

[58]

lin-12 glp-1 unc-5 laminin B homolog, unc-6 twitchin perlecan homolog unc-52 LDL-receptor related gene

G, NT G, NT Ig, TSP LMB, LMD Ig, FN3 LDL, LMB, LMC, Ig LDL, G

NCAM-related adhesion molecule

Ig, FN3

[60]

hemoglobin linker chains

LDL

[61 "]

speract receptor sperm membrane protein metalloproteinase EGF-related protein

SC G CI r, G Clr, G

[62] [63"] [52]

Mosaic protein

Notch G, NT laminin A LMA, LMB, LMC, LMD laminin BI LMB, LMC, LMD laminin B2 LMB, LMC, LMD crumbs G, LMA serrate G delta G fat G, LMA slit G, LMA, SLT hikaru genki Ig, B tolloid CI r, G sevenless FN3 receptor phosphotyrosine phosphatases FN3, Ig neuroglian FN3, Ig NCAM, etc. FN3, Ig fasciclin II Ig, FN3 Dror, trk-related receptor tyrosine kinase Ig, K Dtrk Ig projectin FN3, Ig kettin Ig amalgam Ig hemolin Ig fasciclin II Ig, FN3 Limulus coagulation factor C B, LN, G

[59"] [50"] [51"] [21 "]

[53"] [64"] [65]

[66]

[67,68"] [68%69"] [70"] [711 [72"]

[30"] [74"] [75] [76] [77] [78]

aMosaic proteins identified in vertebrate species may be found in an earlier compilation [3]. Proteins consisting of a single module are excluded from this survey because their mere existence does not prove that they have already been used for exon shuffling. Viruses, parasitic protists, may have acquired modular proteins by horizontal transfer; therefore, their mosaic protiens [3] are not listed here. bThe abbreviations of class I -I modules are identified with those used earlier [3]. B, complement B-type; CI r, complement CI r; FN3, type III module of fibronectin; G, growth factor; Ig, immunoglobulin; K, kringle; LDL, LDL receptor; LN, C-type lectin; NT, Notch protein; SC, scavenger receptor; TSP, thrombospondin. New module abbreviations are: SLT, slit module; LMA, LMB, LMC, LMD, modules found in laminin A, BI and B2. cOnly references not included in an earlier review [3] are shown.

The explosion of exon-shuffling occurred at the time of metazoan radiation As pointed out above, different lines of argument support the conclusion that exon-shuffling is a relatively

late development: it required spliceosomal introns and the accumulation of a critical mass of module types. To define more precisely the time at which modularization reached a critical point (permitting a burst of exon-shuffling), recent evidence on the evolutionary

387

388

Sequences and topology

distribution of proteins produced by exon-shuffling will n o w be surveyed. The structures of numerous mosaic proteins from most major groups of metazoa have recently b e e n determined (Table 1). The fact that mosaic proteins (composed of class 1-1 modules familiar from vertebrate genes) have already been found in hydrozoa, nematodes, molluscs, arthropods, echinoderms and vertebrates indicates that modularization had taken place and that exon-shuffling was in full gear before the divergence of the major metazoan phyla (Fig. 2). There can be no doubt that the construction mechanism of these mosaic proteins was the same as that identified in vertebrate genes, even if some of the 'expected' introns are missing from their genes. There are several cases of invertebrate genes where the class 1-1 modules are still flanked by the original phase 1 introns [21",50°,51°,52,53°]. Although a receptor protein kinase of the plant Arabidopsis tbaBana was found to contain two tandem copies of an EGF-like domain [54"], there is little evidence for mosaic proteins and exon-shuffling in plants. It may be relevant in this respect that spliceosomal introns of plants seem to be less suitable for intronic recombination than those of vertebrates [55°]. Evolutionary distribution of mosaic proteins thus suggests that exon-shuffling became significant at the time of metazoan radiation, i.e. at a time w h e n multicellularity called for a multitude of novel proteins to maintain the communication among different cells, organs and tissues, and there was a strong selective pressure to produce such extracellular proteins. It is noteworthy that most mosaic proteins are associated with, and are essential for, multicellularity. They are constituents of the extracellular matrix, membrane-associated proteins involved in cell-cell or cell-matrix interactions,

VERTEBRATES

*

ECHINODERMS

,

ARTHROPODS

*

ANNELIDS

*

MOLLUSCS

*

NEMATODES

*

HYDROZOANS

,

or receptor proteins regulating cell-cell communications, such as receptor tyrosine kinases, receptor tyrosine phosphatases, growth factor receptors, growth factor precursors. We also know from developmental biology of Caenorhabditis and Drosophila, that most of the mosaic proteins listed in Table 1 control morphogenesis, differentiation processes or cell fate decisions, and thus determine the basic body plans of metazoa. The temporal correlation between the explosion of exon-shuffling and the 'Big Bang' of metazoan radiation [56",57°] thus seems to be more than just a coincidence. It could be argued that the availability of this powerful evolutionary mechanism could actually contribute significantly to the burst of metazoan evolution.

Conclusions

Recent studies of the genes for mosaic proteins confirm the validity of the modular exchange principles outlined earlier [3,9]. Studies on the evolutionary origin of modules that are the substrates of exon-shuffling have also revealed h o w mobile modules may be created by insertion of introns into protein coding genes. As it can be shown that intron insertion at the boundary separating the secretory signal peptides from protein° domains is preferred in phase 1, secreted proteins are most likely to be modularized as class 1-1 modules. Analysis of the evolutionary distribution of proteins assembled from modules by intronic recombination suggests that this evolutionary mechanism became significant at the time of the appearance of the first metazoa and might in fact have contributed to the explosive nature of metazoan radiation.

DEUTEROSTOMES

m

t-

t

'

PROTOSTOMES

SPONGES PROTISTS

9

COELENTERATA

Fig. 2. Evidence for exon-shuffling in major groups of extant animals. An asterisk indicates that there is unquestionable molecular evidence for the presence of mosaic proteins produced by exon-shuffling (cf. Table 1). The question mark after protists is meant to indicate the possibility that the genes of mosaic proteins found in some parasitic protists [3] may have originated from their hosts.

Introns and exons Patthy

Acknowledgements This work h a s b e e n supported by grants from OTKA T1362 and OTKA T 5211.

12. *

Nolan KF, Kaluz S, Higgins JMG, G o u n d i s D, Reid KBM: Characterization of the H u m a n Properdin Gene. Btocbem J 1992, 287:291-297. Four o f the six t h r o m b o s p o n d i n m o d u l e s of properdin are coded for by discrete, symmetrical class 1-1 exons. 13.

van der Logt CPE, Reitsma PH, Bertina RM: Intron-Exon Organization of the H u m a n Gene Coding for the LipoproteinAssociated Coagulation Inhibitor: the Factor Xa Dependent Inhibitor of the Extrinsic Pathway of Coagulation. Biochemistry 1991, 30:1571-1577.

Papers of particular interest, published within the annual period o f review, have b e e n highlighted as: • of special interest •• of outstanding interest

14.

Collins T, Williams A, Johnston GI, Kim J, Eddy R, Shows T, Gimbrone MA Jr, Bevilacqua MP: Structure and Chromosomal Location of the Gene for Endothelial Leukocyte Adhesion Molecule 1. J Biol Chem 1991, 266:246(~2473.

1.

Gilbert W: W h y Genes in Pieces? Nature 1978, 271:501.

15.

2.

Patthy L: Exons - - Original Building Blocks of Proteins? Bioessays 1991, 13:187-192.

D o w b e n k o DJ, Diep A, Taylor BA, Lusis AJ, Lasky LA: Characterization of the Murine Homing Receptor Gene Reveals Correspondence Between Protein Domains and Coding Exons. Genomics 1991, 9:270-277.

3.

Patthy L: Modular Exchange Principles in Proteins. C u r t Optn S t m c t Btol 1991, 1:351-361.

References and recommended reading

4. Dibb NJ: W h y Do Genes Have Introns? FEBS Left 1993, • 325:135-139. This paper discusses s o m e of the controversies surrounding introns a n d exons a n d explores the possibility that alternative splicing might be the cause, rather t h a n a consequence, of split genes. 5. •

Dixon B, Pohajdak B: Did the Ancestral Globin Gene of Plants and Animals Contain Only Two Introns? Trends Btochem Sct 1992, 17:48(>-488. Globins have two introns in highly conserved positions that separate distinct structural e l e m e n t s of the globin fold. Based on Gilbert's hypothesis [1] a n d o n the structural features of the globin fold it was predicted that a third intron should be present in the central e x o n o f the gene. The fact that this 'central intron' was later found in plant globin g e n e s was considered to be proof that plant globins represent the ancestral form of the gene in which distinct 'modules' are encoded by distinct exons. This predictive success is considered by m a n y as the best evidence in support of Gilbert's hypothesis. Based on their n e w evidence, the authors convincingly demonstrate that the two-intron g e n e is the primordial eukaryotic form and that plants along with animals g a i n e d 'central introns' independently. 6. Belfort M: An Expanding Universe of Introns. Science 1993, •• 262:1009-11010. Summarizes recent a d v a n c e s in our understanding of intron mobility. O n e of the p a t h w a y s w h e r e b y all intron types may be transposed to heterologous sites within the g e n o m e involves splicing reversal a n d reverse transcription o f the reverse-spliced product. There is n o w evidence, both in vitro a n d in vivo, that group II introns can transpose to nonallelic sites by this m e c h a n i s m [7••]. 7. ••

Mueller MW, Allmaier M, Eskes R, Schweyen RJ: Transposition of Group II a l l in Yeast and Invasion of Mitochondrial Genes at N e w Locations. Nature 1993, 366:174-176. Provides the first evidence that transposition of a group II intron to nonallelic sites in vivo involves an RNA intermediate generated by reverse splicing.

8.

Palmer JD, Logsdon JM Jr: The Recent Origins of Introns. Curr Opfn Genet Dev 1991, 1:470--477.

9.

Patthy L: Intron-Dependent Evolution: Preferred Types of Exons and Introns. FEBS Lett 1987, 214:1-7.

10. •

Hillarp A, Pardo-Manuel F, Ruiz RR, de Cordoba SR, Dahlback B: T h e H u m a n C4b-Binding Protein ~-Chain Gene. J Btol Chem 1993, 268:15017-15023. Shows that the three c o m p l e m e n t B-type m o d u l e s constituting this protein are flanked by p h a s e 1 introns. 11. •

Schulz AS, Schleithof L, Faust M, Bartram CR, Janssen JWG: The G e n o m i c Structure of the H u m a n UFO Receptor. Oncogene 1993, 8:509-513. Each of the two Ig-like a n d the two fibronectin type III modules of this receptor tyrosine kinase are e n c o d e d by distinct class 1-1 exons.

16. •

Larigan JD, Tsang TC, Rumberger JM, Burns DK: Characterization of cDNA and Genomic Sequences Encoding Rabbit ELAM-I: Conservation of Structure and Functional Interactions with Leukocytes. DNA Cell Btol 1992, 11:149-162. The C-type lectin module, the EGF-module, a n d the five complement B-modules are each e n c o d e d by discrete class 1-1 exons, supporting the view that selectins evolved by e x o n shuffling. 17.

C o h e n IR, Grassel S, Murdoch AD, Iozzo RV: Structural Characterization of the Complete H u m a n Perlecan Gene and its Promoter. Proc Natl A c a d Sct USA 1993, 90:10404-10408.

18. •

Rupp F, Ozcelik T, Linial M, Peterson K, Francke U, Scheller R: Structure and Chromosomal Localization of the Mammalian Agrin Gene. J Ne~trosci 1992, 12:3535--3544. The e x o n - i n t r o n structure of the agrin gene s h o w s significant corres p o n d e n c e to the domain structure of the protein. 19. Patthy L, Nikolics K: Functions of Agrin and Agrin-Related * Proteins. Trends Ne~trosci 1993, 16:76--81. Shows that nine t a n d e m repeats of this modular protein correspond to follistatin m o d u l e s (class 1-1 modules) and not o v o m u c o i d modules (class 0-0 modules). In agreement with this conclusion, phase 1 introns were found at the boundaries separating the nine repeats in the agrin gene, linking t h e m to class 1-1 laminin B-modules. 20. Patthy L, Nikolics K: Agrin-Like Proteins of the Neuromus• cular Junction. Neurochem Int 1994, 24:301-316. In the amino-terminal two-third of agrin, there is a striking correlation between the modular organization of the protein and the e x o n - i n t r o n organization of the gene, whereas in the carboxy-terminal third there is n o such correlation. It is suggested that the 5' part o f the agrin g e n e was assembled more recently than the 3' part, explaining w h y the original exon-intron organization still persists in the amino-terminal, but not in the carboxy-terminal part. 21.

Yochem J, Greenwald I: A Gene for a Low Density Lipoprotein-Related Protein in the Nematode Caenorhabdltis elegana. Proc N a g A c a d Sci USA 1993, 90:4572-4576. The n e m a t o d e a n d h u m a n LDL-receptor-related proteins have a nearly identical n u m b e r and arrangement o f modules, and s o m e introns of the C. elegans g e n e correspond to introns of the LDL-receptor gene. •

22. Shingu T, Bomstein P: Characterization of the Mouse • T h r o m b o s p o n d i n 2 Gene. Genomtcs 1993, 16:78--84. As in the case o f the thrombospondin I gene, the three TSP modules are e n c o d e d by discrete symmetrical class 1-1 exons, but in the other parts of this mosaic protein there is n o clear correlation between the m o d u l a r organization o f the protein and e x o n - i n t r o n structure of the gene. Since the three TSP modules are present in t h r o m b o s p o n d i n 1 a n d 2 but absent from t h r o m b o s p o n d i n 3, it is probable that the ancestor of the TSP1 a n d TSP2 g e n e s acquired the TSP module following its divergence from TSF3. This assumption might explain w h y the original introns still persist in this 'younger' part o f the gene. 23. •

Hobart MJ, Fernie B, DiScipio RG: Structure of the H u m a n C6 Gene. Biochemistry 1993, 32:6198-6205.

389

390

Sequencesand topology Only o n e of the two c o m p l e m e n t B-modules of this c o m p l e x mosaic protein are still flanked by p h a s e 1 introns. In the other parts of the C6 gene introns do not correlate with boundaries of the various thrombospondin, LDL-recpetor, c o m p l e m e n t C7- a n d EGF-like modules. 24.

Price MG, G o m e r RH: Skelemin, a Cytoskeletal M-Disc Periphery Protein, Contains Motifs of Adhesion/Recognition and Intermediate Filament Proteins. J Biol Chem 1993, 268:21800-21810.

Ziegler SF, Bird TA, Schneringer JA, Schooley KA, Baum PR: Molecular Cloning and Characterization of a Novel Receptor Protein Tyrosine Kinase from H u m a n Placenta. Oncogene 1993, 8:663-670. The extracellular d o m a i n of this receptor is constructed from an Igmodule, three EGF-modules and three fibronectin type III modules.

The extracellular d o m a i n o f this receptor contains an EGF a n d an immunoglobulin module. 34. •

B e c k m a n G, Bork P: An Adhesive Domain Detected in Functionally Diverse Receptors. Trends Btochem Set 1993, 18:40--41. Evidence is presented that A5 protein, meprins and s o m e receptor tyrosine p h o s p h a t a s e s contain a novel type of module. 35.

Lackner C, Boerwinkle E, Leffert CC, Rahmig T, Hobbs HH: Molecular Basis of Apolipoprotein (a) Isoform Size Heterogeneity as Revealed by Pulse-Field Gel Electrophoresis. J Clin Int;est 1991, 87:2153-2161.

36.

Bork P: A Trefoil Domain in the Major Rabbit Zona Pellucida Protein. Protein Set 1993, 2:669-670.

37.

Hauser F, Hoffman W: P-Domains as Shuffled Cysteine-Rich Modules in Integumentary Mucin C.1 (FIM-C.1) from Xenop u s laevls. Polydispersity and Genetic Polymorphism. J Biol Chem 1992, 267:24620-24624.

25. •

26.

D u m o n t DJ, Gradwohl GJ, Fong GH, Auerbach R, Breitman • ML: The Endothelial-Specific Receptor Tyrosin Kinase, tek, is a Member of a New Subfamily of Receptors. Oncogene 1993, 8:1293-1301. Shows that the extracellular domain of this receptor contains two Igmodules, three EGF-modules and three fibronectin type IIl modules. 27. •

Iwama A, Hamaguchi I, Hashiyama M, Murayama Y, Yasunaga K, Suda T: Molecular Cloning and Characterization of Mouse TIE and TEK Receptor Tyrosine Kinase Genes and Their Expression in Hematopoietic Stem Cells. Biocbem Biopbys Res Commun 1993, 195:301-309. The extracellular d o m a i n s of these receptors are constructed from an Ig-module, three EGF-modules a n d three fibronectin type III modules. 28. •

Masaiakowski P, Carroll RD: A Novel Family of Cell Surface Receptors with Tyrosine Kinase-Like Domain. J Btol Chern 1992, 267:26181-11690. The extracellular d o m a i n of these h u m a n proteins contains an immunoglobulin and kringle module. These Rot tyrosine kinases provide the first evidence for the occurrence o f a kringle module outside the trypsin family of serine proteases. 29. •

Jennings CGB, Dyer SM, Burden SJ: Muscle-Specific trk-Related Receptor with a Kringle Domain Defines a Distinct Class of Receptor Tyrusine Kinases. Proc Naa Acad Set USA 1993, 90:2895-2899. Describes the extracellular domain o f this protein from the electric ray as containing four immunoglobulin m o d u l e s a n d a kringle module. It should be noted that the h u m a n h o m o l o g u e has only o n e Ig-module [28°], w h e r e a s the Drosophila protein lacks Ig-modules [30•1. 30. •

Wilson C, G o b e r d h a n DCI, Steller H: Dror, a Potential Neurotrophic Receptor Gene, Encodes a Drosophila Homolog of the Vertebrate Ror Family of Trk-Related Receptor Tyrosine Kinases. Proc Natl Acad Sci USA 1993, 90:7109-7113. The extracellular d o m a i n of this protein contains a kringle module, but it lacks the immunogiobulin m o d u l e s that are present in its vertebrate h o m o l o g u e s [28•,29q.

38. •

Bork P, B e c k m a n n G: The CUB Domain. A Widespread Module in Developmentally Regulated Proteins. J Mol Biol 1993, 231:539-545. A survey of the C l r m o d u l e family. The authors show that spermadhesins consist o f a single C l r domain. 39. •

Bates P, Young JAT, Varmus HE: A Receptor for Subgroup A Rous Sarcoma Virus is Related to the Low Density Lipoprotein Receptor. Cell 1993, 74:1043-1051. The receptor consists of a single LDL-receptor module linked to a transmembrane segment. It should be noted that the LDL-module is flanked o n both boundaries by phase 1 introns. 40.

Dear "IN, Kefford RF: The WDNM1 Gene Product is a Novel Member of the 'Four-Disulphide Core' Family of Proteins. Biochem Biophys Res Commun 1991, 76:247-254.

41. •

Saheki T, Ito F, Hagiwara H, Saito Y, Kuroki J, Tachibana S, Hirose S: Primary Structure of the Human Elafin Precursor Preproelafin Deduced from the Nucleotide Sequence of its Gene and the Presence of Unique Repetitive Sequences in the Prosegment. Biochem Btophys Res Commun 1992, 185:240-245. The gene e n c o d e s a single w h e y protein (WH) module. It should be noted that there is a p h a s e 1 intron at the 5' boundary of the WH domain a n d a latent p h a s e 1 intron adjacent to the 3' end of the translated region. 42. •

Alliel PM, Perin JP, Jolles P, Bonnet FJ: Testican, a Multidomain Testicular Proteoglycan Resembling Modulators of Cell Social Behaviour. E u r J Biochem 1993, 214:347-350. It should be noted that the protein contains a class 1-1 follistatin [20] and a class 1-1 thyroglobulin module. 43.

Naito Y, Riggs CK, Vandergon TL, Riggs AF: Origin of a 'Bridge' Intron in the Gene for a Two-Domain GIobin. Proc Natl Acad Sci USA 1991, 88:6672-6676.

44.

Sherman DR, Kloek AP, Krishnan BR, Guinn B, Goldberg DE: Ascarls Hemoglobin Gene: Plant-Like Structure Reflects the Ancestral Globin Gene. Proc Nail Acad Set USA 1992, 89:11696-11700. The two t a n d e m globin d o m a i n s have identical exon-intron structures consistent with a recent duplication event. It should be noted that p h a s e 1 introns are f o u n d at the boundaries of the signal peptide and globin domains, suggesting that tandem duplication of the globin d o m a i n resulted from recombination in these introns. •

31.

J o h n s o n JD, E d m a n JC, Rutter WJ: A Receptor Tyrosine Kinase Found in Breast Carcinoma Cells has an Extracellular Discoidin I-Like Domain. Proc Natl Acad Sci USA 1993, 90:5677-5681.

32. s

Jiang YP, W a n g H, D'Eustachio P, Musacchio JM, Schlessinger J, Sap J: Cloning and Characterization of RPTP-k, a New Member of the Receptor Protein Tyrosine Phosphatase Family with a Proteolytically Cleaved Cellular Adhesion Molecule-Like Extracellular Region. Mol Cell Biol 1993, 13:2942-2951. The extracellular d o m a i n of this receptor protein tyrosine phosphatase contains a n A5 module, a n Ig m o d u l e a n d four fibronectin type III modules. 33. •

Falls DL, Rosen KM, Corfas G, Lane WS, Fischbach GD: ARIA, a Protein that Stimulates Acetylcholine Receptor Synthesis, is a Member of the Neu Ligand Family. Cell 1993, 72:801-815.

45. •

Dixon B, Walker B, Kimmins W, Pohajdak B: A Nematode Hemoglobin Gene Contains an Intron Previously T h o u g h t to be Unique to Plants. J Mol Evol 1992, 35:131-136. This extracellular protein contains a secretory signal peptide a n d two tandem globin domains. As in the related Ascar/s g e n e [44•], p h a s e 1 introns are f o u n d at the boundaries of the signal peptide and globin domains. 46.

Manning AM, Trotman CNA, Tate WP: Evolution of a Polymeric Globin in t h e Brine Shrimp Artemia. Nature 1990, 348:653-656.

Introns and exons Patthy 47. •

Pohajdak B, Dixon B: A C o m m e n t a r y on: 'Unexpected Intron Location in Non-Vertebrate Globin Genes' by Moens et al. (FEBS Lett 321 (1992) 105-109). FEBS Lett 1993, 320:281-283. A strong critique, pointing out s o m e errors in the interpretation o f the data presenting the exon-intron organization of the g e n e of the polymeric globin of Anemia. It is s h o w n that all introns separating individual globin d o m a i n s of this polymeric globin are p h a s e 0. 48. Wise JA: Guides to the Heart of the Spliceosome. Science • 1993, 262:1978-1979. The paper summarizes the most recent data o n h o w various snRNAs participate in the recognition and removal of spliceosomal introns a n d ligation of flanking exons. In this pr(x:ess, U5 is the long sought agent responsible for anchoring the free 5' exon a n d aligning it with the first nucleotide of the 3' exon. 49. *•

Sontheimer EJ, Steitz JA: The U5 and U6 Small Nuclear RNAs as Active Site C o m p o n e n t s of the Spliceosomes. Science 1993, 262:1989-1996. Using site-specific cross-linking experiments the authors s h o w that during splicing of spliceosomal introns U5 binds exon s e q u e n c e s at both the 5' and 3' splice sites and thus aligns the two e x o n s for ligation. This function o f U5 is relevant to the observation that exonic positions flanking spliceosomal introns (protosplice sites) s h o w remarkable conservation. 50. •

Ishii N, Wadsworth WG, Stern BD, Culotti JG, Hedge-o cock EM: UNC-6, a Laminin-Related Protein, Guides Cell and Pioneer Axon Migrations in C. elegans. Neuron 1992, 9:873-881. The protein contains three laminin B (LMB) m o d u l e s and a laminin D module. It is noteworthy that the laminin B m o d u l e s are still flanked by phase 1 introns, consistent with the view that exon-shuffling o f class 1-1 modules created this protein. 51. •

Rogalski TM, Williams BD, Mullen GP, Moerman DG: Products of the uric-52 Gene in Caenorhabditis elegans are Homologous to the Core Protein of the Mammalian Basem e n t Membrane Heparan Sulfate Proteoglycan. Genes Dev 1993, 7:1471-1484. The m o d u l e organization of this n e m a t o d e protein is strikingly similar to that o f m a m m a l i a n perlecans, except that the nematode protein lacks the EGF- and laminin A modules. In the unc-52gene, m a n y of the original p h a s e 1 introns are present at the module boundaries of its LDL, laminin B a n d Ig modules, proving that this protein was assembled by exon-shuftling. Alternative splicing o f these interdomain introns can generate proteins with variable n u m b e r of i m m u n o g l o h ulin and laminin B like modules. 52.

Delgadillo-Reynoso MG, Rollo DR, Hursh DA, Raff RA: Structural Analysis of the uEGF Gene in the Sea Urchin Strongylocentrotus purpuratus Reveals More Similarity to Vertebrate than to Invertebrate Genes with EGF-Like Repeats. J Mol Evol 1989, 29:314-327.

53. •

MacKreli AJ, Kusche-Gullberg M, Garrison K, Fessler JH: Novel Drosophila Laminin A Chain Reveals Structural Relationships Between Laminin Subunits. FASEB J 1993, 376:375-381. A striking difference between vertebrate and Drosophila laminin A chains is that in d o m a i n V there are six extra laminin B m o d u l e s in the Drosophila protein, raising the possibility that these d o m a i n s arose more recently than the rest o f the protein. Significantly, p h a s e 1 introns are still present at the boundaries of these 'young' modules. Kohorn BD, Lane S, Smith TA: An Arabidopsis Serine/Threonine Kinase Homologue with an Epidermal Growth Factor Repeat Selected in Yeast for its Specificity for a Thylakoid Membrane Protein. Proc Natl Acad Set USA 1992, 89:10989-10992. The extraceilular d o m a i n of this plant protein kinase contains two t a n d e m EGF-like modules. This observation raises the possibility that modularization o f the EGF-domain could occur prior to the divergence o f plants a n d animals. 54. •

55. •

Luehrsen KR, Walbot V: Insertion of Non-Intron Sequence into Maize Introns Interferes with Splicing. Nucl~'c Acids Res 1992, 20:5181-5187.

The data presented in this paper suggests that in plants the intron is recognized tn toto rather than as the simple assembly of splice junctions a n d a branch point. 56. Levinton JS: The Big Bang of Animal Evolution. Set Am 1992, • 266:52-59. As a result o f a n u n m a t c h e d and mysterious burst of evolutionary creativity, different phyla of metazoa with different body plans appeared almost simultaneously during the Cambrian period. 57. Kerr RA: Evolution's Big Bang gets Even More Explosive. • Science 1993, 261:1274-1275. Recent paleontological evidence suggests that the 'Cambrian explosion' is confined to a mere 5-10 million years o f the Cambrian period. 58.

Sarras MP Jr, Yan L, Z h a n g X, Grens A, St J o h n PL, Abrah a m s o n DR: Cloning and Biological Function of a Primitive Laminin in Hydra. Mol Cell Biol 1992, 3:228a.

59. •

Leung-Hagesteijn C, Spence AM, Stern BD, Zhou Y, Su MW, Hedgecock EM, Culotti JG: UNC-5, a Transmembrane Protein with Immunoglobulin and T h r o m b o s p o n d i n Type 1 Domains, Guides Cell and Pioneer Axon Migrations in C. elegans. Cell 1992, 71:289-299. The protein contains two immunoglobulin m o d u l e s and two thromb o s p o n d i n modules. Of the original p h a s e 1 introns only the o n e separating the second Ig-module from the first TSP-module is still present in the gene. 60.

Mayford M, Barzilai A, Keller F, Schacher S, Kandel ER: Modulation of an NCAM-Related Adhesion Molecule with Long-Term Synaptic Plasticity in Aplysla. Science 1992, 256:638--644.

61. •

Suzuki T, Riggs AF: Linker Chain L1 of Earthworm Hemoglobin. Structure of Gene and Protein: Homology with Low Density Lipoprotein Receptor. J Btol Chem 1993, 268:13548-13555. The protein contains an LDL-receptor module. A phase 1 intron is found at the upstream boundary of this class I-1 module. 62.

Mendoza LM, Nishioka D, Vacquier VD: A GPI-Anchored Sea Urchin Sperm Membrane Protein Containing EGF Domains is Related to H u m a n Uromodulin. J Cell Btol 1993, 121:1291-1297.

63. •

Lepage T, Ghiglione C, Gache C: Spatial and Temporal Expression Pattern During Sea Urchin Embryogenesis of a Gene Coding for a Protease Homologous to the Hum a n Protein BMP-1 and to the Product of the Drosophila Drosal-Ventral patterning Gene tollold. Development 1992, 114:147-164. This metalloprotease contains an EGF module and two Clr modules. 64. •

Gow CH, C h a n g HY, Lih CJ, Chang TW, Hui CF: Analysis of t h e Drosophila Gene for the Laminin B1 Chain. DNA Cell Btol 1993, 12:573-587. The g e n e contains only one intron in its translated region. All original phase 1 introns are absent from the boundaries of modules. 65.

Chi HC, Juminaga D, W a n g WY, Hui CF: Structure of the Drosophila Gene for the Laminin B2 Chain. DATA Cell Btol 1991, 10:451-466.

66.

Patthy L: Laminin A-Related Domains in crb Protein of Drosophila and their Possible Role in Epithelial Polarization. FEBS Lett 1991, 289:99-101.

67.

Mahoney PA, Weber U, Onofrechuk P, Biessmann ?, Bryant PJ, G o o d m a n CS: The f a t T u m o r Suppressor Gene in Drosophila Encodes a Novel Member of the Cadherin Gene Superfamily. Cell 1991, 67:853--868.

68. •

Patthy L: A Family of Laminin-Related Proteins Controlling Ectodermal Differentiation in Drosophila. FEBS Lett 1992, 298:182-184. It is s h o w n that laminin A related m o d u l e s (LMA modules) are present in bothfatand slttproteins, linked to their EGF-like modules.

69. •

Rothberg JM, Artavanis-Tsakonas S: Modularity of the Slit Protein. Characterization of a Conserved Carboxy-Termi-

391

392

Sequencesand topology hal Sequence in Secreted Proteins and a Motif Implicated in Extracellular Protein Interactions. J Mol Biol 1992, 227:267-370. A motif of slit (the novel slit module) is shown to have homologues in a number of exported proteins. 70. •

Hoshino M, Matsuzaki F, Nabeshima YI, Hama C: Hlkaru genkl, a CNA-Specific Gene Identified by Abnormal Locomotion in Drosophila, Encodes a Novel Type of Protein. Neuron 1993, 10:395-407. This extracellular matrix protein contains an immunoglobulin-module and four tandem complement B-modules. It is noteworthy that one of the B-modules is still flanked by phase 1 introns. 71.

72. •

Shimell MJ, Ferguson EL, Childs SR, O'Connor MB: The Drosophila Dorsal-Ventral Patterning Gene tolloid is Related to Human Morphogenetic Protein 1. Cell 1991, 67:469-481.

Oon SH, Hong A, Yang X, Chia W: Alternative Splicing in a Novel Tyrosine Phosphatase Gene (DPTP4E) of Drosophila melanogaster Generates Two Large Receptor-Like Proteins which Differ in their Carboxyl Termini. J Biol Chem 1993, 268:23964-23971. These proteins contain eleven fibronectin type III mtxtules in their extracellular part. The exon-intron organization no longer correlates with the module organization of the proteins.

73.

Grenningloh G, Rehm EJ, Goodman CS: Genetic Analysis of Growth Cone Guidance in Drosophila Faseiclin lI Functions as a Neuronal Recognition Molecule. Cell 1991, 67:45-57. 74. Pulido D, Campuzano S, Koda T, Modolel J, Barbacid M: • Dtrk, A Drosophila Gene Related to the trk Family of Neurotrophin Receptors, Encodes a Novel Class of Neural Cell Adhesion Molecule. EMBO J 1992, 11:391-404. The extracellular domain of this receptor tyrosine kinase contains six tandem immunoglobulin modules. 75.

76.

77.

78.

Ayme-Southgate A, Vigoreaux J, Benian G, Pardue ML: Drosophila has a Twitchin/Titin-Related Gene that Appears to Encode Projectin. Proc Natl Acad Sct USA 1991, 88:7973-7977. Lakey A, Labeit S, Gautel M, Ferguson C, Barlow DP, Leonard K, Bullard B: Kettin, a Large Modular Protein in the Z-Disc of Insect Muscles. P2C'/BOJ 1993, 12:2863-2871. Seeger MA, Haffley L, Kaufman TC: Characterization of Amalgam: a Member of the Immunoglobulin Superfamily from Drosophila. Cell 1988, 55:598-600. Sun SC, Lindstrom I, Boman HG, Faye I, Schmidt O: Hemolin: An Insect-Immune Protein Belonging to the Immtmoglobulin Superfamily. Science 1990, 250:1729-1732.

L Patthy, Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, PO Box 7, Budapest H-1518, Hungary.