3 Legume Lectins: A Large Family of Homologous Proteins A. DONNY STROSBERG, DOMINIQUE BUFFARD, MARK LAUWEREYS, AND ANDRE FORIERS
I. Common Structural Properties II. Chemotaxonomy III. Variability of Lectins in Single Plants A. Genetic Polymorphism B. Postsynthetic Modifications C. Species Polymorphism IV. Complete Sequences A. Metal Binding Sites B. Hydrophobie Cavity C. Glycosylation Site D. Carbohydrate Binding Site E. Folding and Three-Dimensional Structure V. Conclusion References
251 251 253 256 256 257 258 260 261 261 262 262 262 263
Legume lectins, extensively studied for their carbohydrate-binding properties, are widely used in biology and medicine (Lis and Sharon, 1977). Their biological function in the plants from which they were iso lated has been thoroughly discussed (Barondes, 1981; Goldstein and Etzler, 1983). A number of physiological roles have been attributed to them, including recognition of the nitrogen-fixing bacteria at the surface of roots, inhibition of growth of pathological organisms, and transport of sugars, hormones, or glycoproteins in plants. To gain further insight into the structure-function relationship of lec tins, several groups of investigators have initiated sequence studies of 249 THE LECTINS: PROPERTIES, FUNCTIONS, AND APPLICATIONS IN BIOLOGY AND MEDICINE
Copyright © 1986 by Academic Press, Inc. All rights of reproduction in any form reserved.
LEGUM I NOSEAE CAESALPINOIDEAE
BAUHINIA P. GalNAc GRIFFONIA SIMPLICIFOLIA Gal
MIMOSOIDEAE
PHASEOLEAE
SOPHOREAE
GLYCINE MAX
1 SOPHORA JAP.
GalNAc
GalNAc
PHASEOLUS
PAPILIONACEAE
LOTEAE
CARAGEAE
LOTUS TETRA-
CARAGANA
ARACHIS
GONOLOBUS
ARBORESCENS
HYPOGAEA
L-fucose
GalNAc
Gal
DIOCLEAE MEDICAGO S. Gal
CROTALARIA Gal ULEX EUR.
ENSIFORMIS Man, Glc
VULGARIS
ONOBRYCHIS
GalNAc
VICIIFOLIA
GLADIATA
Man, Glc
Man, Glc
DOLICHOS
GlcNAc
VICIEAE
CANAVALIA
CANAVALIA
BIFLORUS GalNAc
Fig. 1. Distribution of the leguminous plants from which lectins have been sequenced and that are compared in this report. The sugar-binding specificity is indicated. Canavalia ensiformis (con A) and C. gladiata have very recently been grouped together with Dioclea grandiflora into a separate tribe Diocleae.
3. Structural Homologies of Legume Lectins
251
these proteins. At the present time, seven complete amino acid sequences are available for comparison. In addition, partial sequences have been obtained for many other legume lectins. The distribution of the legumi nous plants from which the lectins have been sequenced and whose pri mary structures are compared in this chapter is presented in Fig. 1. We shall review here the available structural data on these carbohy drate-binding proteins and draw from their comparative analysis some conclusions on their evolution and postsynthetic fate. I. COMMON STRUCTURAL PROPERTIES
All available evidence suggests that legume lectins are initially synthe sized as single polypeptide chains of a molecular weight of about 30,000. After removal of a 20-residue hydrophobic leader sequence, this chain may be postsynthetically cleaved into a larger ß- and a smaller a-subunit, with the possible loss of a few amino acids. The ß- and α-subunits of pea, lentil, or fava bean lectins, for instance, or the single chain of soybean, peanut, or lotus lectins, appear to associate into dimers of about 50,000 Da and tetramers of 100,000 to 120,000 Da. So far, only one legume lectin, the lima bean agglutinin, has been reported to contain a disulfide bond, which links the two 31,000-Da subunits of the protein (Goldstein et al., 1983). In the other legume lectins, noncovalent interactions are involved in subunit binding. Variable degrees of glycosylation account for the dis persion of molecular weights as well as for most of the apparent polymor phism. II. CHEMOTAXONOMY The amino terminal sequences of seed lectins have been compared among plants from two of the three groups composing the Leguminoseae, the Caesalpinoideae, and the Papilionaceae. From the data presented in Fig. 2, it appears that, as expected, the proteins within the same group are more homologous to each other than to those from other groups. Thus, in the first 25 positions, the Bauhinia purpure a lectin differs from the Griffonia simplicifolia I-B isolectin (Lamb and Goldstein, 1984) by 16 residues and from Crotalariajuncea lectin by 21. Introduction of a threeresidue deletion after Ser19 in the Griffonia simplicifolia I-B sequence leads to a greater homology (15 common residues out of 27) between the Griffonia and Bauhinia lectins. Similarly, if a four-residue deletion is introduced in the Crotalaria juncea sequence after Leu8, then the Bau hinia lectin differs from Crotalariajuncea lectin by 15 residues and is still
123
130
140
Thr ■Asn-Ala-Leu-His-Phe-Met-Phe- Asn-Gln-Phe-Ser-Lys-Asp- ■Gln-Lys-Asp-Leu-Ile-Leu-Gln-Gly-Asp-Ala-Thr-
1 Genisteae
10
20
foot.
j.
Ala Glu-Glu- Gln-Ser-Phe-Ser-Ser Thr-Lys-Phe-Ser--Thr-Asp ■Gln-Pro-Asn-Leu-Ile-Leu-Gln-Gly-Asp-Ala-Thr-
[lllex
e.
Ser Asp-Asp- Leu
Lys
-Gln-Asn ■Gly-Lys-Asp
Lys-Asn—
Sophoreae
Sophora j .
He- Leu-
-Phe-
Carageae
Caragana a.
Ser· Leu-
-Phe- Asn-
- V a l -Pro-Asn·
Loteae
Lotus t.
-Tyr-
-G1U-
-[
Trifolieae
Medicago s.
-Phe-
-Ser-
-(
Hedysareae
ValGlu-
[
Ala-Glu-Asn-Thr- V a l -
Sainfoin I
Pro-
-Asp-Phe-Ser
Ser-Phe
Ser
( )
- A l a -Ser-Asn· •Gl u-Thr-Asp—>—Leu
.
]■ Lys
Leu-
Asp
H1 s
G1 u
LeUi
, ) Glu-Asp
-Leu·- S e r - G l y
Thr-Val-
PNA
Thr- V a l -
-Asn-Phe-Asn-Ser-
"SBA
Thr- V a l -
-Trp- Asn-
Val -Pro-Lys- Glu
-Phe· Glu-Arg
Asn -Glu-Thr- t
1
-Arg-
-Ser-
Asp-Arg
Asn -Glu-Thr· t
]
-Gly-
-Ser-
Asn-Phe Glu-Arg
Asn -Glu-Thr- t
1
-Arg-
-Ser-
-Ser-Gln-Asn-Asp-
V. cracca
ß
Favin ß V. sativa ß
Ile-Tyr
-Asn-Ile-
lentil ß Pea ß V. cracca
Thr-
-Ser
Phas. R. Pinto III Phas. t. Dolichos b.
p ne . Lys-Asn
-Glu-Gly Asn
Asn -Ser-[
]
-Ile-
Ser-Phe Gin
-Phe-
Gly-Tyr-
Gin
-Phe-
-Asn-Gly-Tyr-
-Pro
Gin
-Phe-
Gly-Tyr-
-Asn -Gln-Asn-
Gin -Phe-
-Gly-Gly-Tyr-
-Thr- •Thr
Leu-Ile-
Thr-
-Thr- •Thr
Leu-Ile-
Thr-
-Ser- ■Thr
Phe-
Thr-Asp-
- I l e - •Thr
Ile-I
-Arg -Pro
Thr-
-Thr- •Thr
Leu-II e -
-Pro
-Glu Ser-Gly-Ile-
ß
-ValGlu-
-Pro
-Thr- •Thr
Thr-
V. gram Lath. o.
Asp-Met
-Pro
Thr-
-Ile-
Ala-Ile-Asn-Phe
Thr-
-Thr- ■Thr-
-Leu-Ile-
•Pro-
-Gln
Thr-
-Thr- •Thr-
-Gin
-Tyr-
Gly-Tyr-
-Phe-
Ser-
-Phe-
-Gly-Tyr-
-Phe-
-Gly-Tyr-
-Leu-Leu-
G l y ■Pro-
Thr-Ser-Ser- ■Thr-Leu- •Thr-
-Thr-Phe· Pro-Asn
Trp· ■Ser-Asn-Thr-Gln-Glu
GriffoniaslA
[Gln-Ser- ■Asp-Ser- • V a l -
-Asn-Leu· Pro-Asn
Trp- •Ser
Val-Lys-Asp-Asn
Ile-Phe-Gln-Gly-Asp-Ala-
G r i f f o n i a slB
[Gln-Ser- •Asp-Ser- ■Val-
-Thr-Phe- Pro-Asn
Trp- •Ser
Val-Glu-Asp-Ser
Ile-Phe-Gln-Gly-Asp-Ala-
Lath. s. ß Bauhinia p.
Gly-Thr-Ser-Ile-Ile-Phe-Gln-
3.
Structural Homologies of Legume Lectins
253
more homologous to the Griffonia simplicifolia I-B subunit than to the Bauhinia lectin. Similar extensive homologies are observed between the amino terminal sequences of lectins from the Papilionaceae group of plants and even more within the single tribes. Concanavalin A also displays homologous sequences, albeit in different portions of the molecule. From the available lectin sequences presented in Fig. 2, several conclusions may be drawn: two residues, Phe6 and Phe11, are conserved in all the lectins compared, and Ser5 is found in all but three lectins, concanavalin A and two proteins, both of which belong to the Phaseoleae tribe. However, lectins from other Phaseoleae seeds also have Ser at position 5. III. VARIABILITY OF LECTINS IN SINGLE PLANTS Lectins of a single plant may seem functionally homogeneous by virtue of their binding to a particular monosaccharide, thus allowing their purifi cation by affinity chromatography. However, electrophoretic analysis of the purified lectins may reveal extensive variations. As many as eight "isolectins" have been identified by isolectric focusing of soybean agglutinin. The origin of these variations is complex and several explanations have been suggested. These include (a) genetic polymorphism, (b) postsynthetic modifications, including cleavages or glycosylation, and (c) spe cies polymorphism.
Fig. 2. N-Terminal amino acid sequences of 25 lectins from leguminous plants, com pared to a homologous portion 123 to 147 from concanavalin A. The protein from Crotalaria juncea was chosen as the model and homologies are indicated by a line. Brackets indicate a deletion introduced to maximize the homologies, and parentheses indicate unknown resi dues. References for the sequence data are as follows: for concanavalin A (the lectin of Canavalia ensiformis): Edelman et al. (1972), Hardman and Ainsworth (1972); for Crot.j. {Crotalaria juncea) and Lotus t. (tetragonolobus): Foriers et al. (1979); for Sophora j . , Caragana a., Ulex e.y Bauhinia p., Vicia graminea, and Medic ago sat.: Lauwereys (1982); for PNA (peanut agglutinin, the lectin from Arachis hypogaea) and SB A (soybean agglutinin, the lectin from Glycine max): Foriers et al. (1977); for Phas. (Phaseolus) E. and Phas. R.: Miller et al. (1975); for Pinto III (another kind of Phaseolus vulgaris) variety of PHA: Lauwereys (1982). For the beta chains of the two-chain lectins the references are: for lentil (Lens culinaris): Foriers et al. (1981); for pea (Pisum sativum): Van Driessche et al. (1976); for V. cracca: Baumann et al. (1979); for favin: Cunningham et al. (1979); for V. sativa: Gebauer et al. (1979); for Lath. o. (Lathyrus odoratus) and Lath. s. (sativa): Kolberg et al. (1980), Kolberg and Sletten (1982); for Sainfoin I (the lectin from Onohrychis viciifolia): Young et al. (1982); for Griffonia simplicifolia (GS IA and GS IB): Goldstein et al. (1983), Lamb and Goldstein (1984), for Dolichos h. (hiflorus): Etzler et al. (1977).
254
A. Donny Strosberg et al.
Fig. 3. In vitro translation of Pis urn sativum mRNA isolated from immature seeds. The translation was carried out in a reticulocyte lysate in the presence of [35S]methionine. (A) Analysis of total translation products by SDS polyacrylamide gradient gel (10-16% acrylamide). Lane 1: endogenous activity of the reticulocyte lysate; lanes 2-5: translation prod ucts of total poly(A)+ RNA isolated from immature pea seeds. (B) Immunoprecipitation of translation products. Total translation products of Pisum sativum RNA (from lanes 2-5) were immunoprecipitated with anti-lentil lectin antibodies and analyzed on a 20% acrylamide gel.
255
3. Structural Homologies of Legume Lectins
B top
32K
Fig. 3. (continued)
256
A. Donny Strosberg et al.
A. Genetic Polymorphism The genome of the plant may contain several genes, each encoding a lectin or at least a protein that cross-reacts immunologically with antibod ies raised against a lectin purified from the same plant. Several examples have now been described. In Phaseolus vulgaris, two different chains, PHA-E and PHA-L, differing from each other by six residues at their amino terminal sequence (Fig. 2), associate to yield five different tetrameric isolectins all present in the seeds. Likewise, from Vicia cracca seeds, Baumann et al. (1979) were able to affinity-purify two lectins, one specific for mannose and glucose and the other specific for N-acetylgalactosamine. Electrophoretic and sequence studies revealed a number of differences between the two proteins. The mannose/glucose-binding pro tein was composed of two chains, the 7V-acetylgalactosamine-binding pro tein of a single polypeptide. The N-terminal sequences of the two lectins are homologous, but differ by 11 substitutions out of the 25 positions examined (Fig. 2). Vodkin et al. (1983) and Goldberg et al. (1983) have recently cloned a cDNA specific for the soybean agglutinin coding se quence. Using this cDNA as a probe, they have identified two genes, Lei, which encodes the seed lectin, and Le2, homologous to Lei but of un known function. It is likely that the expression of the various genes is developmentally regulated: one gene may encode a lectin expressed in seeds and another may code for protein active in roots or leaves. B. Postsynthetic Modifications Several groups have described seed lectins composed of subunits each containing a 15,000- to 17,000-Da "long" ß-chain and a 5000- to 7000-Da "short" α-chain. To this category belong the pea, the lentil, and the fava bean lectins. Antibodies raised against these two-chain proteins immunoprecipitate a single polypeptide chain synthesized in an in vitro system containing mRNA from the seeds of the aforementioned plants. Immunoprecipitation of the pea lectin precursor is shown in Fig. 3. Small amounts of this precursor chain are copurified along with the two-chain lectins. Partial sequence studies using radiolabeled amino acid residues and HPLC peptide mapping suggest extensive if not complete homology be tween the proteins. The precursor chain is cleaved into the a- and jochains, which remain strongly bound to each other by noncovalent inter actions. Upon cleavage, small peptides may be lost. Other types of alterations may occur after synthesis: deamination of asparagine or glutamine side chains and partial or complete glycosylation. These modifications have been mentioned to explain the frequent appear-
257
3. Structural Homologies of Legume Lectins
ance of multiple bands in protein isoelectric focusing, a phenomenon also observed in seed lectins. C. Species Polymorphism Legume plant genetics was pioneered by Mendel through his famous breeding experiments on peas. Since then, however, efforts to obtain rigorously homogeneous species have not been a major concern. The study of the genome of plants therefore constitutes a difficult task. When several homologous lectin genes are revealed by cross-hybridization stud ies, it still remains to be established whether these genes are present on the same chromosome. This can only be done by in situ hybridization to metaphase chromosome or chromosome "walking," an approach still requiring a major undertaking. Isoelectric focusing has already revealed striking differences between the patterns obtained with lectins from various cultivars. For example, the analysis of peanut agglutinin (Fig. 4) revealed the existence of up to eight isolectins distributed into three related isolectin profiles, which are designated the V, S, and V2 types. From the cultivar "Pinto III," devoid A
B
C
D
E
F
x
Fig. 4. PNA isolectin profiles. (A-F) Isolectin profiles obtained from several Arachis lines by Pueppke (1981). (A) V type from A. monticola 405933. (B) V2 type from A. hypogaea 288152. (C) S type from Spanish peanut (A. hypogaea). (D) Glabrata type from A. glabrata 231319. (E) Villosa type from A. villosa 330653. (F) Villosulicarpa type from Arachis sp. 344484. X = profile of PNA prepared by Lauwereys (1982).
258
A. Donny Strosberg et al.
of common Phaseolus vulgaris agglutinin, Pusztai et al. (1981) isolated a new type of seed lectin. The "Pinto III" seed lectin chain, although related to PHA-E, differs from it by 4 residues out of 23 when the amino terminal sequences are compared (Fig. 2). A screen of soybean cultivars in the soybean germplasm collection of the U.S. Department of Agricul ture revealed several "Le-" lines that lacked detectable seed lectin pro tein (Pull et al., 1978) and this trait followed a simple recessive mode of inheritance (Orf et al., 1978). Vodkin et al. (1983) and Goldberg et al. (1983) have now shown that certain "Le-" cultivars possess an allelic form, lei, of the seed lectin gene Lei. This form differs from the Lei by six base substitutions and contains a 3.4-kb insertion element that inter rupts the coding region of the gene, thus preventing transcription of the 5' lectin sequence. IV. COMPLETE SEQUENCES The complete primary structures of seven lectins have now been deter mined, and their comparison (Fig. 5) fully supports and extends the con clusions drawn from the amino terminal sequence studies. The complete sequences of fava bean ["favin" (Cunningham et al., 1979)], lentil [CCLL" (Foriers et al., 1981)], and sainfoin seed lectins ["SL" (Hapner et al., 1983)] have been obtained by classical methods using the protein sequencer and various proteolytic cleavages. These se quences were compared to that of concanavalin A previously determined by similar methods. We also consider in this comparison the sequences of the lectins from French bean (PHA), soybean (SBA), and pea, deduced from the nucleotide sequences of cDNA prepared by reverse transcrip tion of mRNA [Hoffman et al. (1982) for Phaseolus, Vodkin et al. (1983) for soybean, and Higgins et al. (1983) for pea]. As predicted from the studies on the two-chain lectins and their single polypeptide chain precursors, it appears that all leguminous lectins are homologous throughout their sequence, provided one introduces appro priate deletions. To maximize the homology, the concanavalin A se quence has to be rearranged into two portions, one (positions 122 to 237 and 1 to 69) aligned with positions 1 to 185 of the soybean agglutinin and the other (positions 70 to 121) aligned with positions 186 to 238 of the Fig. 5. Complete sequences of SBA, favin, LL, pea, SL, PHA, and con A. The identical residues have been boxed in. Empty brackets correspond to deletions. The empty space at the end of the LL ß and the favin, as well as α-chains, may denote postsynthetic removal of small fragments. The references are given in the text.
SBA Favin LL Pea SL PHA Con A SBA
[P]K|Q-P-N-[M-[T-L-Q-G-D-A]· I fv}{ ]-T{S}S{G-I(-L-Q-L· L-T-K{A-V-[
p_D-Q-P-N-L-I T - D M YH KrlsJi UF-S-ÜIP. K-Ffli H M l-Tfsl UF-S-UT· K-F-S-P-D-Q IJIN-L-I F|Q-G-D{G-Y-T-(
m
l-T-K-jY}s-[ ■L-T-K-fÄ-V-i
K-F-S-P-D-Q- Q4N-L-I4FIQ-G-D|G-Y-T-{
A-E|N|T|{ }[V^S|{ ψ D_F-S4K-F4L-S-G4Q4E4N-L-I-L-Q-G-D4T-Y-T-D-D4S4N-R-C4L4V4L-T4R-(]-E-N4N-G· L-Q-L4S-Y-N-S-Y| r—I L-, ΓΠ160 '
HN-L-I-L-Q-G-D-AI-TTID-A· N-K-T-[ -122, I I ■—· I I «Q—I | ·—I ι 1 L -UO «a KH }{T|D-A>L-H4F4M~F4N|Q|F-S4K4D-Q4K-D-|L-I-L-Q-G-D-A τ-τ-{ ]-G-T-N4G· |P-K4P{S-S{L·
— OU
DU
1
i
/U-i
i
G-R-A-L-Y-S-T-P-I-H-I-W-D-K-E-T-G|SFV-A-S-F4A{A-S-F
L-^L-TJ-RJVJS-S· N-G
,80NJTWF^YJ ^ - " P ^ T - K Q L J Ä I
LL
I4F-V4I-D4 UA-P4N-G-Y-N-V G-R-A-L-Y-SQP-I-H-I-W-D|^T-G-N-V-AWF|T-T^T]-F ΥΠ 1-E-T-G- G - R - A - L - Y - S - T - P - I - H - I - H - D J R - ^ T f v 4 w - V - A | N | F - v l T - W - G - S ^ V ^ R - E - ^ P · N-G-Y-N-V
Pea
M
Favin
YW j-N-T-V-
SL PHA Con A SBA Favin
A-P4N-S-Y-N-V R E N-I -N{R}G-G-i{Y}> - "
J-N-T-V- G-R-A-L-Y-S- S4P-I-H-I-W-D{R4E-T-G-N-V-A}N{F-VfT
V-Q-D
^G-R^L-Y{Q|T^P-I-H·
W-D-KJQ-I-D-K-I^A-S-F-fi-T·
D-T-N4F-T4M-N-I-R-T-H-R-Q-A-[]-N-S4A *"" I4^D{S-TIT-G-N-V-A-S-F 19 1 °D m r - h 200 π Γ, 200r, ' . i— , 1Q |P|E-G-|S-S-|V-|G-R-A-L{F-Y-A-|P{V-[H-I-W|E-S-[ j-S-A-T^yjS-A-tFJE-IAJTlFJAlFJL-I-K-i ]"S-P{D-|S-{ ]>H-P-[AJ
3-D- r M - ^ R - A 4 % - S · r-1 1 70—' H - — ' 170—I
r-180 L—H —I ,, I
( 90 100 , , ID^-L-A-F-F-L-A-P-I-D-T-K-P-Q-T|-H-A-[G-Y-[ D-G+F-T F-F I4A-P4V4D-T-K-P-Q-T-G-G-G-Y-{
,—120110—.r η1—. ΛΓ —Ι Q_V-Y-A-V-E-F-D-T-F 3-L-G-L-F-R M )|E4N--E-S-G-D R-G F4[ UN-G-K- D4Y-D-K-T QJJIV-A-V-E-F-D-T-F TJV-A-V-E-F-D-T-F
LL
D-G
F-F I]A-P|V|D-T-K-P-Q-T-G-G-G-Y-[
R-G4V4F4Y4N-G-K-E-Y-D-K-T
Pea
D-G
F-F{T A-P|V4D-T-K-P-Q-T-G-G-G-Y-{
R-GJVJF-I p-fs^E-Y-D-κ-τ· T4Q
SL
+D-G4I-THIF-F-L-A-P4T D-T4Q4P4K-SiG-G-G-Y-[
R-G
[FJK-D-A-E-[ ]-S-N-E-[TJ{
{vlijiWv-t
J-O-IPJE-S-K- G4D-( 230 1 I-S-N4I-D4S-S-I-P-S4G —' S-T-G-R-L+L-G-L-F^P-D-A-N
PHA Con A SBA
-R|N°S-{
Favin
IY-N-A-A-[]-W-D-P-S-N|G-K|R-H-I-G-I|DIV-N-( D-P-S-NfG-K
}[W-D-P|pfÜ|t
J-pfST-G-I-N-V-N-f
LL
IY-N-A-A-[]-W-D-P-S-N|K-EJR-H-I-G-I|D{V-N-I D-P-S-NJK-E4R
Pea
{Y-N-A-A4 ]-W- D-P-S-N|R-D-[R·-H-I-G-I-[DJV-N-{
SL
■D-P-[A^N| YS-4H-I-G-I-N-V-N-[ HW-D
V-A-V-E-F-D-T-F
}[V^V-A-V-E-F-D-T-F }-TJV{T}V-E-F-D-T-F
A ® ©
)-ΐ|ν-Α-ν-Ε|ϊ4ρ-τ|γ1
J-S-I-R]-s-1 -R-s-1 -κ-τ-τ-s-wf D4L|A|N4N^I "
J-s-i
" K+S-I
I-T-Yj ij-S-F-
K4S4V-N4T4-K· S4V-N
4HA-F-
j-s-lv-K- s
Ι-Τ-Υ4
Con A
I-T-YJ N-N-D4I4K· s- Y-P-( }-W-D-V-H-D-Y-D4G-|Q-N· K-S-P-BT-D-I-G-[D"-P-^)Y-P4 } H-I-G-I D-I-K-R Ys- R-S K|i(^[A-K-fw]-N-M-[Ö]D4G-K-V-|G-T-AU-H
SBA
fp"-A-S^r|s-LJ L-^A-S-L 80
PHA
Favin
1 r VfY-P-SJQ-R4 -NIA-ITIT-N-V- L-SJv-TJ-Lkl·' LJ-Y-P-fil
LL Pea SL PHA Con A
N-S-L-E-E-E {N-V-T-S-Y-T-L-S-D- 1 _ D-V-vjp-E-W-V-R-I-G-F-sl -V-VISJL-K4D-A- T-{ J-R-SJL-S- IS{S-F|Y· R-N-K-P-D-[ ^D-I-F|T-fV4 ΥΚ-φH-( }1L{^-D-A|LV[Q|W-V-R-I-G[L}S| -N-S4S-T4K4VS-L-S N4l^1T-G-K-S-N4N-V| }^S4T-[T][ HV4E L-K4E4 ]-V-Y-D4W-V4S-V|G-F-S4 1 80- 1 ' 90- ' ' ' 1 '—H -N-S-V-D-K-R4L-S-A- ■V-Y4S-Y-I i—70 }|V-D-L|N-D-V-[L-P-E-W-V-R}v}G}LtS-[ PJN-A-D-A-T-S-V4 }(S-Y-|D4 J-N-V-L-[T4V S-LJTfY
SBA
220■A-A-T-G-LJD-I-P-G-nfE S+H-D-V-L-S-W-S-F
Favin
■A|T1T-G|A-E-Y-A-T&E|(
])V-L-S-W{T}F·
LL
•A|T|T-G|A-E-F-A-A^Q}E|(
HV0S-W-S-F·
Pea
A-[TJT-G4A-E-Y-A-A-[H-E][
HV-L-S-W-S-F
SL
•A-A-T-GJD-L-V-E-Q H4R-L-(
PHA Con A
, , 190 , 200 H-SlN-llL-S-D-V-V-D-L-KlT-SfL-P-E-W-V-R-I-G-F-S· L{TiGWRiL-s[E4V-V4P D-V-V4P-E-W-V-R-I-G-F-S4 ^L|TiwRlL-s)Sv-viflL-KlD-V-v]P-E-W-V-R-I-G-F-Sex 1-1 I—I ·—' r—I |V-T-S-Y-T-L{N-Ef ■ - - - - - ■ N-EJV-VJP4L-K ■D-V-VJP-E-W-V-R-I-G-F-S+
]-Y-|S-W-S-F
A-Y-Q-W-S-YJE]T4H-D-V-L-S-W-S-F -100 —mn i—1 I—z—, 110 |A-|S-[T-G-L-IY-K-E-T-N-T-I-( L-S-W-S-F
4A·
250 L-T-S-F-V-L-H-E-A-I
210 2 20 S4K-F-I-N-L-K-D-Q4K 4S-E-R-S-N-I-V-L-N-K-I-L |—120 l—J π SjK-yK-S-N{SjT-H4
260
A. Donny Strosberg et al.
soybean agglutinin and also with the peanut and sainfoin lectins. In simi lar fashion, maximum homology is observed when the lectins from the lentil, the pea, and Viciafaba are aligned, in tandem, such that their en chains correspond to residues 70 to 119 of concanavalin A and their ßchains beginning at residue 120 of concanavalin A. Such relationships have been termed circular permutations of amino acid sequences (Fig. 6) (Foriers et al.y 1978; Hopp et al., 1982; Hemperley et al., 1982; Hemperley and Cunningham, 1983; Olsen, 1983). Examining particularly important sites on lectins, one is struck by the following similarities. A. Metal Binding Sites In concanavalin A the metal binding site is situated between positions 1 to 45. It appears that this region is highly conserved among the various lectins compared here. Thus, of the 45 residues, 16 are identical in all proteins but that from Phaseolus vulgaris. All the amino acids that inter act directly with the metal ions, Ca2+ and Mn2+, in concanavalin A are among the identical residues (Glu8, Asp10, Asn14, Asp19, His24, and Ser34). Tyr12, also involved in the Ca2+ binding site, is replaced in all the proteins
Fig. 6. Diagrammatic representation of the circularly permuted sequence homology that relates concanavalin A to other leguminous lectins. [After Olsen (1983).]
3. Structural Homologies of Legume Lectins
261
by another aromatic residue, phenylalanine (cf. Roberts and Goldstein, 1984). The Phaseolus vulgaris lectin shows less homology in this se quence position, sharing only 11 amino acids with concanavalin A. In particular, only three amino acids (Glu6, Asp10, and Ser34) involved in the metal binding sites are present in the Phaseolus vulgaris sequence, and His24 is replaced by another basic residue, arginine. B. Hydrophobie Cavity Of utmost significance is the conservation in all the lectins studied of the amino acid residues that contribute to the three-dimensional structure of the well-characterized hydrophobic cavity of concanavalin A. Leu81, Val89, Phe111, Phe191, and Phe212 are found in homologous positions in the seven reported sequences. The other amino acids forming the hydropho bic cavity can be replaced by chemically homologous residues. For exam ple, Tyr54, conserved in sainfoin, Phaseolus vulgaris, and soybean aggluti nin, is substituted by Phe in favin and pea lectin (but is absent in lentil lectin); Leu85 is conserved in the sainfoin and soybean agglutinins but exchanged by Val in the other lectins. Val179 is replaced by He in all the other sequences. The conservation of this hydrophobic cavity through evolution sup ports its essential role in the function of plant lectins, especially since it has been shown that concanavalin A can bind plant auxins (Edelman and Wang, 1978) and recently that several lectins may strongly interact with adenine and cytokinin via their hydrophobic binding site (Roberts and Goldstein, 1983a,b). C. Glycosylation Site Aside from concanavalin A and lentil lectin, the lectins presented in Fig. 5 are glycosylated. The favin /3-chain contains a carbohydrate moiety attached to a charac teristic Asn-X-Thr/Ser sequence at position 169. This Asn169-Ala-Thr171 sequence is absent in lentil lectin and is replaced by Asn-Ser-Val in con canavalin A, which also lacks covalently bound carbohydrate, and by Asn140-Ser141-Ser142 in Phaseolus vulgaris, in which it has been predicted to be one of the glycosylation sites, the others being at positions 12 to 14, 65 to 67, 152 to 154, and 160 to 162 (Miller et al., 1975; Hoffman et al., 1982). However, the carbohydrate attachment site has been assigned as Asn18 in the sainfoin lectin (Kouchalakos et al., 1984) and Asn75 in soy bean agglutinin (Becker et al., 1983). Thus the carbohydrate attachment position is not conserved among the glycosylated lectins. Hence the car-
262
A. Donny Strosberg et al.
bohydrate structure may not be involved in an important functional role. D. Carbohydrate Binding Site The amino acid residues that constitute the sugar binding site in concanavalin A (Hardman and Ainsworth, 1976; Becker et al., 1976) are poorly conserved in the other mannose/glucose-binding lectins. For in stance, Asp208 is the only residue maintained in these proteins. This ob servation may be related to the difference in saccharide-binding affinities exhibited by these lectins (Horejsi et al., 1977; Goldstein and Hayes, 1978). Each subunit of the lima bean lectin contains a cysteinyl residue that has been shown to be essential for carbohydrate-binding activity. The peptide containing this residue is homologous to a region that in concanavalin A contains amino acids involved in metal binding sites and in carbohydrate contact (Goldstein et al., 1983; Roberts and Goldstein, 1984). This Cys residue may be aligned in the concanavalin A sequence with Tyr12, a residue also involved in the sugar binding site of this lectin. So far, the lima bean lectin is the only one reported to contain such a cysteinyl residue. More work is obviously needed to characterize the sugar binding sites of the various lectins. E. Folding and Three-Dimensional Structure The striking homologies between the sequences of the one- and twochain lectins strongly suggest a close resemblance in the folding and three-dimensional structure of these proteins. Indeed, the secondary structure of concanavalin A based on x-ray crystallographic determina tion and the probable secondary structure of the lentil lectin predicted by two methods of calculation appeared essentially similar (Foriers et al., 1981). By computational methods, it appears that the legume lectins also share common three-dimensional structures (Olsen, 1983).
V. CONCLUSION In conclusion, lectin structures have been highly conserved in evolu tion, presumably for ensuring the maintenance of important physiological functions yet to be characterized.
3. Structural Homologies of Legume Lectins
263
Introduction and expression of lectin genes in plants that do not pro duce these proteins should allow an investigation of their role in growth regulation and the interactions of the plant with microorganisms. ACKNOWLEDGMENTS The work reviewed here has been supported by the contract "Fixation de ΓAzote" between the University of Paris VII, Institut Pasteur, INRA and Elf Aquitaine, EMC, CdF Chimie, and Rhone-Poulenc, and by grants from the CNRS (ATP "Interactions PlantesMicroorganismes"), and the Ministry of Education ("Biologie 1983").
REFERENCES Barondes, S. H. (1981). Annu. Rev. Biochem. 50, 207-231. Baumann, C , Rüdiger, H., and Strosberg, A. D. (1979). FEBS Lett. 102, 216-218. Becker, J. W., Reeke, G. N., Cunningham, B. A., and Edelman, G. M. (1976). Nature (London) 259, 406-409. Becker, J. W., Cunningham, B. A., and Hemperly, J. J. (1983). In "Chemical Taxonomy, Molecular Biology and Function of Plant Lectins" (I. J. Goldstein and M. E. Etzler, eds.), pp. 31-45. Alan R. Liss, Inc., New York. Cunningham, B. A., Hemperly, J., Hopp, T. P., and Edelman, G. M. (1979). Proc. Natl. Acad. Sei. U.S.A. 76, 3218-3222. Edelman, G. M., and Wang, J. L. (1978). J. Biol. Chem. 253, 3016-3022. Edelman, G. M., Cunningham, B. A., Reeke, G. U., Jr., Becker, J. W., Waxdal, M. J., and Wang, J. L. (1972). Proc. Natl. Acad. Sei. U.S.A. 69, 2580-2584. Etzler, M. E., Talbot, C. F., and Ziaya, P. R. (1977). FEBS Lett. 82, 39-41. Foriers, A., Wuilmart, C , Sharon, N., and Strosberg, A. D. (1977). Biochem. Biophys. Res. Commun. 75, 980-986. Foriers, A., De Neve, R., Kanarek, L., and Strosberg, A. D. (1978). Proc. Natl. Acad. Sei. U.S.A. 75, 1136-1139. Foriers, A., De Neve, R., and Strosberg, A. D. (1979). Physiol. Weg. 17, 597-606. Foriers, A., Lebrun, E., Van Rappenbusch, R., De Neve, R., and Strosberg, A. D. (1981). J. Biol. Chem. 256, 5550-5560. Gebauer, G., Schütz, E., Schimpl, A., and Rüdiger, H. (1979). Hoppe-Seyler's Z. Physiol. Chem. 360, 1727-1735. Goldberg, R. B., Hoschek, G., and Vodkin, L. O. (1983). Cell 33, 465-475. Goldstein, I. J., and Etzler, M. E., eds. (1983). "Chemical Taxonomy, Molecular Biology, and Function of Plant Lectins." Alan R. Liss, Inc., New York. Goldstein, I. J., and Hayes, C. E. (1978). Adv. Carbohydr. Chem. Biochem. 35, 127-340. Goldstein, I. J., Lamb, J. E., Roberts, D. D., and Poola, I. (1983). In "Chemical Taxonomy, Molecular Biology and Function of Plant Lectins" (I. J. Goldstein and M. E. Etzler, eds.), pp. 21-29. Alan R. Liss, Inc., New York. Hapner, K. D., Kouchalakos, R. N., and Bradshaw, R. A. (1983). In "Chemical Taxonomy, Molecular Biology and Function of Plant Lectins" (I. J. Goldstein and M. E. Etzler, eds.), pp. 255-258. Alan R. Liss, Inc., New York.
264
A. Donny Strosberg et al.
Hardman, K. D., and Ainsworth, C. F. (1972). Biochemistry 11, 4910-4919. Hardman, K. D., and Ainsworth, C. F. (1976). Biochemistry 15, 1120-1128. Hemperley, J. J., and Cunningham, B. A. (1983). Trends Biochem. Sei. 8, 100-102. Hemperley, J. J., Becker, J. W., and Cunningham, B. A. (1982). In "Proteins in Biology and Medicine" (R. A. Bradshaw, C.-C. Liang, R. L. Hill, T.-C. Tsao, J. Tang, and C.-L. Tsou, eds), pp. 395-409. Academic Press, New York. Higgins, T. J. V., Chandler, P. M., Zurawski, G., Button, S. C , and Spencer, D. (1983). J. BioL Chem. 258, 9544-9549. Hoffman, L. M., Ma, Y., and Barker, R. F. (1982). Nucleic Acids Res. 10, 7819-7828. Hopp, T. P., Hemperley, J. J., and Cunningham, B. A. (1982). J. BioL Chem. 257, 44734483. Horejsi, V., Ticha, M., and Kocourek, J. (1977). Biochim. Biophys. Acta 499, 290-300. Kolberg, J., and Sletten, K. (1982). Biochim. Biophys. Acta 704, 26-30. Kolberg, J., Michaelson, T. E., and Sletten, K. (1980). FEBS Lett. 117, 281-283. Kouchalakos, R. N., Bates, O. J., Bradshaw, R. A., and Hapner, K. D. (1984). Biochemistry 23, 1824-1830. Lamb, J. E., and Goldstein, I. J. (1984). Arch. Biochem. Biophys. 229, 15-26. Lauwereys, M. (1982). Thesis, Free University of Brussels. Lis, H., and Sharon, N. (1977). In "The Antigens" (M. Sela, ed.), Vol. 4, pp. 429-529. Academic Press, New York. Miller, J. B., Hsu, R., Heinrikson, R., and Yachnin, S. (1975). Proc. Natl. Acad. Sei. U.S.A. 72, 1388-1391. Olsen, K. W. (1983). Biochim. Biophys. Acta 743, 212-218. Orf, J. H., Hymowitz, T., Pull, S. P., and Pueppke, S. G. (1978). Crop Sei. 18, 899-900. Pueppke, S. G. (1981). Arch. Biochem. Biophys. 212, 254-261. Pull, S. P., Pueppke, S. G., Hymowitz, T., and Orf, J. H. (1978). Science 200, 1277-1279. Pusztai, A., Grant, G., and Stewart, J. C. (1981). Biochim. Biophys. Acta 671, 146-154. Roberts, D. D., and Goldstein, I. J. (1983a). In "Chemical Taxonomy, Molecular Biology and Function of Plant Lectins" (I. J. Goldstein and M. E. Etzler, eds.), pp. 131-141. Alan R. Liss, Inc., New York. Roberts, D. D., and Goldstein, I. J. (1983b). J. BioL Chem. 258, 13820-13824. Roberts, D. D., and Goldstein, I. J. (1984). J. BioL Chem. 259, 909-914. Van Driessche, E., Foriers, A., Strosberg, A. D., and Kanarek, L. (1976). FEBS Lett. 71, 220-223. Vodkin, L. O., Rhodes, P. R., and Goldberg, R. B. (1983). Cell 34, 1023-1031. Young, N. M., Williams, R. E., Roy, C , and Yaguchi, M. (1982). Can. J. Biochem. 60, 933941.