Evaluation of Genetic Variation and Association in the Matrix Metalloproteinase 9 (MMP9) Gene in ESRD Patients Shohei Hirakawa, MD, Ethan M. Lange, PhD, Carla J. Colicigno, BS, Barry I. Freedman, MD, Stephen S. Rich, PhD, and Donald W. Bowden, PhD ● Background: Matrix metalloproteinases are Zn2ⴙ- and Ca2ⴙ-dependent endopeptidases secreted by many cells. Expression of the matrix metalloproteinase 9 (MMP9) gene is increased in a variety of renal diseases. Several genetic studies have associated MMP9 with end-stage renal disease (ESRD). Methods: In this study, 2.2 kb of the promoter region and all 13 exons (3.3 kb) of MMP9 genomic DNA were scanned for polymorphisms. Genetic associations between MMP9 polymorphisms and renal disease were evaluated. Results: Eleven single-nucleotide polymorphisms (SNPs; 4 promoter, 6 coding region, and 1 3ⴕ untranslated region [UTR]) were identified in Caucasians and 19 SNPs (11 promoter, 8 coding region, 1 3ⴕ UTR) were identified in African Americans. A previously identified highly polymorphic (CA)n repeat in the promoter region of MMP9 also was evaluated. We found 15 alleles in Caucasians and 14 alleles in African Americans. Allele frequencies, genotypes, and 3-marker haplotypes were compared between patient and control populations. Differences were not observed using single-locus analyses. Two haplotypes that included the (CA)n repeat allele in African-American patients with type 2 diabetic nephropathy (T2DM/ESRD) showed borderline significant differences. Dichotomizing the (CA)n repeat distribution showed that shorter alleles in Caucasian cases were associated with ESRD using an additive disease-predisposing model (P ⴝ 0.05). Analysis of the (CA)n repeat in expanded sets of subjects showed strong evidence for an association of shorter alleles with ESRD in Caucasians (P ⴝ 0.00012) and suggested a similar trend in African Americans with T2DM/ESRD (P ⴝ 0.086) and subjects without T2DM/ESRD (P ⴝ 0.047). Conclusion: This comprehensive analysis of MMP9 and renal disease suggests a possible role for the (CA)n repeat in renal disease, consistent with previous reports. Am J Kidney Dis 42:133-142. © 2003 by the National Kidney Foundation, Inc. INDEX WORDS: Matrix metalloproteinase; renal disease; polymorphism; African Americans.
M
ATRIX metalloproteinases are members of a family of Zn2⫹- and Ca2⫹-dependent endopeptides secreted by many cells as proenzymes.1 The matrix metalloproteinase 9 (MMP9) gene is located in 20q12-q13.1 and has been linked to cardiovascular disease2 and tumor metastasis.3-5 MMP9 is secreted by human glomerular mesangial cells, and levels are increased in a variety of renal diseases, such as membranous nephropathy and renal fibrosis,6,7 suggesting that MMP9 may be associated with progressive nephropathy. Diabetic nephropathy is one of the most serious complications of diabetes. Recent linkage and association studies in Caucasian patients with type 2 diabetes and associated nephropathy suggest that at least 1 gene is located in 20q1213.1.8 Our studies have been performed in patients with diabetes-associated renal disease, suggesting a link of this region also with renal disease and that MMP9 might be one of the susceptibility genes for renal disease in type 2 diabetes. To investigate the association between MMP9 and type 2 diabetic nephropathy (T2DM/ end-stage renal disease [ESRD]), we scanned 2.2 kb of the promoter region and the 13 exons (3.3 kb) of MMP9 for sequence variation and deter-
mined allele frequencies in Caucasian and African-American population samples. MATERIALS AND METHODS
Patients Identification, clinical characteristics, and recruitment of patients and controls have been described in detail.9 In brief, unrelated African-American and Caucasian case patients with T2DM/ESRD were dialysis dependent and had at least background diabetic retinopathy or a urinalysis with 3⫹ or greater protein excretion in the absence of other recognized causes of nephropathy. Diabetes mellitus was diagnosed in these individuals at least 5 years before the initiation of renal
From the Departments of Biochemistry, Public Health Sciences, and Internal Medicine; and Center for Human Genomics, Wake Forest University School of Medicine, Winston-Salem, NC. Received June 7, 2002; accepted in revised form January 21, 2003. Supported in part by grants no. RO1 DK53591 (D.W.B.), RO1 DK56289 (D.W.B.), and RO1 HL56266 (B.I.F.) from The National Institutes of Health. Address reprint requests to Donald W. Bowden, PhD, Department of Biochemistry, Wake Forest University School of Medicine, Medical Center Blvd, Winston-Salem, NC 27157. E-mail:
[email protected] © 2003 by the National Kidney Foundation, Inc. 0272-6386/03/4201-0014$30.00/0 doi:10.1016/S0272-6386(03)00416-5
American Journal of Kidney Diseases, Vol 42, No 1 (July), 2003: pp 133-142
133
134
replacement therapy. African-American cases with nondiabetic ESRD were unrelated dialysis-dependent individuals with hypertension-associated and chronic glomerular disease– associated ESRD. Unrelated African-American and Caucasian “healthy” controls were born in North Carolina and denied personal knowledge of renal disease or the presence of first-degree relatives with ESRD. Initially, we analyzed genomic DNA from 75 Caucasian patients with T2DM/ESRD, 95 Caucasian healthy controls, 91 African-American patients with T2DM/ESRD, 96 AfricanAmerican patients with nondiabetic nephropathy (hypertensive nephropathy, chronic glomerular diseases, and disease of unknown origin), and 86 African-American healthy controls. This initial survey population had power equal to 0.80 to detect an association, when present, with underlying odds ratios equal to 3.25, 2.45, and 2.5 for markers with minor allele frequencies of 0.10, 0.30, and 0.50, respectively, in controls. Power to detect associations with an odds ratio of 2.0 was equal to 0.34, 0.58, and 0.58 for markers with minor allele frequencies of 0.10, 0.30, and 0.50, respectively, in controls. To evaluate the (CA)n repeat in greater detail, additional patients (see text) were recruited using the same criteria.
Polymorphism Scanning of MMP9 Polymerase chain reaction (PCR) primers of the 13 exons and promoter region (2.2 kb) in MMP9 were designed and modified from previous reports10 to be between 150 and 300 bp (Table 1) and analyzed by PCR, followed by singlestrand conformation polymorphism, restriction enzyme digestion, or denaturing high-performance liquid chromatography (DHPLC). All potential sequence differences were confirmed by direct DNA sequencing.
Single-Strand Conformation Polymorphism Analysis PCR primer pairs were end-labeled with [␥-32P]deoxyadenosine triphosphate (ICN Radiochemicals, Irvine, CA.). PCR amplification was performed in a total 10-L volume containing 20 ng of genomic DNA; 50 mmol/L of potassium chloride; 100 mmol/L of tris(hydroxymethyl)aminomethane hydrochloride (Tris-HCl), pH 8.8; 1.2 mmol/L of magnesium chloride; 0.2 mmol/L of deoxyribonucleoside triphosphates (dNTPs); 4 pmol each of unlabelled primer; 0.9 mmol each of end-labeled primer; 8 nmol of spermidine; and 1.0 U of Taq polymerase. Products were denatured and analyzed by electrophoresis on native 0.5%MDE/5% glycerol (FMC Products, Rockland, MA) gels in 0.6⫻ Tris borate EDTA at room temperature at 12 to 15 W for 15 to 24 hours. Gels were exposed overnight to X-ray film (Fuji, Stamford, CT) and checked for the presence of abnormal migration patterns.
DHPLC Analysis PCR for DHPLC was performed in a total 30 L volume containing 60 ng of genomic DNA, 12 pmol of each primer, 0.2 mmol/L of dNTPs, 50 mmol/L of potassium chloride, 100 mmol/L of Tris-HCl (pH 8.8), 1.5 mmol/L of magnesium chloride, and 1.0 U of Taq polymerase. DHPLC analysis was performed on an automated DHPLC instrument (Transgenomic Inc, San Jose, CA). Conditions for DHPLC
HIRAKAWA ET AL
analysis were calculated using WAVEmaker Utility Software (Transgenomic Inc) and tested at 4 to 5 different temperatures using 5-L injection from 20 samples. After an optimum temperature was identified, 5-L samples were analyzed.
Genotyping of Dinucleotide Repeats in the Promoter Region of MMP9 PCR was performed using primers as follow: MMP9 (CA)n-F, 5⬘-GTCTTGCCTGACTTGGCAGT-3⬘, and MMP9 (CA)n-R, 5⬘-GTTGTGGGGGCTTTAAGGAG-3⬘. The forward primer was labeled with fluorescent dye. (CA)n dinucleotide repeats were genotyped using an ABI model 377 automated DNA sequencer (Foster City, CA), as described in detail previously.11
Analysis We performed tests of homogeneity of single-marker allele and genotype frequencies separately for the Caucasian (11 markers) and African-American populations (19 markers) using either Pearson’s chi-square test or Fisher’s exact test. Two-df tests were used when calculating genotype association tests. In the event that the expected number of homozygotes for a particular allele was less than 5, counts for rare homozygotes were combined with heterozygote counts and a single-df test was performed. We also performed a likelihood ratio test of homogeneity of haplotype frequencies for each combination of 3 consecutive markers. For this test, we estimated haplotype frequencies separately for cases and controls using the expectation-maximization algorithm, which accounts for missing genotype data and haplotype phase information. The likelihood ratio statistic compared the product of maximum likelihoods for the 2 samples with the maximum likelihood for the 2 samples analyzed jointly. Because of small estimated haplotype frequencies, we assessed statistical significance using a permutation test based on 10,000 random replicates. To construct these permuted samples, we randomly permuted affection status of cases and controls, keeping the marker data the same. For each permuted sample, we calculated the likelihood ratio test and estimated P as the proportion of permuteddata statistics greater than the observed test statistic. The (CA)n repeat in the promoter region was analyzed several different ways. Homogeneity of allele frequency tests was performed using Pearson’s chi-square test for 2 ⫻ n contingency tables. Rare alleles, alleles with expected counts less than 5, were pooled in these analyses. To test whether the total number of (CA)n repeats was associated with affection status, we initially dichotomized the allele distribution at (CA)15 repeats [ie, (CA)n for n less than 15 was designated allele 1, and (CA)n for n less than 15 was designated allele 2]. This partitioning was chosen a priori based on the bimodal distribution of (CA)n repeat. Using this dichotomization, homogeneity of allele frequency tests were performed using 1-df chi-square tests. Homogeneity of genotype frequencies were tested using 2-df chi-square tests. With the detection of a potential association between the dinucleotide repeat polymorphism and ESRD in both the Caucasian and African-American populations, an expanded set of DNA samples (see text) was genotyped and data were
MATRIX METALLOPROTEINASE GENETICS Table 1.
135 PCR Primer Sets for MMP9 Analysis
Primer*
Sequence (5⬘-3⬘)
P15-F P15-R P16-F P16-R P13-F P13-R P14-F P14-R P11-F P11-R P12-F P12-R P9-F P9-R P10-F P10-R P1-F P2-R P3-F P4-R P5-F P6-R E1-F E1-R E2-F E2-R E3-F E3-R E4-F E4-R E5-F E5-R E6-F E6-R E7-F E7-R E8-F E8-R E9-F E9-R E10-F E10-R E11-F E11-R E12-F E12-R E13-F E13-R
GCTTCAGAGCCAGGCAGTTC CTGGATTTCCATCCCGGCTC TGGTTCAGAGGTAAAGTGAC CTAGGATTACGGGCATGAGC GCCTGGCACATAGTAGGCCC CAGTGGCGTGATCTCGGCTC AGCTACTCGGGAGGCTGAGG CTTCCTAGCCAGCCGGCATC CAAATAGGGCTTTGAAGAAG TTTCTCAGCTTAGGATTCTT TAAGGGCTCCTATAGATTAT TTGAGCCCAGAAAGAAGGTC GTGACATAATCATGGCTCAC GAAGTGTTTCCTCAGCTGTA CCTCACATCAATTTAGGGAC CTTCCTCTCCCTGCTTCATC GAATTCCCCAGCCTTGCCTA CCAAGGGAAAGTGATGGAAG CTCAGGGAGTCTTCCATCAC GCAGCACCAGCATGAGAAAG CCCTGACCCCTGAGTCAGCA TGGGGGCAGCAAAGCAGCG CTGACCCCTGAGTCAGCAC TTGCCCACCTCTGCCAGC TGATCCACAGGAATACCTG CCCGGCTCACCAATAGGTG TACGCTACAGGATCCAAAAC ACGTTCTCACCCGCGACAC GTTTCTTCAGAGCACGGAGAC GAATCTAACCGACGCCCCT CCTCCTGCAGTGGTTCCAAC CCTCACTCACTCTCGCTGGG CTCGCCCCAGGACTCTACAC GTGGAGGTACCTCGGGTCGGG GTCTCTCCAGCTGACTCGAC CCACGCCTACCTTGGTCCGGG CTCCCTCCAGGATACAGTTTG CCTGCCTCACCATAGAGGTGCC CTCTTTTTAGGTCCTCGCCC GCCTCCTCACCCATCCTTGA GCTTTCTCAGGAAGTACTGGC GGTAACTAACCAGAGAAGAAG TCCCCTGCAGGGCGCCAGG CGGCGCTCACCTCCAGAGC CTGCCCGCAGGTTCGACGTG CAGCCCTCACCTTGGTACTG TCTCCTGCAGAGAAAGCC AAAGGTTAGAGAATCCAAG
Size (bp)/Amplified Sequence
300 promoter† 260 promoter 281 promoter 231 promoter 240 promoter 239 promoter 290 promoter 268 promoter 240 promoter 258 promoter 220 promoter 252 exon 1 253 exon 2 169 exon 3 149 exon 4 194 exon 5 194 exon 6 197 exon 7 176 exon 8 300 exon 9 160 exon 10 171 exon 11 124 exon 12 320 exon 13
Position
⫺2170 to ⫺2150 ⫺1890 to ⫺1870 ⫺1942 to ⫺1922 ⫺1702 to ⫺1682 ⫺1753 to ⫺1733 ⫺1492 to ⫺1472 ⫺1553 to ⫺1533 ⫺1340 to ⫺1322 ⫺1379 to ⫺1359 ⫺1159 to ⫺1139 ⫺1212 to ⫺1192 ⫺933 to ⫺973 ⫺1022 to ⫺1002 ⫺752 to ⫺732 ⫺803 to ⫺783 ⫺554 to ⫺535 ⫺599 to ⫺579 ⫺379 to ⫺359 ⫺389 to ⫺369 ⫺152 to ⫺131 ⫺139 to ⫺119 ⫹61 to ⫹83 ⫺92 to ⫺73 From ⫹7 (beginning of intron 1) From ⫺1 (beginning of exon 2) From ⫹9 (beginning of intron 2 From ⫺11 (beginning of exon 3) From ⫹10 (beginning of intron 3) From ⫺10 (beginning of exon 4) From ⫹10 (beginning of intron 4) From ⫺10 (beginning of exon 5) From ⫹10 (beginning of intron 5) From ⫺10 (beginning of exon 6) From ⫹10 (beginning of intron 6) From ⫺10 (beginning of exon 7) From ⫹10 (beginning of intron 7) From ⫺10 (beginning of exon 8) From ⫹10 (beginning of intron 8) From ⫺10 (beginning of exon 9) From ⫹10 (beginning of intron 9) From ⫺10 (beginning of exon 10) From ⫹10 (beginning of intron 10) From ⫺10 (beginning of exon 11) From ⫺10 (beginning of intron 11) From ⫺10 (beginning of exon 12) From ⫹10 (beginning of intron 12) From ⫺10 (beginning of exon 13) From ⫹191 (stop codon)
*Primers were designed and modified from a previous report.2 †Promoter positions are numbered from the start of the transcription.
analyzed in a similar manner with the dinucleotide repeat polymorphism, but in this case, dichotomizing for all possible partitions between larger and smaller numbers of (CA)n repeat.
To measure the strength of disequilibrium, we used Lewontin’s D⬘.12 For 2-allele markers, D⬘ is the standardized disequilibrium value that takes the usual disequilibrium coefficient P( A i B j ) ⫺ P( A i ) P(B j ) and divides it by its
136
HIRAKAWA ET AL Table 2.
Polymorphisms in the MMP9 Gene Allele Frequency Caucasian
Position
Promoter ⫺2119 bp ⫺2078 bp ⫺2075 bp ⫺2027 bp ⫺2012 bp ⫺1989 bp ⫺1979 bp ⫺1919 bp ⫺1562 bp ⫺633 bp Coding region Gly15Gly Ala20Val Arg279Gln Pro574Arg Gly607Gly Gln668Arg Cys674Cys Val694Val 3⬘ UTR ⫹6bp
African American
Variant
T2DM
Control
T2DM
Non-T2DM
Control
C3T T3G T3C G3A C3T T3A G3A C3T C3T T3G
0.98, 0.02 — — — — — — 0.68, 0.32 0.86, 0.14 —
0.96, 0.04 — — 0.99, 0.01 — — — 0.68, 0.32 0.86, 0.14 —
0.98, 0.02 — — 0.98, 0.02 0.68, 0.32 0.99, 0.01 — 0.72, 0.28 0.90, 0.10 0.68, 0.32
— 0.99, 0.01 0.99, 0.01 — 0.71, 0.29 — — 0.74, 0.26 0.92, 0.08 0.72, 0.28
— 0.98, 0.02 — — 0.74, 0.28 — 0.99, 0.01 0.72, 0.28 0.90, 0.10 0.74, 0.26
C3T C3T G3A C3G A3C A3G C3T G3A
— 0.99, 0.01 0.30, 0.70 0.96, 0.04 0.36, 0.64 0.15, 0.85 — 0.87, 0.13
— 0.98, 0.02 0.35, 0.65 0.94, 0.06 0.42, 0.58 0.15, 0.85 — 0.83, 0.17
0.99, 0.01 0.98, 0.02 0.33, 0.67 0.84, 0.16 0.59, 0.41 0.14, 0.86 0.99, 0.01 0.83, 0.17
0.98, 0.02 — 0.32, 0.68 0.84, 0.16 0.55, 0.45 0.16, 0.84 0.99, 0.01 0.89, 0.11
0.97, 0.03 — 0.33, 0.67 0.84, 0.16 0.60, 0.40 0.18, 0.82 0.99, 0.01 0.86, 0.14
C3T
0.40, 0.60
0.46, 0.54
0.79, 0.21
0.76, 0.24
0.76, 0.24
NOTE. Promoter polymorphisms are numbered from the start of transcription. Polymorphism in 3⬘UTR is numbered from the stop codon. Abbreviations: T2DM, type 2 diabetes; non-T2DM, subjects without diabetes but with renal failure.
maximum (if positive) or minimum (if negative) possible value. Because of the arbitrary nature of the sign of D⬘ (D⬘ ranges between ⫺1 and 1), we report the absolute value of D⬘.
RESULTS
Sequence Variations in Caucasian and African-American Populations We screened 75 patients with type 2 diabetic nephropathy and 95 healthy controls in a Caucasian population. Ten sequence variants in patients with type 2 diabetic nephropathy and 11 variants in healthy controls were identified (Table 2): 3 variants in the promoter, 6 variants in the coding region, and 1 variant in the 3⬘ untranslated region (UTR) in patients; and 5 variants in the promoter, 4 variants in the coding region, and 1 variant in the 3⬘ UTR in healthy controls. Four of the 6 coding region substitutions led to a change in amino acid codons: Ala20Val in exon 1, Arg279Gln in exon 6, Pro574Arg in exon 10, and Gln668Arg in exon 12. Minor allele frequen-
cies ranged from 1% to 2% for the A20V C to T polymorphism to 46% for the 3⬘UTR ⫹6 bp C to T polymorphism. We screened 91 patients with T2DM/ESRD, 96 patients without diabetes with nephropathy, and 86 healthy controls in an African-American population. Sixteen sequence variants in patients with T2DM/ESRD and 14 variants in healthy controls and patients without diabetes with renal disease were found (Table 2). A number of additional variants were observed in African Americans that were not observed in Caucasians; however, these variants, eg, the ⫺2,078 bp T to G single-nucleotide polymorphism (SNP), were rare in the African-American population. Seven variants were in the promoter; 8 variants, in the coding region; and 1 variant, in the 3⬘ UTR in patients with T2DM/ESRD. Six variants were in the promoter; 7 variants, in the coding region; and 1 variant, in the 3⬘ UTR in healthy controls and patients without diabetes with renal disease.
MATRIX METALLOPROTEINASE GENETICS
137
Fig 1. Genomic structure of the human MMP9 gene showing exon structure (boxed regions) and location of polymorphic DNA sequences genotyped in this study.
Four of the 8 coding region substitutions changed the amino acid sequence. These variants were seen in both African-American and Caucasian populations. A summary of variants observed in both Caucasian and African-American populations and their location in the MMP9 gene are shown in Fig 1. Association Analysis of MMP9 SNPs With Renal Disease Frequencies of alleles and genotypes in individual SNP markers were compared between patient and control populations (data not shown). We found no significant differences in allele or genotype frequencies between patients and controls in either racial group, but there were significant differences in allele frequencies between Caucasian and African-American populations (Table 3), assessed by Pearson’s chi-square test or Fisher’s exact test. Significantly different allele frequencies were observed with C-2119T
(P ⬍ 0.01), C-2012T (P ⬍ 0.01), C-1562T (P ⬍ 0.05), and T-633G (P ⬍ 0.01) in the promoter region; Gly15Gly (P ⫽ 0.04), Pro574Arg (P ⬍ 0.01), and Gly607Gly (P ⬍ 0.01) in the coding region; and C⫹6bpT (P ⬍ 0.01) in the 3⬘ UTR. Association Analysis of the (CA)n Repeat in the Promoter Region of MMP9 With Renal Disease in the Initial Case-Control Populations Tests of homogeneity (using 2 ⫻ n tests) of the distribution of (CA)n repeat lengths (data not shown) initially showed no significant difference between cases and controls for either Caucasians or African Americans. Dichotomizing the (CA)n repeat distribution suggested significantly shorter alleles (for [CA]n, n ⬍ 15) in Caucasian cases (P ⫽ 0.05) versus Caucasian controls. Using this dichotomization, no difference was detected in the distribution of genotypes using a 2-df test (P ⫽ 0.13); however, a significant difference was detected when an additive disease-predisposing
138
HIRAKAWA ET AL Table 3. MMP9 SNP Allele Frequencies in Caucasian and African-American Populations Allele Frequency
Position
Promoter ⫺2119 bp ⫺2078 bp ⫺2075 bp ⫺2027 bp ⫺2012 bp ⫺1989 bp ⫺1979 bp ⫺1919 bp ⫺1562 bp ⫺633 bp Coding region Gly15Gly Ala20Val Arg279Gln Pro574Arg Gly607Gly Gln668Arg Cys674Cys Val694Val 3⬘-UTR ⫹6 bp
Variant
Caucasian
African American
C3T T3G T3C G3A C3T T3A G3A C3T C3T T3G
0.97, 0.03 — — 0.99, 0.01 — — — 0.68, 0.32 0.86, 0.14 —
0.99, 0.01* 0.99, 0.01 0.996, 0.004 0.99, 0.01 0.71, 0.29* 0.996, 0.004 0.998, 0.002 0.73, 0.27 0.91, 0.09† 0.71, 0.29*
C3T C3T G3A C3G A3C A3G C3T G3A
— 0.99, 0.01 0.33, 0.67 0.95, 0.05 0.39, 0.61 0.15, 0.85 — 0.85, 0.15
0.98, 0.02† 0.99, 0.01 0.32, 0.68 0.84, 0.16* 0.58, 0.42* 0.16, 0.84 0.99, 0.01 0.87, 0.13
C3T
0.44, 0.56
0.77, 0.23*
NOTE. Promoter polymorphisms are numbered from the start of the transcription; polymorphism in 3⬘ UTR is numbered from the stop codon. *P ⬍ 0.01 relative to Caucasian population. †P ⬍ 0.05 relative to Caucasian population.
model was assumed (P ⫽ 0.05). No significant differences in allele or genotype frequencies were detected using this dichotomization strategy and applying it to our African-American sample of cases and controls. Haplotype Analysis of MMP9 None of the individual loci tested showed strong evidence of difference between cases and controls. Haplotype analyses can increase the power to detect differences. Consequently, we evaluated frequencies of haplotype combinations of 3 contiguous loci between patients and controls (T2DM versus controls in both populations; non-T2DM versus controls and T2DM versus non-T2DM in an African-American population). We did not find significant differences, but C-1562T-T-633G-(CA)n and T-633G-(CA)nGly15Gly haplotypes in both African-American patients with T2DM/ESRD and controls showed
borderline significant differences (P ⫽ 0.055 and P ⫽ 0.070; Table 4). A summary of haplotype frequencies listed in Table 5 shows that several haplotypes appear to be different in frequency between cases and controls, but the limited number of individuals involved in the analysis precludes defining specific risk haplotypes. We also evaluated other combinations (patterns other than 3 contiguous loci) of polymorphisms to evaluate if other 3-marker haplotypes were significantly different between cases and controls. No additional evidence of association was observed with this analysis (data not shown). Association Analysis of the MMP9 (CA)n in Expanded Populations of Caucasians and African Americans With this second suggestion that the (CA)n repeat differs in frequency distribution between cases and controls, we analyzed the (CA)n repeat polymorphism in a greatly enlarged number of case and control patients for both AfricanTable 4. Haplotype Association Analysis Between Patients and Controls in African-American and Caucasian Populations P Haplotypes
African American C-2119T-C2012T-C1919T C2012T-C1919T-C-1562T C1919T-C-1562T-T-633G C-1562T-T-633G-(CA)n T-633G-(CA)n-Gly15Gly (CA)n-Gly15Gly-Ala20Val Gly15Gly-Ala20Val-Arg279Gln Ala20Val-Arg279Gln-Pro574Arg Arg279Gln-Pro574Arg-Gly607Gly Pro574Arg-Gly607Gly-Gln668Arg Gly607Gly-Gln668Arg-Gys67Cys Gln668Arg-Gys67Cys-Val694Val Gys67Cys-Val694Val-C⫹6bpT Caucasian C-2119T-C-1919T-C-1562T C-1919T-C-1562T-(CA)n C-1562T-(CA)n-Ala20Val (CA)n-Ala20Val-Arg279Gln Ala20Val-Arg279Gln-Pro574Arg Arg279Gln-Pro574Arg-Gly607Gly Pro574Arg-Gly607Gly-Gln668Arg Gly607Gly-Gln668Arg-Val694Val Gln668Arg-Val694Val-C⫹6bpT
T2DM
Non-T2DM
0.133 0.821 0.711 0.055 0.070 0.229 0.130 0.387 0.980 0.725 0.794 0.692 0.740
0.794 0.925 0.897 0.658 0.710 0.794 0.794 0.987 0.847 0.886 0.708 0.895 0.794
0.488 0.249 0.549 0.509 0.329 0.852 0.645 0.119 0.194
MATRIX METALLOPROTEINASE GENETICS
139
Table 5. Frequencies of 3-Marker Haplotypes Including Microsatellite Polymorphism
Haplotype
(C-1562T)-(T-633G)-(CAn) 1 1 14 1 1 20 1 1 21 1 1 22 1 1 23 1 2 21 1 2 23 2 1 20 Other (T-633G)-(CAn)-(Gly15Gly) 1 14 1 1 20 1 1 21 1 1 22 1 1 23 1 2 21 1 2 22 1 2 23 1 Other
Non-T2DM ESRD
T2DM ESRD
Controls
0.233 0.038 0.084 0.126 0.078 0.127 0.064 0.042 0.208
0.179 0.068 0.142 0.042 0.107 0.160 0.024 0.060 0.218
0.231 0.058 0.102 0.067 0.101 0.101 0.046 0.044 0.250
0.216 0.076 0.117 0.127 0.077 0.127 0.009 0.065 0.186
0.180 0.128 0.120 0.056 0.119 0.164 0.057 0.021 0.155
0.220 0.108 0.139 0.077 0.099 0.098 0.065 0.044 0.150
NOTE. Haplotypes with a frequency less than 0.04 are summed as combined frequencies in other. Abbreviation: Non-T2DM, subjects without diabetes but with renal failure.
American and Caucasian populations. This expanded set of samples consisted of 347 Caucasian controls, 352 Caucasian patients with T2DM/ ESRD, 237 African-American controls, 283 African-American patients with T2DM/ESRD, and 287 African-American patients with ESRD without diabetes. Fifteen variants (between 12 and 28 repeats) in a Caucasian population and 14 variants (between 14 and 27 repeats) in an African-American population were found (Table 6). The most common allele in Caucasians contained 14 repeats (56% of all individuals tested) compared with 25% repeats in all African Americans. In African Americans, the allele with 21 repeats was almost as common (23% of the total) as the 14-repeat allele. Results of the analysis are listed in Table 7. In Caucasians, there is a strong association (P ⫽ 0.00012) for increased risk for ESRD in individuals who have 23 or fewer repeats and evidence for association with 24 or fewer repeats (P ⫽ 0.03). The 25-repeat allele was significantly more common in controls than case (P ⫽ 0.002), contributing to the overall pattern of association.
Analysis of African-American data showed a similar, but less significant, trend for alleles with 22 or fewer repeats (P ⫽ 0.086) in subjects with T2DM/ESRD and 23 or fewer repeats (P ⫽ 0.074) and 21 or fewer repeats (P ⫽ 0.047) in subjects without T2DM/ESRD. Linkage Disequilibrium in the MMP9 Gene SNPs evaluated in MMP9 are in considerable linkage disequilibrium (LD), as might be expected for loci spread over the relatively short interval of 8.7 kb, as listed in Table 8, a summary of marker-to-marker |D⬘| expressions of LD for markers with minor allele frequencies of 5% or greater. Interestingly, the average of |D⬘| was 0.90 ⫾ 0.096 in Caucasian and 0.78 ⫾ 0.25 in African-American populations, consistent with most observations that LD is more limited in extent along the genome in African Americans than Caucasians. DISCUSSION
We surveyed the MMP9 gene, including both promoter and coding region, for polymorphic DNA sequences. This study identified 6 previously reported polymorphisms: 2 polymorphisms in the promoter, 3 polymorphisms in the coding region, and 1 polymorphism in the 3⬘ UTR. In some cases, our results were different from those observed by Zhang et al.10 We did not find the Table 6. Allelic Frequencies of Dinucleotide Repeat Polymorphism in the Promoter Region of MMP9 Allele Frequencies No. of Repeats
Caucasian
African American
12 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
0.001 0.56 0.02 0.001 0.001 0.001 0.013 0.014 0.16 0.16 0.06 0.014 0.001 0.002 ⬍0.001 0.001
⬍0.001 0.25 0.03 0.001 0.028 0.014 0.015 0.111 0.23 0.17 0.13 0.009 0.004 0.003 0.002 ⬍0.001
140
HIRAKAWA ET AL Table 7.
Analysis of MMP9 Microsatellite Repeat in Expanded Sets of Cases and Controls Caucasians
African Americans
No. of Repeats
Controls
T2DM/ESRD Cases
P for ⱕ k v⬎k
P for k v Others
Controls
T2DM/ESRD Cases
P for ⱕ k v ⬎k
Non-T2DM/ESRD Cases
P for ⬍ k v⬎k
12 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
1 381 20 1 0 1 4 7 115 102 41 16 1 3 0 1
0 397 12 0 2 0 14 13 104 120 41 3 0 0 0 0
0.5 0.67 1.00 1.00 0.96 1.00 0.62 0.41 0.90 0.069 0.00012 0.03 0.06 0.50 NA —
0.50 0.63 0.15 0.50 0.25 0.50 0.03 0.26 0.38 0.24 1.00 0.002 0.50 0.12 NA 0.50
0 112 18 0 10 3 7 44 103 75 50 4 2 0 0 0
0 144 12 0 14 10 9 61 139 73 69 3 3 2 1 0
NA 0.86 0.077 1 0.8 0.12 0.97 0.61 0.55 0.086 0.61 0.71 1 0.51 1 NA
0 123 17 1 17 8 7 60 98 98 82 6 1 2 2 0
NA 0.35 0.44 1 0.4 0.36 0.71 0.55 0.047 0.62 0.074 1 0.59 0.5 0.5 NA
Abbreviations: k, number of repeats in allele; NA, not applicable.
T-1702A and C-861T polymorphisms in the promoter and Glu82Lys and Val571Val in the coding region.10 We observed 3 new polymorphisms (2 polymorphisms in the promoter and 1 polymorphism in the coding region) in Caucasians and 11 new polymorphisms (8 polymorphisms in the promoter and 3 polymorphisms in the coding Table 8.
African American C-2012T C-1919T T-633G Arg279Val Pro574Arg Gly607Gly Gln668Arg Val694Val C⫹6bpT
Caucasian C-1919T C-1562T Arg279Val Gly607Gly Val694Val Gln668Arg C⫹6bpT
region) in African Americans (Table 2). These differences may be associated with the study populations: North American Caucasians and African Americans in our study and Caucasians from the United Kingdom in the study of Zhang et al.10 Recent studies suggest that 2 polymorphisms
LD Between MMP9 Polymorphisms Calculated as 円Dⴕ円 Statistics
C-2012T
C-1919T
T-633G
Arg279Val
Pro574Arg
Gly607Gly
Gln668Arg
Val694Val
1.00 0.99 0.65 0.88 0.25 0.46 1.00 0.94
1.00 0.84 0.84 0.55 0.43 0.57 0.95
0.64 0.87 0.19 0.43 1.00 0.88
1.00 0.60 0.98 0.89 0.92
0.20 1.00 0.75 0.91
1.00 0.77 0.93
0.91 1.00
0.73
C-1919T
C-1562T
Gly607Gly
Gln668Arg
Val694Val
0.87 0.91 0.93 0.85 0.67 0.96
0.96 0.95 0.97 0.84 1.00
0.95 0.72 0.96
0.88 0.95
0.74
Arg279Val
0.98 0.96 0.81 1.00
MATRIX METALLOPROTEINASE GENETICS
of the promoter region in the MMP9 gene are associated with MMP9 expression: the C-1562T SNP polymorphism and (CA)n repeat variants in promoter regions.2,13-15 The C to T substitution at position ⫺1562 in the promoter caused an increase in transcriptional activity in macrophages and was associated with severity of coronary atherosclerosis.2 This polymorphism also has been analyzed for association with atherosclerotic lesion area,16 coronary arterial stenosis,17 multiple sclerosis, and pulmonary emphysema.19 Results of these case-control studies are mixed, with associations seen with coronary atherosclerosis and atherosclerotic lesion area, but no association with severity of coronary arterial stenosis and multiple sclerosis. Our patient population has not been evaluated in detail for coronary atherosclerosis and pulmonary emphysema. Our analyses show no differences between patients with renal disease and controls for the C-1562T SNP. The (CA)n repeat variant in the promoter region is located 90 bp upstream from the transcriptional initiation site of MMP9 and is widely believed to be functionally important.20 Recent studies suggest (CA)n repeat length has an important role in transcriptional activity13 and is associated with diabetic nephropathy and intracranial aneurysm.14,15 These studies suggested that shorter (CA)n repeats have less MMP9 promoter activity compared with (CA)21 repeats. For example, Maeda et al14 observed an association with diabetic nephropathy in the Japanese population and suggested the (CA)21 repeat allele, found in 42% to 46% of the Japanese population, may be a protective allele for diabetic nephropathy in the Japanese population. Other studies using Caucasian populations have observed that the (CA)14 repeat was one of the most common alleles.15,18 Other common alleles are (CA)21 and (CA)22 repeats in Caucasian and (CA)20, (CA)21, (CA)22, and (CA)23 repeats in African Americans (Table 6). Distribution of the (CA)n repeat allele frequencies in our study differs substantially from that in the Japanese population, and differences in distribution of (CA)n repeats between Caucasian and African Americans also are significantly different. In contrast to the Japanese population, the (CA)21 repeat is much less frequent in our populations
141
(Tables 6 and 7), and we did not observe a protective effect. Evaluation of association between single polymorphisms and nephropathy in our study populations showed no evidence for association between MMP9 and nephropathy. We subsequently evaluated the possibility that multiple SNPs in combination might contribute to renal disease susceptibility. Three-locus haplotype analysis found that 2-haplotype combinations in the African-American population ([C-1562T]-[T-633G][CA]n and [T-633G]-pCA]n-[Gly15Gly]) show borderline significant differences between African-American diabetic nephropathy cases and African-American controls. Both haplotypes include the (CA)n repeat locus. In addition, we observed some evidence for an association of shorter (CA)n repeats alleles with ESRD when dichotomizing the (CA)n repeat distribution. These observations suggest that (CA)n repeat variants might be associated with ESRD susceptibility. To address this possibility in greater detail, we evaluated the (CA)n polymorphism in a significantly expanded population of cases and controls. In our study, the (CA)14 repeat was the most common allele in Caucasians and African Americans(56% and 25%, respectively; Table 6). We found statistically significant (P ⫽ 0.00012) evidence suggesting shorter alleles [n ⱕ 23 (CA)n repeats] increase the risk for diabetic nephropathy in Caucasian populations. A similar, but statistically weak, trend was apparent when African-American data were dichotomized in a similar way. The association of shorter (CA)n repeats with ESRD is consistent with a model in which lower MMP9 expression could lead to mesangial matrix expansion and, ultimately, ESRD. In our study, we used cases drawn from patients undergoing dialysis. In our view, this is the ideal patient population for this type of study because these individuals have the greatest morbidity and mortality. In conclusion, genetic variation in MMP9 was analyzed in association with patients with ESRD in both Caucasian and African-American populations. We could not find significant differences between patients and controls in either population for SNPs, but we found statistically significant evidence suggesting shorter (CA)n repeats may be associated with diabetic ESRD in the
142
HIRAKAWA ET AL
Caucasian population, consistent with earlier reports in a Japanese population. (CA)n repeat variants in the African-American populations showed a possible similar trend. It seems appropriate to extend this study to Caucasian patients without diabetes with ESRD. We created a panel of SNP and (CA)n repeat polymorphisms that will be useful for future studies of MMP9, especially in the African-American population. ACKNOWLEDGMENT The authors thank the physicians, patients, and staff of ESRD Network 6 (The Southeastern Kidney Council, Inc) treatment facilities for their assistance in collecting clinical information and blood samples.
REFERENCES 1. Nagase H, Woessner JF Jr: Matrix metalloproteinases. J Biol Chem 274:21491-21494, 1999 2. Zhang B, Ye S, Herrmann SM, et al: Functional polymorphism in the regulatory region of gelatinase B gene in relation to severity of coronary atherosclerosis. Circulation 99:1788-1794, 1999 3. Bernhard EJ, Gruber SB, Muschel RJ: Direct evidence linking expression of matrix metalloproteinase 9 (92-kDa gelatinase/collagenase) to the metastatic phenotype in transformed rat embryo cells. Proc Natl Acad Sci U S A 91:42934297, 1994 4. Hua J, Muschel RJ: Inhibition of matrix metalloproteinase 9 expression by a ribozyme blocks metastasis in a rat sarcoma model system. Cancer Res 56:5279-5284, 1996 5. Ye S: Polymorphism in matrix metalloproteinase gene promoters: Implication in regulation of gene expression and susceptibility of various diseases. Matrix Biol 19:623-629, 2000 6. McMillan JI, Riordan JW, Couser WG, Pollock AS, Lovett DH: Characterization of a glomerular epithelial cell metalloproteinase as matrix metalloproteinase-9 with enhanced expression in a model of membranous nephropathy. J Clin Invest 97:1094-1101, 1996 7. Ebihara I, Nakamura T, Shimada N, Koide H: Increased plasma metalloproteinase-9 concentrations precede development of microalbuminuria in non–insulin-dependent diabetes mellitus. Am J Kidney Dis 32:544-550, 1998 8. Bowden DW, Sale M, Howard TD, et al: Linkage of genetic markers on human chromosomes 20 and 12 to
NIDDM in Caucasian sib pairs with a history of diabetic nephropathy. Diabetes 46:882-886, 1997 9. Price JA, Fossey SC, Sale MM, et al: Analysis of the HNF4 alpha gene in Caucasian type II diabetic nephropathic patients. Diabetologia 43:364-372, 2000 10. Zhang B, Henney A, Eriksson P, Hamsten A, Watkins H, Ye S: Genetic variation at the matrix metalloproteinase-9 locus on chromosome 20q12.2-13.1. Hum Genet 105:418423, 1999 11. Yu H, Sale M, Rich SS, et al: Evaluation of markers on human chromosome 10, including the homologue of the rodent Rf-1 gene, for linkage to ESRD in black patients. Am J Kidney Dis 33:294-300, 1999 12. Lewontin RC: The interaction of selection and linkage. I. General considerations; Heterotic models. Genetics 49:49-67, 1964 13. Shimajiri S, Arima N, Tanimoto A, et al: Shortened microsatellite d(CA)21 sequence down-regulates promoter activity of matrix metalloproteinase 9 gene. FEBS Lett 455:70-74, 1999 14. Maeda S, Haneda M, Guo B, et al: Dinucleotide repeat polymorphism of matrix metalloproteinase-9 gene is associated with diabetic nephropathy. Kidney Int 60:14281434, 2001 15. Peters DG, Kassam A, St Jean PL, Yonas H, Ferrell RE: Functional polymorphism in the matrix metalloproteinase-9 promoter as a potential risk factor for intracranial aneurysm. Stroke 30:2612-2616, 1999 16. Pollanen PJ, Karhunen PJ, Mikkelsson J, et al: Coronary artery complicated lesion area is related to functional polymorphism of matrix metalloproteinase 9 gene: An autopsy study. Arterioscler Thromb Vasc Biol 21:1446-1450, 2001 17. Wang J, Warzecha D, Wilcken D, Wang XL: Polymorphism in the gelatinase B gene and the severity of coronary arterial stenosis. Clin Sci (Lond) 101:87-92, 2001 18. Nelissen I, Vandenbroeck K, Fiten P, et al: Polymorphism analysis suggests that the gelatinase B gene is not a susceptibility factor for multiple sclerosis. J Neuroimmunol 105:58-63, 2000 19. Minematsu N, Nakamura H, Tateno H, Nakajima T, Yamaguchi K: Genetic polymorphism in matrix metalloproteinase-9 and pulmonary emphysema. Biochem Biophys Res Commun 289:116-119, 2001 20. St Jean PL, Zhang XC, Hart BK, et al: Characterization of a dinucleotide repeat in the 92 kDa type IV collagenase gene (CLG4B), localization of CLG4B to chromosome 20 and the role of CLG4B in aortic aneurysmal disease. Ann Hum Genet 59:17-24, 1995