Experimental Approaches to the Human Renal Transcriptome Jeffrey B. Hodgin, MD,* and Clemens D. Cohen, MD†,‡ Summary: The sum of RNA transcripts of a cell, organ structure, or organism can be referred to as transcriptome. An increasing number of studies report on specific and common alterations in the renal transcriptome in human nephropathies. In this review several challenges in transcriptomic analyses of the human kidney are discussed. This includes ways to approach the heterogeneity of the kidney itself as well as the diversity of renal diseases. Conventional and upcoming techniques for transcriptional profiling of minute tissue samples are presented, including so-called next generation sequencing and microRNA detection. Different tools to integrate transcriptomic data in a systematic context are discussed beside the current challenge to combine such results with data sets from other integrative biology technologies. Semin Nephrol 30:455-467 © 2010 Elsevier Inc. All rights reserved. Keywords: Gene expression, functional analysis, microarray, renal biopsy, sequencing
hromosomal DNA is partly transcribed to a variety of RNA molecules including coding messenger RNAs, noncoding regulatory RNAs, and additional RNAs such as ribosomal and transfer RNA. The sum of these RNA transcripts can be referred to as the transcriptome, in analogy to the genome as the entity of the genetic information. This implies that the transcriptome is far more dynamic than the genome, and varies extensively not only between organisms, but between cells, tissues, and physiological conditions in a single organism. It is obvious that studying the transcriptome in physiological and disease-associated conditions will help to elucidate transcripts and regulatory processes that are crucial for the specific biological state. In recent years it has
C
*Department of Pathology, University of Michigan, Ann Arbor, MI. †Division of Nephrology, University Hospital Zurich, Zurich, Switzerland. ‡Institute of Physiology with Zurich Center of Integrative Human Physiology, University of Zurich, Zurich, Switzerland. Supported in part by National Institutes of Health grant P30 DK081943, George M. O’Brien Kidney Research Core Center at the University of Michigan, and U54 DA021519 “National Center for Integrative Biomedical Informatics” (to J.B.H.) and the Else Kröner-Fresenius Foundation (A62/04) and the Swiss National Science Foundation (32-122439/1) (to C.D.C.). Address reprint requests to Clemens D. Cohen, MD, Institute of Physiology and Division of Nephrology, University and University Hospital of Zurich, Winterthurerstr. 190 (23-J-74), 8057 Zurich, Switzerland. E-mail: clemens.
[email protected] 0270-9295/ - see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.semnephrol.2010.07.003
become scientifically evident and technically feasible to go beyond the measurement of messenger RNAs (mRNAs) and investigate noncoding RNAs and small RNAs, which are used for structural and regulatory purposes. Today transcriptomics aims not only to identify all transcripts of a cell but also to determine the structure of gene transcripts such as splicing pattern, to identify transcriptional and translational regulators, and to integrate the expression levels of each transcript into a biological context. In this article we focus on transcriptomic profiling on human renal tissue. Although these studies may be mainly descriptive in nature, the hope is that analyzing the transcriptome of normal and diseased kidneys will decipher mechanisms leading to renal disease in human beings and may identify therapeutic targets and markers of progression.1 HUMAN RENAL TISSUE FOR MOLECULAR ANALYSIS Each healthy human kidney contains approximately 1 million nephrons.2 The nephron per se is a highly segmented and specialized structure with area- and function-specific transcriptomic variation.3 This variation has to be kept in mind when analyzing the transcriptome of re-
Seminars in Nephrology, Vol 30, No 5, September 2010, pp 455-467
455
456
nal samples. Renal biopsies provide key information for the diagnosis and therapeutic management of the individual patient with renal disease. The state of the art of renal pathology uses light microscopy, immunohistology for selected molecules, and electron microscopy. However, the present morphology-based analysis yields descriptive diagnostic categories that do not reflect the full heterogeneity of renal disease pathogenesis and gives limited prognostic information.4,5 Thus, additional functional insight into the pathogenesis is needed. Transcriptomics has, compared with other wideranging strategies, the advantages of broad data capture and a high level of standardization.6 Nevertheless, the integration of transcriptomic data with clinical and histopathologic information for a true systems understanding of disease processes is a daunting challenge.7 Obvious limitations of transcriptomic analysis alone include multiple layers of regulation interposed between mRNA level alterations and downstream changes in protein function in the disease state. Gene expression profiling of renal disease to define the molecular pathogenesis and refine diagnostic categories requires a comprehensive collection of tissue specimens, potentially microdissection of functional nephron segments, and a databank with corresponding clinical parameters, follow-up data, and histopathologic reports. Renal tissue from autopsies or unaffected portions of tumor nephrectomies is of limited suitability because of the concern for significant gene expression alterations before tissue procurement. Standardized protocols to process renal biopsy material for comprehensive expression analysis and collection of detailed clinicopathologic information have been established (Fig. 1),8,9 although challenges arise from the minute amounts of tissue obtained by fine-needle renal biopsies and set apart for gene expression analysis.1 In addition, sample bias in renal biopsies is inevitable. For example, in glomerular diseases of a focal nature, such as focal segmental glomerulosclerosis, the minimal sample needed to exclude focal disease present in fewer than 10% of the glomeruli with greater than 90% confidence is at least 20 glomeruli.10 These sample biases also will affect the results of gene expression analysis, but this
J.B. Hodgin and C.D. Cohen
can be overcome to a degree by analyzing a large population and obtaining a representative gene expression profile. In addition, knowledge of the histopathologic features from the routinely processed portions of the biopsy can inform the interpretation of expression analysis. As mentioned earlier, microdissection addresses the heterogeneity of the kidney, although other ways of tissue compartment analysis have been tested.11 Manual microdissection under a stereomicroscope is an effective approach that has allowed the early nephron segment-specific application of different novel techniques (reverse-transcription-polymerase chain reaction [RT-PCR],12 serial analysis of gene expression [SAGE]3, microarray13). A second approach to isolate specific nephron segments, such as glomeruli, is laser-capture microdissection.14,15 There are two basic methods. One system uses a pulsing infrared laser to heat a thermoplastic film in a specialized microfuge cap above the tissue section to form an adherent bridge. Lifting the thermoplastic cap separates targeted tissue from surrounding undisturbed tissue (Arcturus; now Applied Biosystems, Foster City, CA).16 Another system uses a laser source to cut around the desired areas within a tissue section and then pulse catapult the microdissected material into a collection tube (Zeiss PALM; Carl Zeiss Microimaging, Göttingen, Germany).17 Although the RNA yield obtained is much lower,15 a significant advantage to this approach over manual microdissection is that defined pathologic lesions can be isolated for gene expression analysis. Renal biopsy material already processed for routine diagnostic purposes can be used as a source of RNA for gene expression profiling. If the frozen portion of the renal biopsy processed for diagnostic immunofluorescence analysis is available, RNA can be isolated using standard protocols.16,18 However, the frozen biopsy portion often is stored in a solution (Michel’s/ Zeus transport medium) before freezing that is optimized for immune-complex stability, but not for preserving RNA. Isolation of RNA from formalin-fixed, paraffin-embedded (FFPE) tissue for gene expression analysis is a highly attractive option because large archives already exist. In the past, RNA extraction from FFPE material has had limited success for global gene expression profiling in part owing to chemical alter-
Human renal transcriptome
457
Figure 1. Study design for comprehensive gene expression analysis in renal disease (eg, the European Renal Complementary DNA Bank). A specimen of a routine kidney biopsy core is transferred into RNA preservative and shipped to the core facility. The tissue samples are microdissected into glomeruli and tubulointerstitial compartments. Gene expression analysis can be performed by real-time RT-PCR, high-density oligonucleotide complementary DNA array, and additional techniques presented in the text. Clinical characteristics and histopathology are collected in parallel and integrated with the gene expression data. Adapted from Cohen and Kretzler97 with kind permission of Springer Science⫹Business Media.
ation and fragmentation of nucleic acids,19 although quantitative RT-PCR has been used successfully for some time.15,20 New technologies, however, have been developed to allow successful global gene expression profiling of FFPE material, even after many years of storage.21,22 MOLECULAR APPROACHES TO TRANSCRIPTOMIC PROFILING OF HUMAN KIDNEY DISEASES Several molecular techniques to investigate gene expression have been developed including Northern hybridization, ribonuclease pro-
tection assay, in situ hybridization, and competitive RT-PCR. The first two are not suitable for large-scale expression profiling of renal biopsy material owing to a requirement for considerable amounts of RNA. In situ hybridization is effective for localizing specific mRNA expression in cells and tissue, but is time consuming and has a limited ability to quantitate expression levels or assay multiple transcripts.5 Compared with nonamplifying methods of mRNA expression analysis, PCR is a powerful tool for detecting mRNA expression of multiple genes in a small amount of sample RNA, such as renal biopsies. Competitive RT-PCR quantitates a
458
message by comparing product signal intensity with a concentration curve generated by a synthetic competitor RNA sequence. Early studies showed the utility of competitive RT-PCR to detect differential regulation of type IV collagen ␣-chains and TGF- 1 expression in isolated glomeruli from biopsy material from patients with DN and MGN and various kinds of glomerulonephritis.23,24 However, because real-time RT-PCR (also known as quantitative RT-PCR) allows high-throughput application, and gives an accurate, reproducible, and rapid determination of mRNA expression levels, it has become the gold standard among all PCR-based gene expression analysis techniques. Real-time PCR enables the exact quantification of a target mRNA by determining the number of amplicons after each PCR cycle in a dynamic manner.25 Minute amounts of target mRNA can be quantified, such as single laser-microdissected glomerular cross-sections or even single podocytes.26 Quantitative RT-PCR also can be used to assay gene expression from archival FFPE renal biopsies.15,27 Internal reference RNAs, or “housekeepers,” with assumed stable levels are used to normalize the expression levels of the genes of interest with the amount of tissue analyzed. But they must be chosen carefully because regulation of these genes will confound the expression ratio of the gene of interest.20,28 Parallel gene expression analysis targeting multiple transcripts is needed to investigate the complex biological mechanisms in health and disease. Real-time RT-PCR is limited by the number of mRNAs that can be assayed simultaneously, although broader applications have been developed.29 Microarray technology has become a powerful tool to analyze the gene expression of tens of thousands of genes simultaneously. DNA microarrays are currently a well-established method for comprehensive gene expression profiling in medical renal disease, as indicated by the rapid growth of publications using this technology (Fig. 2). Complementary DNA microarrays spot complementary DNA clones of 200 to 500 base pair length onto a glass slide. Hybridization of mRNAs can be detected by a two-color fluorescence detection system using target mRNAs labeled with one dye and control mRNAs with another. This results in a
J.B. Hodgin and C.D. Cohen
Figure 2. Rapid growth of publications reporting the use of gene expression arrays in medical renal disease per year since 1996. Query for PubMed search was as follows: “gene expression array” OR “gene expression profiling” OR (microarrays AND gene) OR (microarray AND gene) NOT review NOT cancer AND (kidney OR renal).
fluorescence ratio of differential expression between sample and control. Oligonucleotide microarrays (such as Affymetrix Genechip arrays, Santa Clara, CA) use an in situ DNA synthesis technique to create thousands of complementary single-stranded DNAs of 25 to 70 base pairs in length. Oligonucleotide arrays use a one-color biotin-labeling protocol, which permits absolute quantification of gene expression. Bead arrays (such as BeadArray; Illumina, San Diego, CA) are similar, but use small tagged silica beads as the solid support for attachment of oligonucleotide sequences. The minute quantity of RNA obtained from microdissected samples necessitates amplification before DNA array-based expression analysis.30 Because small discrepancies in amplification efficiency can result in significant distortions in the amplified product, a major concern is the linearity of the amplifications across mRNA species. Thus, rigorous quality controls are mandatory.1 Nevertheless, a high degree of reproducibility between identical samples before and after amplification has been shown at a satisfying correlation.31 Microarrays used for profiling transcript levels typically contain one to several DNA oligonucleotide probes complementary to a specific region of the target gene.5 A limitation of this technique is the fixed sequence-specificity and the dependence on a priori knowledge of transcript sequences. The latter can be partly addressed by re-annotation of the probe sequences.32 High-den-
Human renal transcriptome
sity oligonucleotide-based whole-genome microarrays (tiling arrays), however, use shorter fragments designed to cover the entire genome, reaching much higher degrees of resolution. High-density microarrays can be used as a generic platform for numerous experimental approaches to decode the information in the genome, such as empiric annotation of the transcriptome, novel gene discovery, analysis of alternative splicing, and mapping regulatory DNA motifs using chromatin-immunoprecipitation.33 The disadvantages of the tiling array include high cost, large variation of probe characteristics, increased cross-hybridization, and difficult analysis. Exon arrays use a more focused analysis than tiling arrays by using multiple probes per target gene exon or exon junctions instead of extending across the whole genome. Compared with standard oligonucleotide arrays, exon arrays allow the interrogation of differentially expressed, alternatively spliced isoforms, including novel isoforms. However, the exon array primarily is restricted to type I deletions (cassette exons) and the exon boundaries must be known initially.34 Novel arrays also have been developed to detect noncoding RNAs such as miRNA,35,36 which also can be quantified by PCR-based platforms.37 The comprehensive profiling of gene expression in the kidney has shown promise as a novel molecular diagnostic tool and has provided novel insights into both physiological and pathogenetic mechanisms in various renal diseases. As a proof-of-principle that renal lesions can be categorized by a molecular approach, Henger et al38 used mRNA expression profiles from hydronephrotic and control kidneys to identify a subset of transcripts that enabled molecular stratification into controls and fibrotic and inflammatory renal lesions. Examples of arraybased analyses of the transcriptome in specific renal diseases include diabetic nephropathy,14,39,40 lupus nephritis,18 arterionephrosclerosis,41 focal segmental glomerulosclerosis,42,43 and renal transplantion.44,45 Another approach to study the transcriptome is SAGE, which compares sequences of concatenated 9- to 13-base pair sequence tags, corresponding to unique mRNAs in a sample of interest, with a genomic database.46 SAGE pro-
459
duces a comprehensive and quantitative profile of genes expressed at the time of analysis. The sequences do not need to be known a priori, allowing discovery of genes or gene variants. However, SAGE is less effective in evaluating low-expressing genes and more expensive than array-based analysis,47 thus limiting its applicability in large-scale gene expression analysis of diseased kidney versus normal. Nevertheless, in human kidney, SAGE analysis has been informative toward establishing the unique expression patterns of nephron segments.3,48-50 New transformative technologies in DNA sequencing have captured the imagination of biological scientists in recent years and already are having a major impact on genome-wide experiments.51,52 Compared with conventional, capillary-based dideoxynucleotide sequencing of DNA (Sanger sequencing), next-generation sequencing (also called deep sequencing or second-generation sequencing) allows vastly increased throughput and yield of data at a lower cost, enabling the design of ultradeep sequencing projects previously prohibitive because of their large size.52 Next-generation sequencing can do this by processing millions of sequence reads in parallel, requiring only one or two instrument runs to complete an experiment. Beyond applications to genome-wide DNA sequencing, such as mutation mapping or polymorphism discovery, next-generation sequencing technology can be applied to the investigation of the transcriptome (RNA-seq), DNA-protein interactions (chromatin immunoprecipitation [ChIP]-seq), epigenomic variation, the metagenome, noncoding RNA discovery, and others.51,52 Four platforms are currently commercially available for massively parallel DNA sequencing (Table 1), Roche 454 FLX (Roche, Branford, CT), Illumina Genome Analyzer, Applied Biosystems Supported Oligonucleotide Ligation and Detection System (SOLiD), and the Helicos Heliscope (Heliscope, Cambridge, MA). Each uses an elaborate collaboration of enzymology, chemistry, high-resolution optics, hardware, and software engineering.53 Sample preparation is highly streamlined; DNA fragments are obtained from a variety of sources, such as reverse transcription of mRNA or chromatin immunoprecipitation, and are ligated to specific adaptor oligos.
460
J.B. Hodgin and C.D. Cohen
Table 1. Comparison of Next-Generation Sequencing Technologies Amplification Method
Sequencing Chemistry
Roche 454 FLX Illumina Genome Analyzer
Emulsion PCR Bridge PCR
ABI SOLiD HeliScope
Emulsion PCR none
Pyrosequencing Reversible terminator sequence-bysynthesis Sequence-by-ligation Virtual terminator sequence-bysynthesis
Platform
Read Length, bp
Run Time, d
Sequencing Throughput/Run, gigabase
400-500 35-75
⬍1 2-4
⬍1 ⱕ6
25-75 25-50
6-8 ⬍2
ⱕ20 ⱕ28
Website www.454.com www.illumina.com
www.appliedbiosystems.com www.helicosbio.com
NOTE. Currently available platforms for massively parallel sequencing are shown. All platforms differ with respect to amplification, chemistry, product specificity, and data output. See text for further information.
The presence of adapter sequences allows attachment to solid surface and subsequent template amplification by PCR before sequencing. Roche 454 and Applied Biosystems SOLiD use emulsion droplets to isolate single DNA templates attached to beads for amplification in a microenvironment. Illumina Genome Analyzer uses bridge amplification in which DNA molecules are attached to a solid surface and amplified in situ to form clonal clusters.51,52 In contrast, the Helicos system eliminates the amplification step, directly sequencing single DNA molecules bound to a surface.54 Such single-molecule sequencing approaches are referred to as third-generation sequencers. The sequencing chemistry markedly differs between the available commercial systems, but each exploits light that is emitted when the correct base or oligonucleotide matches the template. Roche 454 uses a pyrosequencing approach in which chemiluminescence-based detection of each released pyrophosphate occurs upon incorporation of a nucleotide by DNA polymerase. The Illumina Genome Analyzer also uses sequence-by-synthesis, but uses four uniquely labeled fluorescent nucleotide an-
alogs that are reversible sequence terminators, and special DNA polymerases for incorporation.55,56 In contrast to the polymerase-based approach, the SOLiD system uses sequence-byligation with successive rounds of dinucleotide hybridization and ligation events.57 The true single-molecule sequencing by Helicos Heliscope captures single-molecule DNA templates on a flow cell surface and labeled nucleotides are added one at a time. The signal for incorporated nucleotides then are detected, whereas unincorporated labels are washed out.58 The result is hundreds of thousands to tens of millions of short reads (25-500 bases). A more in-depth discussion of the technical and methodologic aspects of these next-generation sequencers can be found elsewhere.51-53,59 After image and signal processing, reliable RNA-seq data relies on efficient de novo assembly or more commonly proper mapping of sequence reads to corresponding reference genomes (Fig. 3). This is dependent on a number of factors including the complexity of the reference genome, error rates, and the length of the short reads.52 Paired end sequencing, in which reads from both ends of complementary
Figure 3. Analysis of RNA-Seq data. After the RNA is fragmented, reverse-transcribed, and sequenced, the sequence reads are mapped onto a reference genome. The number of reads mapped to particular exons can be used to calculate the expression level of corresponding transcripts. *Indicates the existence of an alternatively spliced transcript lacking exon 2.
Human renal transcriptome
DNA fragments are produced, helps to reduce alignment ambiguities when mapping short reads to the genome. Next-generation sequencing technologies are capable of generating vast quantities of data that require robust alignment, assembly, and analysis tools to overcome the gap between data collection and meaningful biological interpretation.60 Several computational tools exist and more are being developed to keep up with the advancing methods of high-throughput sequencing (reviewed in Turner et al52 and Pepke et al61). Whole-transcriptome studies using next-generation technologies for several organisms have been accomplished, including human, mouse, yeast, plants, and others.59 RNA-seq data are highly quantitative, sensitive, reliable, and have several advantages over microarrays.62-64 Massively parallel sequencing does not infer transcript abundance from hybridization intensity, which can introduce data noise and interfere with reproducibility and cross-sample comparisons, but rather measures transcript abundance explicitly. The dynamic range is much broader, having been reported to be at least 5 orders of magnitude.65 The range is theoretically limited by the sequencing depth, defined as the total number of sequencing reads generated from a sample library. The more reads sequenced, the higher chance of detecting rare transcripts. However, more depth increases run time and costs. Unlike standard microarray technology or SAGE, RNA-seq is not dependent on the organism’s known genome sequence, thus allowing the ability to detect differential levels of novel transcripts.66 Furthermore, posttranscriptional regulation, a fundamental part of gene expression, can be detected. This includes not only alternative splicing, but also RNA editing and polyadenylation. In higher eukaryotes, alternative splicing is a key factor underlying increased cellular and functional complexity.67 However, the extent of human alternative splicing has not been investigated thoroughly because of the limited sensitivity and depth of coverage using conventional sequencing and microarray profiling methods. Recently, several laboratories have harnessed next-generation sequencing’s ability to globally sample every possible splice isoform
461
in a given tissue and have revealed much more complexity than previously estimated.65,66,68,69 These newer studies have revealed that more than 90% of multiexon genes undergo alternative splicing,68,69 most with minor isoform frequency of 15% or more, indicating substantial and biologically relevant levels of isoform expression.69 In the years to come, massively parallel sequencing technologies undoubtedly will drive additional exciting discoveries concerning the human transcriptome. At the moment, RNA-Seq studies on human kidney specimen are eagerly awaited. ANALYSIS OF RENAL TRANSCRIPTOMIC DATA Given the highly specific physiological function of nephron segments, the steady-state gene expression is of key interest. For human glomeruli, the site of initial damage in most human renal disease, more than five gene expression libraries are available.3,50,70-72 These libraries are being used to identity candidate genes involved in glomerulopathies. The dynamics of the transcriptome can be better integrated in biological processes when changes in transcript expression are observed under physiological or pathophysiological settings. Because the amount of raw data of a gene expression study easily may exceed the capability of the scientist for data integration the data have to be transferred in other matrices. Specific analytical tools have been developed to identify genes or transcripts differentially regulated in two or more conditions, after initial quality checks and definition of detection limits. One of the most widely used tools is significance analysis of microarrays.73 This analysis can be performed on standard microarray readouts and for single-probe analyses.32,74 The differences and advantages of both approaches are described in more detail by Nelson and Werner (p. 477) in this issue of Seminars in Nephrology. The read-out of such statistical analysis can be visualized as a heat map and cluster analysis. The application of cluster analysis was introduced into gene expression analysis by Eisen et al75 in 1998. Such cluster analyses help to bring order to an overwhelming set of observations. Most often the results are presented as cluster dendrograms, in
462
which samples with the most similar gene expression profiles are sorted together with the shortest branches. This presentation of gene expression data also became popular in nephrology-related studies such as the report of the first gene expression analyses on human transplant and native kidney biopsies.18,38,44,76,77 Such analyses are helpful to detect overall differences between given conditions, and they may be used to detect overall expression changes between selected gene sets.41,78 However, additional tools are needed to combine experimental expression data with established knowledge. A number of molecular biology databases and analysis and visualization tools have been developed, either freely available or by subscription, for molecular interaction networks. Some of the most common include Gene Ontology (www.geneontology.org), DAVID,79 Ingenuity Pathway Analysis (www.ingenuity.com), Genomatix (www.genomatix.de), KEGG (www. genome.jp/kegg), NCI/Nature Pathway Interaction Database,80 and others. Several of these pathway and functional annotation software are presented by Perco and Oberbauer (p. 520) and Nelson and Werner (p. 477) in this issue of Seminars in Nephrology. In addition, bioinformatic tools using Bayesian integration of experimental data for knowledge discovery, instead of curation of knowledge from the literature, have been developed and are presented by Greene and Troyanskaya (p. 443) in this issue of Seminars in Nephrology. As a recent example of transcriptomic network analysis, Bhavnani et al81 extracted gene profiles from renal biopsies and used a bipartite network to visually represent the explicit relationships between 7 renal diseases and 747 differentially expressed genes (Fig. 4). The analysis revealed molecular signatures shared by two or more renal diseases, and molecular signatures unique to each disease, thus showing the utility of network analysis to rapidly understand complex relationships between diseases and regulated genes. Microarray data have a high reproducibility for a single platform. But several technical and biological aspects initially led to limited cross-platform comparability.82 Some of the underlying confounders have been identified and sufficiently addressed (eg, probe sequence, filtering, and normalization).83 Neverthe-
J.B. Hodgin and C.D. Cohen
less, the comparison of gene expression profiles across platforms, and between species, is an ongoing challenge84 (see also Ju and Brosius [p. 512] in this issue of Seminars in Nephrology). Apart from technical issues, the increasing knowledge of the complexity of the transcriptome makes the data interpretation an ongoing challenge. For example, microRNAs (miRNAs), a rapid evolving area of interest, are singlestranded RNAs of approximately 22 nucleotides in length that have important regulatory properties. Several hundred different miRNAs have been identified since the first description85 and they need special detection platforms or sequencing approaches as described earlier.86,87 miRNAs undergo posttranscriptional modifications by the enzymes Drosha and Dicer before they enter the RNA-induced silencing complex. Here, the specific miRNA binds to its target mRNA, leading to suppression of translation.88 Recently, it has been shown that most mammalian mRNAs are conserved targets of miRNAs.89 As for other elements of the transcriptome, a wide dynamic range of expression has been described for miRNAs under different physiological and disease-related conditions of the kidney.36,90,91 The relevance of miRNA processing for glomerular biology has been elegantly shown in cell-specific deletion of Dicer, one of the miRNA-processing enzymes.92-94 In the human kidney, two studies on allografts investigated the ability of miRNA expression profiles to predict renal status.37,94 The study of Anglicheau et al37 followed the profiling data successfully to potential target RNAs and relevant cellular components responsible for the changes observed in the miRNA levels. However, much remains to be elucidated in miRNA biology and miRNA analyses in a systems biology context including novel miRNA transcripts, proper prediction of target mRNAs, and more complete understanding of its regulatory nature.95,96 Similar challenges have to be faced when integrating standard gene expression profiling with proteomic and metabolomic data as biological readouts. Here, integrative approaches, as presented by Wagner et al (p. 487) and Wang et al (p. 500) in this issue, are needed. In sum, several technical platforms exist to comprehensively determine the transcriptome of the human kidney. Although real-time RT-PCR–
Human renal transcriptome
463
Figure 4. A bipartite network showing the relationship between renal diseases (black nodes) and differentially regulated genes (white nodes) in renal biopsies. The size of the disease nodes is proportional to the number of edges (lines) that connect them to genes. Red edges connect genes associated with only one renal disease (unique molecular signature). Blue and green edges show genes connected to two or more diseases (shared molecular signature). The renal diseases include systemic lupus erythematosus (SLE), focal segmental glomerulosclerosis (FSGS), membranous glomerulopathy (MGN), IgA nephropathy (IgAN), diabetic nephropathy (DN), thin-membrane disease (TMD), and minimal change disease (MCD). Also included are biopsies from cadaveric transplant donor kidneys (CD). For more detail see article by Bhavnani et al.81
based and array-based studies have become numerous, the first studies using next-generation sequencing on this tissue are still anticipated. Together these studies will allow the quantitative measurement of known and unknown transcripts as well as coding and noncoding RNAs for a truly broad and robust characterization of the transcrip-
tome. Arguably the most challenging task will be to put these data in a biologically meaningful context. Here, knowledge-based as well as unbiased analysis tools will be as important as the integration with other data sets such as proteomic, metabolomic, and clinical data—this will lead us to real systems nephrology.
464
J.B. Hodgin and C.D. Cohen
REFERENCES 1. Kretzler M, Cohen CD, Doran P, Henger A, Madden S, Grone EF, et al. Repuncturing the renal biopsy: strategies for molecular diagnosis in nephrology. J Am Soc Nephrol. 2002;13:1961-72. 2. Keller G, Zimmer G, Mall G, Ritz E, Amann K. Nephron number in patients with primary hypertension. N Engl J Med. 2003;348:101-8. 3. Chabardes-Garonne D, Mejean A, Aude JC, Cheval L, Di Stefano A, Gaillard MC, et al. A panoramic view of gene expression in the human kidney. Proc Natl Acad Sci U S A. 2003;100:13710-5. 4. Neusser MA, Lindenmeyer MT, Kretzler M, Cohen CD. Genomic analysis in nephrology—towards systems biology and systematic medicine? Nephrol Ther. 2008;4:306-11. 5. Yasuda Y, Cohen CD, Henger A, Kretzler M. Gene expression profiling analysis in nephrology: towards molecular definition of renal disease. Clin Exp Nephrol. 2006;10:91-8. 6. Morrison N, Cochrane G, Faruque N, Tatusova T, Tateno Y, Hancock D, et al. Concept of sample in OMICS technology. OMICS. 2006;10:127-37. 7. Sieberts SK, Schadt EE. Moving toward a system genetics view of disease. Mamm Genome. 2007;18:389401. 8. Cohen CD, Frach K, Schlondorff D, Kretzler M. Quantitative gene expression analysis in renal biopsies: a novel protocol for a high-throughput multicenter application. Kidney Int. 2002;61:133-40. 9. Roos-van Groningen MC, Eikmans M, Baelde HJ, de Heer E, Bruijn JA. Improvement of extraction and processing of RNA from renal biopsies. Kidney Int. 2004;65:97-105. 10. Corwin HL, Schwartz MM, Lewis EJ. The importance of sample size in the interpretation of the renal biopsy. Am J Nephrol. 1988;8:85-9. 11. Disset A, Cheval L, Soutourina O, Duong Van Huyen JP, Li G, Genin C, et al. Tissue compartment analysis for biomarker discovery by gene expression profiling. PLoS One. 2009;4:e7779. 12. Peten EP, Striker LJ, Carome MA, Elliott SJ, Yang CW, Striker GE. The contribution of increased collagen synthesis to human glomerulosclerosis: a quantitative analysis of alpha 2IV collagen mRNA expression by competitive polymerase chain reaction. J Exp Med. 1992;176:1571-6. 13. Cohen CD, Klingenhoff A, Boucherot A, Nitsche A, Henger A, Brunner B, et al. Comparative promoter analysis allows de novo identification of specialized cell junction-associated proteins. Proc Natl Acad Sci U S A. 2006;103:5682-7. 14. Baelde HJ, Eikmans M, Lappin DW, Doran PP, Hohenadel D, Brinkkoetter PT, et al. Reduction of VEGF-A and CTGF expression in diabetic nephropathy is associated with podocyte loss. Kidney Int. 2007;71:637-45. 15. Cohen CD, Grone HJ, Grone EF, Nelson PJ, Schlondorff D, Kretzler M. Laser microdissection and gene
16.
17.
18.
19.
20.
21.
22.
23.
24.
25. 26.
27.
28.
expression analysis on formaldehyde-fixed archival tissue. Kidney Int. 2002;61:125-32. Kohda Y, Murakami H, Moe OW, Star RA. Analysis of segmental renal gene expression by laser capture microdissection. Kidney Int. 2000;57:321-31. Grone HJ, Cohen CD, Grone E, Schmidt C, Kretzler M, Schlondorff D, et al. Spatial and temporally restricted expression of chemokines and chemokine receptors in the developing human kidney. J Am Soc Nephrol. 2002;13:957-67. Peterson KS, Huang JF, Zhu J, D’Agati V, Liu X, Miller N, et al. Characterization of heterogeneity in the molecular pathogenesis of lupus nephritis from transcriptional profiles of laser-captured glomeruli. J Clin Invest. 2004;113:1722-33. Masuda N, Ohnishi T, Kawamoto S, Monden M, Okubo K. Analysis of chemical modification of RNA from formalin-fixed samples and optimization of molecular biology applications for such samples. Nucleic Acids Res. 1999;27:4436-43. Schmid H, Cohen CD, Henger A, Irrgang S, Schlondorff D, Kretzler M. Validation of endogenous controls for gene expression analysis in microdissected human renal biopsies. Kidney Int. 2003;64:356-60. Frank M, Doring C, Metzler D, Eckerle S, Hansmann ML. Global gene expression profiling of formalinfixed paraffin-embedded tumor samples: a comparison to snap-frozen material using oligonucleotide microarrays. Virchows Arch. 2007;450:699-711. Hoshida Y, Villanueva A, Kobayashi M, Peix J, Chiang DY, Camargo A, et al. Gene expression in fixed tissues and outcome in hepatocellular carcinoma. N Engl J Med. 2008;359:1995-2004. Esposito C, Striker LJ, Patel A, Peten E, Liu ZH, Sakai H, et al. Molecular analysis of glomerular diseases in renal biopsies: initial results of a collaborative international study. The International Study Group for Molecular Study of Kidney Biopsies. Proc Assoc Am Physicians. 1996;108:209-17. Iwano M, Akai Y, Fujii Y, Dohi Y, Matsumura N, Dohi K. Intraglomerular expression of transforming growth factor-beta 1 (TGF-beta 1) mRNA in patients with glomerulonephritis: quantitative analysis by competitive polymerase chain reaction. Clin Exp Immunol. 1994; 97:309-14. Heid CA, Stevens J, Livak KJ, Williams PM. Real time quantitative PCR. Genome Res. 1996;6:986-94. Kretzler M, Teixeira VP, Unschuld PG, Cohen CD, Wanke R, Edenhofer I, et al. Integrin-linked kinase as a candidate downstream effector in proteinuria. FASEB J. 2001;15:1843-5. Serinsoz E, Bock O, Kirsch T, Haller H, Lehmann U, Kreipe H, et al. Compartment-specific quantitative gene expression analysis after laser microdissection from archival renal allograft biopsies. Clin Nephrol. 2005;63:193-201. Koop K, Bakker RC, Eikmans M, Baelde HJ, de Heer E, Paul LC, et al. Differentiation between chronic rejec-
Human renal transcriptome
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
tion and chronic cyclosporine toxicity by analysis of renal cortical mRNA. Kidney Int. 2004;66:2038-46. Goulter AB, Harmer DW, Clark KL. Evaluation of low density array technology for quantitative parallel measurement of multiple genes in human tissue. BMC Genomics. 2006;7:34. Ginsberg SD, Che S. Expression profile analysis within the human hippocampus: comparison of CA1 and CA3 pyramidal neurons. J Comp Neurol. 2005; 487:107-18. Ernst T, Hergenhahn M, Kenzelmann M, Cohen CD, Bonrouhi M, Weninger A, et al. Decrease and gain of gene expression are equally discriminatory markers for prostate carcinoma: a gene expression analysis on total and microdissected prostate tissue. Am J Pathol. 2002;160:2169-80. Moll AG, Lindenmeyer MT, Kretzler M, Nelson PJ, Zimmer R, Cohen CD. Transcript-specific expression profiles derived from sequence-based analysis of standard microarrays. PLoS One. 2009;4:e4702. Mockler TC, Chan S, Sundaresan A, Chen H, Jacobsen SE, Ecker JR. Applications of DNA tiling arrays for whole-genome analysis. Genomics. 2005;85:1-15. Cuperlovic-Culf M, Belacel N, Culf AS, Ouellette RJ. Microarray analysis of alternative splicing. OMICS. 2006;10:344-57. Sun Y, Koo S, White N, Peralta E, Esau C, Dean NM, et al. Development of a micro-array to detect human and mouse microRNAs and characterization of expression in human organs. Nucleic Acids Res. 2004; 32:e188. Tian Z, Greene AS, Pietrusz JL, Matus IR, Liang M. MicroRNA-target pairs in the rat kidney identified by microRNA microarray, proteomic, and bioinformatic analysis. Genome Res. 2008;18:404-11. Anglicheau D, Sharma VK, Ding R, Hummel A, Snopkowski C, Dadhania D, et al. MicroRNA expression profiles predictive of human renal allograft status. Proc Natl Acad Sci U S A. 2009;106:5330-5. Henger A, Kretzler M, Doran P, Bonrouhi M, Schmid H, Kiss E, et al. Gene expression fingerprints in human tubulointerstitial inflammation and fibrosis as prognostic markers of disease progression. Kidney Int. 2004;65:904-17. Lindenmeyer MT, Kretzler M, Boucherot A, Berra S, Yasuda Y, Henger A, et al. Interstitial vascular rarefaction and reduced VEGF-A expression in human diabetic nephropathy. J Am Soc Nephrol. 2007;18: 1765-76. Schmid H, Cohen CD, Henger A, Schlondorff D, Kretzler M. Gene expression analysis in renal biopsies. Nephrol Dial Transplant. 2004;19:1347-51. Neusser MA, Lindenmeyer MT, Moll AG, Segerer S, Edenhofer I, Sen K, et al. Human nephrosclerosis triggers a hypoxia-related glomerulopathy. Am J Pathol. 2010;176:594-607. Bennett MR, Czech KA, Arend LJ, Witte DP, Devarajan P, Potter SS. Laser capture microdissection-microarray
465
43.
44.
45.
46.
47.
48.
49.
50.
51. 52.
53.
54.
55. 56.
57.
58. 59. 60. 61.
analysis of focal segmental glomerulosclerosis glomeruli. Nephron Exp Nephrol. 2007;107:e30-40. Schwab K, Witte DP, Aronow BJ, Devarajan P, Potter SS, Patterson LT. Microarray analysis of focal segmental glomerulosclerosis. Am J Nephrol. 2004;24:438-47. Sarwal M, Chua MS, Kambham N, Hsieh SC, Satterwhite T, Masek M, et al. Molecular heterogeneity in acute renal allograft rejection identified by DNA microarray profiling. N Engl J Med. 2003;349:125-38. Bunnag S, Einecke G, Reeve J, Jhangri GS, Mueller TF, Sis B, et al. Molecular correlates of renal function in kidney transplant biopsies. J Am Soc Nephrol. 2009; 20:1149-60. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270: 484-7. Hayden PS, El-Meanawy A, Schelling JR, Sedor JR. DNA expression analysis: serial analysis of gene expression, microarrays and kidney disease. Curr Opin Nephrol Hypertens. 2003;12:407-14. Schelling JR, El-Meanawy MA, Barathan S, Dodig T, Iyengar SK, Sedor JR. Generation of kidney transcriptomes using serial analysis of gene expression. Exp Nephrol. 2002;10:82-92. Virlon B, Cheval L, Buhler JM, Billon E, Doucet A, Elalouf JM. Serial microanalysis of renal transcriptomes. Proc Natl Acad Sci U S A. 1999;96:15286-91. Nystrom J, Fierlbeck W, Granqvist A, Kulak SC, Ballermann BJ. A human glomerular SAGE transcriptome database. BMC Nephrol. 2009;10:13. Mardis ER. The impact of next-generation sequencing technology on genetics. Trends Genet. 2008;24:133-41. Turner DJ, Keane TM, Sudbery I, Adams DJ. Nextgeneration sequencing of vertebrate experimental organisms. Mamm Genome. 2009;20:327-38. Mardis ER. Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet. 2008;9:387402. Braslavsky I, Hebert B, Kartalov E, Quake SR. Sequence information can be obtained from single DNA molecules. Proc Natl Acad Sci U S A. 2003;100:3960-4. Bentley DR. Whole-genome re-sequencing. Curr Opin Genet Dev. 2006;16:545-52. Morozova O, Hirst M, Marra MA. Applications of new sequencing technologies for transcriptome analysis. Annu Rev Genomics Hum Genet. 2009;10:135-51. Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP, Rosenbaum AM, et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science. 2005;309:1728-32. Milos P. Helicos BioSciences. Pharmacogenomics. 2008;9:477-80. Marguerat S, Bahler J. RNA-seq: from technology to biology. Cell Mol Life Sci. 2010;67:569-79. McPherson JD. Next-generation gap. Nat Methods. 2009;6 Suppl:S2-5. Pepke S, Wold B, Mortazavi A. Computation for ChIPseq and RNA-seq studies. Nat Methods. 2009;6 Suppl: S22-32.
466
62. Bloom JS, Khan Z, Kruglyak L, Singh M, Caudy AA. Measuring differential gene expression by short read sequencing: quantitative comparison to 2-channel gene expression microarrays. BMC Genomics. 2009; 10:221. 63. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18:1509-17. 64. ’t Hoen PA, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, et al. Deep sequencingbased expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 2008;36:e141. 65. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5:621-8. 66. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008;321:956-60. 67. Ben-Dov C, Hartmann B, Lundgren J, Valcarcel J. Genome-wide analysis of alternative pre-mRNA splicing. J Biol Chem. 2008;283:1229-33. 68. Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008;40:1413-5. 69. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, et al. Alternative isoform regulation in human tissue transcriptomes. Nature. 2008;456:470-6. 70. Cuellar LM, Fujinaka H, Yamamoto K, Miyamoto M, Tasaki M, Zhao L, et al. Identification and localization of novel genes preferentially expressed in human kidney glomerulus. Nephrology (Carlton). 2009;14: 94-104. 71. Higgins JP, Wang L, Kambham N, Montgomery K, Mason V, Vogelmann SU, et al. Gene expression in the normal adult human kidney assessed by complementary DNA microarray. Mol Biol Cell. 2004;15:649-56. 72. Lindenmeyer MT, Eichinger F, Sen K, Anders HJ, Edenhofer I, Mattinzoli D, et al. Systematic analysis of a novel human renal glomerulus-enriched gene expression dataset. PLoS One. 2010;5:e11545. 73. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2001;98:5116-21. 74. Cohen CD, Lindenmeyer MT, Eichinger F, Hahn A, Seifert M, Moll AG, et al. Improved elucidation of biological processes linked to diabetic nephropathy by single probe-based microarray data analysis. PLoS One. 2008;3:e2937. 75. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A. 1998;95:14863-8. 76. Akalin E, Hendrix RC, Polavarapu RG, Pearson TC, Neylan JF, Larsen CP, et al. Gene expression analysis in human renal allograft biopsy samples using high-
J.B. Hodgin and C.D. Cohen
77.
78.
79.
80.
81.
82.
83.
84.
85.
86.
87.
88. 89.
90.
91.
density oligoarray technology. Transplantation. 2001;72:948-53. Kainz A, Mitterbauer C, Hauser P, Schwarz C, Regele HM, Berlakovich G, et al. Alterations in gene expression in cadaveric vs. live donor kidneys suggest impaired tubular counterbalance of oxidative stress at implantation. Am J Transplant. 2004;4:1595-604. Schmid H, Boucherot A, Yasuda Y, Henger A, Brunner B, Eichinger F, et al. Modular activation of nuclear factor-kappaB transcriptional programs in human diabetic nephropathy. Diabetes. 2006;55:2993-3003. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4: 44-57. Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, et al. PID: the Pathway Interaction Database. Nucleic Acids Res. 2009;37(database issue): D674-9. Bhavnani SK, Eichinger F, Martini S, Saxman P, Jagadish HV, Kretzler M. Network analysis of genes regulated in renal diseases: implications for a molecularbased classification. BMC Bioinformatics. 2009;10 Suppl 9:S3. Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, Kohane IS. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics. 2002;18:405-12. Yauk CL, Berndt ML. Review of the literature examining the correlation among DNA microarray technologies. Environ Mol Mutagen. 2007;48:380-94. Kuhn A, Luthi-Carter R, Delorenzi M. Cross-species and cross-platform gene expression studies with the Bioconductor-compliant R package ‘annotationTools’. BMC Bioinformatics. 2008;9:26. Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell. 1993;75: 843-54. Wang WC, Lin FM, Chang WC, Lin KY, Huang HD, Lin NS. miRExpress: analyzing high-throughput sequencing data for profiling microRNA expression. BMC Bioinformatics. 2009;10:328. Liu CG, Spizzo R, Calin GA, Croce CM. Expression profiling of microRNA using oligo DNA arrays. Methods. 2008;44:22-30. Carthew RW, Sontheimer EJ. Origins and mechanisms of miRNAs and siRNAs. Cell. 2009;136:642-55. Friedman RC, Farh KK, Burge CB, Bartel DP. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 2009;19:92-105. Liang M, Liu Y, Mladinov D, Cowley AW Jr, Trivedi H, Fang Y, et al. MicroRNA: a new frontier in kidney and blood pressure research. Am J Physiol Renal Physiol. 2009;297:F553-8. Pandey P, Brors B, Srivastava PK, Bott A, Boehn SN, Groene HJ, et al. Microarray-based approach identifies microRNAs and their target functional pat-
Human renal transcriptome
terns in polycystic kidney disease. BMC Genomics. 2008;9:624. 92. Harvey SJ, Jarad G, Cunningham J, Goldberg S, Schermer B, Harfe BD, et al. Podocyte-specific deletion of dicer alters cytoskeletal dynamics and causes glomerular disease. J Am Soc Nephrol. 2008; 19:2150-8. 93. Ho J, Ng KH, Rosen S, Dostal A, Gregory RI, Kreidberg JA. Podocyte-specific loss of functional microRNAs leads to rapid glomerular and tubular injury. J Am Soc Nephrol. 2008;19:2069-75. 94. Shi S, Yu L, Chiu C, Sun Y, Chen J, Khitrov G, et al.
467
Podocyte-selective deletion of dicer induces proteinuria and glomerulosclerosis. J Am Soc Nephrol. 2008;19:2159-69. 95. Lee I, Ajay SS, Yook JI, Kim HS, Hong SH, Kim NH, et al. New class of microRNA targets containing simultaneous 5’-UTR and 3’-UTR interaction sites. Genome Res. 2009;19:1175-83. 96. Vasudevan S, Tong Y, Steitz JA. Switching from repression to activation: microRNAs can up-regulate translation. Science. 2007;318:1931-4. 97. Cohen CD, Kretzler M. Genexpressionsanalysen an Nierenbiopsien. Der Nephrologe. 2008;3:190-4.