DNA microarrays in parasitology: strengths and limitations

DNA microarrays in parasitology: strengths and limitations

470 Review TRENDS in Parasitology Vol.19 No.10 October 2003 DNA microarrays in parasitology: strengths and limitations John C. Boothroyd1, Ira Bla...

180KB Sizes 0 Downloads 91 Views

470

Review

TRENDS in Parasitology

Vol.19 No.10 October 2003

DNA microarrays in parasitology: strengths and limitations John C. Boothroyd1, Ira Blader1,2, Michael Cleary1 and Upinder Singh1,3 1

Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford CA 94305, USA Department of Microbiology and Immunology, University of Oklahoma Health Sciences Center, Oklahoma City, OK 73190, USA 3 Department of Medicine, Division of Infectious Diseases, Stanford University School of Medicine, Stanford CA 94305, USA 2

Genome sequencing efforts have provided a wealth of new biological information that promises to have a major impact on our understanding of parasites. Microarrays provide one of the major high-throughput platforms by which this information can be exploited in the laboratory. Many excellent reviews and technique articles have recently been published on applying microarrays to organisms for which fully annotated genomes are at hand. However, many parasitologists work on organisms whose genomes have been only partially sequenced and where little, if any, annotation is available. The focus of this review is on how to use and apply microarrays to these situations. DNA microarrays come in various forms, but all involve the same basic technology. A large number of unique nucleotide sequences are arrayed as microscopic spots and the abundance of sequences complementary to each in a given test sample is then determined by hybridization (the reader is encouraged to browse the large number of articles on microarrays published in the December 2002 issue of Nature Genetics). Probes typically consist of DNA, either genomic or complementary DNA (cDNA; used as a proxy for mRNA), labeled with fluorescent dyes (usually Cy3 and Cy5). Differences in abundance of a given sequence can then be assessed as a function of strain, time, developmental state or environmental/physiological condition. In this way, one can address changes or differences for a very large number of genes and draw conclusions about the biology of the system, all the while recognizing the limitations of the technology. Microarray fabrication and experimental design There are three major types of spotted arrays, defined by the nature of DNA arrayed: cDNA, genomic DNA or open reading frame-specific oligonucleotides (see Table 1) [1,2]. Oligonucleotide arrays are best suited to situations where the gene boundaries, including their exon/intron organization, are known. Genomic DNA arrays require that introns and intergenic regions are not too common, whereas cDNA arrays have neither of these limitations. The advantages of genomic and cDNA microarrays are that they can be produced from libraries (cDNA or small, random fragments of genomic DNA) in which all inserts are cloned into a common vector such as pBluescript. As a result, only one pair of primers are required to amplify the Corresponding author: John C. Boothroyd ( [email protected]).

inserts, which makes the generation of these microarrays relatively routine and inexpensive, albeit laborious. The quality of the resulting microarrays is dependent largely on the quality of the libraries used, including overall complexity and the length of the inserts. For example, non-normalized cDNA libraries will result in some genes being represented multiple times (an advantage for giving high confidence to the data) whereas others, corresponding to rare transcripts, will be completely absent. Another limitation of cDNA and genomic microarrays is that they are unable to discriminate between splice variants or gene family members that are closely related, whereas genomic libraries could contain segments that span two or more genes. For organisms such as Plasmodium falciparum, whose genomes have been sequenced and annotated, long oligonucleotides (, 70mers) can be prepared and spotted directly onto the microarrays* [3,4]. (We refer here to oligonucleotides synthesized by standard methods and then spotted onto glass slides. An alternative method of making microarrays, which we will not cover in this review, involves short oligonucleotides synthesized in situ, with a different sequence synthesized at each position, a technology most associated with companies such as Affymetrix [5]). Although more expensive than the other types, oligonucleotide microarrays have several advantages. First, oligonucleotide microarrays permit distinction between genes that are differentially spliced and/or are members of highly conserved gene families. Second, oligonucleotide microarrays allow discrimination between sense and antisense transcripts through hybridization with strand-specific probes generated by labeling either first or second strand cDNA. Third, oligonucleotide microarrays can be used to reveal polymorphisms, although achieving this level of specificity can be technically challenging. Fourth, the use of multiple, independent oligonucleotides for a given gene can increase the confidence in the result. Finally, spotting oligonucleotides directly onto microarrays avoids the laborious and error-prone step of polymerase chain reaction (PCR) amplification and amplicon purification. Comparison methods Virtually all microarray experiments are designed to compare one sample with others, to look for differences * Bozdech et al. (2002) Genome wide gene expression characterization of Plasmodium falciparum asexual erythrocyte life cycle. Abstract from Molecular Parasitology Meeting held 22– 26 September 2003 in Woods Hole, MA, USA.

http://parasites.trends.com 1471-4922/$ - see front matter q 2003 Elsevier Ltd. All rights reserved. doi:10.1016/j.pt.2003.08.002

Review

TRENDS in Parasitology

471

Vol.19 No.10 October 2003

Table 1. Three major types of spotted arrays Ease of generationa Costb Specificityc Representation Can develop without much sequence data Distinguish sense from antisense Easy to normalize

Complementary DNA

Genomic

Oligonucleotide

Easy Cheap Moderate Moderate Yes No Yes with common flankd

Easy Cheap Moderate High Yes No Yes with common flank

Easy Expensive High High No Yes Need (expensive) tag

a

All three forms are relatively easy to generate with equipment that can be built or purchased for prices that are affordable to most large research universities. Generating the material to be spotted involves both labor and reagent costs. For many laboratories, the labor costs can be absorbed (e.g. through student and post-doctoral fellowships) whereas reagent budgets are severely limited. Oligonucleotides are therefore classified as expensive because they require a greater outlay for synthesis than is needed for the complementary DNA or genomic approaches. c See text. d If the sequences to be spotted are PCR-amplified from a plasmid library, all can be made with a common sequence at the two ends, which can be used to determine the molar quantity present in each spot on the array. For oligonucleotides, this requires synthesis of longer molecules or incorporation of some other chemical tag in the synthesis, which significantly increases the cost. One of the challenges facing microarray users is the development of reagents, protocols, data management and statistical programs that allow the enormous amounts of data to be analyzed, and confidence in the results and conclusions to be high. An in-depth discussion of how this is done is beyond the scope of this review and we direct readers to several reviews for guidance in designing their experiments and analyzing their data [6 –8]. Some of these issues do, however, warrant further discussion because they are especially challenging and pertinent to parasitologists. b

in a gene’s polymorphisms and/or its transcript abundance [6 – 8]. There are two general designs, however, for such comparative experiments [7]. In Type I experiments, two experimental samples are labeled with different fluorescent dyes, and condition A is compared directly with condition B. After standardizing the average intensity for each dye over the entire microarray, the relative abundance of the sequence in question under the two conditions can be deduced from the ratio of fluorescence for the two dyes. Type I experiments require that all comparisons be pair-wise, but the amount of target sequence present in a spot will not generally affect the relative efficiency with which the two probes will hybridize and thus the ratio of their respective intensities. The major limitation of Type 1 experiments is that they do not permit comparison of data from multiple microarrays and experiments. This led to the development of Type II experiments, which use a common reference probe labeled with one of the two dyes (see Fig. 1). Therefore, for a given experiment, a fixed amount of the common reference probe, labeled with Cy5, is added to the experimental probes, labeled with Cy3. The common reference probe allows the amount of spotted DNA to be determined and normalized among many different microarrays. An important feature of the reference probe is that it should hybridize to at least the spots that will be hybridized by the experimental probes, if not to all of the spots on the microarray. Currently, three types of common reference probes are generally used: cDNA, PCR amplicons and PCR-amplified polylinker [9–12]. The cDNA reference probe can be derived by pooling together aliquots of all experimental samples. Although this common reference probe should hybridize to all of the relevant genes on the microarray, it is sometimes not feasible when samples are limiting and new conditions demand preparation of a new reference pool. To circumvent this limitation, cDNA can be generated from an extensive collection of cell lines and tissues. Whereas such material is commercially available for human, mouse and rat microarrays, there is no guarantee that parasite-induced host genes will be represented. Moreover, for parasite microarrays, obtaining large amounts of mRNA from all stages can be technically challenging. An alternative approach is to http://parasites.trends.com

use PCR-amplified inserts corresponding to all of the genes present on the microarray, when such material is available. A third approach exploits the fact that, often, all of the spotted sequences have a small region in common (e.g. the polylinker into which the DNA inserts were cloned). In such cases, the common tag can be amplified (or synthesized) and used as a common reference probe. With this approach, the reference can be used not only to normalize signal intensities across multiple microarrays but also, because it hybridizes to each spot in proportion to the molar amount of target present, to determine the relative transcript abundance of the genes on a single microarray [10,11]. Reproducibility of the microarray Like other techniques, reproducibility is partly a function of the quality of the reagents used. Thus, the DNA to be spotted, as well as the glass on which the microarray will be fabricated, must be meticulously prepared and carefully processed to minimize the level of background fluorescence [7]. Another common concern is having confidence that a spot on the microarray corresponds to the gene assigned to it. This problem was acutely highlighted when researchers using human cDNA microarrays commonly found mistakes in the annotation of the UNIGENE expressed sequence tag (EST) clones used to make the microarrays. The source of this error is still not clear, but most probably arose out of the difficulty of assembling, handling and managing a very large number of clones [13]. Thus, it is advisable to spot onto the microarray two or more independent clones for each gene. In addition, one should validate a microarray by directly sequencing random samples of the spotted material and checking the ‘register’ of the array using probes for specific genes. Finally, for any gene of special interest, it is prudent to expand the data by independent means, such as northern blot or real-time reverse transcriptase PCR, before investing significant effort into its further analysis. Such methods have their own pitfalls and are not necessarily more reliable than well-performed microarray experiments; however, they can provide additional, valuable data (e.g. they can reveal alternative splicing that might not be detected on all arrays).

472

Review

TRENDS in Parasitology

(a) Type I

Vol.19 No.10 October 2003

(b) Type II

Mock infection

Infection

Hybridize to microarray

No Change

Infection (condition B)

Isolate parasite RNA and label

Isolate RNA and label

Downregulated

Infection (condition A)

Mix with common reference probe

Upregulated

Hybridize to one microarray for each condition

Compare colors (= ratio of experimental : reference) under the two conditions A and B TRENDS in Parasitology

Fig. 1. Comparison of Type I and II experimental design for microarray analysis. (a) Type I experiments are head-to-head comparisons in which the complementary DNA from two conditions are each labeled with a different dye and then mixed and hybridized to the array. The red:green ratio indicates the relative abundance of the transcript in the two conditions. (b) Type II experiments use a common reference probe labeled with one of the dyes and each experimental sample is labeled with the other. One array is used for each condition being analyzed and then, for each spot, the colors (ratio of the experimental:reference signals) are compared. For the indicated spot, red versus green indicates that that gene’s transcript is more abundant under condition A (left) than B (right); the remaining genes’ transcripts are equally abundant under the two conditions.

Statistical analysis One of the greatest hurdles to overcome when analyzing microarray data is to determine which changes observed with the microarrays are statistically significant. In general, two different approaches have been used. The first is to set arbitrary cutoffs, such as a two- or threefold difference between the two signals. An obvious limitation to this approach is that an arbitrary threshold gives no measure of statistical power, and could miss important genes that are reproducibly modulated less severely, but with substantial biological importance. As an alternative to arbitrary thresholds, programs such as ‘significance analysis of microarrays’ (SAM) have been developed that identify statistically significant differences in expression levels based on repeated measurements of a given spot [14]. SAM allows the researcher to set parameters that influence the frequency of falsepositives and identifies differences in transcript abundance regardless of ‘fold-change’ values. One challenge that parasitologists face, as well as others working with organisms whose genomes have yet to be sequenced and annotated, is applying these statistical algorithms to microarrays spotted with non-normalized cDNA libraries. Since conventional microarrays (human, mouse, yeast, Escherichia coli) usually contain at most twoor threefold redundancy, SAM and other algorithms were developed to consider each spot as a distinct gene [14]. In contrast to these microarrays, parasitologists who work with non-sequenced organisms often generate their microarrays http://parasites.trends.com

using clones from non-normalized cDNA libraries in which highly expressed transcripts will be over-represented. For example, Toxoplasma gondii cDNA microarrays spotted with ,4400 ESTs from a non-normalized bradyzoite EST library contained highly abundant bradyzoite genes, such as BAG1 and SAG4, that were spotted 70 and 23 times, respectively [10]. To analyze all of these spots as a single gene, the data from the spots were grouped and the fraction of spots called significant by SAM was determined. Importantly, when genes that had previously been shown to be developmentally regulated were analyzed in this way, the vast majority of the spots containing the relevant ESTs were identified as differentially regulated by SAM [10]. Limiting samples Whereas some stages of certain eukaryotic parasites, such as P. falciparum, T. gondii and Trypanosoma cruzi, can be grown in vitro to relatively high numbers, many stages cannot and, for some parasites (e.g. Plasmodium vivax), there are no reliable in vitro propagation methods. Moreover, techniques such as fluorescence-activated cell sorter, fine-needle aspiration biopsy and laser capture microdissection have greatly expanded our abilities to examine in vivo the interaction of a parasite directly with its host cell or its microenvironment within a specific tissue or organ [15]. Analysis of material that can only be obtained in small amounts has led to the development of microarray techniques that can be used with very small quantities of starting material while maintaining the

Review

TRENDS in Parasitology

molecular integrity of the sample (in this case, the relative amounts of each mRNA). Taking the lead from in situ hybridization and immunofluorescence protocols, reagents such as dendrimers, which are designed to attach a single cDNA molecule to hundreds of fluorescent molecules, and tryramide signal amplification, which is used in conjunction with biotin-streptavidin, have been successfully used to enhance the sensitivity of microarrays [16 – 18]. In addition to signal amplification, another widely used technique is RNA linear amplification by creating cRNAwith T7 RNA polymerase (T7Rpol) [19,20]. Here, the T7Rpol promoter sequence is added to the oligo(dT) primer used in the first strand reaction. Following second-strand synthesis, an in vitro transcription reaction is performed using T7Rpol. Although T7Rpol based amplification is considered to be ‘linear’ (as opposed to PCR amplification, which is ‘exponential’), it is important to verify that the complexity and fidelity of the sample remains intact throughout the amplification. Using malaria high-density oligonucleotide microarrays, RNA amplification techniques were validated by monitoring the transcriptional changes of chromosome 2 that occur during intraerythrocytic development [3]. Both signal amplification and linear amplification have been used to analyze cancer biopsies and subpopulations of neurons, and are likely to prove to be powerful approaches for parasitologists. These techniques can also help to analyze changes in abundance for the most rare transcripts, which can otherwise be difficult to detect, even when the availability of parasite material is generally not considered limiting. Microarray data publication Because no single laboratory can study every pathogen, a major goal in this field is to standardize the experiments and reagents that make it possible to compare and correlate data between laboratories. Although it is not practicable to compel groups to use, for example, a single cell line or infection protocol, it is conceivable and important that methods used to generate, hybridize, and analyze the microarrays are clear and specific, such that the experiments can be reproduced and the data utilized by others. Recent discussions by the Microarray Gene Expression Data Society (www.mged.org) have resulted in the development of the Minimum Information About a Microarray Experiment (MIAME) guidelines. The goal of these guidelines is to establish a standard in the microarray field that allows data to be unambiguously interpreted and verified. Because a large number of peerreviewed journals require that all published microarray experiments follow these guidelines, we recommend reading the guidelines at http://www.mged.org/Workgroups/ MIAME/miame.html and a checklist at http://www.mged. org/Workgroups/MIAME/miame_checklist.html. What to do with microarrays? The most common use of microarrays is to examine differences in transcript abundance as a function of any of several variables. These can include time in a given physiological condition, developmental stage, drug treatment, population density, infection, strain, etc. Importantly, it should be remembered that, for most experiments, microarrays are not the last experiment performed but, http://parasites.trends.com

Vol.19 No.10 October 2003

473

like genetic screens, they serve as a springboard to unraveling complex molecular and cellular pathways. The two big questions for researchers are: what type of experiments can be done and how does one go from having a mass of data on a large number of genes to understanding the biology of the system as a whole? Comparative genomics Parasitologists interested in questions of pathogenesis can use microarrays to address differences between two strains or between closely related species to determine, for example, what makes one virulent and another avirulent. Microarrays have facilitated our understanding that, for some bacterial pathogens, differences in virulence can be the result of the presence or absence of critical genes and/or a difference in their expression levels. Either one or both of these differences could serve as the basis for the vastly different clinical outcome of patients infected with Entamoeba histolytica, which causes severe disease, and Entamoeba dispar, which colonizes the gut without causing disease. Because these two organisms share 98% sequence identity, microarray based, genome-wide comparisons could prove useful in identifying candidate virulence genes, based on which genes are uniquely present in E. histolytica versus E. dispar [21,22]. For some pathogens, successful transmission between hosts requires that the parasite become infectious before release. Vibrio cholera collected from fecal samples of infected patients are more virulent than the same bacteria propagated in vitro. Transcriptional profiling of the bacteria directly from stool versus in vitro cultivation demonstrated that bacteria from human stool expressed genes that were apparently crucial for subsequent transmission [23]. Microarrays can also be utilized to compare the genomes of attenuated vaccine strains. One caveat of using such vaccines is that continuous passage of the strain could result in the accumulation of point and deletion mutations that can have deleterious effects on its ability to elicit protective immune responses. For example, different daughter strains of attenuated Mycobacterium bovis strain, BCG, display various potencies in protecting individuals from infection with Mycobacterium tuberculosis. To investigate the basis for these differences, whole genome M. tuberculosis microarrays were probed with genomic DNA from M. tuberculosis and different BCG daughter strains. It was found that the BCG strains evolved and collected deletions in genes that could function to provide immunity against M. tuberculosis [24]. For parasites that have multiple clinical outcomes, an ultimate goal of genotype analysis would be the development of PCR or restriction fragment length polymorphism based assays that can readily distinguish the most virulent strains and allow for strain-appropriate therapy. Such goals are especially important in parasitology, in that genome-wide studies are often performed in developed countries with a low incidence of clinical disease and with laboratory adapted strains. Experiments with fresh, local isolates will be crucial to the successful exploitation of comparative genomics in identifying genotypes associated with unique epidemiological and clinical behaviors, as well as revealing evolutionary trends in parasite biology.

474

Review

TRENDS in Parasitology

Genetics One of the most important tools for the geneticist is the availability of mutants in a given property of interest. Two questions inevitably follow: what is the basis of the mutation and how does it manifest in the change of phenotype? Genome-wide transcriptional profiling of a mutant strain compared with its parental wild-type strain can offer important clues about a gene’s function and the pathways in which it is involved without previous bias regarding which transcripts to examine. In bacteria, two-component systems such as Agr and Sar in Staphylococcus aureus and EvgAS in E. coli have been implicated in virulence, initially through largely unknown mechanisms [25,26]. Comparing the transcriptional profiles of the mutant and wild-type strains demonstrated that these two-component systems control the expression of genes that are likely to be crucial for survival in the host, thus providing important clues regarding the role of these factors in pathogenesis. The transition from one physiological state to another is not usually controlled by a single gene, but rather the coordination of multiple pathways. Collections of mutants are invaluable in dissecting and classifying these pathways. Even though many parasites are readily mutagenized, only a few have the genetic systems and reagents necessary to identify the mutated genes. Microarrays can be invaluable in analyzing a bank of such mutants and gaining insight about these pathways. As an example, performing epistasis experiments with several T. gondii mutants unable to differentiate between tachyzoites and bradyzoites, we were able to classify bradyzoite genes into four different functional groups based upon their expression patterns during differentiation [27]. Parasite development and life cycle Transcriptional profiling of the changes in gene expression during development offers a new tool in the parasitologist’s arsenal to unravel how a parasite develops. The transcriptional profiles of P. falciparum grown in red blood cells revealed that glycolytic and other metabolic enzymes are coordinately upregulated when the parasites are undergoing high rates of growth. By contrast, late schizonts that are ready to undergo egress display reduced global transcriptional levels except for a subset of genes whose transcripts encode proteins involved in signaling [28]. Time-course analysis of gene expression following exposure to some developmental trigger (e.g. nutrient starvation or drug treatment) can identify early-response genes that are probably involved in initiating developmental changes as well as late-response genes that are likely to dictate the unique metabolic and physical properties of that developmental stage. For example, at a late stage during the transition of T. gondii tachyzoites to bradyzoites, a large number of genes encoding metabolic enzymes and bradyzoite secretory antigens were upregulated [10]. However, these data also highlight a major limitation of creating microarrays from a cDNA library. Because these microarrays were prepared from a mature bradyzoite cDNA library [29], spots representing some of the ‘early-response’ genes might not have been present. One of the greatest challenges to this type of study involves obtaining pure preparations of the developmental http://parasites.trends.com

Vol.19 No.10 October 2003

stage of interest and generating sufficient quantities of RNA. Large scale, in vitro culture methods and the ability to selectively isolate parasites based on stage-specific molecular markers and/or physical properties can often overcome these challenges. However, the number of manipulations which the parasites undergo between the time of reaching the stage of interest and RNA preparation must be kept to a minimum to avoid inducing changes in gene expression as a result of those manipulations. Steady state versus de novo synthesis To date, microarrays have been used almost exclusively to measure the abundance of a given transcript, rather than the actual transcription rate from the relevant gene. Abundance, of course, is a function of two competing forces, synthesis and decay, and so a rise in levels of a given mRNA can result from an increase in its synthesis and/or a decrease in its decay rate. Partly as a result of this, conventional transcriptional profiling experiments can be confounded by the presence of mRNA at the start of the experiment: if a given mRNA is highly abundant and has a considerably long lifetime, it will be difficult to monitor a change in its synthesis rate, even if the transcription of the gene in question has been completely shut down by a given condition. In parasite systems, such as kinetoplastida, where post-transcriptional effects play a major role in modulating transcript levels, it can be especially hard to tease apart the relative contributions of synthesis and decay [30]. RNA decay can be addressed on a total-genome basis, using drugs to inhibit transcription (e.g. actinomycin D) and then measuring the relative decline in transcript abundance for all genes as a function of time. Such inhibitor-based studies are complicated by secondary effects of the drug and differences in the way a given mRNA decays. Once a gene of interest has been identified by the microarrays, it is important to adopt alternative approaches to unravel the mechanisms that regulate the abundance of the gene of interest. Such approaches include more conventional assays such as pulse/chase experiments using biosynthetic labeling to measure decay rates and/or nuclear run-on to measure transcriptional activity. An alternative approach that exists in theory, but has yet to be developed for routine use is to perform genomewide assays to directly measure mRNA synthesis over a specific period of time or condition. This could be accomplished, for example, by incubating cells with specifically tagged nucleosides and purifying the tagged mRNA (newly synthesized) from non-tagged mRNA (residual). This approach would be especially useful for looking at genes that are downregulated in a given condition. Translational control Translation is one of the most important levels at which gene expression is regulated. For example, global translation levels can decrease in response to stresses such as heat shock, starvation and hypoxia, whereas the translation rate of certain transcripts thought to be involved in responding to these stresses is increased [31,32]. Microarray analysis of total mRNA does not reveal whether mRNA for a given gene is being actively translated or not. It is possible, however, to derive some information on this by examining polysome profiles (from sucrose gradients)

Review

TRENDS in Parasitology

that separate transcripts based upon the number of ribosomes associated with a specific transcript as an indicator of its translational state [33,34]. Hence, here too arrays can provide important information on a crucial means for regulating gene expression. Apart from examining the translational state of each gene’s transcript, microarrays can also be used to determine which transcripts encode a secreted protein through the purification and microarray analysis of mRNA in membrane bound polysomes [35]. Because secreted proteins are a major focus in vaccine design programs, this approach could provide these groups with many new candidates and complement bioinformatic and other approaches aimed at the same goal. Host responses to infection Most of the above discussion has focused on microarrays comprising parasite sequences. Of course, the essence of parasitology is the interaction between host and pathogen, and arrays consisting of host gene sequences are the perfect complement to the parasite gene arrays. For example, transcriptional profiling of the changes that occur within the host cell during infection, growth, latency and killing can provide key insights into how the parasite establishes and maintains infection. Reduced to its simplest level, the three major goals of examining how the host responds to infection are elucidating: (1) how does it recognize that it has been infected with a specific parasite and initiate an appropriate immune response?; (2) how does the parasite stimulate these responses?; and (3) do the changes in host transcript abundance favor the host or the parasite? For intimately associated pathogens, the use of infected host material brings with it a special problem, however: ensuring that the signal detected is not due to cross hybridization between parasite transcripts and the host DNA spotted on the microarray. For some highly conserved genes, the sequences present in the parasite and host genomes could be so similar that each hybridizes to the other (e.g. some ribosomal protein genes) [36]. In practice, however, such genes are relatively rare and the specificity of the hybridization is sufficient that the problem arises only in very few cases. These rare instances are easily identified in pilot experiments in which pure parasite genomic or cDNA is used as a probe on the host arrays and vice versa. Recognition by the innate immune system of an infectious agent is a critical step in mounting an immune response against infection. Several groups have categorized the response of dendritic cells, macrophages and other innate immune cells to pathogenic organisms as well as to specific activators of the innate immune system such as lipopolysaccharide, double-stranded RNA, CpG methylated DNA (in vertebrate systems, cytosines found immediately 50 of guanosines are frequently methylated, depending on whether the given gene is being transcribed or not) and mannan [37,38]. In many cases, these stimuli modulated common sets of genes that include those encoding chemokines, cytokines and other stress-related proteins. However, each stimulus did modulate its own unique subset of genes, suggesting that the innate immune system discriminates between infectious agents. In addition, some pathogens fail to trigger at least a subset of the common http://parasites.trends.com

Vol.19 No.10 October 2003

475

response genes. For example, the response of infected fibroblasts to two intracellular pathogens, T. gondii and T. cruzi, varies dramatically, with T. gondii stimulating a rapid and classic proinflammatory response and T. cruzi inducing much less change and in a very different repertoire of genes [39]. In terms of the biology of the system, host genes modulated during infection represent three functionally distinct classes: (1) genes that improve the survival of the host (‘pro-host’); (2) genes that promote the parasite’s growth (‘pro-parasite’); and (3) genes incidentally regulated as a consequence of modulating the first two classes (‘bystander’). Of course, pro-host genes that ameliorate extreme virulence can serve a crucial role for the effective transmission of any pathogen but, for the purposes of this discussion, we will consider only the shortterm consequences of the interactions between a given host and given pathogen within it. Many pro-host genes (e.g. those encoding interferon-g, tumor necrosis factor-a and other cytokines) have been previously identified, and microarrays have proven useful in the en masse characterization of their activation by numerous pathogens (see, for example, Refs [40–44]). Many more such genes, however, are likely to be identified by these sorts of analysis. On the pro-parasite side, very few genes have so far been definitively identified, but it can safely be assumed that the parasites have evolved to co-opt the host machinery to their own purposes and that changes in gene expression are likely to be one way to accomplish this. For example, intracellular parasites such as T. gondii, Theileria parva and Leishmania sp. are likely to upregulate anti-apoptotic genes as well as host metabolic genes necessary to satisfy their auxotrophies. Probably the greatest challenge for a biologist, is determining the class (pro-host, pro-parasite or bystander) to which a gene belongs. It is obviously not feasible to examine each of the hundreds of modulated genes in detail, and so careful choices will need to be made based on some knowledge of the gene’s function and, more importantly, the parasite’s biology. Clues will come from a variety of sources, including, for example, the type of host cells that respond in a similar way, the timing of the induction and the response to virulent versus non-virulent strains. Ultimately, the most useful information will come from analysis of mutant hosts (or host cell lines) that produce more or less of the given gene’s product. This knockout approach has been very powerful in analyzing the most obvious pro-host genes over the past two decades and will be crucial in the analysis of the less obvious genes identified from microarray experiments. Dissecting how the parasite effects changes on the host cell will require identifying the parasite factors that modulate host cell transcription and determining the host transcriptional and signaling pathways influenced by infection. Reporter assays based upon transcripts and pathways identified in the microarray experiments can be utilized to facilitate identification of these factors. Finally, it should also be noted that, while the individual bystander genes might not be directly relevant, they could be invaluable in providing clues to pro-parasite genes or pro-parasite transcriptional and signaling pathways through providing a list of genes that respond to the stimulus.

476

Review

TRENDS in Parasitology

Conclusion Microarrays promise to be a tremendous addition to the tools available to the parasitologist interested in the complex interaction between host and pathogen. As with many new tools, however, their full utility will probably not be appreciated or developed for several more years to come. One of the most exciting advances to look forward to is the ability to integrate data across many experiments, from many laboratories, using arrays, proteomics, metabolics, genetics and other broad approaches. Collectively, these advances will help us to develop a truly holistic view of the host–pathogen interaction. Anything that allows rapid insight into the myriad changes and crosstalk that occur on both sides of this equation can only help as we wrestle with the important diseases caused by these organisms. Acknowledgements We thank our many colleagues, in and out of our laboratories, who gave input into this review and the data on which it is based. The work from our laboratories mentioned in this review was supported by grants to: J.C.B. (NIH R37-AI214123, RO1-AI41014 and RO1-AI45057); I.B. (NIH F32-AI10478); M.C. (NIH CMB GM07276 and the University of California University-wide AIDS Research Program); and U.S. (NIH KO8-AI01453 and the Burroughs Wellcome Fund Career Development Award).

References 1 Rathod, P.K. et al. (2002) DNA microarrays for malaria. Trends Parasitol. 18, 39 – 45 2 Duggan, D.J. et al. (1999) Expression profiling using cDNA microarrays. Making and reading microarrays. Nat. Genet. 1 (Suppl), 10 – 14 3 Le Roch, K.G. et al. (2002) Monitoring the chromosome 2 intraerythrocytic transcriptome of Plasmodium falciparum using oligonucleotide arrays. Am. J. Trop. Med. Hyg. 67, 233 – 243 4 Volkman, S.K. et al. (2002) Excess polymorphisms in genes for membrane proteins in Plasmodium falciparum. Science 298, 216– 218 5 Lockhart, D.J. et al. (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat. Biotechnol. 14, 1675– 1680 6 Miller, L.D. et al. (2002) Optimal gene expression analysis by microarrays. Cancer Cell 2, 353 – 361 7 Eisen, M.B. and Brown, P.O. (1999) DNA arrays for analysis of gene expression. Methods Enzymol. 303, 179– 205 8 Richter, A. et al. (2002) Comparison of fluorescent tag DNA labeling methods used for expression analysis by DNA microarrays. Biotechniques 33, 620 – 630 9 Alizadeh, A.A. et al. (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 10 Cleary, M.D. et al. (2002) Toxoplasma gondii asexual development: identification of developmentally regulated genes and distinct patterns of gene expression. Eukaryot. Cell 1, 329 – 340 11 Dudley, A.M. et al. (2002) Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proc. Natl. Acad. Sci. U. S. A. 99, 7554 – 7559 12 Sterrenburg, E. et al. (2002) A common reference for cDNA microarray hybridizations. Nucleic Acids Res. 30, E116 13 Knight, J. (2001) When the chips are down. Nature 410, 860 – 861 14 Tusher, V.G. et al. (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. U. S. A. 98, 5116 – 5121 15 Mills, J.C. et al. (2001) DNA microarrays and beyond: completing the journey from tissue to cell. Nat. Cell Biol. 3, E175– E178 16 Stears, R.L. et al. (2000) A novel, sensitive detection system for highdensity microarrays using dendrimer technology. Physiol. Genomics 3, 93–99 17 Karsten, S.L. et al. (2002) An evaluation of tyramide signal amplification and archived fixed and frozen tissue in microarray gene expression analysis. Nucleic Acids Res. 30, E4 18 Manduchi, E. et al. (2002) Comparison of different labeling methods for two-channel high-density microarray experiments. Physiol. Genomics 10, 169–179 19 Luo, L. et al. (1999) Gene expression profiles of laser-captured adjacent neuronal subtypes. Nat. Med. 5, 117 – 122 http://parasites.trends.com

Vol.19 No.10 October 2003

20 Gelder, R. et al. (1990) Amplified RNA synthesized from limited quantities of heterogeneous cDNA. Proc. Natl. Acad. Sci. U. S. A. 87, 1663– 1667 21 Diamond, L.S. and Clark, C.G. (1993) A redescription of Entamoebahistolytica schaudinn, 1903 (emended Walker; 1911) separating it from Entamoeba dispar Brumpt, 1925. J. Eukaryot. Microbiol. 40, 340 – 344 22 Clark, C.G. and Diamond, L.S. (1997) Intraspecific variation and phylogenetic relationships in the genus Entamoeba as revealed by riboprinting. J. Eukaryot. Microbiol. 44, 142 – 154 23 Merrell, D.S. et al. (2002) Host-induced epidemic spread of the cholera bacterium. Nature 417, 642– 645 24 Behr, M.A. et al. (1999) Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science 284, 1520– 1523 25 Masuda, N. and Church, G.M. (2002) Escherichia coli gene expression responsive to levels of the response regulator EvgA. J. Bacteriol. 184, 6225– 6234 26 Dunman, P.M. et al. (2001) Transcription profiling-based identification of Staphylococcus aureus genes regulated by the agr and/or sarA loci. J. Bacteriol. 183, 7341– 7353 27 Singh, U. et al. (2002) Genetic analysis of tachyzoite to bradyzoite differentiation mutants in Toxoplasma gondii reveals a hierarchy of gene induction. Mol. Microbiol. 44, 721 – 733 28 Mamoun, C.B. et al. (2001) Co-ordinated programme of gene expression during asexual intraerythrocytic development of the human malaria parasite Plasmodium falciparum revealed by microarray analysis. Mol. Microbiol. 39, 26 – 36 29 Manger, I.D. et al. (1998) Expressed sequence tag analysis of the bradyzoite stage of Toxoplasma gondii: identification of developmentally regulated genes. Infect. Immun. 66, 1632 – 1637 30 Clayton, C.E. (2002) Life without transcriptional control? From fly to man and back again. EMBO J. 21, 1881 – 1888 31 Gingras, A-C. et al. (1999) eIF4 initiation factors: effectors of mRNA recruitment to ribosomes and regulators of translation. Annu. Rev. Biochem. 68, 913– 963 32 Kraggerud, S.M. et al. (1995) Regulation of protein-synthesis in human cells exposed to extreme hypoxia. Anticancer Res. 15, 683 – 686 33 Zong, Q. et al. (1999) Messenger RNA translation state: the second dimension of high-throughput expression screening. Proc. Natl. Acad. Sci. U. S. A. 96, 10632 – 10636 34 Kuhn, K.M. et al. (2001) Global and specific translational regulation in the genomic response of Saccharomyces cerevisiae to a rapid transfer from a fermentable to a nonfermentable carbon source. Mol. Cell. Biol. 21, 916 – 927 35 Diehn, M. et al. (2000) Large-scale identification of secreted and membrane-associated gene products using DNA microarrays. Nat. Genet. 25, 58 – 62 36 Blader, I.J. et al. (2001) Microarray analysis reveals previously unknown changes in Toxoplasma gondii-infected human cells. J. Biol. Chem. 276, 24223 – 24231 37 Huang, Q. et al. (2001) The plasticity of dendritic cell responses to pathogens and their components. Science 294, 870 – 875 38 Boldrick, J.C. et al. (2002) Stereotyped and specific gene expression programs in human innate immune responses to bacteria. Proc. Natl. Acad. Sci. U. S. A. 99, 972 – 977 39 de Avalos, S.V. et al. (2001) Immediate/early response to Trypanosoma cruzi infection involves minimal modulation of host cell transcription. J. Biol. Chem. 277, 639 – 644 40 Guillemin, K. et al. (2002) Cag pathogenicity island-specific responses of gastric epithelial cells to Helicobacter pylori infection. Proc. Natl. Acad. Sci. U. S. A. 99, 15136 – 15141 41 Geiss, G.K. et al. (2000) Large-scale monitoring of host cell gene expression during HIV-1 infection using cDNA microarrays. Virology 266, 8 – 16 42 Chang, Y. and Laimins, L. (2000) Microarray analysis identifies interferon-inducible genes and Stat1 as major transcriptional targets of human papillomavirus type 31. J. Virol. 74, 4174– 4182 43 Cohen, P. et al. (2000) Monitoring cellular responses to Listeria monocytogenes with oligonucleotide arrays. J. Biol. Chem. 275, 11181–11190 44 Renne, R. et al. (2001) Modulation of cellular and viral gene expression by the latency-associated nuclear antigen of Kaposi’s sarcoma-associated herpesvirus. J. Virol. 75, 458 – 468