FEBS Letters 584 (2010) 4375–4378
journal homepage: www.FEBSLetters.org
Hypothesis
A conserved gene cluster as a putative functional unit in insect innate immunity Kálmán Somogyi a,⇑,1, Botond Sipos a,1, Zsolt Pénzes a,b, István Andó a a b
Institute of Genetics, Biological Research Center of the Hungarian Academy of Sciences, P.O. Box 1, H-6726 Temesvári krt. 62. Szeged, Hungary Department of Ecology, University of Szeged, H-6726 Közép fasor 52. Szeged, Hungary
a r t i c l e
i n f o
Article history: Received 9 August 2010 Revised 7 October 2010 Accepted 8 October 2010 Available online 14 October 2010 Edited by Takashi Gojobori
a b s t r a c t The Nimrod gene superfamily is an important component of the innate immune response. The majority of its member genes are located in close proximity within the Drosophila melanogaster genome and they lie in a larger conserved cluster (‘‘Nimrod cluster”), made up of non-related groups (families, superfamilies) of genes. This cluster has been a part of the Arthropod genomes for about 300–350 million years. The available data suggest that the Nimrod cluster is a functional module of the insect innate immune response. Ó 2010 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Keywords: Gene cluster Nimrod Centaurin gamma Ance Vajk Immune response
The Nimrod gene superfamily is a fundamental component of the innate immune response. For many Nimrod proteins, participation in defense mechanisms through involvement in the phagocytosis of either apoptotic cells, bacteria or both (see [1] for a summary; [2–4]) has been demonstrated in various organisms. Also, some Nimrod genes show up-regulation upon immune challenge of the Drosophila melanogaster in different in vivo and cell culture experiments [5,6]. Most of the genes in the Nimrod superfamily are located in close proximity on the 2nd chromosome of D. melanogaster and this clustering was also observed in the genomes of Anopheles gambiae (Diptera) and Apis mellifera (Hymenoptera) [2]. The examination of several arthropod genomes (D. melanogaster and A. gambiae (Diptera), Tribolium castaneum (Coleoptera), A. mellifera (Hymenoptera), Pediculus humanus (Phthiraptera) and Daphnia pulex (Cladocera)) revealed that these genes are part of a gene cluster (hereafter ‘‘Nimrod cluster”) conserved throughout a broad timescale (Fig. 1A). The cluster is composed of members of the following gene families: Centaurin gamma (CenG), Angiotensin converting enzyme (Ance), Nimrod A (NimA), Nimrod B (NimB)
⇑ Corresponding author. Address: Institute of Biochemistry, Biological Research Center of the Hungarian Academy of Sciences, P.O. Box 1, H-6726 Temesvári krt. 62. Szeged, Hungary. Fax: +36 62 433 506. E-mail addresses:
[email protected] (K. Somogyi),
[email protected] (B. Sipos),
[email protected] (Z. Pénzes),
[email protected] (I. Andó). 1 These authors contributed equally to this work.
and Nimrod C (NimC), and a yet uncharacterized group (most likely a superfamily) of genes (hereafter ‘‘Vajk genes”) (Fig. 1A and SI Table 1). According to a comparative analysis of 12 Drosophila genomes [7] the region containing most of the genes of the Nimrod cluster is part of one of the largest syntenic blocks. Of the analyzed genomes, the genome of P. humanus is the most distant from Drosophila where many properties of the complete cluster are still evident. Notably, here only a single Draper-type Nimrod gene (coding for a protein with a Draper-type domain architecture) was identified in the cluster (nimrod A). This cluster shows a composition which can be considered as an intermediate between the holometabolous insects and other non-insect species (Fig. 1A). In the latter genomes no Nimrod C or B genes were found. This suggests that, in concordance with the previously described hypothesis [1], a Draper-type gene may have been the first member of the superfamily and later local duplications followed by divergence led to the other (Nimrod C and B) genes. The genome of P. humanus is also the one most distant from D. melanogaster where we were able to identify a Vajk gene. This single Vajk gene, located within the intron of an Ance gene, might indicate an upper bound for the time of appearance of that group of genes (Fig. 1A). Centaurin gamma seems to be a somewhat optional member of the cluster. First, it is absent from the cluster in A. gambiae and lacks any ortholog in A. mellifera. Second, although in most of the sequenced Drosophila genomes the whole Nimrod cluster is intact [7], in Drosophila yakuba and Drosophila mojavensis it does not
0014-5793/$36.00 Ó 2010 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.febslet.2010.10.014
4376
K. Somogyi et al. / FEBS Letters 584 (2010) 4375–4378
Fig. 1. Genomic organization and some aspects of gene expression regulation in the neighbourhood of the Nimrod cluster. (A) Genomic organization in the various species. Genes are schematically shown on simplified chromosomal maps, colour-coded as follows: Centaurin gamma (cenG; black), Ance (blue), Vajk (green), Draper-type (red), Nimrod C-type (pink) and Nimrod B-type (orange). Other genes are shown in white. Only the genus names of the analyzed species are shown. In the case of Drosophila, the gene names are also indicated. The names of the chromosomes/genomic segments are indicated above the maps. Genes located outside the cluster are shown on the right. To emphasize the position of the Vajk genes, the large intron of ance-3 (and its orthologs) is represented. Note that the region containing one of the Tribolium clusters (on the Lg Unknown 24) is not yet mapped to any chromosome, so its position according to the other clusters is unknown. The divergence times are based on [30]. (B) Some aspects of gene expression regulation. The figure schematically represents a two megabase-long region around the Nimrod cluster (2L: 13490000-14490000 basepairs; 102 genes). Rectangles in the first lane represent the genes in their chromosomal order, colour-coded as in (A). Data presented in other lanes: conserved synthetic blocks across 12 Drosophila species [7]; clusters of co-expression [12]; clusters of genes with coordinated evolution in Drosophila species [13], overlapping merged windows obtained with different window sizes were fused; Class I insulator sites [16], the sector between two consecutive insulators overlapping the Nimrod gene cluster is highlighted; Polycomb protein binding domains [18]; and regions of Eve transcription factor binding-eve BRICK, the largest BRICK overlapping the cluster is shown [19].
include cenG (and also not ance and ance-2), providing hints of at least two independent chromosomal breakages upstream of ance-3. The similar gene composition across distantly related arthropod species indicates that the cluster remained mostly intact during the last 300–350 million years. We assessed the statistical significance of clustering by using the method of Raghupathy and Durand [8] – which also deals with the presence of gene families – in pairwise tests of the A. gambiae, A. mellifera and P. humanus clusters against the D. melanogaster cluster. After correcting for the multiple comparisons by the Holm’s method, all pairwise P-values were smaller than 0.001 (SI Text 1). The amount of time elapsed since the divergence of the most distant species (Fig. 1A) makes the neutral explanation of the conservation less plausible, which qualifies the cluster as a possibly important evolutionary unit. While the overall composition (presence and order of the genes) seems to be well conserved, the actual number and orientation of genes from individual gene families vary widely. This again indicates that this clustering cannot be simply explained by neutral factors (e.g. a chromosome region with an intrinsically low rate of rearrangements). A need for close proximity of common elements in the regulation of genes might drive the purifying selection maintaining the integrity of a gene cluster [9–11]. Indeed, experimental data suggest possible coordinated gene expression regulation acting in the Nimrod region of D. melanogaster.
In a comprehensive study Spellmann and Rubin analyzed transcription profiles based on experiments performed in 88 different conditions [12]. They identified groups of adjacent similarly expressed genes in the fruitfly genome. According to this robust dataset, many of the genes of the Nimrod cluster appear to be co-regulated (Fig. 1B). A subset of these co-regulated genes was also identified as a cluster of genes with coordinated evolution of gene expression levels (Fig. 1B) in a study on closely related Drosophila species [13]. These pieces of evidence clearly argue for the presence of common elements in the regulation of the clustered genes. Several authors (e.g. [9,12]) claim that regulation on chromatin level might be responsible for gene clustering. Boundary elements (insulators) are able to participate in the organization of the chromatin structure by promoting the formation of functional domains on the chromosome (for a review see [14]). Through inter-insulator contacts, boundary elements can create chromatin loops and so provide the opportunity for concerted regulation inside this domain [15]. Indeed, ‘‘Class I” insulators – identified by the binding of BEAF-32, CP190, or CTCF in Drosophila embryos – act as chromatin boundaries and separate differentially expressed genes [16]. In humans, chromosomal domains where the insulator protein CTCF is depleted tend to contain clusters of coordinately regulated genes; moreover, pairs of consecutive CTCF-binding sites often delimit clusters of functionally and/or phylogenetically related genes [17]. In the Nimrod region, a domain delimited by two sequential Class I insulators [16] considerably overlaps with the SpellmanRubin co-expression block and the Mezey-cluster (Fig. 1B). The
K. Somogyi et al. / FEBS Letters 584 (2010) 4375–4378
chromosomal distribution of the chromatin proteins – frequently modulated by boundary elements – might also indicate similar regulatory mechanisms operating on the corresponding genes. Notably, in the Nimrod region experimentally defined domains of chromatin protein binding were identified in Drosophila studies, performed on a cell line [18] or on embryos [19] (Fig. 1B). These data support that chromatin-level processes are behind the common regulation in this region. Co-regulation is often coupled to involvement of the concerned genes in a common biological process [9,20]. Our entire knowledge about the Nimrod superfamily unambiguously points to a role of these genes in the innate immune response. Indeed, immune genes were reported to tend to form clusters in the D. melanogaster genome [21,22]. Based on these considerations, we examined whether the other genes of the cluster fulfill immune-related functions. Ance (or ACE) genes encode proteases with broad substrate specificity and have been found in a wide range of animals from worms to vertebrates. In insects, they were described to have roles, among others, in the immune defense (reviewed by [23]), supported, for example, by high ACE enzymatic activity measured in the haemolymph of various Lepidoptera species [24] and in D. melanogaster [25]. The expression of Ance genes is significantly enhanced during immune response, as after stimulation by bacterial lipopolysaccharide in Locusta migratoria [26] or following a Wolbachia endosymbiotic infection in fruitfly [27]. In fruitflies selected for resistance against a bacterial pathogen Pseudomonas aeruginosa, the expression of ance-4 and ance-5 genes was significantly upregulated, similarly to many other Immune genes (including some Nimrod genes, as nimrod C1 and eater) [6]. Orthologs of the Drosophila centaurin gamma (cenG) were identified in various organisms, like Caenorhabditis elegans, Ciona intestinalis or humans (Flybase: http://flybase.org). The CenG gene encodes an intracellular multidomain protein, believed to be involved in cellular signalling [28]. According to the only available experimental evidence for CenG protein it was isolated from the phagosomes of D. melanogaster cells [29], suggesting a role in phagocytosis. In D. melanogaster we have found 3 Vajk genes (vajk-1,-2, and-3) in the large intron of ance-3 (Fig. 1A) (and a fourth one at a different genomic location: vajk-4), which encode proteins with similar sequence properties (SI Fig. 1). The increased expression of vajk-2 and -3 upon bacterial infection of fruitfly larvae [5] provides experimental support for the involvement of Vajk genes in the immune defense. The pieces of evidence listed above make appealing the hypothesis that the Nimrod cluster is a functional module composed of genes acting in the innate immune response. The long-term conservation indicates the importance of this unit, implying, that a study of the individual genes and the elucidation of the putative common process might contribute to our understanding of insect defense mechanisms. Our observations on Nimrod cluster also show that detailed analysis of selected conserved gene clusters might be a promising way to identify novel functional units of various biological processes. Acknowledgements This work was supported by the following sources: FEBS Longterm Fellowship to K.S.; joint doctoral fellowship of The National Centre for Scholarship Abroad of the Romanian Ministry of Education and Research and the Hungarian Scholarship Board to B.S.; Hungarian Scientific Research Fund grant (NK78024) to I.A. Henrik Gyurkovics gave useful suggestions, Gregory Jordan helped in editing the manuscript. We thank two anonymous referees for their comments.
4377
Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.febslet.2010.10.014. References [1] Somogyi, K., Sipos, B., Pénzes, Z., Kurucz, E., Zsámboki, J., Hultmark, D. and Andó, I. (2008) Evolution of genes and repeats in the Nimrod superfamily. Mol. Biol. Evol. 25, 2337–2347. [2] Kurucz, E., Márkus, R., Zsámboki, J., Folkl-Medzihradszky, K., Darula, Z., Vilmos, P., Udvardy, A., Krausz, I., Lukacsovich, T., Gateff, E., Zettervall, C.J., Hultmark, D. and Andó, I. (2007) Nimrod, a putative phagocytosis receptor with EGF repeats in Drosophila plasmatocytes. Curr. Biol. 17, 649–654. [3] Cuttell, L., Vaughan, A., Silva, E., Escaron, C.J., Lavine, M., Van Goethem, E., Eid, J.P., Quirin, M. and Franc, N.C. (2008) Undertaker, a Drosophila Junctophilin, links Draper-mediated phagocytosis and calcium homeostasis. Cell 135, 524– 534. [4] Kurant, E., Axelrod, S., Leaman, D. and Gaul, U. (2008) Six-microns-under acts upstream of Draper in the glial phagocytosis of apoptotic neurons. Cell 133, 498–509. [5] Irving, P., Ubeda, J.M., Doucet, D., Troxler, L., Lagueux, M., Zachary, D., Hoffmann, J.A., Hetru, C. and Meister, M. (2005) New insights into Drosophila larval haemocyte functions through genome-wide analysis. Cell. Microbiol. 7, 335–350. [6] Ye, Y.H., Chenoweth, S.F. and McGraw, E.A. (2009) Effective but costly, evolved mechanisms of defense against a virulent opportunistic pathogen in Drosophila melanogaster. PLoS Pathog. 5, e1000385. [7] Bhutkar, A., Schaeffer, S.W., Russo, S.M., Xu, M., Smith, T.F. and Gelbart, W.M. (2008) Chromosomal rearrangement inferred from comparisons of 12 Drosophila genomes. Genetics 179, 1657–1680. [8] Raghupathy, N. and Durand, D. (2009) Gene cluster statistics with gene families. Mol. Biol. Evol. 26, 957–968. [9] Hurst, L.D., Pál, C. and Lercher, M.J. (2004) The evolutionary dynamics of eukaryotic gene order. Nat. Rev. Genet. 5, 299–310. [10] de Wit, E. and van Steensel, B. (2009) Chromatin domains in higher eukaryotes: insights from genome-wide mapping studies. Chromosoma 118, 25–36. [11] Koonin, E.V. (2009) Evolution of genome architecture. Int. J. Biochem. Cell Biol. 41, 298–306. [12] Spellman, P.T. and Rubin, G.M. (2002) Evidence for large domains of similarly expressed genes in the Drosophila genome. J. Biol. 1, 5. [13] Mezey, J.G., Nuzhdin, S.V., Ye, F. and Jones, C.D. (2008) Coordinated evolution of co-expressed gene clusters in the Drosophila transcriptome. BMC Evol. Biol. 8, 2. [14] Dorman, E.R., Bushey, A.M. and Corces, VG. (2007) The role of insulator elements in large-scale chromatin structure in interphase. Semin. Cell Dev. Biol. 18, 682–690. [15] Wallace, J.A. and Felsenfeld, G. (2007) We gather together: insulators and genome organization. Curr. Opin. Genet. Dev. 17, 400–407. [16] Nègre, N., Brown, C.D., Shah, P.K., Kheradpour, P., Morrison, C.A., Henikoff, J.G., Feng, X., Ahmad, K., Russell, S., White, R.A., Stein, L., Henikoff, S., Kellis, M. and White, K.P. (2010) A comprehensive map of insulator elements for the Drosophila genome. PLoS Genet. 6, e1000814. [17] Kim, T.H., Abdullaev, Z.K., Smith, A.D., Ching, K.A., Loukinov, D.I., Green, R.D., Zhang, M.Q., Lobanenkov, V.V. and Ren, B. (2007) Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128, 1231– 1245. [18] Tolhuis, B., de Wit, E., Muijrers, I., Teunissen, H., Talhout, W., van Steensel, B. and van Lohuizen, M. (2006) Genome-wide profiling of PRC1 and PRC2 Polycomb chromatin binding in Drosophila melanogaster. Nat. Genet. 38, 694– 699. [19] de Wit, E., Braunschweig, U., Greil, F., Bussemaker, H.J. and van Steensel, B. (2008) Global chromatin domain organization of the Drosophila genome. PLoS Genet. 4, e1000045. [20] Lee, J.M. and Sonnhammer, E.L. (2003) Genomic gene clustering analysis of pathways in eukaryotes. Genome Res. 13, 875–882. [21] Khush, R.S. and Lemaitre, B. (2000) Genes that fight infection: what the Drosophila genome says about animal immunity. Trends Genet. 16, 442–449. [22] Wegner, K.M. (2008) Clustering of Drosophila melanogaster immune genes in interplay with recombination rate. PLoS One 3, e2835. [23] Macours, N., Poels, J., Hens, K., Francis, C. and Huybrechts, R. (2004) Structure, evolutionary conservation, and functions of angiotensin-and endothelinconverting enzymes. Int. Rev. Cytol. 239, 47–97. [24] Lemeire, E., Vanholme, B., Van Leeuwen, T., Van Camp, J. and Smagghe, G. (2008) Angiotensin-converting enzyme in Spodoptera littoralis: molecular characterization, expression and activity profile during development. Insect. Biochem. Mol. Biol. 38, 166–1675. [25] Siviter, R.J., Nachman, R.J., Dani, M.P., Keen, J.N., Shirras, A.D. and Isaac, R.E. (2002) Peptidyl dipeptidases (Ance and Acer) of Drosophila melanogaster: major differences in the substrate specificity of two homologs of human angiotensin I-converting enzyme. Peptides 23, 2025–2034. [26] Macours, N., Hens, K., Francis, C., De Loof, A. and Huybrechts, R. (2003) Molecular evidence for the expression of angiotensin converting enzyme in
4378
K. Somogyi et al. / FEBS Letters 584 (2010) 4375–4378
hemocytes of Locusta migratoria: stimulation by bacterial lipopolysaccharide challenge. J. Insect. Physiol. 49, 739–746. [27] Xi, Z., Gavotte, L., Xie, Y. and Dobson, S.L. (2008) Genome-wide analysis of the interaction between the endosymbiotic bacterium Wolbachia and its Drosophila host. BMC Genomics 9, 1. [28] Jackson, T.R., Kearns, B.G. and Theibert, A.B. (2000) Cytohesins and centaurins: mediators of PI 3-kinase-regulated Arf signaling. Trends Biochem. Sci. 25, 489–495.
[29] Stuart, L.M., Boulais, J., Charriere, G.M., Hennessy, E.J., Brunet, S., Jutras, I., Goyette, G., Rondeau, C., Letarte, S., Huang, H., Ye, P., Morales, F., Kocks, C., Bader, J.S., Desjardins, M. and Ezekowitz, R.A. (2007) A systems biology analysis of the Drosophila phagosome. Nature 445, 95–101. [30] Honeybee Genome Sequencing Consortium (2006) Insights into social insects from the genome of the honeybee Apis mellifera. Nature 443, 931–949.