Towards the elucidation of the regulatory network guiding the insulin producing cells’ differentiation

Towards the elucidation of the regulatory network guiding the insulin producing cells’ differentiation

Genomics 100 (2012) 212–221 Contents lists available at SciVerse ScienceDirect Genomics journal homepage: www.elsevier.com/locate/ygeno Towards the...

933KB Sizes 2 Downloads 25 Views

Genomics 100 (2012) 212–221

Contents lists available at SciVerse ScienceDirect

Genomics journal homepage: www.elsevier.com/locate/ygeno

Towards the elucidation of the regulatory network guiding the insulin producing cells’ differentiation Maria Kapasa a, b,⁎, Dimitrios Vlachakis a, Myrto Kostadima c, Georgia Sotiropoulou b, Sophia Kossida a,⁎ a b c

Bioinformatics & Medical Informatics Laboratory, Biomedical Research Foundation of the Academy of Athens, 11527, Athens, Greece Department of Pharmacy, School of Health Sciences, University of Patras, GR-26500 Rion-Patras, Greece European Molecular Biology Laboratory — European Bioinformatics Institute (EMBL–EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK

a r t i c l e

i n f o

Article history: Received 6 March 2012 Accepted 5 July 2012 Available online 20 July 2012 Keywords: Regulatory networks Insulin Stem cells NGN3 (NEUROG3) Gene expression

a b s t r a c t This study pertains to the regulatory network of neurogenin3 (NGN3, approved symbol: NEUROG3), the main regulator of insulin producing cells’ formation. In silico regulatory region analyses of known and novel targets of NGN3 revealed the presence of two variants of a regulatory module that appeared conserved at the most phylogenetically distant species with pancreas. Both variants of this module contained binding sites of six transcription factors implicated in pancreas development. Nevertheless, an additional factor was found only into the module of the down-regulated by NGN3 genes. Whole genome analyses confirmed the statistical significance of these regulatory modules. Investigation of protein–protein interactions among the factors bound into these sequences indicated the formation of alternative protein complexes resulting into the up- or down-regulation of the respective genes. Subsequently, an NGN3-guided regulatory network, was modeled, describing the interactions among the analyzed genes with their transcriptional regulators, leading into the differentiation of cells capable of producing insulin. © 2012 Elsevier Inc. All rights reserved.

1. Introduction Embryonic development can be considered as a perfectly orchestrated series of multiple cell movements and changes in gene expression, Abbreviations: NGN3 (NEUROG3), neurogenin3; ESCs, Embryonic Stem Cells; TFBS, Transcription Factor Binding Sites; TFs, Transcription Factors; AP4 (TFAP4), Activating enhancer-binding Protein 4 (Transcription Factor AP-4); WISP1, WNT-Induced Secreted Protein 1; CTGF, Connective Tissue Growth Factor; TSS, Transcription Start Site; HNF1 (HNF1A), Hepatocyte Nuclear Factor 1; HNF6 (ONECUT1), Hepatocyte Nuclear Factor 6 (ONE CUT homeobox 1); BRN2/5 (POU3F2/POU6F1), BRaiN-specific homeobox/POU domain proteins2/5 (POU class 3 homeobox 2/ POU class 6 homeobox 1); PDX1, Pancreatic and Duodenal homeoboX 1; ISL1, ISL LIM homeobox 1; LEF1, Lymphoid Enhancer-binding Factor 1; IPA, Ingenuity Pathway Analysis; CREBBP, CREB Binding Protein; CTNNB1, Β-CaΤeΝiΝ; HDAC1, Histone DeACetylase 1; MD, Molecular Dynamics; RMSd, Root Mean Square deviations; CSS, Core Similarity Score; MSS, Matrix Similarity Score; PWMs, Position Weight Matrices; GB, Generalized Born; SPC, Simple Point Charge; NVT, Number of atoms, Volume, Temperature; IA1, (INSM1) INSulinoMa Associated 1; NEUROD1, NEUROgenic Differentiation 1; NKX2-2, ΝΚ2 family of homeoproteins; PAX4, PAired boX gene 4; PARD6B, Par-6 PARtitioning Defective 6 homolog C. elegans; VCL, VinCuLin; CDC42, Cell Division Cycle 42 homolog; PARD3, Par-3 PARtitioning Defective3 homolog C. elegans; FGD4, FYVE, RhoGEF and PH Domain containing 4; RGS4, Regulator of G-protein Signaling 4; DCX, DoubleCortin; HUD (ELAVL4), Hu antigen D (Embryonic Lethal, Abnormal Vision, DrosophilaLike 4); ALDH1B1, ΑLdehyde DeHydrogenase 1 family, member B1; BHLHB2 (BHLHE40), Βasic Helix–Loop–Helix domain containing, class B, 2 (Basic Helix–Loop–Helix family, member E40); FZD2, FriZzleD homolog 2 Drosophila. ⁎ Corresponding authors. Fax: +30 210 65 97 545. E-mail addresses: [email protected] (M. Kapasa), [email protected] (D. Vlachakis), [email protected] (M. Kostadima), [email protected] (G. Sotiropoulou), [email protected] (S. Kossida). 0888-7543/$ – see front matter © 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.ygeno.2012.07.002

progressively restricting the pluripotency of undifferentiated cells, through gradual determination of developmental potential, until the fully differentiated cells get committed to their terminal fate. Cellular identity is defined by the expression of cell or tissue specific genes. Therefore differentiation occurs by the exposition of the progenitor cells into extracellular signaling molecules that output instructions on the regulation of gene expression through complex regulatory pathways. In mammals, endodermal progenitor cells give birth to both exocrine and endocrine cells of pancreas. The endocrine cells are organized into the islets of Langerhans and are classified, according to the hormone that they produce, into α, β, δ, ε and pancreatic polypeptide cells. In an effort to develop cell-based therapies, experimental protocols are intended for differentiating embryonic stem cells (ESCs) into specific cell types. One of the foremost challenges of the cell therapy of diabetic individuals is to achieve the directed differentiation of cells capable of producing insulin. Towards the differentiation of ESC into β cells, as previous studies have already denoted, the NGN3 expression is crucial for the specification of the precursor cells that will be committed to become insulin producing cells [1]. This (NGN3) transcription factor induces its’ direct targets leading into the subsequent differential expression of additional transcription factors, which guide the differentiation of the β cells [1]. These factors interplay in various combinations simultaneously or not and finally the expression of the insulin gene is achieved [1]. In a reference study the NGN3 gene was selectively induced into endocrine pancreas progenitors differentiated from murine ESCs [1].

M. Kapasa et al. / Genomics 100 (2012) 212–221

Microarray gene expression experiments on the above mentioned cells revealed that, after NGN3 induction, some known targets of this factor along with other genes, that were considered to be putative NGN3 targets, altered their expression [1]. Genes selected from these microarray data were set at the focal point of extended in silico analyses [2], in order to identify the binding sites of NGN3 and further unravel the regulatory network that this factor coordinates. More precisely, the regulatory regions of genes, possibly regulated by NGN3 [2], were set for Transcription Factor Binding Sites (TFBS) prediction. Taking into account the fact that evolutionarily conserved sequences stand for experimentally identified regulatory elements, comparative genomics was applied in order to identify functionally important regions that represent TFBS. Subsequently, regulatory analyses were performed on the orthologs of the genes, considered as potential targets of NGN3, in several species including the most phylogenetically distant ones (fishes), which have pancreas [2]. The application of the above mentioned approach [2] has already led to the identification of a highly conserved regulatory region, in the orthologs of genes co-regulated by NGN3 and in those WNT5 paralogs implicated in pancreas development [2,3]. This region, for now on called “module”, was consisted of binding sites of six transcription factors (TFs) with established involvement in pancreas development, further including additional binding sites for the factor AP4 (TFAP4) (Activating enhancer-binding Protein 4 (Transcription Factor AP4)) only in the orthologs of the down-regulated by NGN3 genes WISP1 (WNT-Induced Secreted Protein 1) and CTGF (Connective Tissue Growth Factor) [2]. Given the consistency of this regulatory module discovery, which was not identified in the negative control genes [3], the presented study looked for it in the same set of orthologs of several other upor down-regulated genes upon NGN3 induction, that are presumable targets of NGN3. In order to clarify the importance of the presence of AP4 in the module and the statistical significance of the identification of the two alternative modules, the whole mouse genome was searched for TFBS for these seven factors. Thereafter, clusters of these binding sites containing the central regulatory factor (NGN3) were sought. In the sequel, statistical analysis was performed on the results of the regulatory region analyses of the entire mouse genome. Additionally, the occasional identification of the regulatory modules in the sequences of the genes, which altered their expression upon NGN3 induction, was investigated by comparing the upcoming whole genome analyses results with those of the selected from microarray data genes. Furthermore, the protein–protein interactions among the transcription factors of the regulatory modules were studied. Eventually, the genomic (cis–cis), proteomic (trans–trans) and combinations of both (cis–trans) interactions among the molecular players of β-cells’ specification were incorporated into a regulatory network that describes how through the transcriptional regulation of the analyzed genes, mainly guided by ΝGN3, the generation of cells capable of producing insulin takes place. 2. Results 2.1. Regulatory region analyses of selected from microarray data genes Taking into account the data from the aforementioned microarray experiments [1], genes that appeared to be up or down-regulated upon NGN3 induction, were selected for regulatory region analyses. The analyses included the orthologs of the genes only for those species covering the evolutionary distance from mammals to fishes. These genomic sequences were submitted for TFBS prediction (Data and methods) with the intension to verify the already attested binding sites of NGN3, for the genes that have been proved to be NGN3 regulated, and further unravel the respective ones in the genes supposed to be regulated by this factor. Namely, among these genes, IA1, NEUROD1,

213

NKX2-2 and PAX4 are known targets of NGN3 with previously documented [4] binding motifs of this factor inside their regulatory sequences, which our in silico approach managed to corroborate. Strikingly, the experimentally identified binding motifs of NGN3 as well as their relative positions in the sequences of IA1, NEUROD1, NKX2-2 and PAX4 were confirmed (Supplementary Material 1, Fig. S1). Moreover, applying a filtering procedure, extensively described in [2], binding motifs of additional TFs were predicted close to those of NGN3 in all the analyzed orthologs of these genes, constituting an evolutionary conserved cluster of TFBS. Supplementary Material 2 Fig. S2, indicatively depicts the conservation of this TFBS assemblage in the orthologs of NEUROD1. The computational regulatory element discovery through this approach had also led to the identification of the same group of TFBS in the orthologs of the genes WISP1 and CTGF, which are both down-regulated after NGN3 expression [2]. It is worth mentioning that in the group of the TFBS in the WISP1 and CTGF orthologs, binding sites for the AP4 factor were additionally found [2]. The current study revealed the presence of this set of TFBS in the same orthologs for several other genes, which according to the microarray data, are considered as potential NGN3 targets. The names of the genes that were submitted for regulatory region analyses in previous studies and in the current one as well as details concerning their biological function are indexed in Table 1. Remarkably, the presented in Table 1 genes are dissevered at two distinct groups according to their up (+) or down (−) regulation 48 h after NGN3 induction. Fig. 1 depicts the relative positions of the set of TFBS and its’ absolute distance from the transcription start site Table 1 The up and down-regulated genes after NGN3 induction that were analyzed and their biological roles. Gene

Up-regulation (+) Down-regulation (−)

Biological process

IA1 (INSM1) (INSulinoMa associated 1) NEUROD1 (NEUROgenic Differentiation 1) NKX2-2 (ΝΚ2 family of homeoproteins) PAX4 (PAired boX gene 4) CDC42 (Cell Division Cycle 42 homolog) PARD3 (Par-3 PARtitioning Defective3 homolog C. elegans) FGD4 (FYVE, RhoGEF and PH Domain containing 4) RGS4 (Regulator of G-protein Signaling 4) DCX (DoubleCortin)

+

Transcription factor

+

Transcription factor

+

Transcription factor

+ +

HUD (ELAVL4) (Hu antigen D) (Embryonic Lethal, Abnormal Vision, Drosophila- Like 4) ALDH1B1 (ΑLdehyde DeHydrogenase 1 family, member B1) WISP1 (WNT1 Inducible Signaling Pathway protein 1) CTGF (Connective Tissue Growth Factor) BHLHB2 (BHLHE40) (Βasic Helix–Loop–Helix domain containing, class B, 2) (Basic Helix–Loop–Helix family, member E40) FZD2 (FriZzleD homolog 2 Drosophila) PARD6B (Par-6 PARtitioning Defective 6 homolog C. elegans) VCL (VinCuLin)

+

Transcription factor Integrin-mediated signaling pathway Integrin-mediated signaling pathway Small GTP binding protein and regulator Small GTP binding protein and regulator Cytoskeletal/ membrane changes RNA binding processing

+

Metabolic process



Growth factor and receptor

− −

Integrin-mediated signaling pathway Metabolic process



Wnt signaling pathway



Integrin-mediated signaling pathway Integrin-mediated signaling pathway

+ + + +



214

M. Kapasa et al. / Genomics 100 (2012) 212–221

Fig. 1. The regulatory complex of TFs belonging to the families NEUR, PDX1, BRNF, HNF1, HNF6, LEFF and AP4R, presented at the murine orthologs of the analyzed genes. The relative positions of the TFBS are presented in scale and explained with the symbols at the right part of the picture. The presence of the AP4 factor in the down-regulated genes is indicated with the black cross. The margins of the conserved regulatory region are marked with the vertical black lines. The distance of the TF cluster from the TSS (white bold arrow) is counted with the ruler.

(TSS) in the murine orthologs of all the analyzed genes (Table 1). Notably, the binding motifs of 6 TFs belonging to the families: NEUR (NGN1/ 3- NeuroGeNin1/3), HNF1 (HNF1A-Hepatocyte Nuclear Factor 1), HNF6 (HNF6-Hepatocyte Nuclear Factor 6) (ONECUT1- ONE CUT homeobox 1), BRNF (BRN2/5 (POU3F2/POU6F1)- BRaiN-specific homeobox/POU domain proteins2/5 (POU class 3 homeobox 2/ POU class 6 homeobox 1)), PDX1 (PDX1-Pancreatic and Duodenal homeoboX 1 and ISL1-ISL LIM homeobox 1) and LEFF (LEF1- Lymphoid Enhancer-binding Factor 1) were found co localized in a narrow genomic region. Conjointly, AP4 factor (of AP4R family) was present only in the assembly of TFs in the orthologs of the down-regulated genes of Table 1. Despite the alternative order of the previously referred TFBS into this region and their variable distances from the TSS, their presence appeared to be conserved in all the orthologs checked for every gene included in Table 1. In the Table S3 of the Supplementary Material 3, the positions of the discovered cluster of TFBS relative to the TSS in the orthologs of all the analyzed genes are listed. Obviously, these TFs stand for a highly conserved regulatory module identified in all the orthologs, of species with pancreas, for several genes possibly implicated into the endocrine pancreas specification process.

AP4, whereas those containing the previously referred factors together with AP4 were characterized as clusters with AP4. The number of the clusters of TFs with and without AP4 that were identified at the M. musculus genome, together with the length of the whole genome's sequences in nucleotides is summarized in Table 2A. In the same Table 2A these numbers are compared with the respective ones concerning the selected analyzed genes, from the microarray data. It is worth pointing out that the modules containing AP4, in contrast to those without AP4, were rare. Indicatively, along the chromosome 11, which is the most gene rich chromosome of mouse, only 18 modules with AP4 were found. In Fig. S4, of the Supplementary Material 4, the modules residing inside the regions of known genes are indicated with bold arrows and the corresponding genes are labelled accordingly. On the other hand, the much more frequent sets of TFBS lacking AP4 along the same chromosome, were classified according to the annotation of the genomic region that they were located and the results are presented in the pie chart of the Fig. S5A of the Supplementary Material 5. Strikingly, according to the pie chart of Fig. S5A, the modules without AP4 are mapped in their majority at distal intergenic regions, while those with AP4 (Fig. S5B) are mostly located inside introns.

2.2. Whole genome analyses 2.3. Statistical analysis Every single nucleotide of the murine genome was analyzed with a code (Data and methods) developed in order to detect putative TFBS for the TFs of the families NEUR, PDX1, HNF1, HNF6, BRNF, LEFF and AP4R. In the sequel, the TFBS found in each chromosome were searched for clusters of TFBS for the factors belonging to the previously mentioned families. More precisely, two versions of another code scanned for TFBS of the under study TFs that were co-localized in genomic regions of maximum length 1000 bp (Data and methods) in every murine chromosome. Moreover, the usage of the two alternative versions of the second code, allowed the exploration of the presence of AP4 into each cluster of TFBS. Indeed, the modules containing the factors NGN3, PDX1, HNF1, HNF6, BRN4 and LEF1 were designated as clusters without

In order to ensure the statistical significance of the identification of the two variants of the regulatory module, Table 2A's results were subjected to statistical analysis. More specifically, linear correlation Table 2A The number of the identified modules with and without AP4 and the length in nucleotides of the respective analyzed sequences. Analysis

Nucleotides

Modules with AP4

Modules without AP4

Genome- wide Selected genes

2.654.911.517 810.000

668 60

86.324 180

M. Kapasa et al. / Genomics 100 (2012) 212–221 Table 2B Linear correlation of the number of modules with and without AP4 and the number of the nucleotides of the analyzed sequences. Dependent–independent variable

Prediction formula

Modules with AP4-nucleotides (N) Modules without AP4-nucleotides (N)

ΑP4 = 2.556 e−7 *Ν NOAP4 = 3.290 e

−5



P-value

Adjusted R square

0.0001

0.893

0.0001

0.968

method was applied in order to correlate the length of the analyzed sequences, in nucleotides, with the number of the clusters of TFs found within them. The analysis resulted into the predictive models, shown in Table 2B, together with the respective p-values and the adjusted R squares. The equations of Table 2B estimated the number of the clusters of TFs with and without AP4 (Dependent Variable) that was expected to be found in a sequence consisting of a specific number of nucleotides (N- Independent Variable). The relatively low values (2.556 e −7 and 3.290 e −5) of the linear correlation rates indicated that the clusters of the selected TFs were not expected more frequently, when the analyzed sequences were longer. Calculating the number of the regulatory modules, through the equations of Table 2B, that were expected to be found by chance and comparing them with those occurred by the analyses of the selected genes, revealed that both forms of the module, with and without AP4, were found more frequently in the putative targets of NGN3, than occasionally expected. Confirming that, the identification of these two distinct modules was far from random. Likely, the existence of these alternative clusters of TFs in the sequences of the selected genes is linked with their regulation by these TFs during the pancreas endocrine cells’ subtypes’ differentiation. 2.4. Investigation of protein–protein interactions through data mining from graphs By performing Ingenuity Pathway Analysis (IPA) (Data and methods) the up-to-date documented knowledge concerning the protein–protein interactions among the TFs of the regulatory modules was extracted. Specifically, setting the murine protein identities as queries, all the experimentally identified alternative or combinatorial trans-trans interactions were investigated, generating the suppositional bundle of TFs shown in Fig. 2, which also provides information about any possible activation of the interacting partners. The nodes of Fig. 2 stand for the proteins named inside them, while the lines indicate the relationships explained in the right part of the figure. Sharply, as Fig. 2 illustrates, NGN3 can directly interact with CREB Binding Protein (CREBBP) and with HNF1, which also interacts with CREBBP as well as

215

with Β-CaΤeΝiΝ (CTNNB1) [4–7]. CREBBP could also be characterized as a cross linker of the protein complex shown in Fig. 2 as it additionally interacts with the factor HNF6 [8]. Moreover, LEF1 binds Histone DeACetylase 1 (HDAC1), which further interacts with AP4 and PDX1 [9–11]. Presumably, these factors can form alternative protein complexes of the same regulatory module. 2.5. Structural study of protein–protein interactions Among the factors of the putative regulatory protein complex of Fig. 2, LEF1 and CTNNB1 were considered of major importance and their interactions with other proteins were studied thoroughly. The notability of LEF1 was arisen by the fact that LEF1 is present in the variants of the regulatory module found in both the up and the down-regulated genes. Moreover, LEF1 interacts with either HDAC1 and/or CTNNB1 by suppressing or inducing the transcription of its targets respectively [9]. On the other hand, CTNNB1 is a general TF, whose transfer into the nucleus and its’ interplay with LEF1 is controlled by the canonical WNT signaling pathway [12]. In detail, when this pathway is active the proteasome degradation of the cytosolic CTNNB1 is blocked, so that CTNNB1 is available to be transferred into the nucleus, where it binds CREBBP [12]. CREBBP is another general TF, acting at the same time as a histone acetyltransferase that plays crucial role in embryonic development via working as a transactivator of several TFs and as a regulator of chromatin acetylation state [12]. As shown in Fig. 2, CTNNB1 binds LEF1, while both of them can interact with HDAC1 [9]. In order to explore the interactions among alternative partners and examine the putative ternary complexes of these proteins, molecular docking of the solved structures or of the reliable models of these factors was performed using the ZDOCK program (Data and methods). The complexes of the CTNNB1 with either the histone acetyltransferase CREBBP or with the histone deacetylase HDAC1 are depicted in Figs. 3A and C respectively, validating previously published [9,12] interactions. When CTNNB1 is transferred into the nucleus, it facilitates the formation of the complex CREBBP/CTNNB1/LEF1 shown in Fig. 3B, a tripartite complex presumed by [9], which was thermodynamically confirmed by the herein presented in silico study. More precisely, in Figs. 3A, B and C the structural representations of the ZDOCK docking results (colored red) for the complexes of CTNNB1/CREBBP, CTNNB1/CREBBP/LEF1 and CTNNB1/HDAC1 respectively, are shown superimposed with the respective outputs of the 50 ns molecular dynamics (MD) simulations (coloured green). In order to quantify the conformational changes for the molecular complexes upon the MD simulations, each one was compared with its corresponding docking conformation, prior to MD simulations, by calculating the root mean square deviations (RMSd) between equivalent atoms. Noteworthy that large RMSd values are indicative of major conformational changes. However the Cα RMSd values for the complexes shown in Figs. 3A, B and C after the MD simulations

Fig. 2. The putative complex of the TFs found within the conserved regulatory region.

216

M. Kapasa et al. / Genomics 100 (2012) 212–221

Fig. 3. A. Superimposition of the CTNNB1/CREBBP complex as occurred by the ZDOCK docking result (red colour), with the respective output of the 50 ns molecular dynamics simulations (green color). B. Superimposition of the ZDOCK output for the complex CTNNB1/CREBBP/LEF1 (red color) with the respective complex that occurred after the 50 ns molecular dynamics simulations (green colour). C. The ZDOCK output for the complex CTNNB1/HDAC1 superimposed with the respective one obtained after the 50 ns molecular dynamics simulations.

were less than 0.28, 0.36 and 0.35 Å respectively, when compared to their equivalent docking structures. Therefore, with an average RMSd of 0.33 Å (before/after MD), it was concluded for all three complexes that their docking conformations remained stable throughout the exhaustive MD simulations. Upon the analysis of the full molecular dynamics simulations trajectory, depicted in Fig. 4, it was shown that these three complexes quickly (after ~5 ns) reached equilibrium and remained conformationally stable for the remaining simulation time (~up to 50 ns). Consequently, these complexes are characterized by extremely low energy values, ensuring the reliability of the presented 3D molecular systems. 2.6. Regulatory network construction Using the IPA platform's applications a holistic regulatory network was constructed that gave a putative explanation on how the previously described complex of TFs, could induce or suppress the expression of the analyzed genes in order to differentiate embryonic

pancreas progenitors into the β cell subtypes. More specifically, the selected genes form the microarray results were clustered as up or down-regulated ones according to their fold change expression values calculated 12, 24 and 48 h after NGN3 induction. Subsequently, the fold change values, corresponding to the three time points, and the genes’ identities, were submitted separately for the up-regulated and the down-regulated genes as distinct input datasets. Furthermore, a supplementary analysis was conducted for the fold change values, of both the up-regulated and the down-regulated genes, counted 48 h after NGN3 induction, All possible regulatory events containing the genes under question, the NGN3 and the fewest number of eligible molecules were algorithmically computed and were incorporated into different genomic circuits. These circuits were, consequently, integrated into a combinative network by including as much information as possible and lacking any redundant regulatory relationships. The final network, illustrated in Fig. 5, verified the induction of NGN3 known targets, NEUROD1 and NKX2-2, as well as the induction of NGN1. Furthermore it was shown that NGN3 together with HNF1 are necessary for the expression of PAX4 [4,13]. The proteins NGN3 and NKX2-2 are major participants of the differentiation process of pancreas progenitors into insulin producing cells through inducing the expression of insulin gene (Insulin) [14]. In addition, NEUROD1 controls the expression of ΙΑ, the protein of which in synergy with NEUROD1 and PDX1 up-regulate the expression of NGN3 [15]. At the same time, NEUROD1 increases the expression levels of PAX4, DCX and NKX2-2 [13,16]. The factor NKX2-2 binds and regulates the expression of PAX4 and further increases the activation levels of NGΝ1 and NGΝ2 proteins [14,17]. The latter one (NGN2) is crucial for the transcription of NKX2-2 as well as for the expression and activation of DCX [16,18,19]. Details on the intermediate molecules, like APP, associated with this network are presented in Table 3. APP is responsible for the up-regulation of HUD, which is another analyzed gene [20]. Moreover, upon the insulin's presence in the cells, APP protein is gathered [21] and contributes to the activation of the previously described complex of TFs, mainly through the regulation of the amount of β-catenin (CTNNB1) that is translocated into the nucleus and is available to be bound into the protein cluster of Fig. 2 [22]. The quantity of CTNNB1, that enters the nucleus, is also regulated by WNT5A [12]. When WNT5Α is expressed, it leads the cells to insulin production [23] and increases the activation levels of CDC42 [24]. Additionally, WNT5Α protein participates into the non-canonical WNT signaling pathway inhibiting the proteolysis of CTNNB1 by the canonical WNT signaling pathway and at the same time increases the concentration of Ca +2 inside the cells [12]. This last event has been linked with the capability of the cells to produce insulin [25]. The concentration of Ca +2 is also regulated by the protein encoded by FZD2, which interacts with WNT5 and must be suppressed, when Ca +2 concentration has to be kept in high levels [12]. It is noteworthy that the FZD2 down-regulation is in accordance with the VCL suppression, as these two proteins usually form a protein complex [26]. It is also worth pointing out that the inactivation of VCL has been correlated with the capability of the cells to express insulin, while the suppression of the VCL gene is affected by the retinoic acid, used at the experimental differentiation protocol of insulin producing cells [1,27]. As shown in Fig. 5, retinoic acid also induces CDC42, [28] the protein of which activates and enforces insulin secretion [29]. Moreover, through the activation of MAPK8, CDC42 leads to the acute suppression of the TNF factor [30,31]. So when CDC42 is present, TNF is unable to induce its’ targets VCL, CTGF and WISP1, which are subsequently suppressed [32–34]. Recurring to CDC42 protein, it can be characterized as a key molecule in the presented regulatory network, as it directly induces the expression of PARD3 and through intermediate molecules upregulates ALDH1B1. The increased activation of CDC42, as Fig. 5 illustrates, can be attributed to the induction of FGD4, the protein of which interacts with CDC42 and activates it, as well as the ΜΑPK8 protein [35]. Additionally, the activation of the

M. Kapasa et al. / Genomics 100 (2012) 212–221

217

Fig. 4. Molecular dynamics simulations trajectory for the complexes of Figs. 3A, B and C. Graphical representation of the energy values for the complexes of CTNNB1 with CREBBP and LEF1 (red line), CTNNB1 with CREBBP (green line) and CTNNB1 with HDAC1 (blue line).

NFKB complexes is a situation linked with insulin secreting cells [36], depends on CDC42 and is increased by F2 [37]. Finally, the activated NFKB complexes, under the guidance of F2, up-regulate ΑLDH1B1 [38]. However, the presence of AP4 in the regulatory module of the down-regulated genes (i.e. in PARD6B) activates TP53, which then suppresses the respective gene [39,40]. On the other hand, the activation of TP53 up-regulates the expression of LIF protein, leading into the

up-regulation of RGS4 [41,42]. Lastly, this network shows that LEF1 induces ΒRN2 transcription, the protein of which dimerizes with BRN4 [43,44]. The latter (BRN4) guides the cells’ glucagon production, which coexists with the insulin secretion at the primary states of differentiation of embryonic pancreas progenitors [1,2]. In conclusion, through the proper transcriptional regulation of the analyzed genes by the presented variants of the regulatory module,

Fig. 5. Regulatory network guided by NGN3, indicating the interactions among the TFs of the conserved regulatory module and the analyzed genes. The types of the presented interactions, the up or down-regulation of the analyzed genes and the intermediate molecules of the network are described with the symbols at the right part of the picture.

218

M. Kapasa et al. / Genomics 100 (2012) 212–221

Table 3 Information concerning the intermediate participants of the regulatory network shown in Fig. 5. Acronym Molecule NGN2 NGN1 APP

BRN2 NFKB TP53

F2

LIF TNF MAPK8

Biological Function

NeuroGeniN 2

Transcription regulator. Neural system development. NeuroGeniN 1 Transcription regulator. Neural system development. Amyloid beta (A4) Cell surface receptor. Precursor of a precursor protein transmembrane protein. After proteolysis turns into small peptides that induce transcription. Neural system development. (POU3F2) POU class Transcription regulator. Neural system 3 homeobox 2 development Nuclear factor kappa Protein complex. Regulation of transcription and B apoptosis. Tumor protein p53 Oncogene product. Control of cell cycle, apoptosis, DNA damage repair, cell aging and metabolism. DNA binding and induction of transcription. Regulation of neural system growth. Coagulation factor II Factor that after proteolysis becomes thrombin. (thrombin) Regulation of blood coagulation and circulation in neonates and embryos. Growth factor. Proteolysis of other proteins. Regulation of neural cell differentiation. Leukaemia Cytokine. Neural cell differentiation and inhibitory factor organogenesis of multicellular organisms. Tumor necrosis Cell cycle regulator. Regulation of cell factor proliferation, differentiation and apoptosis. Μitogen-activated Phosphorylation of specific protein-targets protein kinase 8

the directed differentiation of pancreas progenitor cells into insulin producing ones can be achieved.

3. Discussion Through the regulatory region analyses of genes known and putative targets of NGN3 [1,4] a genomic region, where particular TFs bind was identified. These factors belong to the TF families NEUR, PDX1, HNF1, HNF6, BRNF, LEFF and AP4R. Subsequently, it can be assumed that NGN3 (of NEUR family); the main regulator of endocrine pancreas cell subtypes’ specification [1], in association with the rest TFs of the cluster, constitute two variants of a regulatory module, which rules either the induction, or the suppression of the respective gene's transcription. Specifically, the group of the TFBS belonging to the 6 first TF families was found in both the up and down-regulated genes upon NGN3 induction, but only the modules of the down-regulated genes contained the factor AP4 of the last family. Consequently, it could be hypothesized that AP4 selectively binds at the cluster of the TFs NGN3PDX1-HNF1-HNF6-BRN4-LEF1 and converts this regulatory region into a silencer. It is worth mentioning that the AP4 factor has been documented to bind into an enhancer element of the genes specifically expressed in pancreas [45,46]. Likewise, all the other factors of this cluster are known to participate in the embryonic development of pancreas. Precisely, PDX1 and ISL1 regulate insulin gene expression and guide islet cell development, while HNF6 and HNF1 are both transcriptional activators of pancreas-specific genes [2]. Furthermore, the factors BRN2, BRN3, BRN4, and BRN5, classified into the BRNF family, are known regulators of gene expression that rules the mammalian embryonic development. Among the BRNF members, BRN4 must be primarily concerned with the activation of the glucagon gene [2]. Additionally, the only representative of the LEFF TF family, LEF1, participates into the canonical WNT signaling pathway, which is implicated into the pancreas organogenesis [12].

Yet, through the statistical analysis of the results occurring from the whole mouse genome analyses it was proven that the two types of clusters of TFBS, with and without AP4, were not randomly identified inside the regulatory regions of the selected analyzed genes. Conjointly, the fact that this region was found conserved for every gene analyzed, even at the most phylogenetically distant species that have pancreas, indicates its’ functionality in the regulation of expression of the analyzed genes during the morphogenesis of pancreas. Indeed, functional regulatory elements are usually highly conserved genomic regions, as the selective pressure acts on them and retains through evolution the binding motifs of the TFs with major importance for the regulation of the respective gene. Noticing that this region was identified far beyond the TSS for most of the analyzed genes (Supplementary Material 3, Table S3), it is estimated that, it acts as a distant regulatory element. Strikingly, the selective presence of the AP4 factor only in the group of TFs that was identified in the down-regulated genes, makes AP4 a focal molecule that inverts the modules’ functionality from an enhancer into a silencer one. Besides, the plausible protein bundle of Fig. 2, that transpired though the integration of all known protein–protein interactions, signifies that the TFs bound into the conserved genomic region, interact also with HDAC1, CREBBP and CTNNB1, synchronously or not in various combinations. It seems that the factors NGN3, PDX1, HNF1, HNF6, BRN4, AP4 and LEF1 stand for the core of different clusters that contain either the histone deacetylase HDAC1 or the histone acetylase CREBBP. Hence, the resulting alternative protein complexes suppress or induce the transcription of the corresponding genes respectively. Additionally, the mediator's CTNNB1 and the HDAC1's presence into these complexes are controlled by LEF1, while AP4 also recruits HDAC1 into the module. In detail, combining the information provided in Figs. 2, 3A and C it can be assumed that the putative regulatory module is consisted of the general TF CTNNB1, which can interact with both the CREBBP and the HDAC1. These results are in accordance with previous studies showing that when CTNNB1 is bound to LEF1 it induces the transcription of LEF1 dependent genes and disturbs the binding of LEF1 to HDAC1 [9]. Concerning HDAC1 it seems that it combines various roles beyond working as a histone deacetylase, as it also acts as a crosslinker of the factors PDX1 and AP4 with the rest TFs of the complex of Fig. 2 [10,11]. In conclusion, these findings confirm the indications that HDAC1 decreases the ability of LEF1 to function as a transcriptional activator [9]. Moreover, Fig. 3B depicts the complex of CTNNB1 with LEF1 and CREBBP, supporting the hypothesis [9] that an acytyltranferase is also present in the complex of CTNNB1 with LEF1. Therefore, it could be presumed that with this trans regulatory elements combination, a looser chromatin state is achieved via increasing the local chromatin acetylation levels; a situation generally linked with the induction of transcription. Based on the fact that AP4 was found only into the suppressed genes, it could be characterized as a co-repressor, attracting the deacetylase HDAC1 into the protein complex and leading into more compact chromatin structures and finally into the transcriptional repression of the genes. The latter conclusion confirms the assumption that an unknown co-repressor is also present in the complex of LEF1 with HDAC1 [9]. Through the induction or the suppression of the analyzed genes the embryonic pancreas progenitors were differentiated towards the direction of insulin producing cells. Indeed, focusing on the regulatory network of Fig. 5 it is worth noting that the down-regulation of BHLHB2, known to be retinoic acid dependent, is of major importance for the differentiation of pancreatic cells. In detail, BHLHB2 is expressed at the initial phases of differentiation of the endocrine pancreas progenitors, after treatment with the retinoic acid [47]. Additionally, the transcriptional regulation of CTGF is crucial for the formation of the pancreatic islets and for the determination of the endocrine pancreas cell subtypes in the right proportion. Indeed, CTGF is keenly expressed at the epithelium of the pancreatic buds and its’ expression falls at lower levels

M. Kapasa et al. / Genomics 100 (2012) 212–221

during the generation of the insulin producing cells, while is found totally suppressed after birth in the already differentiated β-cells of pancreas. It seems that CTGF controls the β cells’ subtypes’ proliferation during their differentiation process, as in knock-out, for this gene, experimental models, lower than normal proliferation rates of the cells have been documented [48]. On the other hand, concerning the up-regulated genes analyzed here, unpublished data indicated their implication into pancreas morphogenesis and in β cells’ differentiation. In detail, increased expression of ALDH1B1 has been found during the differentiation protocol of pancreas progenitors with selective induction of NGN3, as well as in the developing early pancreatic buds of mice embryos. The expression of ALDH1B1 depends on PDX1 and has been marked in progenitor cell populations of both the endocrine and the exocrine pancreas [manuscript under preparation]. Even more, CDC42 has been proved to determine the cell fate of embryonic stem cells in the early embryo, via controlling the suitable and sufficient microenvironment for the proper differentiation of pancreas progenitors. Moreover, CDC42 ensures the pancreas tubulogenesis initially through the creation of microlumen and then through the maintenance of the apical cell polarity [49]. Additionally, RGS4 has recently been proved to be regulated by NGN3 in endocrine progenitor cells of zebrafish (Danio rerio) and mice, as well as in pancreatic epithelial cells of murine embryos. Concerning this, loss of RGS4 leads to the deregulation of endocrine cell migration in early embryos of zebrafish and mice, resulting into the disturbance of the proper formation of pancreatic islets [50]. At the primary differentiation stages cells maintain their capability to secrete both insulin and glucagon [1,2], but as the differentiation process proceeds the insulin production predominates over glucagon production. The genes that are controlled by the identified variants of the regulatory module incite a set of intracellular changes, mainly guided by NGN3, where the antagonism between the canonical and non-canonical WNT signaling pathways is crucial. It is noteworthy that, LEF1 is both the core factor of the alternative regulatory complexes of TFs presented here and the main participant of the canonical WNT signaling pathway. Summarizing, LEF1 that was identified in the conserved regulatory module of all the analyzed genes forms alternative complexes with the previously mentioned TFs and through the antagonism of CREBBP with HDAC1 the local acetylation levels of the chromatin abet or prevent transcription. The selective suppression of transcription by the recruitment of HDAC1 into the protein cluster could be possibly attributed to the AP4 factor. This factor by residing its’ binding motifs exclusively inside the regulatory regions of the suppressed genes and by interacting with LEF1 and HDAC1, may assist the HDAC1 recruitment to the protein complex. Moreover, the 50 ns MD simulations revealed that all three complexes presented in Figs. 3(A,B,C) soon reached a conformational equilibrium, even though they entered the MD simulations with variable starting energies (Fig. 4). Taken together these observations illustrated the viability of the docking results of the ZDOCK experiment for all three complexes. Furthermore, the low RMSd values indicated that these complexes remained conformationally close to their initial docking poses upon the minimization and the MD simulation courses that followed, reflecting the structural reliability of the adapted docking interaction patterns. Conclusively, the presented study revealed details concerning the regulatory circuits that enable the pancreas progenitors to become insulin producing cells. Promising TFBS, in genes with main participation into this differentiation process, occurred, calling for experimental validation. Furthermore, hypothetical protein complexes came into being awaiting for biochemical affirmation or molecular dynamics simulation endorsement when the suitable crystal structures become available. Besides, the genome-wide distribution of the two types of the discovered regulatory module and the annotation of the respective residence regions may unfold new perspectives and provide insights about these regulatory elements.

219

4. Data and methods 4.1. Regulatory region analyses of selected from microarray data genes Gene expression profiling of embryonic pancreas progenitors, in which NGN3 had been induced, indicated a number of genes whose expression increased or decreased 12–48 h after NGN3 induction. These genes, known and putative targets of NGN3, were selected for regulatory region analyses, aiming at the verification of the experimentally identified binding sites of NGN3 in its’ known targets and at the revelation of the predicted ones in the rest of the genes. In order to enforce the robustness of the method used, for each selected gene ten orthologs of species that have pancreas were analyzed. Therewithal, the orthologs from organisms, normally distributed across the phylogenetic distance from mammals to fishes were identified with the Reciprocal Best Blast Hit method, using as reference sequences the respective murine proteins, as extracted from Ensembl database (http://www.ensembl.org/ index.html). The genomic regions for each one of the analyzed genes, extending 4500 bp upstream to 500 bp downstream of the TSS were also obtained by Ensembl database (http://www.ensembl.org/index. html) for the species shown in Supplementary Material 2, Fig. S2. The retrieved regulatory sequences were submitted to the MatInspector platform in Genomatix Database (http://www.genomatix.de/en/index. html) in order to identify putative TFBS for all the TFs of Matrix Library 7.1. The TFBS were primarily predicted setting as cut- off parameters the core similarity score (css) at 0.75, and selecting the “optimized” matrix similarity score (mss). In an effort to increase the results’ reliability, only the TFBS common to all the orthologs for every gene checked were accepted. Subsequently, the experimentally verified TFBS for NGN3 in the genes known targets were considered as references and their mss and css values were set as cut- off parameters for the reliable identification of the TFBS for the NEUR TF family (where NGN3 belongs). Setting these sites as mandatory elements, a detailed search for regions of maximum length 1000 bp, which also contained TFBS for additional factors, was performed. Towards this direction and based at an additional filtering method applied in a previous work [2], clusters of phylogenetically conserved TFBS for factors belonging to the families NEUR, PDX1, HNF1, HNF6, BRNF, LEFF and AP4R were identified.

4.2. Whole genome analyses Certain limitations of MatInspector, like the maximum number of matches that can be printed to output, led to the implementation of an in-house tool for the identification of putative TFBS in all the DNA sequences of an entire genome. Therefore, a tool called “TFBS_Identifier” was built in C in order to perform the regulatory analyses of the whole mouse genome sequences for the TF families of interest. The identification of putative TFBS was performed using available Position Weight Matrices (PWMs) and computing css and mss, between the PWM and the query sequence. More information about the algorithm can be found at MatInspector Publications (http://www.genomatix.de/en/ index.html). “TFBS_Identifier” was constructed to take as input the query sequence, the PWM and a threshold for each of the css and mss. The query sequences were those belonging to each of the 21 chromosomes of the murine genome (INSDC Genome Collections Accession GCF_000001635.18) corresponding to coding and non‐coding or intergenic regions that were extracted from the Ensembl Database (ftp://ftp.ensembl.org/pub/current_fasta/mus_musculus/dna). The PWM for the TF families NEUR, PDX1, HNF1, HNF6, BRNF, LEFF and AP4R were retrieved from the MatBase Database of the Genomatix Suite (http://www.genomatix.de/en/index.html). The cutoffs used for the css and mss were identical with those used at the analyses of the selected from microarray data genes. Specifically, for NEUR family the cutoff value of mss was set at 0,92, as this value verified the experimentally identified binding sites of NGN3.

220

M. Kapasa et al. / Genomics 100 (2012) 212–221

The output files of the “TFBS_Identifier” application were processed with another code named “Cluster_Detector”. This code, written in C++ read the TFBS files and, setting binding sites of NEUR TF family with mss greater than 0.92 as anchors, searched 500 bp both upstream and downstream of these points for TFBS of the previously mentioned TF families. Two alternative versions of “Cluster_Detector” were executed at Linux environment and lead into the identification of clusters of TFBS, representing the two variants of the module (with or without the AP4 factor). 4.3. Statistical analysis The Pearson correlation was applied for the statistical analysis of the two variants of the module that were identified in the selected from microarray data genes as well as in the sequences comprising the mouse genome, either fitting genes or intergenic regions. The analysis was performed with the SPSS 16.0 package considering a significance level at p b 0.05. Indeed, linear correlation was used to describe the relationship between the number of the two alternative types of the regulatory module and the number of the nucleotides of the analyzed sequences. It is noteworthy that, the total length of the analyzed genes’ sequences was calculated by multiplying the number of the selected genes (Table 1) with the number of the orthologs (10) checked for each one and with 4500 bp counting for each analyzed gene's sequence. It is also worth noting that the orthologs of the previously analyzed genes B-ACTIN (ACTB) and B-GLOBIN (HBB), which used as negative controls [3], were excluded from this calculation. On the other hand, the information concerning the length in nucleotides of the murine genome was provided by Ensembl Database (http://www.ensembl.org/index.html). 4.4. Regulatory network construction The putative complex of TFs and the regulatory network presented in Figs. 2 and 5 respectively were constructed through the applications of the Ingenuity Pathway Analysis (IPA) (http://www.ingenuity.com/). More precisely, the regulatory bundle of Fig. 2 was constructed by retrieving the documented knowledge concerning the protein–protein interactions among the TFs of the conserved regulatory module setting as queries the protein identities and searching for a suppositional assemblage containing all of them. On the other hand, the fold changes of expression of the analyzed genes, retrieved at 12, 24 and 48 h after NGN3 induction [1] were set as disparate input datasets in the IPA Platform and distinct analyses were performed for the up-regulated and the down-regulated genes. Moreover, a complementary analysis was also performed, setting as query the expression fold changes of both the induced and the suppressed genes, 48 h after NGN3 induction. The results of these analyses were combined into a final network depicted in Fig. 5, where all the regulatory events concerning the under question genes were embedded into the presented bunch of edges and lines, while the redundant ones were trimmed. For both approaches leading to Figs. 2 and 5, up to 3 eligible molecules were allowed to intervene among the query molecular players. It is noteworthy that, 3 intermediate molecules reflect a chance of 10 −3 of the respective edge to have occurred by chance. 4.5. Structural study of protein–protein interactions The structural study of trans–trans interactions was performed after retrieving the solved structures for the TFs of interest from Protein Data Bank (PDB) (http://www.pdb.org/pdb/home/home.do). Additionally, models through comparative modelling were constructed for the factors lacking solved structures, based at experimentally verified structures of proteins with which greater than 30% sequence similarity was shared. Comparative modelling was conducted with SWISS MODEL (http://swissmodel.expasy.org/) and the models were subsequently evaluated with ANOLEA (http://protein.bio.puc.cl/cardex/servers/

anolea/index.html). The experimentally identified structures and the constructed models were used for molecular docking with ZDOCK version 3.0, while RDOCK was utilized to minimize the ZDOCK molecular complex outputs and rank them according to their re-calculated binding free energies. Protein complexes were further subjected to an extensive energy minimization run using the Amber99 forcefield as it was implemented into the Gromacs, version 4.5.5, via a previously developed graphical interface for it. An implicit Generalized Born (GB) solvation was chosen at this stage, in an attempt to speed up the energy minimization process. In order to explore further the interaction space and binding potential of each docking conformation, the molecular complexes were subjected to unrestrained molecular dynamics simulations using the Gromacs suite, version 4.5.5. Molecular dynamics took place in a periodic environment, which was subsequently solvated with Simple Point Charge (SPC) water using the truncated octahedron box, extending to 7 Å from each molecule. Partial charges were applied and the molecular systems were neutralized with counter-ions as required. The temperature was set to 300 K, the pressure at 1 atm and the step size was set to 2 fs. The total run of each molecular complex was fifty nanoseconds (50 ns), using the Number of atoms, the Volume and the Temperature (NVT), that remain constant throughout the calculation, in a canonical environment. The results of the MD simulations were collected into a molecular trajectory database in order to be further analyzed. Principal component analysis was done using Pymol (DeLano, W. L. The PyMoL Molecular Graphics System (2002), http://www.pymol.org) and the Ca atom root-mean-square function of Deep-View. Analyses of the MD outputs and trajectories were therefore focused on structural deviations of each molecular system from its original docking conformation. The molecular systems were neither restrained nor constrained throughout the simulation, with the exception of the DNA coordinates for the complex shown in Fig. 3B, which were fixed in the molecular dynamics periodic three dimensional space. The MD final conformations were initially evaluated with a residue packing quality function built-in the Gromacs suite, depending on the number of buried non-polar side chain groups and on hydrogen bonding. Moreover, the suites Procheck and Verify3D were employed to evaluate the structural viability of each protein complex upon the MD simulations. Illustrations of the molecular systems were rendered with the aid of the Chimera suite. Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.ygeno.2012.07.002. Acknowledgments We are grateful to D.D.S. M. Kalarrytis for thoroughly revising the final version of the manuscript. We also thank Dr. A. Kastania for kindly helping MK with the statistical analyses. References [1] I. Serafimidis, I. Rakatzi, V. Episkopou, M. Gouti, A. Gavalas, Novel effectors of directed and Ngn3-mediated differentiation of mouse embryonic stem cells into endocrine pancreas progenitors, Stem Cells 26 (2008) 3–16. [2] M. Kapasa, I. Serafimidis, A. Gavalas, S. Kossida, Identification of phylogenetically conserved enhancer elements implicated in pancreas development in the WISP1 and CTGF orthologs, Genomics 92 (2008) 301–308. [3] M. Kapasa, S. Arhondakis, S. Kossida, Phylogenetic and regulatory region analysis of Wnt5 genes reveals conservation of a regulatory module with putative implication in pancreas development, Biol. Direct 5 (2010) 49. [4] G. Mellitzer, S. Bonne, R.F. Luco, M. Van De Casteele, N. Lenne-Samuel, P. Collombat, A. Mansouri, J. Lee, M. Lan, D. Pipeleers, F.C. Nielsen, J. Ferrer, G. Gradwohl, H. Heimberg, IA1 is NGN3-dependent and essential for differentiation of the endocrine pancreas, EMBO J. 25 (2006) 1344–1352. [5] A.B. Vojtek, J. Taylor, S.L. DeRuiter, J.Y. Yu, C. Figueroa, R.P. Kwok, D.L. Turner, Akt regulates basic helix–loop–helix transcription factor-coactivator complex formation and activity during neuronal differentiation, Mol. Cell. Biol. 23 (2003) 4417–4427. [6] E. Soutoglou, G. Papafotiou, N. Katrakili, I. Talianidis, Transcriptional activation by hepatocyte nuclear factor-1 requires synergism between multiple coactivator proteins, J. Biol. Chem. 275 (2000) 12515–12520. [7] C. Fonte, J. Grenier, A. Trousson, A. Chauchereau, O. Lahuna, E.E. Baulieu, M. Schumacher, C. Massaad, Involvement of {beta}-catenin and unusual behavior

M. Kapasa et al. / Genomics 100 (2012) 212–221

[8]

[9]

[10] [11]

[12] [13]

[14]

[15]

[16]

[17]

[18]

[19] [20]

[21]

[22] [23]

[24]

[25]

[26]

[27]

[28] [29]

of CBP and p300 in glucocorticosteroid signaling in Schwann cells, Proc. Natl. Acad. Sci. U. S. A. 102 (2005) 14260–14265. F.M. Rausa, Y. Tan, R.H. Costa, Association between hepatocyte nuclear factor 6 (HNF-6) and FoxA2 DNA binding domains stimulates FoxA2 transcriptional activity but inhibits HNF-6 DNA binding, Mol. Cell. Biol. 23 (2003) 437–449. A.N. Billin, H. Thirlwell, D.E. Ayer, Beta-catenin-histone deacetylase interactions regulate the transition of LEF1 from a transcriptional repressor to an activator, Mol. Cell. Biol. 20 (2000) 6882–6890. K. Imai, T. Okamoto, Transcriptional repression of human immunodeficiency virus type 1 by AP-4, J. Biol. Chem. 281 (2006) 12495–12505. A.L. Mosley, S. Ozcan, The pancreatic duodenal homeobox-1 protein (Pdx-1) interacts with histone deacetylases Hdac-1 and Hdac-2 on low levels of glucose, J. Biol. Chem. 279 (2004) 54241–54247. H.J. Welters, R.N. Kulkarni, Wnt signaling: relevance to beta-cell biology and diabetes, Trends Endocrinol. Metab. 19 (2008) 349–355. Y. Heremans, M. Van De Casteele, P. in't Veld, G. Gradwohl, P. Serup, O. Madsen, D. Pipeleers, H. Heimberg, Recapitulation of embryonic neuroendocrine differentiation in adult human pancreatic duct cells expressing neurogenin 3, J. Cell Biol. 159 (2002) 303–312. M.A. Cissell, L. Zhao, L. Sussel, E. Henderson, R. Stein, Transcription factor occupancy of the insulin gene in vivo. Evidence for direct regulation by Nkx2.2, J. Biol. Chem. 278 (2003) 751–756. T. Zhang, H. Wang, N.A. Saunee, M.B. Breslin, M.S. Lan, Insulinoma-associated antigen-1 zinc-finger transcription factor promotes pancreatic duct cell trans-differentiation, Endocrinology 151 (2010) 2030–2039. W. Ge, F. He, K.J. Kim, B. Blanchi, V. Coskun, L. Nguyen, X. Wu, J. Zhao, J.I. Heng, K. Martinowich, J. Tao, H. Wu, D. Castro, M.M. Sobeih, G. Corfas, J.G. Gleeson, M.E. Greenberg, F. Guillemot, Y.E. Sun, Coupling of cell migration with neurogenesis by proneural bHLH factors, Proc. Natl. Acad. Sci. U. S. A. 103 (2006) 1319–1324. M. Sugimori, M. Nagao, N. Bertrand, C.M. Parras, F. Guillemot, M. Nakafuku, Combinatorial actions of patterning and HLH transcription factors in the spatiotemporal control of neurogenesis and gliogenesis in the developing spinal cord, Development 134 (2007) 1617–1629. R. Scardigli, C. Schuurmans, G. Gradwohl, F. Guillemot, Crossregulation between Neurogenin2 and pathways specifying neuronal identity in the spinal cord, Neuron 31 (2001) 203–217. R. Ayala, T. Shu, L.H. Tsai, Trekking across the brain: the journey of neuronal migration, Cell 128 (2007) 29–43. A. Parachikova, C.W. Cotman, Reduced CXCL12/CXCR4 results in impaired learning and is downregulated in a mouse model of Alzheimer disease, Neurobiol. Dis. 28 (2007) 143–153. L. Gasparini, G.K. Gouras, R. Wang, R.S. Gross, M.F. Beal, P. Greengard, H. Xu, Stimulation of beta-amyloid precursor protein trafficking by insulin reduces intraneuronal beta-amyloid and requires mitogen-activated protein kinase signaling, J. Neurosci. 21 (2001) 2561–2570. Y. Chen, A.M. Bodles, Amyloid precursor protein modulates beta-catenin degradation, J. Neuroinflammation 4 (2007) 29. T. Fujino, H. Asaba, M.J. Kang, Y. Ikeda, H. Sone, S. Takada, D.H. Kim, R.X. Ioka, M. Ono, H. Tomoyori, M. Okubo, T. Murase, A. Kamataki, J. Yamamoto, K. Magoori, S. Takahashi, Y. Miyamoto, H. Oishi, M. Nose, M. Okazaki, S. Usui, K. Imaizumi, M. Yanagisawa, J. Sakai, T.T. Yamamoto, Low-density lipoprotein receptor-related protein 5 (LRP5) is essential for normal cholesterol metabolism and glucose-induced insulin secretion, Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 229–234. J. Dejmek, A. Safholm, C. Kamp Nielsen, T. Andersson, K. Leandersson, Wnt5a/Ca2+‐induced NFAT activity is counteracted by Wnt-5a/Yes-Cdc42casein kinase 1alpha signaling in human mammary epithelial cells, Mol. Cell. Biol. 26 (2006) 6024–6036. J.T. Deeney, J. Gromada, M. Hoy, H.L. Olsen, C.J. Rhodes, M. Prentki, P.O. Berggren, B.E. Corkey, Acute stimulation with long chain acyl-CoA enhances exocytosis in insulin-secreting cells (HIT T-15 and NMRI beta-cells), J. Biol. Chem. 275 (2000) 9363–9368. O. Wiggan, P.A. Hamel, Pax3 regulates morphogenetic cell behavior in vitro coincident with activation of a PCP/non-canonical Wnt-signaling cascade, J. Cell Sci. 115 (2002) 531–541. M. Kruger, I. Kratchmarova, B. Blagoev, Y.H. Tseng, C.R. Kahn, M. Mann, Dissection of the insulin signaling pathway via quantitative phosphoproteomics, Proc. Natl. Acad. Sci. U. S. A. 105 (2008) 2451–2456. H.J. Kim, R. Lotan, Identification of retinoid-modulated proteins in squamous carcinoma cells using high-throughput immunoblotting, Cancer Res. 64 (2004) 2439–2448. S. Daniel, M. Noda, R.A. Cerione, G.W. Sharp, A link between Cdc42 and syntaxin is involved in mastoparan-stimulated insulin release, Biochemistry 41 (2002) 9663–9671.

221

[30] C. Zheng, J. Xiang, T. Hunter, A. Lin, The JNKK2-JNK1 fusion protein acts as a constitutively active c-Jun kinase that stimulates c-Jun transcription activity, J. Biol. Chem. 274 (1999) 28966–28971. [31] T. Minamino, T. Yujiri, P.J. Papst, E.D. Chan, G.L. Johnson, N. Terada, MEKK1 suppresses oxidative stress-induced apoptosis of embryonic stem cell-derived cardiac myocytes, Proc. Natl. Acad. Sci. U. S. A. 96 (1999) 15127–15132. [32] T. Banno, A. Gazel, M. Blumenberg, Effects of tumor necrosis factor-alpha (TNF alpha) in epidermal keratinocytes revealed using global transcriptional profiling, J. Biol. Chem. 279 (2004) 32633–32642. [33] K. Venkatachalam, B. Venkatesan, A.J. Valente, P.C. Melby, S. Nandish, J.E. Reusch, R.A. Clark, B. Chandrasekar, WISP1, a pro-mitogenic, pro-survival factor, mediates tumor necrosis factor-alpha (TNF-alpha)-stimulated cardiac fibroblast proliferation but inhibits TNF-alpha-induced cardiomyocyte death, J. Biol. Chem. 284 (2009) 14414–14427. [34] L.A. Cooker, D. Peterson, J. Rambow, M.L. Riser, R.E. Riser, F. Najmabadi, D. Brigstock, B.L. Riser, TNF-alpha, but not IFN-gamma, regulates CCN2 (CTGF), collagen type I, and proliferation in mesangial cells: possible roles in the progression of renal fibrosis, Am. J. Physiol. Renal Physiol. 293 (2007) F157–F165. [35] M. Umikawa, H. Obaishi, H. Nakanishi, K. Satoh-Horikawa, K. Takahashi, I. Hotta, Y. Matsuura, Y. Takai, Association of frabin with the actin cytoskeleton is essential for microspike formation through activation of Cdc42 small G protein, J. Biol. Chem. 274 (1999) 25197–25200. [36] F. Bertrand, C. Desbois-Mouthon, A. Cadoret, C. Prunier, H. Robin, J. Capeau, A. Atfi, G. Cherqui, Insulin antiapoptotic signaling involves insulin activation of the nuclear factor kappaB-dependent survival genes encoding tumor necrosis factor receptor-associated factor 2 and manganese-superoxide dismutase, J. Biol. Chem. 274 (1999) 30596–30602. [37] M.S. Cammarano, A. Minden, Dbl and the Rho GTPases activate NF kappa B by I kappa B kinase (IKK)-dependent and IKK-independent pathways, J. Biol. Chem. 276 (2001) 25876–25882. [38] T. Minami, M. Miura, W.C. Aird, T. Kodama, Thrombin-induced autoinhibitory factor, Down syndrome critical region-1, attenuates NFAT-dependent vascular cell adhesion molecule-1 expression and inflammation in the endothelium, J. Biol. Chem. 281 (2006) 20503–20520. [39] Q. Huang, A. Raya, P. DeJesus, S.H. Chao, K.C. Quon, J.S. Caldwell, S.K. Chanda, J.C. Izpisua-Belmonte, P.G. Schultz, Identification of p53 regulators by genome-wide functional analysis, Proc. Natl. Acad. Sci. U. S. A. 101 (2004) 3456–3461. [40] T. Liu, C. Laurell, G. Selivanova, J. Lundeberg, P. Nilsson, K.G. Wiman, Hypoxia induces p53-dependent transactivation and Fas/CD95-dependent apoptosis, Cell Death Differ. 14 (2007) 411–421. [41] S.A. Amundson, K.T. Do, L. Vinikoor, C.A. Koch-Paiz, M.L. Bittner, J.M. Trent, P. Meltzer, A.J. Fornace Jr., Stress-specific signatures: expression profiling of p53 wild-type and -null human cells, Oncogene 24 (2005) 4572–4579. [42] R.A. Abbud, R. Kelleher, S. Melmed, Cell-specific pituitary gene expression profiles after treatment with leukemia inhibitory factor reveal novel modulators for proopiomelanocortin expression, Endocrinology 145 (2004) 867–880. [43] J. Goodall, S. Martinozzi, T.J. Dexter, D. Champeval, S. Carreira, L. Larue, C.R. Goding, Brn-2 expression controls melanoma proliferation and is directly regulated by beta-catenin, Mol. Cell. Biol. 24 (2004) 2915–2922. [44] K.F. Malik, H. Jaffe, J. Brady, W.S. Young III, The class III POU factor Brn-4 interacts with other class III POU factors and the heterogeneous nuclear ribonucleoprotein U, Brain Res. Mol. Brain Res. 45 (1997) 99–107. [45] C. Nelson, L.P. Shen, A. Meister, E. Fodor, W.J. Rutter, Pan: a transcriptional regulator that binds chymotrypsin, insulin, and AP-4 enhancer motifs, Genes Dev. 4 (1990) 1035–1043. [46] A.M. Boulet, C.R. Erwin, W.J. Rutter, Cell-specific enhancers in the rat exocrine pancreas, Proc. Natl. Acad. Sci. U. S. A. 83 (1986) 3599–3603. [47] M. Ostrom, K.A. Loffler, S. Edfalk, L. Selander, U. Dahl, C. Ricordi, J. Jeon, M. Correa-Medina, J. Diez, H. Edlund, Retinoic acid promotes the generation of pancreatic endocrine progenitor cells and their further differentiation into beta-cells, PLoS One 3 (2008) e2841. [48] L.A. Crawford, M.A. Guney, Y.A. Oh, R.A. Deyoung, D.M. Valenzuela, A.J. Murphy, G.D. Yancopoulos, K.M. Lyons, D.R. Brigstock, A. Economides, M. Gannon, Connective tissue growth factor (CTGF) inactivation leads to defects in islet cell lineage allocation and beta-cell proliferation during embryogenesis, Mol. Endocrinol. 23 (2009) 324–336. [49] G. Kesavan, F.W. Sand, T.U. Greiner, J.K. Johansson, S. Kobberup, X. Wu, C. Brakebusch, H. Semb, Cdc42-mediated tubulogenesis controls cell specification, Cell 139 (2009) 791–801. [50] I. Serafimidis, S. Heximer, D. Beis, A. Gavalas, G protein-coupled receptor signaling and sphingosine-1-phosphate play a phylogenetically conserved role in endocrine pancreas morphogenesis, Mol. Cell. Biol. 31 (2011) 4442–4453.