useless.out 95.7 KB

Raw Blame History Permalink

29339415	 Escherichia coli K1 strains are major causative agents of invasive disease of newborn infants. The age dependency of infection can be reproduced in neonatal rats. Colonization of the small intestine following oral administration of K1 bacteria leads rapidly to invasion of the blood circulation; bacteria that avoid  capture by the mesenteric lymphatic system and evade antibacterial mechanisms in  the blood may disseminate to cause organ-specific infections such as meningitis.  Some E. coli K1 surface constituents, in particular the polysialic acid capsule,  are known to contribute to invasive potential, but a comprehensive picture of the factors that determine the fully virulent phenotype has not emerged so far. We constructed a library and constituent sublibraries of ∼775,000 Tn5 transposon mutants of E. coli K1 strain A192PP and employed transposon-directed insertion site sequencing (TraDIS) to identify genes required for fitness for infection of  2-day-old rats. Transposon insertions were lacking in 357 genes following recovery on selective agar; these genes were considered essential for growth in nutrient-replete medium. Colonization of the midsection of the small intestine was facilitated by 167 E. coli K1 gene products. Restricted bacterial translocation across epithelial barriers precluded TraDIS analysis of gut-to-blood and blood-to-brain transits; 97 genes were required for survival in  human serum. This study revealed that a large number of bacterial genes, many of  which were not previously associated with systemic E. coli K1 infection, are required to realize full invasive potential.IMPORTANCEEscherichia coli K1 strains cause life-threatening infections in newborn infants. They are acquired from the  mother at birth and colonize the small intestine, from where they invade the blood and central nervous system. It is difficult to obtain information from acutely ill patients that sheds light on physiological and bacterial factors determining invasive disease. Key aspects of naturally occurring age-dependent human infection can be reproduced in neonatal rats. Here, we employ transposon-directed insertion site sequencing to identify genes essential for the in vitro growth of E. coli K1 and genes that contribute to the colonization of susceptible rats. The presence of bottlenecks to invasion of the blood and cerebrospinal compartments precluded insertion site sequencing analysis, but we identified genes for survival in serum.
29091192	 Objectives: Polymyxins remain one of the last-resort drugs to treat infections caused by MDR Gram-negative pathogens. Here, we determined the mechanisms by which chromosomally encoded resistance to colistin and polymyxin B can arise in the MDR uropathogenic Escherichia coli ST131 reference strain EC958. Methods: Two complementary approaches, saturated transposon mutagenesis and spontaneous mutation induction with high concentrations of colistin and polymyxin B, were employed to select for mutations associated with resistance to polymyxins. Mutants were identified using transposon-directed insertion-site sequencing or Illumina WGS. A resistance phenotype was confirmed by MIC and further investigated using RT-PCR. Competitive growth assays were used to measure fitness cost. Results: A transposon insertion at nucleotide 41 of the pmrB gene (EC958pmrB41-Tn5) enhanced its transcript level, resulting in a 64- and 32-fold increased MIC of colistin and polymyxin B, respectively. Three spontaneous mutations, also located within the pmrB gene, conferred resistance to both colistin and polymyxin B with a corresponding increase in transcription of the pmrCAB genes. All three mutations incurred a fitness cost in the absence of colistin and polymyxin B. Conclusions: This study identified the pmrB gene as the main chromosomal target for induction of colistin and polymyxin B resistance in E. coli.
29066548	 Uropathogenic Escherichia coli (UPEC) is a major cause of urinary tract and bloodstream infections and possesses an array of virulence factors for colonization, survival, and persistence. One such factor is the polysaccharide K  capsule. Among the different K capsule types, the K1 serotype is strongly associated with UPEC infection. In this study, we completely sequenced the K1 UPEC urosepsis strain PA45B and employed a novel combination of a lytic K1 capsule-specific phage, saturated Tn5 transposon mutagenesis, and high-throughput transposon-directed insertion site sequencing (TraDIS) to identify the complement of genes required for capsule production. Our analysis identified known genes involved in capsule biosynthesis, as well as two additional regulatory genes (mprA and lrhA) that we characterized at the molecular level. Mutation of mprA resulted in protection against K1 phage-mediated killing, a phenotype restored by complementation. We also identified a significantly increased unidirectional Tn5  insertion frequency upstream of the lrhA gene and showed that strong expression of LrhA induced by a constitutive Pcl promoter led to loss of capsule production. Further analysis revealed loss of MprA or overexpression of LrhA affected the transcription of capsule biosynthesis genes in PA45B and increased sensitivity to killing in whole blood. Similar phenotypes were also observed in UPEC strains UTI89 (K1) and CFT073 (K2), demonstrating that the effects were neither strain nor capsule type specific. Overall, this study defined the genome of a UPEC urosepsis isolate and identified and characterized two new regulatory factors that affect UPEC capsule production.IMPORTANCE Urinary tract infections (UTIs) are among the most common bacterial infections in humans and are primarily caused by uropathogenic Escherichia coli (UPEC). Many UPEC strains express a polysaccharide K capsule that provides protection against host innate immune factors and contributes to survival and persistence during infection. The K1 serotype is one example of a polysaccharide capsule type and is strongly associated with UPEC strains that cause UTIs, bloodstream infections, and meningitis. The number of UTIs caused by antibiotic-resistant UPEC is steadily increasing, highlighting the need to better understand factors (e.g., the capsule) that contribute to UPEC pathogenesis. This study describes the original  and novel application of lytic capsule-specific phage killing, saturated Tn5 transposon mutagenesis, and high-throughput transposon-directed insertion site sequencing to define the entire complement of genes required for capsule production in UPEC. Our comprehensive approach uncovered new genes involved in the regulation of this key virulence determinant.
28791299	 Increasing evidence that microRNAs (miRNAs) play important roles in the immune response against infectious agents suggests that miRNA might be exploitable as signatures of exposure to specific infectious agents. In order to identify potential early miRNA biomarkers of bacterial infections, human peripheral blood  mononuclear cells (hPBMCs) were exposed to two select agents, Burkholderia pseudomallei K96243 and Francisella tularensis SHU S4, as well as to the nonpathogenic control Escherichia coli DH5α. RNA samples were harvested at three  early time points, 30, 60, and 120 minutes postexposure, then sequenced. RNAseq analyses identified 87 miRNAs to be differentially expressed (DE) in a linear fashion. Of these, 31 miRNAs were tested using the miScript miRNA qPCR assay. Through RNAseq identification and qPCR validation, we identified differentially expressed miRNA species that may be involved in the early response to bacterial infections. Based upon its upregulation at early time points postexposure in two  different individuals, hsa-mir-30c-5p is a miRNA species that could be studied further as a potential biomarker for exposure to these gram-negative intracellular pathogens. Gene ontology functional analyses demonstrated that programmed cell death is the first ranking biological process associated with miRNAs that are upregulated in F. tularensis-exposed hPBMCs.
28649444	 Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 104 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data (Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory  networks in real cells.
28614372	 Infection with Shiga toxin (Stx) producing Escherichia coli O157:H7 can cause the potentially fatal complication hemolytic uremic syndrome, and currently only supportive therapy is available. Lack of suitable animal models has hindered study of this disease. Induced human intestinal organoids (iHIOs), generated by in vitro differentiation of pluripotent stem cells, represent differentiated human intestinal tissue. We show that iHIOs with addition of human neutrophils can model E. coli intestinal infection and innate cellular responses. Commensal and O157:H7 introduced into the iHIO lumen replicated rapidly achieving high numbers. Commensal E. coli did not cause damage, and were completely contained within the lumen, suggesting defenses, such as mucus production, can constrain non-pathogenic strains. Some O157:H7 initially co-localized with cellular actin.  Loss of actin and epithelial integrity was observed after 4 hours. O157:H7 grew as filaments, consistent with activation of the bacterial SOS stress response. SOS is induced by reactive oxygen species (ROS), and O157:H7 infection increased  ROS production. Transcriptional profiling (RNAseq) demonstrated that both commensal and O157:H7 upregulated genes associated with gastrointestinal maturation, while infection with O157:H7 upregulated inflammatory responses, including interleukin 8 (IL-8). IL-8 is associated with neutrophil recruitment, and infection with O157:H7 resulted in recruitment of human neutrophils into the  iHIO tissue.
28439033	 Upon oxygen limitation, the Bacillus subtilis ResE sensor kinase and its cognate  ResD response regulator play primary roles in the transcriptional activation of genes functioning in anaerobic respiration. The nitric oxide (NO)-sensitive NsrR  repressor controls transcription to support nitrate respiration. In addition, the ferric uptake repressor (Fur) can modulate transcription under anaerobic conditions. However, whether these controls are direct or indirect has been investigated only in a gene-specific manner. To gain a genomic view of anaerobic  gene regulation, we determined the genome-wide in vivo DNA binding of ResD, NsrR, and Fur transcription factors (TFs) using in situ DNase I footprinting combined with chromatin affinity precipitation sequencing (ChAP-seq; genome footprinting by high-throughput sequencing [GeF-seq]). A significant number of sites were targets of ResD and NsrR, and a majority of them were also bound by Fur. The binding of multiple TFs to overlapping targets affected each individual TF's binding, which led to combinatorial transcriptional control. ResD bound to both the promoters and the coding regions of genes under its positive control. Other genes showing enrichment of ResD at only the promoter regions are targets of direct ResD-dependent repression or antirepression. The results support previous  findings of ResD as an RNA polymerase (RNAP)-binding protein and indicated that ResD can associate with the transcription elongation complex. The data set allowed us to reexamine consensus sequence motifs of Fur, ResD, and NsrR and uncovered evidence that multiple TGW (where W is A or T) sequences surrounded by  an A- and T-rich sequence are often found at sites where all three TFs competitively bind.IMPORTANCE Bacteria encounter oxygen fluctuation in their natural environment as well as in host organisms. Hence, understanding how bacteria respond to oxygen limitation will impact environmental and human health. ResD, NsrR, and Fur control transcription under anaerobic conditions. This work using in situ DNase I footprinting uncovered the genome-wide binding profile of the three transcription factors (TFs). Binding of the TFs is often competitive or cooperative depending on the promoters and the presence of other TFs, indicating  that transcriptional regulation by multiple TFs is much more complex than we originally thought. The results from this study provide a more complete picture of anaerobic gene regulation governed by ResD, NsrR, and Fur and contribute to our further understanding of anaerobic physiology.
28270101	 BACKGROUND: Avian pathogenic E. coli (APEC) can lead to a loss in millions of dollars in poultry annually because of mortality and produce contamination. Studies have verified that many immune-related genes undergo changes in alternative splicing (AS), along with nonsense mediated decay (NMD), to regulate  the immune system under different conditions. Therefore, the splicing profiles of primary lymphoid tissues with systemic APEC infection need to be comprehensively  examined. RESULTS: Gene expression in RNAseq data were obtained for three different immune  tissues (bone marrow, thymus, and bursa) from three phenotype birds (non-challenged, resistant, and susceptible birds) at two time points. Alternative 5' splice sites and exon skipping/inclusion were identified as the major alternative splicing events in avian primary immune organs under systemic APEC infection. In this study, we detected hundreds of differentially-expressed-transcript-containing genes (DETs) between different phenotype birds at 5 days post-infection (dpi). DETs, PSAP and STT3A, with NMD have important functions under systemic APEC infection. DETs, CDC45, CDK1, RAG2,  POLR1B, PSAP, and DNASE1L3, from the same transcription start sites (TSS) indicate that cell death, cell cycle, cellular function, and maintenance were predominant in host under systemic APEC. CONCLUSIONS: With the use of RNAseq technology and bioinformatics tools, this study provides a portrait of the AS event and NMD in primary lymphoid tissues, which play critical roles in host homeostasis under systemic APEC infection. According to this study, AS plays a pivotal regulatory role in the immune response in chicken under systemic APEC infection via either NMD or alternative TSSs. This study elucidates the regulatory role of AS for the immune complex under systemic APEC infection.
28060822	 Mosquitoes host communities of microbes in their digestive tract that consist primarily of bacteria. We previously reported that Aedes aegypti larvae colonized by a native community of bacteria and gnotobiotic larvae colonized by only Escherichia coli develop very similarly into adults, whereas axenic larvae never  molt and die as first instars. In this study, we extended these findings by first comparing the growth and abundance of bacteria in conventional, gnotobiotic, and  axenic larvae during the first instar. Results showed that conventional and gnotobiotic larvae exhibited no differences in growth, timing of molting, or number of bacteria in their digestive tract. Axenic larvae in contrast grew minimally and never achieved the critical size associated with molting by conventional and gnotobiotic larvae. In the second part of the study we compared  patterns of gene expression in conventional, gnotobiotic and axenic larvae by conducting an RNAseq analysis of gut and nongut tissues (carcass) at 22 h post-hatching. Approximately 12% of Ae. aegypti transcripts were differentially expressed in axenic versus conventional or gnotobiotic larvae. However, this profile consisted primarily of transcripts in seven categories that included the  down-regulation of select peptidases in the gut and up-regulation of several genes in the gut and carcass with roles in amino acid transport, hormonal signaling, and metabolism. Overall, our results indicate that axenic larvae exhibit alterations in gene expression consistent with defects in acquisition and assimilation of nutrients required for growth.
28039131	 Human enteric pathogens, such as Salmonella spp. and verotoxigenic Escherichia coli, are increasingly recognized as causes of gastroenteritis outbreaks associated with the consumption of fruits and vegetables. Persistence in plants represents an important part of the life cycle of these pathogens. The identification of the full complement of Salmonella genes involved in the colonization of the model plant (tomato) was carried out using transposon insertion sequencing analysis. With this approach, 230,000 transposon insertions  were screened in tomato pericarps to identify loci with reduction in fitness, followed by validation of the screen results using competition assays of the isogenic mutants against the wild type. A comparison with studies in animals revealed a distinct plant-associated set of genes, which only partially overlaps  with the genes required to elicit disease in animals. De novo biosynthesis of amino acids was critical to persistence within tomatoes, while amino acid scavenging was prevalent in animal infections. Fitness reduction of the Salmonella amino acid synthesis mutants was generally more severe in the tomato rin mutant, which hyperaccumulates certain amino acids, suggesting that these nutrients remain unavailable to Salmonella spp. within plants. Salmonella lipopolysaccharide (LPS) was required for persistence in both animals and plants, exemplifying some shared pathogenesis-related mechanisms in animal and plant hosts. Similarly to phytopathogens, Salmonella spp. required biosynthesis of amino acids, LPS, and nucleotides to colonize tomatoes. Overall, however, it appears that while Salmonella shares some strategies with phytopathogens and taps into its animal virulence-related functions, colonization of tomatoes represents  a distinct strategy, highlighting this pathogen's flexible metabolism.IMPORTANCE  Outbreaks of gastroenteritis caused by human pathogens have been increasingly associated with foods of plant origin, with tomatoes being one of the common culprits. Recent studies also suggest that these human pathogens can use plants as alternate hosts as a part of their life cycle. While dual (animal/plant) lifestyles of other members of the Enterobacteriaceae family are well known, the  strategies with which Salmonella colonizes plants are only partially understood.  Therefore, we undertook a high-throughput characterization of the functions required for Salmonella persistence within tomatoes. The results of this study were compared with what is known about genes required for Salmonella virulence in animals and interactions of plant pathogens with their hosts to determine whether Salmonella repurposes its virulence repertoire inside plants or whether it behaves more as a phytopathogen during plant colonization. Even though Salmonella utilized some of its virulence-related genes in tomatoes, plant colonization required a distinct set of functions.
27872077	 Plasmids of incompatibility group A/C (IncA/C) are becoming increasingly prevalent within pathogenic Enterobacteriaceae They are associated with the dissemination of multiple clinically relevant resistance genes, including blaCMY  and blaNDM Current typing methods for IncA/C plasmids offer limited resolution. In this study, we present the complete sequence of a blaNDM-1-positive IncA/C plasmid, pMS6198A, isolated from a multidrug-resistant uropathogenic Escherichia  coli strain. Hypersaturated transposon mutagenesis, coupled with transposon-directed insertion site sequencing (TraDIS), was employed to identify  conserved genetic elements required for replication and maintenance of pMS6198A.  Our analysis of TraDIS data identified roles for the replicon, including repA, a  toxin-antitoxin system; two putative partitioning genes, parAB; and a putative gene, 053 Construction of mini-IncA/C plasmids and examination of their stability within E. coli confirmed that the region encompassing 053 contributes to the stable maintenance of IncA/C plasmids. Subsequently, the four major maintenance genes (repA, parAB, and 053) were used to construct a new plasmid multilocus sequence typing (PMLST) scheme for IncA/C plasmids. Application of this scheme to a database of 82 IncA/C plasmids identified 11 unique sequence types (STs), with  two dominant STs. The majority of blaNDM-positive plasmids examined (15/17; 88%)  fall into ST1, suggesting acquisition and subsequent expansion of this blaNDM-containing plasmid lineage. The IncA/C PMLST scheme represents a standardized tool to identify, track, and analyze the dissemination of important  IncA/C plasmid lineages, particularly in the context of epidemiological studies.
27836995	 RNA sequencing studies have identified hundreds of non-coding RNAs in bacteria, including regulatory small RNA (sRNA). However, our understanding of sRNA function has lagged behind their identification due to a lack of tools for the high-throughput analysis of RNA-RNA interactions in bacteria. Here we demonstrate that in vivo sRNA-mRNA duplexes can be recovered using UV-crosslinking, ligation  and sequencing of hybrids (CLASH). Many sRNAs recruit the endoribonuclease, RNase E, to facilitate processing of mRNAs. We were able to recover base-paired sRNA-mRNA duplexes in association with RNase E, allowing proximity-dependent ligation and sequencing of cognate sRNA-mRNA pairs as chimeric reads. We verified that this approach captures bona fide sRNA-mRNA interactions. Clustering analyses identified novel sRNA seed regions and sets of potentially co-regulated target mRNAs. We identified multiple mRNA targets for the pathotype-specific sRNA Esr41, which was shown to regulate colicin sensitivity and iron transport in E. coli Numerous sRNA interactions were also identified with non-coding RNAs, including sRNAs and tRNAs, demonstrating the high complexity of the sRNA interactome.
27492287	 DNA of viral origin represents a ubiquitous element of bacterial genomes. Its integration into host regulatory circuits is a pivotal driver of microbial evolution but requires the stringent regulation of phage gene activity. In this study, we describe the nucleoid-associated protein CgpS, which represents an essential protein functioning as a xenogeneic silencer in the Gram-positive Corynebacterium glutamicum CgpS is encoded by the cryptic prophage CGP3 of the C. glutamicum strain ATCC 13032 and was first identified by DNA affinity chromatography using an early phage promoter of CGP3. Genome-wide profiling of CgpS binding using chromatin affinity purification and sequencing (ChAP-Seq) revealed its association with AT-rich DNA elements, including the entire CGP3 prophage region (187 kbp), as well as several other elements acquired by horizontal gene transfer. Countersilencing of CgpS resulted in a significantly increased induction frequency of the CGP3 prophage. In contrast, a strain lacking the CGP3 prophage was not affected and displayed stable growth. In a bioinformatics approach, cgpS orthologs were identified primarily in actinobacterial genomes as well as several phage and prophage genomes. Sequence analysis of 618 orthologous proteins revealed a strong conservation of the secondary structure, supporting an ancient function of these xenogeneic silencers in phage-host interaction.
27466434	 Avian pathogenic Escherichia coli (APEC) can cause significant morbidity in chickens. The thymus provides the essential environment for T cell development; however, the thymus transcriptome has not been examined for gene expression in response to APEC infection. An improved understanding of the host genomic response to APEC infection could inform future breeding programs for disease resistance and APEC control. We therefore analyzed the transcriptome of the thymus of birds challenged with APEC, contrasting susceptible and resistant phenotypes. Thousands of genes were differentially expressed in birds of the 5-day post infection (dpi) challenged-susceptible group vs. 5 dpi non-challenged, in 5 dpi challenged-susceptible vs. 5 dpi challenged-resistant birds, as well as  in 5 dpi vs. one dpi challenged-susceptible birds. The Toll-like receptor signaling pathway was the major innate immune response for birds to respond to APEC infection. Moreover, lysosome and cell adhesion molecules pathways were common mechanisms for chicken response to APEC infection. The T-cell receptor signaling pathway, cell cycle, and p53 signaling pathways were significantly activated in resistant birds to resist APEC infection. These results provide a comprehensive assessment of global gene networks and biological functionalities of differentially expressed genes in the thymus under APEC infection. These findings provide novel insights into key molecular genetic mechanisms that differentiate host resistance from susceptibility in this primary lymphoid tissue, the thymus.
27424527	 Thermobifida fusca is a thermophilic actinobacterium. T. fusca muC obtained by adaptive evolution preferred yeast extract to ammonium sulfate for accumulating malic acid and ammonium sulfate for cell growth. We did transcriptome analysis of T. fusca muC on Avicel and cellobiose with addition of ammonium sulfate or yeast  extract, respectively by RNAseq. The transcriptional results indicate that ammonium sulfate induced the transcriptions of the genes related to carbohydrate  metabolisms significantly more than yeast extract. Importantly, Tfu_2487, encoding histidine-containing protein (HPr), didn't transcribe on yeast extract at all, while it transcribed highly on ammonium sulfate. In order to understand the impact of HPr on malate production and cell growth of the muC strain, we deleted Tfu_2487 to get a mutant strain: muCΔ2487, which had 1.33 mole/mole-glucose equivalent malate yield, much higher than that on yeast extract. We then developed an E. coli-T. fusca shuttle plasmid for over-expressing HPr in muCΔ2487, a strain without HPr background, forming the muCΔ2487S strain. The muCΔ2487S strain had a much lower malate yield but faster cell growth than the muC strain. The results of both mutant strains confirmed that HPr was the key regulatory protein for T. fusca's metabolisms on nitrogen sources.
27336699	 Our objective was to identify the biological response and the cross-talk between  liver and mammary tissue after intramammary infection (IMI) with Escherichia coli (E. coli) using RNAseq technology. Sixteen cows were inoculated with live E. coli into one mammary quarter at ~4-6 weeks in lactation. For all cows, biopsies were  performed at -144, 12 and 24 h relative to IMI in liver and at 24 h post-IMI in infected and non-infected (control) mammary quarters. For a subset of cows (n = 6), RNA was extracted from both liver and mammary tissue and sequenced using a 100 bp paired-end approach. Ingenuity Pathway Analysis and the Dynamic Impact Approach analysis of differentially expressed genes (overall effect False Discovery Rate≤0.05) indicated that IMI induced an overall activation of inflammation at 12 h post-IMI and a strong inhibition of metabolism, especially related to lipid, glucose, and xenobiotics at 24 h post-IMI in liver. The data indicated in mammary tissue an overall induction of inflammatory response with little effect on metabolism at 24 h post-IMI. We identified a large number of up-stream regulators potentially involved in the response to IMI in both tissues  but a relatively small core network of transcription factors controlling the response to IMI for liver whereas a large network in mammary tissue. Transcriptomic results in liver and mammary tissue were supported by changes in inflammatory and metabolic mediators in blood and milk. The analysis of potential cross-talk between the two tissues during IMI uncovered a large communication from the mammary tissue to the liver to coordinate the inflammatory response but  a relatively small communication from the liver to the mammary tissue. Our results indicate a strong induction of the inflammatory response in mammary tissue and impairment of liver metabolism 24h post-IMI partly driven by the signaling from infected mammary tissue.
27298336	 R loops form when transcripts hybridize to homologous DNA on chromosomes, yielding a DNA:RNA hybrid and a displaced DNA single strand. R loops impact the genome of many organisms, regulating chromosome stability, gene expression, and DNA repair. Understanding the parameters dictating R-loop formation in vivo has been hampered by the limited quantitative and spatial resolution of current genomic strategies for mapping R loops. We report a novel whole-genome method, S1-DRIP-seq (S1 nuclease DNA:RNA immunoprecipitation with deep sequencing), for mapping hybrid-prone regions in budding yeast Saccharomyces cerevisiae Using this methodology, we identified ∼800 hybrid-prone regions covering 8% of the genome. Given the pervasive transcription of the yeast genome, this result suggests that  R-loop formation is dictated by characteristics of the DNA, RNA, and/or chromatin. We successfully identified two features highly predictive of hybrid formation: high transcription and long homopolymeric dA:dT tracts. These accounted for >60% of the hybrid regions found in the genome. We demonstrated that these two factors play a causal role in hybrid formation by genetic manipulation. Thus, the hybrid map generated by S1-DRIP-seq led to the identification of the first global genomic features causal for R-loop formation in yeast.
27004424	 BACKGROUND: Biofilm formation is an important survival strategy of Salmonella in  all environments. By mutant screening, we showed a knock-out mutant of fabR, encoding a repressor of unsaturated fatty acid biosynthesis (UFA), to have impaired biofilm formation. In order to unravel how this regulator impinges on Salmonella biofilm formation, we aimed at elucidating the S. Typhimurium FabR regulon. Hereto, we applied a combinatorial high-throughput approach, combining ChIP-chip with transcriptomics. RESULTS: All the previously identified E. coli FabR transcriptional target genes  (fabA, fabB and yqfA) were shown to be direct S. Typhimurium FabR targets as well. As we found a fabB overexpressing strain to partly mimic the biofilm defect of the fabR mutant, the effect of FabR on biofilms can be attributed at least partly to FabB, which plays a key role in UFA biosynthesis. Additionally, ChIP-chip identified a number of novel direct FabR targets (the intergenic regions between hpaR/hpaG and ddg/ydfZ) and yet putative direct targets (i.a. genes involved in tRNA metabolism, ribosome synthesis and translation). Next to UFA biosynthesis, a number of these direct targets and other indirect targets identified by transcriptomics (e.g. ribosomal genes, ompA, ompC, ompX, osmB, osmC, sseI), could possibly contribute to the effect of FabR on biofilm formation. CONCLUSION: Overall, our results point at the importance of FabR and UFA biosynthesis in Salmonella biofilm formation and their role as potential targets  for biofilm inhibitory strategies.
26862720	 The DNA adenine methyltransferase identification (DamID) assay is a powerful method to detect protein-DNA interactions both locally and genome-wide. It is an  alternative approach to chromatin immunoprecipitation (ChIP). An expressed fusion protein consisting of the protein of interest and the E. coli DNA adenine methyltransferase can methylate the adenine base in GATC motifs near the sites of protein-DNA interactions. Adenine-methylated DNA fragments can then be specifically amplified and detected. The original DamID assay detects the genomic locations of methylated DNA fragments by hybridization to DNA microarrays, which  is limited by the availability of microarrays and the density of predetermined probes. In this paper, we report the detailed protocol of integrating high throughput DNA sequencing into DamID (DamID-seq). The large number of short reads generated from DamID-seq enables detecting and localizing protein-DNA interactions genome-wide with high precision and sensitivity. We have used the DamID-seq assay to study genome-nuclear lamina (NL) interactions in mammalian cells, and have noticed that DamID-seq provides a high resolution and a wide dynamic range in detecting genome-NL interactions. The DamID-seq approach enables probing NL associations within gene structures and allows comparing genome-NL interaction maps with other functional genomic data, such as ChIP-seq and RNA-seq.
26706151	 Proper division site selection is crucial for the survival of all organisms. What still eludes us is how bacteria position their division site with high precision, and in tight coordination with chromosome replication and segregation. Until recently, the general belief, at least in the model organisms Bacillus subtilis and Escherichia coli, was that spatial regulation of division comes about by the  combined negative regulatory mechanisms of the Min system and nucleoid occlusion. However, as we review here, these two systems cannot be solely responsible for division site selection and we highlight additional regulatory mechanisms that are at play. In this review, we put forward evidence of how chromosome replication and segregation may have direct links with cell division in these bacteria and the benefit of recent advances in chromosome conformation capture techniques in providing important information about how these three processes mechanistically work together to achieve accurate generation of progenitor cells.
26537891	 BACKGROUND: FNR homologues constitute an important class of transcription factors that control a wide range of anaerobic physiological functions in a number of bacterial species. Since FNR homologues are some of the most pervasive transcription factors, an understanding of their involvement in regulating anaerobic gene expression in different species sheds light on evolutionary similarity and differences. To address this question, we used a combination of high throughput RNA-Seq and ChIP-Seq analysis to define the extent of the FnrL regulon in Rhodobacter capsulatus and related our results to that of FnrL in Rhodobacter sphaeroides and FNR in Escherichia coli. RESULTS: Our RNA-seq results show that FnrL affects the expression of 807 genes,  which accounts for over 20 % of the Rba. capsulatus genome. ChIP-seq results indicate that 42 of these genes are directly regulated by FnrL. Importantly, this includes genes involved in the synthesis of the anoxygenic photosystem. Similarly, FnrL in Rba. sphaeroides affects 24 % of its genome, however, only 171 genes are differentially expressed in common between two Rhodobacter species, suggesting significant divergence in regulation. CONCLUSIONS: We show that FnrL in Rba. capsulatus activates photosynthesis while  in Rba. sphaeroides FnrL regulation reported to involve repression of the photosystem. This analysis highlights important differences in transcriptional control of photosynthetic events and other metabolic processes controlled by FnrL orthologues in closely related Rhodobacter species. Furthermore, we also show that the E. coli FNR regulon has limited transcriptional overlap with the FnrL regulons from either Rhodobacter species.
26483520	 An ability to sense and respond to changes in extracellular phosphate is critical for the survival of most bacteria. For Caulobacter crescentus, which typically lives in phosphate-limited environments, this process is especially crucial. Like many bacteria, Caulobacter responds to phosphate limitation through a conserved two-component signaling pathway called PhoR-PhoB, but the direct regulon of PhoB  in this organism is unknown. Here we used chromatin immunoprecipitation-DNA sequencing (ChIP-Seq) to map the global binding patterns of the phosphate-responsive transcriptional regulator PhoB under phosphate-limited and -replete conditions. Combined with genome-wide expression profiling, our work demonstrates that PhoB is induced to regulate nearly 50 genes under phosphate-starved conditions. The PhoB regulon is comprised primarily of genes known or predicted to help Caulobacter scavenge for and import inorganic phosphate, including 15 different membrane transporters. We also investigated the regulatory role of PhoU, a widely conserved protein proposed to coordinate phosphate import with expression of the PhoB regulon by directly modulating the histidine kinase PhoR. However, our studies show that it likely does not play such a role in Caulobacter, as PhoU depletion has no significant effect on PhoB-dependent gene expression. Instead, cells lacking PhoU exhibit striking accumulation of large polyphosphate granules, suggesting that PhoU participates in controlling intracellular phosphate metabolism.IMPORTANCE: The transcription factor PhoB is widely conserved throughout the bacterial kingdom, where it helps  organisms respond to phosphate limitation by driving the expression of a battery  of genes. Most of what is known about PhoB and its target genes is derived from studies of Escherichia coli. Our work documents the PhoB regulon in Caulobacter crescentus, and comparison to the regulon in E. coli reveals significant differences, highlighting the evolutionary plasticity of transcriptional responses driven by highly conserved transcription factors. We also demonstrated  that the conserved protein PhoU, which is implicated in bacterial persistence, does not regulate PhoB activity, as previously suggested. Instead, our results favor a model in which PhoU affects intracellular phosphate accumulation, possibly through the high-affinity phosphate transporter.
28348816	 Uropathogenic Escherchia coli (UPEC) is the causative agent of urinary tract infections. Nitric oxide (NO) is a toxic water-soluble gas that is encountered by UPEC in the urinary tract. Therefore, UPEC probably requires mechanisms to detoxify NO in the host environment. Thus far, flavohaemoglobin (Hmp), an NO denitrosylase, is the only demonstrated NO detoxification system in UPEC. Here we show that, in E. coli strain CFT073, the NADH-dependent NO reductase flavorubredoxin (FlRd) also plays a major role in NO scavenging. We generated a mutant that lacks all known and candidate NO detoxification pathways (Hmp, FlRd and the respiratory nitrite reductase, NrfA). When grown and assayed anaerobically, this mutant expresses an NO-inducible NO scavenging activity, pointing to the existence of a novel detoxification mechanism. Expression of this activity is inducible by both NO and nitrate, and the enzyme is membrane-associated. Genome-wide transcriptional profiling of UPEC grown under anaerobic conditions in the presence of nitrate (as a source of NO) highlighted various aspects of the response of the pathogen to nitrate and NO. Several virulence-associated genes are upregulated, suggesting that host-derived NO is a  potential regulator of UPEC virulence. Chromatin immunoprecipitation and sequencing was used to evaluate the NsrR regulon in CFT073. We identified 49 NsrR binding sites in promoter regions in the CFT073 genome, 29 of which were not previously identified in E. coli K-12. NsrR may regulate some CFT073 genes that do not have homologues in E. coli K-12.
26389830	 The alternative sigma factor σE functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σE in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σE-binding sites in Salmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time  course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σE-binding sites were identified. Seventeen sites were  new, which included outer membrane proteins, a quorum-sensing protein, a cell division factor, and a signal transduction modulator. The consensus sequence identified for σE in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the  σE-binding sites did not contain the consensus sequence, suggesting there may be  alternative mechanisms by which σE modulates transcription. By dissecting direct  and indirect modes of σE-mediated regulation, we found that σE activates gene expression through recognition of both canonical and reversed consensus sequence. New σE regulated genes (greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.
26131613	 Escherichia coli ST131 is a recently emerged and globally disseminated multidrug  resistant clone associated with urinary tract and bloodstream infections in both  community and clinical settings. The most common group of ST131 strains are defined by resistance to fluoroquinolones and possession of the type 1 fimbriae fimH30 allele. Here we provide an update on our recent work describing the globally epidemiology of ST131. We review the phylogeny of ST131 based on whole genome sequence data and highlight the important role of recombination in the evolution of this clonal lineage. We also summarize our findings on the virulence of the ST131 reference strain EC958, and highlight the use of transposon directed insertion-site sequencing to define genes associated with serum resistance and essential features of its large antibiotic resistance plasmid pEC958.
26070154	 In bacteria the concurrence of DNA replication and transcription leads to potentially deleterious encounters between the two machineries, which can occur in either the head-on (lagging strand genes) or co-directional (leading strand genes) orientations. These conflicts lead to replication fork stalling and can destabilize the genome. Both eukaryotic and prokaryotic cells possess resolution  factors that reduce the severity of these encounters. Though Escherichia coli accessory helicases have been implicated in the mitigation of head-on conflicts,  direct evidence of these proteins mitigating co-directional conflicts is lacking. Furthermore, the endogenous chromosomal regions where these helicases act, and the mechanism of recruitment, have not been identified. We show that the essential Bacillus subtilis accessory helicase PcrA aids replication progression  through protein coding genes of both head-on and co-directional orientations, as  well as rRNA and tRNA genes. ChIP-Seq experiments show that co-directional conflicts at highly transcribed rRNA, tRNA, and head-on protein coding genes are  major targets of PcrA activity on the chromosome. Partial depletion of PcrA renders cells extremely sensitive to head-on conflicts, linking the essential function of PcrA to conflict resolution. Furthermore, ablating PcrA's ATPase/helicase activity simultaneously increases its association with conflict regions, while incapacitating its ability to mitigate conflicts, and leads to cell death. In contrast, disruption of PcrA's C-terminal RNA polymerase interaction domain does not impact its ability to mitigate conflicts between replication and transcription, its association with conflict regions, or cell survival. Altogether, this work establishes PcrA as an essential factor involved  in mitigating transcription-replication conflicts and identifies chromosomal regions where it routinely acts. As both conflicts and accessory helicases are found in all domains of life, these results are broadly relevant.
25875675	 Escherichia coli sequence type 131 (E. coli ST131) is a recently emerged and globally disseminated multidrug resistant clone associated with urinary tract and bloodstream infections. Plasmids represent a major vehicle for the carriage of antibiotic resistance genes in E. coli ST131. In this study, we determined the complete sequence and performed a comprehensive annotation of pEC958, an IncF plasmid from the E. coli ST131 reference strain EC958. Plasmid pEC958 is 135.6 kb in size, harbours two replicons (RepFIA and RepFII) and contains 12 antibiotic resistance genes (including the blaCTX-M-15 gene). We also carried out hyper-saturated transposon mutagenesis and multiplexed transposon directed insertion-site sequencing (TraDIS) to investigate the biology of pEC958. TraDIS data showed that while only the RepFII replicon was required for pEC958 replication, the RepFIA replicon contains genes essential for its partitioning. Thus, our data provides direct evidence that the RepFIA and RepFII replicons in pEC958 cooperate to ensure their stable inheritance. The gene encoding the antitoxin component (ccdA) of the post-segregational killing system CcdAB was also protected from mutagenesis, demonstrating this system is active. Sequence comparison with a global collection of ST131 strains suggest that IncF represents the most common type of plasmid in this clone, and underscores the need to understand its evolution and contribution to the spread of antibiotic resistance  genes in E. coli ST131.
25873626	 The cMonkey integrated biclustering algorithm identifies conditionally co-regulated modules of genes (biclusters). cMonkey integrates various orthogonal pieces of information which support evidence of gene co-regulation, and optimizes biclusters to be supported simultaneously by one or more of these prior constraints. The algorithm served as the cornerstone for constructing the first global, predictive Environmental Gene Regulatory Influence Network (EGRIN) model  for a free-living cell, and has now been applied to many more organisms. However, due to its computational inefficiencies, long run-time and complexity of various  input data types, cMonkey was not readily usable by the wider community. To address these primary concerns, we have significantly updated the cMonkey algorithm and refactored its implementation, improving its usability and extendibility. These improvements provide a fully functioning and user-friendly platform for building co-regulated gene modules and the tools necessary for their exploration and interpretation. We show, via three separate analyses of data for  E. coli, M. tuberculosis and H. sapiens, that the updated algorithm and inclusion of novel scoring functions for new data types (e.g. ChIP-seq and transcription factor over-expression [TFOE]) improve discovery of biologically informative co-regulated modules. The complete cMonkey2 software package, including source code, is available at https://github.com/baliga-lab/cmonkey2.
25757765	 Plants consist of many functionally specialized cell types, each with its own unique epigenome, transcriptome, and proteome. Characterization of these cell type-specific properties is essential to understanding cell fate specification and the responses of individual cell types to the environment. In this chapter we describe an approach to map chromatin features in specific cell types of Arabidopsis thaliana using nuclei purification from individual cell types with the INTACT method (isolation of nuclei tagged in specific cell types) followed by chromatin immunoprecipitation and high-throughput sequencing (ChIP-seq). The INTACT system employs two transgenes to generate affinity-labeled nuclei in the cell type of interest, and these tagged nuclei can then be selectively purified from tissue homogenates. The primary transgene encodes the nuclear tagging fusion protein (NTF), which consists of a nuclear envelope-targeting domain, the green fluorescent protein, and a biotin ligase recognition peptide, while the second transgene encodes the E. coli biotin ligase (BirA), which selectively biotinylates NTF. Expression of NTF and BirA in a specific cell type thus yields  nuclei that are coated with biotin and can be purified by virtue of their affinity for streptavidin-coated magnetic beads. Compared with the original INTACT nuclei purification protocol, the procedure presented here is greatly simplified and shortened. After nuclei purification, we provide detailed instructions for chromatin isolation, shearing, and immunoprecipitation. Finally, we present a low input ChIP-seq library preparation protocol based on the nano-ChIP-seq method of Adli and Bernstein, and we describe multiplex Illumina sequencing of these libraries to produce high quality, cell type-specific epigenome profiles at a relatively low cost. The procedures given here are optimized for Arabidopsis but should be easily adaptable to other plant species.
25089258	 CarD is an essential mycobacterial protein that binds the RNA polymerase (RNAP) and affects the transcriptional profile of Mycobacterium smegmatis and Mycobacterium tuberculosis (6). We predicted that CarD was directly regulating RNAP function but our prior experiments had not determined at what stage of transcription CarD was functioning and at which genes CarD interacted with the RNAP. To begin to address these open questions, we performed Chromatin Immunoprecipitation sequencing (ChIP-seq) to survey the distribution of CarD throughout the M. smegmatis chromosome. The distribution of RNAP subunits β and σA were also profiled. We expected that RNAP β would be present throughout transcribed regions and RNAP σA would be predominantly enriched at promoters based on work in Escherichia coli (3), however this had yet to be determined in mycobacteria. The ChIP-seq analyses revealed that CarD was never present on the genome in the absence of RNAP, was primarily associated with promoter regions, and was highly correlated with the distribution of RNAP σA. The colocalization of σA and CarD led us to propose that in vivo, CarD associates with RNAP initiation  complexes at most promoters and is therefore a global regulator of transcription  initiation. Here we describe in detail the data from the ChIP-seq experiments associated with the study published by Srivastava and colleagues in the Proceedings of the National Academy of Science in 2013 (5) as well as discuss the findings from this dataset in relation to both CarD and mycobacterial transcription as a whole. The ChIP-seq data have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE48164).
25085508	 BACKGROUND: Burkholderia pseudomallei is a facultative intracellular pathogen and the causative agent of melioidosis. A conserved type III secretion system (T3SS3) and type VI secretion system (T6SS1) are critical for intracellular survival and  growth. The T3SS3 and T6SS1 genes are coordinately and hierarchically regulated by a TetR-type regulator, BspR. A central transcriptional regulator of the BspR regulatory cascade, BsaN, activates a subset of T3SS3 and T6SS1 loci. RESULTS: To elucidate the scope of the BsaN regulon, we used RNAseq analysis to compare the transcriptomes of wild-type B. pseudomallei KHW and a bsaN deletion mutant. The 60 genes positively-regulated by BsaN include those that we had previously identified in addition to a polyketide biosynthesis locus and genes involved in amino acid biosynthesis. BsaN was also found to repress the transcription of 51 genes including flagellar motility loci and those encoding components of the T3SS3 apparatus. Using a promoter-lacZ fusion assay in E. coli, we show that BsaN together with the chaperone BicA directly control the expression of the T3SS3 translocon, effector and associated regulatory genes that are organized into at least five operons (BPSS1516-BPSS1552). Using a mutagenesis approach, a consensus regulatory motif in the promoter regions of BsaN-regulated  genes was shown to be essential for transcriptional activation. CONCLUSIONS: BsaN/BicA functions as a central regulator of key virulence clusters in B. pseudomallei within a more extensive network of genetic regulation. We propose that BsaN/BicA controls a gene expression program that facilitates the adaption and intracellular survival of the pathogen within eukaryotic hosts.
24743342	 DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that DNA:RNA hybrids preferentially accumulated at rDNA, Ty1 and Ty2 transposons, telomeric repeat regions and a subset of open reading frames (ORFs). The latter are generally highly transcribed and have high  GC content. Interestingly, significant DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense-associated genes was also significantly altered upon overexpression of  RNase H, which degrades the RNA in hybrids. Finally, we uncover mutant-specific differences in the DRIP profiles of a Sen1 helicase mutant, RNase H deletion mutant and Hpr1 THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A summary of this paper was presented at the 26th International Conference on Yeast Genetics and Molecular Biology, August 2013.
24650566	 Inferring gene regulatory networks from gene expression data at whole genome level is still an arduous challenge, especially in higher organisms where the number of genes is large but the number of experimental samples is small. It is reported that the accuracy of current methods at genome scale significantly drops from Escherichia coli to Saccharomyces cerevisiae due to the increase in number of genes. This limits the applicability of current methods to more complex genomes, like human and mouse. Least absolute shrinkage and selection operator (LASSO) is widely used for gene regulatory network inference from gene expression profiles. However, the accuracy of LASSO on large genomes is not satisfactory. In this study, we apply two extended models of LASSO, L0 and L1/2 regularization models to infer gene regulatory network from both high-throughput gene expression data and transcription factor binding data in mouse embryonic stem cells (mESCs). We find that both the L0 and L1/2 regularization models significantly outperform  LASSO in network inference. Incorporating interactions between transcription factors and their targets remarkably improved the prediction accuracy. Current study demonstrates the efficiency and applicability of these two models for gene  regulatory network inference from integrative omics data in large genomes. The applications of the two models will facilitate biologists to study the gene regulation of higher model organisms in a genome-wide scale.
24565265	 BACKGROUND: Chromatin immunoprecipitation (ChIP) experiments are now the most comprehensive experimental approaches for mapping the binding of transcription factors (TFs) to their target genes. However, ChIP data alone is insufficient for identifying functional binding target genes of TFs for two reasons. First, there  is an inherent high false positive/negative rate in ChIP-chip or ChIP-seq experiments. Second, binding signals in the ChIP data do not necessarily imply functionality. METHODS: It is known that ChIP-chip data and TF knockout (TFKO) data reveal complementary information on gene regulation. While ChIP-chip data can provide TF-gene binding pairs, TFKO data can provide TF-gene regulation pairs. Therefore, we propose a novel network approach for identifying functional TF-gene binding pairs by integrating the ChIP-chip data with the TFKO data. In our method, a TF-gene binding pair from the ChIP-chip data is regarded to be functional if it also has high confident curated TFKO TF-gene regulatory relation or deduced hypostatic TF-gene regulatory relation. RESULTS AND CONCLUSIONS: We first validated our method on a gathered ground truth set. Then we applied our method to the ChIP-chip data to identify functional TF-gene binding pairs. The biological significance of our identified functional TF-gene binding pairs was shown by assessing their functional enrichment, the prevalence of protein-protein interaction, and expression coherence. Our results  outperformed the results of three existing methods across all measures. And our identified functional targets of TFs also showed statistical significance over the randomly assigned TF-gene pairs. We also showed that our method is dataset independent and can apply to ChIP-seq data and the E. coli genome. Finally, we provided an example showing the biological applicability of our notion.
24098145	 Escherichia coli ST131 is a globally disseminated, multidrug resistant clone responsible for a high proportion of urinary tract and bloodstream infections. The rapid emergence and successful spread of E. coli ST131 is strongly associated with antibiotic resistance; however, this phenotype alone is unlikely to explain  its dominance amongst multidrug resistant uropathogens circulating worldwide in hospitals and the community. Thus, a greater understanding of the molecular mechanisms that underpin the fitness of E. coli ST131 is required. In this study, we employed hyper-saturated transposon mutagenesis in combination with multiplexed transposon directed insertion-site sequencing to define the essential genes required for in vitro growth and the serum resistome (i.e. genes required for resistance to human serum) of E. coli EC958, a representative of the predominant E. coli ST131 clonal lineage. We identified 315 essential genes in E. coli EC958, 231 (73%) of which were also essential in E. coli K-12. The serum resistome comprised 56 genes, the majority of which encode membrane proteins or factors involved in lipopolysaccharide (LPS) biosynthesis. Targeted mutagenesis confirmed a role in serum resistance for 46 (82%) of these genes. The murein lipoprotein Lpp, along with two lipid A-core biosynthesis enzymes WaaP and WaaG,  were most strongly associated with serum resistance. While LPS was the main resistance mechanism defined for E. coli EC958 in serum, the enterobacterial common antigen and colanic acid also impacted on this phenotype. Our analysis also identified a novel function for two genes, hyxA and hyxR, as minor regulators of O-antigen chain length. This study offers novel insight into the genetic make-up of E. coli ST131, and provides a framework for future research on E. coli and other Gram-negative pathogens to define their essential gene repertoire and to dissect the molecular mechanisms that enable them to survive in the bloodstream and cause disease.
24053571	 BACKGROUND: Studies of protein association with DNA on a genome wide scale are possible through methods like ChIP-Chip or ChIP-Seq. Massive problems with false  positive signals in our own experiments motivated us to revise the standard ChIP-Chip protocol. Analysis of chromosome wide binding of the alternative sigma  factor σ³² in Escherichia coli with this new protocol resulted in detection of only a subset of binding sites found in a previous study by Wade and colleagues.  We suggested that the remainder of binding sites detected in the previous study are likely to be false positives. In a recent article the Wade group claimed that our conclusion is wrong and that the disputed sites are genuine σ³² binding sites. They further claimed that the non-detection of these sites in our study was due to low data quality. RESULTS/DISCUSSION: We respond to the criticism of Wade and colleagues and discuss some general questions of ChIP-based studies. We outline why the quality  of our data is sufficient to derive meaningful results. Specific points are: (i)  the modifications we introduced into the standard ChIP-Chip protocol do not necessarily result in a low dynamic range, (ii) correlation between ChIP-Chip replicates should not be calculated based on the whole data set as done in transcript analysis, (iii) control experiments are essential for identifying false positives. Suggestions are made how ChIP-based methods could be further optimized and which alternative approaches can be used to strengthen conclusions. CONCLUSION: We appreciate the ongoing discussion about the ChIP-Chip method and hope that it helps other scientist to analyze and interpret their results. The modifications we introduced into the ChIP-Chip protocol are a first step towards  reducing false positive signals but there is certainly potential for further optimization. The discussion about the σ³² binding sites in question highlights the need for alternative approaches and further investigation of appropriate methods for verification.
23865838	 BACKGROUND: Identification of transcription factor binding sites (also called 'motif discovery') in DNA sequences is a basic step in understanding genetic regulation. Although many successful programs have been developed, the problem is far from being solved on account of diversity in gene expression/regulation and the low specificity of binding sites. State-of-the-art algorithms have their own  constraints (e.g., high time or space complexity for finding long motifs, low precision in identification of weak motifs, or the OOPS constraint: one occurrence of the motif instance per sequence) which limit their scope of application. RESULTS: In this paper, we present a novel and fast algorithm we call TFBSGroup.  It is based on community detection from a graph and is used to discover long and  weak (l,d) motifs under the ZOMOPS constraint (zero, one or multiple occurrence(s) of the motif instance(s) per sequence), where l is the length of a  motif and d is the maximum number of mutations between a motif instance and the motif itself. Firstly, TFBSGroup transforms the (l, d) motif search in sequences  to focus on the discovery of dense subgraphs within a graph. It identifies these  subgraphs using a fast community detection method for obtaining coarse-grained candidate motifs. Next, it greedily refines these candidate motifs towards the true motif within their own communities. Empirical studies on synthetic (l, d) samples have shown that TFBSGroup is very efficient (e.g., it can find true (18,  6), (24, 8) motifs within 30 seconds). More importantly, the algorithm has succeeded in rapidly identifying motifs in a large data set of prokaryotic promoters generated from the Escherichia coli database RegulonDB. The algorithm has also accurately identified motifs in ChIP-seq data sets for 12 mouse transcription factors involved in ES cell pluripotency and self-renewal. CONCLUSIONS: Our novel heuristic algorithm, TFBSGroup, is able to quickly identify nearly exact matches for long and weak (l, d) motifs in DNA sequences under the ZOMOPS constraint. It is also capable of finding motifs in real applications. The source code for TFBSGroup can be obtained from http://bioinformatics.bioengr.uic.edu/TFBSGroup/.
23717649	 Fis, one of the most important nucleoid-associated proteins, functions as a global regulator of transcription in bacteria that has been comprehensively studied in Escherichia coli K12. Fis also influences the virulence of Salmonella  enterica and pathogenic E. coli by regulating their virulence genes, however, the relevant mechanism is unclear. In this report, using combined RNA-seq and chromatin immunoprecipitation (ChIP)-seq technologies, we first identified 1646 Fis-regulated genes and 885 Fis-binding targets in the S. enterica serovar Typhimurium, and found a Fis regulon different from that in E. coli. Fis has been reported to contribute to the invasion ability of S. enterica. By using cell infection assays, we found it also enhances the intracellular replication ability of S. enterica within macrophage cell, which is of central importance for the pathogenesis of infections. Salmonella pathogenicity islands (SPI)-1 and SPI-2 are crucial for the invasion and survival of S. enterica in host cells. Using mutation and overexpression experiments, real-time PCR analysis, and electrophoretic mobility shift assays, we demonstrated that Fis regulates 63 of the 94 Salmonella pathogenicity island (SPI)-1 and SPI-2 genes, by three regulatory modes: i) binds to SPI regulators in the gene body or in upstream regions; ii) binds to SPI genes directly to mediate transcriptional activation of themselves and downstream genes; iii) binds to gene encoding OmpR which affects SPI gene expression by controlling SPI regulators SsrA and HilD. Our results provide new insights into the impact of Fis on SPI genes and the pathogenicity of S. enterica.
23580539	 Accurate identification of the DNA-binding sites of transcription factors and other DNA-binding proteins on the genome is crucial to understanding their molecular interactions with DNA. Here, we describe a new method: Genome Footprinting by high-throughput sequencing (GeF-seq), which combines in vivo DNase I digestion of genomic DNA with ChIP coupled with high-throughput sequencing. We have determined the in vivo binding sites of a Bacillus subtilis global regulator, AbrB, using GeF-seq. This method shows that exact DNA-binding sequences, which were protected from in vivo DNase I digestion, were resolved at  a comparable resolution to that achieved by in vitro DNase I footprinting, and this was simply attained without the necessity of prediction by peak-calling programs. Moreover, DNase I digestion of the bacterial nucleoid resolved the closely positioned AbrB-binding sites, which had previously appeared as one peak  in ChAP-chip and ChAP-seq experiments. The high-resolution determination of AbrB-binding sites using GeF-seq enabled us to identify bipartite TGGNA motifs in 96% of the AbrB-binding sites. Interestingly, in a thousand binding sites with very low-binding intensities, single TGGNA motifs were also identified. Thus, GeF-seq is a powerful method to elucidate the molecular mechanism of target protein binding to its cognate DNA sequences.
23511241	 As the first, and usually rate-limiting, step of transcription initiation, bacterial RNA polymerase (RNAP) binds to double stranded DNA (dsDNA) and subsequently opens the two strands of DNA (the open complex formation). The rate  determining step in the open complex formation is opening of a short (6 bp) DNA called the -10 region, which interacts with RNAP in both dsDNA and single stranded (ssDNA) forms. Accordingly, formation of the open complex depends on (physically independent) domains of RNAP that interact with ssDNA and dsDNA, as well as on parameters of DNA melting and sequences of -10 regions. We here aim to understand how these different interactions are mutually related to ensure efficient open complex formation. To achieve this, we use a recently developed biophysical model of transcription initiation, which allows the calculation of the kinetic parameters of transcription initiation on the scale of whole genome.  We consequently investigate kinetic properties of sequences derived from all E. coli intergenic regions, and from more than 300 experimentally confirmed E. coli  σ(70) promoters. We find that interaction specificities of σ(70) DNA binding domains reduce the number of sequences where RNAP binds strongly, but forms the open complex too slowly to achieve functional transcription (so-called poised promoters). However, we find that, despite this reduction, there is still a significant number of such poised promoters in the intergenic regions, which may  provide a major source of false positives in genome-wide searches of transcription start sites. Furthermore, we surprisingly find that sequences of -10 regions of the functional promoters increase the extent of RNAP poising, which we interpret in terms of an extension of a recently proposed model of promoter recognition ('mix-and-match model') to kinetic parameters. Overall, our  results allow better understanding of the design of σ(70) DNA binding domains and promoter sequences, and place a fundamental limit on accuracy of methods for promoter detection that are based on strong RNAP binding (e.g. ChIP-chip).
23470992	 Salmonella Typhi and Typhimurium diverged only ∼50 000 years ago, yet have very different host ranges and pathogenicity. Despite the availability of multiple whole-genome sequences, the genetic differences that have driven these changes in phenotype are only beginning to be understood. In this study, we use transposon-directed insertion-site sequencing to probe differences in gene requirements for competitive growth in rich media between these two closely related serovars. We identify a conserved core of 281 genes that are required for growth in both serovars, 228 of which are essential in Escherichia coli. We are able to identify active prophage elements through the requirement for their repressors. We also find distinct differences in requirements for genes involved  in cell surface structure biogenesis and iron utilization. Finally, we demonstrate that transposon-directed insertion-site sequencing is not only applicable to the protein-coding content of the cell but also has sufficient resolution to generate hypotheses regarding the functions of non-coding RNAs (ncRNAs) as well. We are able to assign probable functions to a number of cis-regulatory ncRNA elements, as well as to infer likely differences in trans-acting ncRNA regulatory networks.
23275538	 Nanobodies® are single-domain antibody fragments derived from camelid heavy-chain antibodies. Because of their small size, straightforward production in Escherichia coli, easy tailoring, high affinity, specificity, stability and solubility, nanobodies® have been exploited in various biotechnological applications. A major challenge in the post-genomics and post-proteomics era is the identification of regulatory networks involving nucleic acid-protein and protein-protein interactions. Here, we apply a nanobody® in chromatin immunoprecipitation followed by DNA microarray hybridization (ChIP-chip) for genome-wide identification of DNA-protein interactions. The Lrp-like regulator Ss-LrpB, arguably one of the best-studied specific transcription factors of the hyperthermophilic archaeon Sulfolobus solfataricus, was chosen for this proof-of-principle nanobody®-assisted ChIP. Three distinct Ss-LrpB-specific nanobodies®, each interacting with a different epitope, were generated for ChIP.  Genome-wide ChIP-chip with one of these nanobodies® identified the well-established Ss-LrpB binding sites and revealed several unknown target sequences. Furthermore, these ChIP-chip profiles revealed auxiliary operator sites in the open reading frame of Ss-lrpB. Our work introduces nanobodies® as a  novel class of affinity reagents for ChIP. Taking into account the unique characteristics of nanobodies®, in particular, their short generation time, nanobody®-based ChIP is expected to further streamline ChIP-chip and ChIP-Seq experiments, especially in organisms with no (or limited) possibility of genetic  manipulation.
23232715	 Cyclic AMP receptor protein (Crp) is a transcription regulator controlling diverse cellular processes in many bacteria. In Streptomyces coelicolor, it is well established that Crp plays a critical role in spore germination and colony development. Here, we demonstrate that Crp is a key regulator of secondary metabolism and antibiotic production in S. coelicolor and show that it may additionally coordinate precursor flux from primary to secondary metabolism. We found that crp deletion adversely affected the synthesis of three well-characterized antibiotics in S. coelicolor: actinorhodin (Act), undecylprodigiosin (Red), and calcium-dependent antibiotic (CDA). Using chromatin immunoprecipitation-microarray (ChIP-chip) assays, we determined that eight (out  of 22) secondary metabolic clusters encoded by S. coelicolor contained Crp-associated sites. We followed the effect of Crp induction using transcription profiling analyses and found secondary metabolic genes to be significantly affected: included in this Crp-dependent group were genes from six of the clusters identified in the ChIP-chip experiments. Overexpressing Crp in a panel of Streptomyces species led to enhanced antibiotic synthesis and new metabolite production, suggesting that Crp control over secondary metabolism is broadly conserved in the streptomycetes and that Crp overexpression could serve as a powerful tool for unlocking the chemical potential of these organisms. IMPORTANCE Streptomyces produces a remarkably diverse array of secondary metabolites, including many antibiotics. In recent years, genome sequencing has revealed that  these products represent only a small proportion of the total secondary metabolite potential of Streptomyces. There is, therefore, considerable interest  in discovering ways to stimulate the production of new metabolites. Here, we show that Crp (the classical regulator of carbon catabolite repression in Escherichia  coli) is a master regulator of secondary metabolism in Streptomyces. It binds to  eight of 22 secondary metabolic gene clusters in the Streptomyces coelicolor genome and directly affects the expression of six of these. Deletion of crp in S. coelicolor leads to dramatic reductions in antibiotic levels, while Crp overexpression enhances antibiotic production. We find that the antibiotic-stimulatory capacity of Crp extends to other streptomycetes, where its overexpression activates the production of "cryptic" metabolites that are not otherwise seen in the corresponding wild-type strain.
23190111	 OmpR is a multifunctional DNA binding regulator with orthologues in many enteric  bacteria that exhibits classical regulator activity as well as nucleoid-associated protein-like characteristics. In the enteric pathogen Salmonella enterica, using chromatin immunoprecipitation of OmpR:FLAG and nucleotide sequencing, 43 putative OmpR binding sites were identified in S. enterica serovar Typhi, 22 of which were associated with OmpR-regulated genes. Mutation of a sequence motif (TGTWACAW) that was associated with the putative OmpR binding sites abrogated binding of OmpR:6×His to the tviA upstream region. A core set of 31 orthologous genes were found to exhibit OmpR-dependent expression  in both S. Typhi and S. Typhimurium. S. Typhimurium-encoded orthologues of two divergently transcribed OmpR-regulated operons (SL1068-71 and SL1066-67) had a putative OmpR binding site in the inter-operon region in S. Typhi, and were characterized using in vitro and in vivo assays. These operons are widely distributed within S. enterica but absent from the closely related Escherichia coli. SL1066 and SL1067 were required for growth on N-acetylmuramic acid as a sole carbon source. SL1068-71 exhibited sequence similarity to sialic acid uptake systems and contributed to colonization of the ileum and caecum in the streptomycin-pretreated mouse model of colitis.
22923524	 Typical approaches for predicting transcription factor binding sites (TFBSs) involve use of a position-specific weight matrix (PWM) to statistically characterize the sequences of the known sites. Recently, an alternative physicochemical approach, called SiteSleuth, was proposed. In this approach, a linear support vector machine (SVM) classifier is trained to distinguish TFBSs from background sequences based on local chemical and structural features of DNA. SiteSleuth appears to generally perform better than PWM-based methods. Here, we improve the SiteSleuth approach by considering both new physicochemical features  and algorithmic modifications. New features are derived from Gibbs energies of amino acid-DNA interactions and hydroxyl radical cleavage profiles of DNA. Algorithmic modifications consist of inclusion of a feature selection step, use of a nonlinear kernel in the SVM classifier, and use of a consensus-based post-processing step for predictions. We also considered SVM classification based on letter features alone to distinguish performance gains from use of SVM-based models versus use of physicochemical features. The accuracy of each of the variant methods considered was assessed by cross validation using data available  in the RegulonDB database for 54 Escherichia coli TFs, as well as by experimental validation using published ChIP-chip data available for Fis and Lrp.
22890136	 Two transcription termination mechanisms - intrinsic and Rho-dependent - have evolved in bacteria. The Rho factor occurs in most bacterial lineages, and has been hypothesized to play a global regulatory role. Genome-wide studies using microarray, 2D-gel electrophoresis and ChIP-chip provided evidence that Rho serves to silence transcription from horizontally acquired genes and prophages in Escherichia coli K-12, implicating the factor to be a part of the "cellular immune mechanism" protecting against deleterious phages and aberrant gene expression from acquired xenogenic DNA. We have investigated this model by adopting an alternate in silico approach and have extended the study to other species. Our analysis shows that several genomic islands across diverse phyla have under-representation of intrinsic terminators, similar to that experimentally observed in E. coli K-12. This implies that Rho-dependent termination is the predominant process operational in these islands and that silencing of foreign DNA is a conserved function of Rho. From the present analysis, it is evident that horizontally acquired islands have lost intrinsic terminators to facilitate Rho-dependent termination. These results underscore the importance of Rho as a conserved, genome-wide sentinel that regulates potentially toxic xenogenic DNA.
22555467	 Signature tagged mutagenesis is a genetic approach that was developed to identify novel bacterial virulence factors. It is a negative selection method in which unique identification tags allow analysis of pools of mutants in mixed populations. The approach is particularly well suited to functional genetic analysis of the gastrointestinal phase of infection in foodborne pathogens and has the capacity to guide the development of novel vaccines and therapeutics. In  this review we outline the technical principles underpinning signature-tagged mutagenesis as well as novel sequencing-based approaches for transposon mutant identification such as TraDIS (transposon directed insertion-site sequencing). We also provide an analysis of screens that have been performed in gastrointestinal  pathogens which are a global health concern (Escherichia coli, Listeria monocytogenes, Helicobacter pylori, Vibrio cholerae and Salmonella enterica). The identification of key virulence loci through the use of signature tagged mutagenesis in mice and relevant larger animal models is discussed.
22207717	 Gene expression is tightly regulated by transcription factors and cofactors that  function by directly or indirectly interacting with DNA of the genome. Understanding how and where these proteins bind provides essential information to uncover genetic regulatory mechanisms. We have developed a new method to study DNA-protein interaction in vivo called DNA adenine methyltransferase (Dam)IP, which is based on fusing a protein of interest to a mutant form of Dam from Escherichia coli. We showed previously that DamIP can efficiently identify in vivo binding sites of Dam-tethered human estrogen receptor (hER)α. In current study, we present the cistrome of hERα determined by DamIP and high throughput sequencing (DamIP-seq). The DamIP-seq-defined hERα cistrome identifies many new binding regions and overlaps with those determined by chromatin immunoprecipitation (ChIP)-chip or ChIP-seq. Elements uniquely identified by DamIP-seq include a unique class of elements that show low, but persistent, hERα  binding when reexamined by conventional ChIP. In contrast, DamIP-seq fails to detect some elements with very transient hERα binding. The methyl-adenine modifications introduced by Dam are stable and do not decrease over 12 d. In summary, the current study provides both an alternate view of the hERα cistrome to further understand the mechanism of hERα-mediated transcription and a new tool to explore other transcriptional factors and cofactors that is very different from conventional ChIP.
21515770	 Bacterial Gre factors associate with RNA polymerase (RNAP) and stimulate intrinsic cleavage of the nascent transcript at the active site of RNAP. Biochemical and genetic studies to date have shown that Escherichia coli Gre factors prevent transcriptional arrest during elongation and enhance transcription fidelity. Furthermore, Gre factors participate in the stimulation of promoter escape and the suppression of promoter-proximal pausing during the beginning of RNA synthesis in E. coli. Although Gre factors are conserved in general bacteria, limited functional studies have been performed in bacteria other than E. coli. In this investigation, ChAP-chip analysis (chromatin affinity precipitation coupled with DNA microarray) was conducted to visualize the distribution of Bacillus subtilis GreA on the chromosome and to determine the effects of GreA inactivation on core RNAP trafficking. Our data show that GreA is uniformly distributed in the transcribed region from the promoter to coding region with core RNAP, and its inactivation induces RNAP accumulation at many promoter or promoter-proximal regions. Based on these findings, we propose that GreA would constantly associate with core RNAP during transcriptional initiation  and elongation and resolves its stalling at promoter or promoter-proximal regions, thus contributing to the even distribution of RNAP along the promoter and coding regions in B. subtilis cells.
21278291	 Massively parallel sequencing of transposon-flanking regions assigned the genotype and fitness score to 91% of Escherichia coli O157:H7 mutants previously  screened in cattle by signature-tagged mutagenesis (STM). The method obviates the limitations of STM and markedly extended the functional annotation of the prototype E. coli O157:H7 genome without further animal use.
21124945	 An important step in understanding gene regulation is to identify the DNA binding sites recognized by each transcription factor (TF). Conventional approaches to prediction of TF binding sites involve the definition of consensus sequences or position-specific weight matrices and rely on statistical analysis of DNA sequences of known binding sites. Here, we present a method called SiteSleuth in  which DNA structure prediction, computational chemistry, and machine learning are applied to develop models for TF binding sites. In this approach, binary classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. These features are determined via molecular dynamics calculations in which we consider  each base in different local neighborhoods. For each of 54 TFs in Escherichia coli, for which at least five DNA binding sites are documented in RegulonDB, the  TF binding sites and portions of the non-coding genome sequence are mapped to feature vectors and used in training. According to cross-validation analysis and  a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth outperforms three conventional approaches: Match, MATRIX SEARCH, and the method of Berg and von Hippel. SiteSleuth also outperforms QPMEME, a method similar to SiteSleuth in that it involves a learning algorithm.  The main advantage of SiteSleuth is a lower false positive rate.
21051353	 Immuno-precipitation of protein-DNA complexes followed by microarray hybridization is a powerful and cost-effective technology for discovering protein-DNA binding events at the genome scale. It is still an unresolved challenge to comprehensively, accurately and sensitively extract binding event information from the produced data. We have developed a novel strategy composed of an information-preserving signal-smoothing procedure, higher order derivative  analysis and application of the principle of maximum entropy to address this challenge. Importantly, our method does not require any input parameters to be specified by the user. Using genome-scale binding data of two Escherichia coli global transcription regulators for which a relatively large number of experimentally supported sites are known, we show that ∼90% of known sites were resolved to within four probes, or ∼88 bp. Over half of the sites were resolved to within two probes, or ∼38 bp. Furthermore, we demonstrate that our strategy delivers significant quantitative and qualitative performance gains over available methods. Such accurate and sensitive binding site resolution has important consequences for accurately reconstructing transcriptional regulatory networks, for motif discovery, for furthering our understanding of local and non-local factors in protein-DNA interactions and for extending the usefulness horizon of the ChIP-chip platform.
20817769	 To obtain insight into the in vivo dynamics of RNA polymerase (RNAP) on the Bacillus subtilis genome, we analyzed the distribution of the σ(A) and β' subunits of RNAP and the NusA elongation factor on the genome in exponentially growing cells using chromatin affinity precipitation coupled with gene chip mapping (ChAP-chip). In contrast to Escherichia coli RNAP, which often accumulates at the promoter-proximal region, B. subtilis RΝΑP is evenly distributed from the promoter to the coding sequences. This finding suggests that, in general, B. subtilis RNAP recruited to the promoter promptly translocates away from the promoter to form the elongation complex and proceeds without intragenic transcription attenuation. We detected RNAP accumulation in the promoter-proximal regions of some genes, most of which can be identified as transcription attenuation systems in the leader region. Our findings suggest that the differences in RNAP behavior between E. coli and B. subtilis during initiation and elongation steps might result in distinct strategies for postinitiation control of transcription. The E. coli mechanism involves trapping  at the promoter and promoter-proximal pausing of RNAP in addition to transcription attenuation, whereas transcription attenuation in leader sequences  is mainly employed in B. subtilis.
20639326	 Histone-like protein H1 (H-NS) family proteins are nucleoid-associated proteins (NAPs) conserved among many bacterial species. The IncP-7 plasmid pCAR1 is transmissible among various Pseudomonas strains and carries a gene encoding the H-NS family protein, Pmr. Pseudomonas putida KT2440 is a host of pCAR1, which harbors five genes encoding the H-NS family proteins PP_1366 (TurA), PP_3765 (TurB), PP_0017 (TurC), PP_3693 (TurD), and PP_2947 (TurE). Quantitative reverse  transcription-PCR (qRT-PCR) demonstrated that the presence of pCAR1 does not affect the transcription of these five genes and that only pmr, turA, and turB were primarily transcribed in KT2440(pCAR1). In vitro pull-down assays revealed that Pmr strongly interacted with itself and with TurA, TurB, and TurE. Transcriptome comparisons of the pmr disruptant, KT2440, and KT2440(pCAR1) strains indicated that pmr disruption had greater effects on the host transcriptome than did pCAR1 carriage. The transcriptional levels of some genes that increased with pCAR1 carriage, such as the mexEF-oprN efflux pump genes and  parI, reverted with pmr disruption to levels in pCAR1-free KT2440. Transcriptional levels of putative horizontally acquired host genes were not altered by pCAR1 carriage but were altered by pmr disruption. Identification of genome-wide Pmr binding sites by ChAP-chip (chromatin affinity purification coupled with high-density tiling chip) analysis demonstrated that Pmr preferentially binds to horizontally acquired DNA regions. The Pmr binding sites  overlapped well with the location of the genes differentially transcribed following pmr disruption on both the plasmid and the chromosome. Our findings indicate that Pmr is a key factor in optimizing gene transcription on pCAR1 and the host chromosome.
20460455	 Deregulation of the Wnt/β-catenin signaling pathway is a hallmark of colon cancer. Mutations in the adenomatous polyposis coli (APC) gene occur in the vast  majority of colorectal cancers and are an initiating event in cellular transformation. Cells harboring mutant APC contain elevated levels of the β-catenin transcription coactivator in the nucleus which leads to abnormal expression of genes controlled by β-catenin/T-cell factor 4 (TCF4) complexes. Here, we use chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) to identify β-catenin binding regions in HCT116 human colon cancer cells. We localized 2168 β-catenin enriched regions using a concordance approach for integrating the output from multiple peak alignment algorithms. Motif discovery algorithms found a core TCF4 motif (T/A-T/A-C-A-A-A-G), an extended TCF4 motif (A/T/G-C/G-T/A-T/A-C-A-A-A-G) and an AP-1 motif (T-G-A-C/T-T-C-A) to be significantly represented in β-catenin enriched regions.  Furthermore, 417 regions contained both TCF4 and AP-1 motifs. Genes associated with TCF4 and AP-1 motifs bound β-catenin, TCF4 and c-Jun in vivo and were activated by Wnt signaling and serum growth factors. Our work provides evidence that Wnt/β-catenin and mitogen signaling pathways intersect directly to regulate  a defined set of target genes.
19843227	 StpA is a paralogue of the nucleoid-associated protein H-NS that is conserved in  a range of enteric bacteria and had no known function in Salmonella Typhimurium.  We show that 5% of the Salmonella genome is regulated by StpA, which contrasts with the situation in Escherichia coli where deletion of stpA only had minor effects on gene expression. The StpA-dependent genes of S. Typhimurium are a specific subset of the H-NS regulon that are predominantly under the positive control of sigma(38) (RpoS), CRP-cAMP and PhoP. Regulation by StpA varied with growth phase; StpA controlled sigma(38) levels at mid-exponential phase by preventing inappropriate activation of sigma(38) during rapid bacterial growth. In contrast, StpA only activated the CRP-cAMP regulon during late exponential phase. ChIP-chip analysis revealed that StpA binds to PhoP-dependent genes but not to most genes of the CRP-cAMP and sigma(38) regulons. In fact, StpA indirectly regulates sigma(38)-dependent genes by enhancing sigma(38) turnover by repressing the anti-adaptor protein rssC. We discovered that StpA is essential for the dynamic regulation of sigma(38) in response to increased glucose levels.  Our findings identify StpA as a novel growth phase-specific regulator that plays  an important physiological role by linking sigma(38) levels to nutrient availability.
18974181	 EcoCyc (http://EcoCyc.org) provides a comprehensive encyclopedia of Escherichia coli biology. EcoCyc integrates information about the genome, genes and gene products; the metabolic network; and the regulatory network of E. coli. Recent EcoCyc developments include a new initiative to represent and curate all types of E. coli regulatory processes such as attenuation and regulation by small RNAs. EcoCyc has started to curate Gene Ontology (GO) terms for E. coli and has made a  dataset of E. coli GO terms available through the GO Web site. The curation and visualization of electron transfer processes has been significantly improved. Other software and Web site enhancements include the addition of tracks to the EcoCyc genome browser, in particular a type of track designed for the display of  ChIP-chip datasets, and the development of a comparative genome browser. A new Genome Omics Viewer enables users to paint omics datasets onto the full E. coli genome for analysis. A new advanced query page guides users in interactively constructing complex database queries against EcoCyc. A Macintosh version of EcoCyc is now available. A series of Webinars is available to instruct users in the use of EcoCyc.
18697768	 MOTIVATION: Locating transcription factor binding sites (motifs) is a key step in understanding gene regulation. Based on Tompa's benchmark study, the performance  of current de novo motif finders is far from satisfactory (with sensitivity <or=0.222 and precision <or=0.307). The same study also shows that no motif finder performs consistently well over all datasets. Hence, it is not clear which finder one should use for a given dataset. To address this issue, a class of algorithms called ensemble methods have been proposed. Though the existing ensemble methods overall perform better than stand-alone motif finders, the improvement gained is not substantial. Our study reveals that these methods do not fully exploit the information obtained from the results of individual finders, resulting in minor improvement in sensitivity and poor precision. RESULTS: In this article, we identify several key observations on how to utilize  the results from individual finders and design a novel ensemble method, MotifVoter, to predict the motifs and binding sites. Evaluations on 186 datasets  show that MotifVoter can locate more than 95% of the binding sites found by its component motif finders. In terms of sensitivity and precision, MotifVoter outperforms stand-alone motif finders and ensemble methods significantly on Tompa's benchmark, Escherichia coli, and ChIP-Chip datasets. MotifVoter is available online via a web server with several biologist-friendly features.
18460200	 BACKGROUND: Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show  how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer  or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. RESULTS: We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E.  coli extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344  interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a  significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions. CONCLUSION: Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.