useless.out 54.3 KB
29339415	 Escherichia coli K1 strains are major causative agents of invasive disease of newborn infants. The age dependency of infection can be reproduced in neonatal rats. Colonization of the small intestine following oral administration of K1 bacteria leads rapidly to invasion of the blood circulation; bacteria that avoid  capture by the mesenteric lymphatic system and evade antibacterial mechanisms in  the blood may disseminate to cause organ-specific infections such as meningitis.  Some E. coli K1 surface constituents, in particular the polysialic acid capsule,  are known to contribute to invasive potential, but a comprehensive picture of the factors that determine the fully virulent phenotype has not emerged so far. We constructed a library and constituent sublibraries of ∼775,000 Tn5 transposon mutants of E. coli K1 strain A192PP and employed transposon-directed insertion site sequencing (TraDIS) to identify genes required for fitness for infection of  2-day-old rats. Transposon insertions were lacking in 357 genes following recovery on selective agar; these genes were considered essential for growth in nutrient-replete medium. Colonization of the midsection of the small intestine was facilitated by 167 E. coli K1 gene products. Restricted bacterial translocation across epithelial barriers precluded TraDIS analysis of gut-to-blood and blood-to-brain transits; 97 genes were required for survival in  human serum. This study revealed that a large number of bacterial genes, many of  which were not previously associated with systemic E. coli K1 infection, are required to realize full invasive potential.IMPORTANCEEscherichia coli K1 strains cause life-threatening infections in newborn infants. They are acquired from the  mother at birth and colonize the small intestine, from where they invade the blood and central nervous system. It is difficult to obtain information from acutely ill patients that sheds light on physiological and bacterial factors determining invasive disease. Key aspects of naturally occurring age-dependent human infection can be reproduced in neonatal rats. Here, we employ transposon-directed insertion site sequencing to identify genes essential for the in vitro growth of E. coli K1 and genes that contribute to the colonization of susceptible rats. The presence of bottlenecks to invasion of the blood and cerebrospinal compartments precluded insertion site sequencing analysis, but we identified genes for survival in serum. 
28791299	 Increasing evidence that microRNAs (miRNAs) play important roles in the immune response against infectious agents suggests that miRNA might be exploitable as signatures of exposure to specific infectious agents. In order to identify potential early miRNA biomarkers of bacterial infections, human peripheral blood  mononuclear cells (hPBMCs) were exposed to two select agents, Burkholderia pseudomallei K96243 and Francisella tularensis SHU S4, as well as to the nonpathogenic control Escherichia coli DH5α. RNA samples were harvested at three  early time points, 30, 60, and 120 minutes postexposure, then sequenced. RNAseq analyses identified 87 miRNAs to be differentially expressed (DE) in a linear fashion. Of these, 31 miRNAs were tested using the miScript miRNA qPCR assay. Through RNAseq identification and qPCR validation, we identified differentially expressed miRNA species that may be involved in the early response to bacterial infections. Based upon its upregulation at early time points postexposure in two  different individuals, hsa-mir-30c-5p is a miRNA species that could be studied further as a potential biomarker for exposure to these gram-negative intracellular pathogens. Gene ontology functional analyses demonstrated that programmed cell death is the first ranking biological process associated with miRNAs that are upregulated in F. tularensis-exposed hPBMCs. 
28649444	 Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 104 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data (Escherichia coli K14 network reconstruction using network and transcriptomic data), we show that the formalism used to build FRANK can to some extent be a reasonable model for gene regulatory  networks in real cells. 
28614372	 Infection with Shiga toxin (Stx) producing Escherichia coli O157:H7 can cause the potentially fatal complication hemolytic uremic syndrome, and currently only supportive therapy is available. Lack of suitable animal models has hindered study of this disease. Induced human intestinal organoids (iHIOs), generated by in vitro differentiation of pluripotent stem cells, represent differentiated human intestinal tissue. We show that iHIOs with addition of human neutrophils can model E. coli intestinal infection and innate cellular responses. Commensal and O157:H7 introduced into the iHIO lumen replicated rapidly achieving high numbers. Commensal E. coli did not cause damage, and were completely contained within the lumen, suggesting defenses, such as mucus production, can constrain non-pathogenic strains. Some O157:H7 initially co-localized with cellular actin.  Loss of actin and epithelial integrity was observed after 4 hours. O157:H7 grew as filaments, consistent with activation of the bacterial SOS stress response. SOS is induced by reactive oxygen species (ROS), and O157:H7 infection increased  ROS production. Transcriptional profiling (RNAseq) demonstrated that both commensal and O157:H7 upregulated genes associated with gastrointestinal maturation, while infection with O157:H7 upregulated inflammatory responses, including interleukin 8 (IL-8). IL-8 is associated with neutrophil recruitment, and infection with O157:H7 resulted in recruitment of human neutrophils into the  iHIO tissue. 
28270101	 BACKGROUND: Avian pathogenic E. coli (APEC) can lead to a loss in millions of dollars in poultry annually because of mortality and produce contamination. Studies have verified that many immune-related genes undergo changes in alternative splicing (AS), along with nonsense mediated decay (NMD), to regulate  the immune system under different conditions. Therefore, the splicing profiles of primary lymphoid tissues with systemic APEC infection need to be comprehensively  examined. RESULTS: Gene expression in RNAseq data were obtained for three different immune  tissues (bone marrow, thymus, and bursa) from three phenotype birds (non-challenged, resistant, and susceptible birds) at two time points. Alternative 5' splice sites and exon skipping/inclusion were identified as the major alternative splicing events in avian primary immune organs under systemic APEC infection. In this study, we detected hundreds of differentially-expressed-transcript-containing genes (DETs) between different phenotype birds at 5 days post-infection (dpi). DETs, PSAP and STT3A, with NMD have important functions under systemic APEC infection. DETs, CDC45, CDK1, RAG2,  POLR1B, PSAP, and DNASE1L3, from the same transcription start sites (TSS) indicate that cell death, cell cycle, cellular function, and maintenance were predominant in host under systemic APEC. CONCLUSIONS: With the use of RNAseq technology and bioinformatics tools, this study provides a portrait of the AS event and NMD in primary lymphoid tissues, which play critical roles in host homeostasis under systemic APEC infection. According to this study, AS plays a pivotal regulatory role in the immune response in chicken under systemic APEC infection via either NMD or alternative TSSs. This study elucidates the regulatory role of AS for the immune complex under systemic APEC infection. 
28060822	 Mosquitoes host communities of microbes in their digestive tract that consist primarily of bacteria. We previously reported that Aedes aegypti larvae colonized by a native community of bacteria and gnotobiotic larvae colonized by only Escherichia coli develop very similarly into adults, whereas axenic larvae never  molt and die as first instars. In this study, we extended these findings by first comparing the growth and abundance of bacteria in conventional, gnotobiotic, and  axenic larvae during the first instar. Results showed that conventional and gnotobiotic larvae exhibited no differences in growth, timing of molting, or number of bacteria in their digestive tract. Axenic larvae in contrast grew minimally and never achieved the critical size associated with molting by conventional and gnotobiotic larvae. In the second part of the study we compared  patterns of gene expression in conventional, gnotobiotic and axenic larvae by conducting an RNAseq analysis of gut and nongut tissues (carcass) at 22 h post-hatching. Approximately 12% of Ae. aegypti transcripts were differentially expressed in axenic versus conventional or gnotobiotic larvae. However, this profile consisted primarily of transcripts in seven categories that included the  down-regulation of select peptidases in the gut and up-regulation of several genes in the gut and carcass with roles in amino acid transport, hormonal signaling, and metabolism. Overall, our results indicate that axenic larvae exhibit alterations in gene expression consistent with defects in acquisition and assimilation of nutrients required for growth. 
27872077	 Plasmids of incompatibility group A/C (IncA/C) are becoming increasingly prevalent within pathogenic Enterobacteriaceae They are associated with the dissemination of multiple clinically relevant resistance genes, including blaCMY  and blaNDM Current typing methods for IncA/C plasmids offer limited resolution. In this study, we present the complete sequence of a blaNDM-1-positive IncA/C plasmid, pMS6198A, isolated from a multidrug-resistant uropathogenic Escherichia  coli strain. Hypersaturated transposon mutagenesis, coupled with transposon-directed insertion site sequencing (TraDIS), was employed to identify  conserved genetic elements required for replication and maintenance of pMS6198A.  Our analysis of TraDIS data identified roles for the replicon, including repA, a  toxin-antitoxin system; two putative partitioning genes, parAB; and a putative gene, 053 Construction of mini-IncA/C plasmids and examination of their stability within E. coli confirmed that the region encompassing 053 contributes to the stable maintenance of IncA/C plasmids. Subsequently, the four major maintenance genes (repA, parAB, and 053) were used to construct a new plasmid multilocus sequence typing (PMLST) scheme for IncA/C plasmids. Application of this scheme to a database of 82 IncA/C plasmids identified 11 unique sequence types (STs), with  two dominant STs. The majority of blaNDM-positive plasmids examined (15/17; 88%)  fall into ST1, suggesting acquisition and subsequent expansion of this blaNDM-containing plasmid lineage. The IncA/C PMLST scheme represents a standardized tool to identify, track, and analyze the dissemination of important  IncA/C plasmid lineages, particularly in the context of epidemiological studies. 
27836995	 RNA sequencing studies have identified hundreds of non-coding RNAs in bacteria, including regulatory small RNA (sRNA). However, our understanding of sRNA function has lagged behind their identification due to a lack of tools for the high-throughput analysis of RNA-RNA interactions in bacteria. Here we demonstrate that in vivo sRNA-mRNA duplexes can be recovered using UV-crosslinking, ligation  and sequencing of hybrids (CLASH). Many sRNAs recruit the endoribonuclease, RNase E, to facilitate processing of mRNAs. We were able to recover base-paired sRNA-mRNA duplexes in association with RNase E, allowing proximity-dependent ligation and sequencing of cognate sRNA-mRNA pairs as chimeric reads. We verified that this approach captures bona fide sRNA-mRNA interactions. Clustering analyses identified novel sRNA seed regions and sets of potentially co-regulated target mRNAs. We identified multiple mRNA targets for the pathotype-specific sRNA Esr41, which was shown to regulate colicin sensitivity and iron transport in E. coli Numerous sRNA interactions were also identified with non-coding RNAs, including sRNAs and tRNAs, demonstrating the high complexity of the sRNA interactome. 
27466434	 Avian pathogenic Escherichia coli (APEC) can cause significant morbidity in chickens. The thymus provides the essential environment for T cell development; however, the thymus transcriptome has not been examined for gene expression in response to APEC infection. An improved understanding of the host genomic response to APEC infection could inform future breeding programs for disease resistance and APEC control. We therefore analyzed the transcriptome of the thymus of birds challenged with APEC, contrasting susceptible and resistant phenotypes. Thousands of genes were differentially expressed in birds of the 5-day post infection (dpi) challenged-susceptible group vs. 5 dpi non-challenged, in 5 dpi challenged-susceptible vs. 5 dpi challenged-resistant birds, as well as  in 5 dpi vs. one dpi challenged-susceptible birds. The Toll-like receptor signaling pathway was the major innate immune response for birds to respond to APEC infection. Moreover, lysosome and cell adhesion molecules pathways were common mechanisms for chicken response to APEC infection. The T-cell receptor signaling pathway, cell cycle, and p53 signaling pathways were significantly activated in resistant birds to resist APEC infection. These results provide a comprehensive assessment of global gene networks and biological functionalities of differentially expressed genes in the thymus under APEC infection. These findings provide novel insights into key molecular genetic mechanisms that differentiate host resistance from susceptibility in this primary lymphoid tissue, the thymus. 
27424527	 Thermobifida fusca is a thermophilic actinobacterium. T. fusca muC obtained by adaptive evolution preferred yeast extract to ammonium sulfate for accumulating malic acid and ammonium sulfate for cell growth. We did transcriptome analysis of T. fusca muC on Avicel and cellobiose with addition of ammonium sulfate or yeast  extract, respectively by RNAseq. The transcriptional results indicate that ammonium sulfate induced the transcriptions of the genes related to carbohydrate  metabolisms significantly more than yeast extract. Importantly, Tfu_2487, encoding histidine-containing protein (HPr), didn't transcribe on yeast extract at all, while it transcribed highly on ammonium sulfate. In order to understand the impact of HPr on malate production and cell growth of the muC strain, we deleted Tfu_2487 to get a mutant strain: muCΔ2487, which had 1.33 mole/mole-glucose equivalent malate yield, much higher than that on yeast extract. We then developed an E. coli-T. fusca shuttle plasmid for over-expressing HPr in muCΔ2487, a strain without HPr background, forming the muCΔ2487S strain. The muCΔ2487S strain had a much lower malate yield but faster cell growth than the muC strain. The results of both mutant strains confirmed that HPr was the key regulatory protein for T. fusca's metabolisms on nitrogen sources. 
27336699	 Our objective was to identify the biological response and the cross-talk between  liver and mammary tissue after intramammary infection (IMI) with Escherichia coli (E. coli) using RNAseq technology. Sixteen cows were inoculated with live E. coli into one mammary quarter at ~4-6 weeks in lactation. For all cows, biopsies were  performed at -144, 12 and 24 h relative to IMI in liver and at 24 h post-IMI in infected and non-infected (control) mammary quarters. For a subset of cows (n = 6), RNA was extracted from both liver and mammary tissue and sequenced using a 100 bp paired-end approach. Ingenuity Pathway Analysis and the Dynamic Impact Approach analysis of differentially expressed genes (overall effect False Discovery Rate≤0.05) indicated that IMI induced an overall activation of inflammation at 12 h post-IMI and a strong inhibition of metabolism, especially related to lipid, glucose, and xenobiotics at 24 h post-IMI in liver. The data indicated in mammary tissue an overall induction of inflammatory response with little effect on metabolism at 24 h post-IMI. We identified a large number of up-stream regulators potentially involved in the response to IMI in both tissues  but a relatively small core network of transcription factors controlling the response to IMI for liver whereas a large network in mammary tissue. Transcriptomic results in liver and mammary tissue were supported by changes in inflammatory and metabolic mediators in blood and milk. The analysis of potential cross-talk between the two tissues during IMI uncovered a large communication from the mammary tissue to the liver to coordinate the inflammatory response but  a relatively small communication from the liver to the mammary tissue. Our results indicate a strong induction of the inflammatory response in mammary tissue and impairment of liver metabolism 24h post-IMI partly driven by the signaling from infected mammary tissue. 
27298336	 R loops form when transcripts hybridize to homologous DNA on chromosomes, yielding a DNA:RNA hybrid and a displaced DNA single strand. R loops impact the genome of many organisms, regulating chromosome stability, gene expression, and DNA repair. Understanding the parameters dictating R-loop formation in vivo has been hampered by the limited quantitative and spatial resolution of current genomic strategies for mapping R loops. We report a novel whole-genome method, S1-DRIP-seq (S1 nuclease DNA:RNA immunoprecipitation with deep sequencing), for mapping hybrid-prone regions in budding yeast Saccharomyces cerevisiae Using this methodology, we identified ∼800 hybrid-prone regions covering 8% of the genome. Given the pervasive transcription of the yeast genome, this result suggests that  R-loop formation is dictated by characteristics of the DNA, RNA, and/or chromatin. We successfully identified two features highly predictive of hybrid formation: high transcription and long homopolymeric dA:dT tracts. These accounted for >60% of the hybrid regions found in the genome. We demonstrated that these two factors play a causal role in hybrid formation by genetic manipulation. Thus, the hybrid map generated by S1-DRIP-seq led to the identification of the first global genomic features causal for R-loop formation in yeast. 
27004424	 BACKGROUND: Biofilm formation is an important survival strategy of Salmonella in  all environments. By mutant screening, we showed a knock-out mutant of fabR, encoding a repressor of unsaturated fatty acid biosynthesis (UFA), to have impaired biofilm formation. In order to unravel how this regulator impinges on Salmonella biofilm formation, we aimed at elucidating the S. Typhimurium FabR regulon. Hereto, we applied a combinatorial high-throughput approach, combining ChIP-chip with transcriptomics. RESULTS: All the previously identified E. coli FabR transcriptional target genes  (fabA, fabB and yqfA) were shown to be direct S. Typhimurium FabR targets as well. As we found a fabB overexpressing strain to partly mimic the biofilm defect of the fabR mutant, the effect of FabR on biofilms can be attributed at least partly to FabB, which plays a key role in UFA biosynthesis. Additionally, ChIP-chip identified a number of novel direct FabR targets (the intergenic regions between hpaR/hpaG and ddg/ydfZ) and yet putative direct targets (i.a. genes involved in tRNA metabolism, ribosome synthesis and translation). Next to UFA biosynthesis, a number of these direct targets and other indirect targets identified by transcriptomics (e.g. ribosomal genes, ompA, ompC, ompX, osmB, osmC, sseI), could possibly contribute to the effect of FabR on biofilm formation. CONCLUSION: Overall, our results point at the importance of FabR and UFA biosynthesis in Salmonella biofilm formation and their role as potential targets  for biofilm inhibitory strategies. 
26706151	 Proper division site selection is crucial for the survival of all organisms. What still eludes us is how bacteria position their division site with high precision, and in tight coordination with chromosome replication and segregation. Until recently, the general belief, at least in the model organisms Bacillus subtilis and Escherichia coli, was that spatial regulation of division comes about by the  combined negative regulatory mechanisms of the Min system and nucleoid occlusion. However, as we review here, these two systems cannot be solely responsible for division site selection and we highlight additional regulatory mechanisms that are at play. In this review, we put forward evidence of how chromosome replication and segregation may have direct links with cell division in these bacteria and the benefit of recent advances in chromosome conformation capture techniques in providing important information about how these three processes mechanistically work together to achieve accurate generation of progenitor cells. 
26131613	 Escherichia coli ST131 is a recently emerged and globally disseminated multidrug  resistant clone associated with urinary tract and bloodstream infections in both  community and clinical settings. The most common group of ST131 strains are defined by resistance to fluoroquinolones and possession of the type 1 fimbriae fimH30 allele. Here we provide an update on our recent work describing the globally epidemiology of ST131. We review the phylogeny of ST131 based on whole genome sequence data and highlight the important role of recombination in the evolution of this clonal lineage. We also summarize our findings on the virulence of the ST131 reference strain EC958, and highlight the use of transposon directed insertion-site sequencing to define genes associated with serum resistance and essential features of its large antibiotic resistance plasmid pEC958. 
25875675	 Escherichia coli sequence type 131 (E. coli ST131) is a recently emerged and globally disseminated multidrug resistant clone associated with urinary tract and bloodstream infections. Plasmids represent a major vehicle for the carriage of antibiotic resistance genes in E. coli ST131. In this study, we determined the complete sequence and performed a comprehensive annotation of pEC958, an IncF plasmid from the E. coli ST131 reference strain EC958. Plasmid pEC958 is 135.6 kb in size, harbours two replicons (RepFIA and RepFII) and contains 12 antibiotic resistance genes (including the blaCTX-M-15 gene). We also carried out hyper-saturated transposon mutagenesis and multiplexed transposon directed insertion-site sequencing (TraDIS) to investigate the biology of pEC958. TraDIS data showed that while only the RepFII replicon was required for pEC958 replication, the RepFIA replicon contains genes essential for its partitioning. Thus, our data provides direct evidence that the RepFIA and RepFII replicons in pEC958 cooperate to ensure their stable inheritance. The gene encoding the antitoxin component (ccdA) of the post-segregational killing system CcdAB was also protected from mutagenesis, demonstrating this system is active. Sequence comparison with a global collection of ST131 strains suggest that IncF represents the most common type of plasmid in this clone, and underscores the need to understand its evolution and contribution to the spread of antibiotic resistance  genes in E. coli ST131. 
25873626	 The cMonkey integrated biclustering algorithm identifies conditionally co-regulated modules of genes (biclusters). cMonkey integrates various orthogonal pieces of information which support evidence of gene co-regulation, and optimizes biclusters to be supported simultaneously by one or more of these prior constraints. The algorithm served as the cornerstone for constructing the first global, predictive Environmental Gene Regulatory Influence Network (EGRIN) model  for a free-living cell, and has now been applied to many more organisms. However, due to its computational inefficiencies, long run-time and complexity of various  input data types, cMonkey was not readily usable by the wider community. To address these primary concerns, we have significantly updated the cMonkey algorithm and refactored its implementation, improving its usability and extendibility. These improvements provide a fully functioning and user-friendly platform for building co-regulated gene modules and the tools necessary for their exploration and interpretation. We show, via three separate analyses of data for  E. coli, M. tuberculosis and H. sapiens, that the updated algorithm and inclusion of novel scoring functions for new data types (e.g. ChIP-seq and transcription factor over-expression [TFOE]) improve discovery of biologically informative co-regulated modules. The complete cMonkey2 software package, including source code, is available at https://github.com/baliga-lab/cmonkey2. 
25757765	 Plants consist of many functionally specialized cell types, each with its own unique epigenome, transcriptome, and proteome. Characterization of these cell type-specific properties is essential to understanding cell fate specification and the responses of individual cell types to the environment. In this chapter we describe an approach to map chromatin features in specific cell types of Arabidopsis thaliana using nuclei purification from individual cell types with the INTACT method (isolation of nuclei tagged in specific cell types) followed by chromatin immunoprecipitation and high-throughput sequencing (ChIP-seq). The INTACT system employs two transgenes to generate affinity-labeled nuclei in the cell type of interest, and these tagged nuclei can then be selectively purified from tissue homogenates. The primary transgene encodes the nuclear tagging fusion protein (NTF), which consists of a nuclear envelope-targeting domain, the green fluorescent protein, and a biotin ligase recognition peptide, while the second transgene encodes the E. coli biotin ligase (BirA), which selectively biotinylates NTF. Expression of NTF and BirA in a specific cell type thus yields  nuclei that are coated with biotin and can be purified by virtue of their affinity for streptavidin-coated magnetic beads. Compared with the original INTACT nuclei purification protocol, the procedure presented here is greatly simplified and shortened. After nuclei purification, we provide detailed instructions for chromatin isolation, shearing, and immunoprecipitation. Finally, we present a low input ChIP-seq library preparation protocol based on the nano-ChIP-seq method of Adli and Bernstein, and we describe multiplex Illumina sequencing of these libraries to produce high quality, cell type-specific epigenome profiles at a relatively low cost. The procedures given here are optimized for Arabidopsis but should be easily adaptable to other plant species. 
25085508	 BACKGROUND: Burkholderia pseudomallei is a facultative intracellular pathogen and the causative agent of melioidosis. A conserved type III secretion system (T3SS3) and type VI secretion system (T6SS1) are critical for intracellular survival and  growth. The T3SS3 and T6SS1 genes are coordinately and hierarchically regulated by a TetR-type regulator, BspR. A central transcriptional regulator of the BspR regulatory cascade, BsaN, activates a subset of T3SS3 and T6SS1 loci. RESULTS: To elucidate the scope of the BsaN regulon, we used RNAseq analysis to compare the transcriptomes of wild-type B. pseudomallei KHW and a bsaN deletion mutant. The 60 genes positively-regulated by BsaN include those that we had previously identified in addition to a polyketide biosynthesis locus and genes involved in amino acid biosynthesis. BsaN was also found to repress the transcription of 51 genes including flagellar motility loci and those encoding components of the T3SS3 apparatus. Using a promoter-lacZ fusion assay in E. coli, we show that BsaN together with the chaperone BicA directly control the expression of the T3SS3 translocon, effector and associated regulatory genes that are organized into at least five operons (BPSS1516-BPSS1552). Using a mutagenesis approach, a consensus regulatory motif in the promoter regions of BsaN-regulated  genes was shown to be essential for transcriptional activation. CONCLUSIONS: BsaN/BicA functions as a central regulator of key virulence clusters in B. pseudomallei within a more extensive network of genetic regulation. We propose that BsaN/BicA controls a gene expression program that facilitates the adaption and intracellular survival of the pathogen within eukaryotic hosts. 
24743342	 DNA:RNA hybrid formation is emerging as a significant cause of genome instability in biological systems ranging from bacteria to mammals. Here we describe the genome-wide distribution of DNA:RNA hybrid prone loci in Saccharomyces cerevisiae by DNA:RNA immunoprecipitation (DRIP) followed by hybridization on tiling microarray. These profiles show that DNA:RNA hybrids preferentially accumulated at rDNA, Ty1 and Ty2 transposons, telomeric repeat regions and a subset of open reading frames (ORFs). The latter are generally highly transcribed and have high  GC content. Interestingly, significant DNA:RNA hybrid enrichment was also detected at genes associated with antisense transcripts. The expression of antisense-associated genes was also significantly altered upon overexpression of  RNase H, which degrades the RNA in hybrids. Finally, we uncover mutant-specific differences in the DRIP profiles of a Sen1 helicase mutant, RNase H deletion mutant and Hpr1 THO complex mutant compared to wild type, suggesting different roles for these proteins in DNA:RNA hybrid biology. Our profiles of DNA:RNA hybrid prone loci provide a resource for understanding the properties of hybrid-forming regions in vivo, extend our knowledge of hybrid-mitigating enzymes, and contribute to models of antisense-mediated gene regulation. A summary of this paper was presented at the 26th International Conference on Yeast Genetics and Molecular Biology, August 2013. 
24098145	 Escherichia coli ST131 is a globally disseminated, multidrug resistant clone responsible for a high proportion of urinary tract and bloodstream infections. The rapid emergence and successful spread of E. coli ST131 is strongly associated with antibiotic resistance; however, this phenotype alone is unlikely to explain  its dominance amongst multidrug resistant uropathogens circulating worldwide in hospitals and the community. Thus, a greater understanding of the molecular mechanisms that underpin the fitness of E. coli ST131 is required. In this study, we employed hyper-saturated transposon mutagenesis in combination with multiplexed transposon directed insertion-site sequencing to define the essential genes required for in vitro growth and the serum resistome (i.e. genes required for resistance to human serum) of E. coli EC958, a representative of the predominant E. coli ST131 clonal lineage. We identified 315 essential genes in E. coli EC958, 231 (73%) of which were also essential in E. coli K-12. The serum resistome comprised 56 genes, the majority of which encode membrane proteins or factors involved in lipopolysaccharide (LPS) biosynthesis. Targeted mutagenesis confirmed a role in serum resistance for 46 (82%) of these genes. The murein lipoprotein Lpp, along with two lipid A-core biosynthesis enzymes WaaP and WaaG,  were most strongly associated with serum resistance. While LPS was the main resistance mechanism defined for E. coli EC958 in serum, the enterobacterial common antigen and colanic acid also impacted on this phenotype. Our analysis also identified a novel function for two genes, hyxA and hyxR, as minor regulators of O-antigen chain length. This study offers novel insight into the genetic make-up of E. coli ST131, and provides a framework for future research on E. coli and other Gram-negative pathogens to define their essential gene repertoire and to dissect the molecular mechanisms that enable them to survive in the bloodstream and cause disease. 
23865838	 BACKGROUND: Identification of transcription factor binding sites (also called 'motif discovery') in DNA sequences is a basic step in understanding genetic regulation. Although many successful programs have been developed, the problem is far from being solved on account of diversity in gene expression/regulation and the low specificity of binding sites. State-of-the-art algorithms have their own  constraints (e.g., high time or space complexity for finding long motifs, low precision in identification of weak motifs, or the OOPS constraint: one occurrence of the motif instance per sequence) which limit their scope of application. RESULTS: In this paper, we present a novel and fast algorithm we call TFBSGroup.  It is based on community detection from a graph and is used to discover long and  weak (l,d) motifs under the ZOMOPS constraint (zero, one or multiple occurrence(s) of the motif instance(s) per sequence), where l is the length of a  motif and d is the maximum number of mutations between a motif instance and the motif itself. Firstly, TFBSGroup transforms the (l, d) motif search in sequences  to focus on the discovery of dense subgraphs within a graph. It identifies these  subgraphs using a fast community detection method for obtaining coarse-grained candidate motifs. Next, it greedily refines these candidate motifs towards the true motif within their own communities. Empirical studies on synthetic (l, d) samples have shown that TFBSGroup is very efficient (e.g., it can find true (18,  6), (24, 8) motifs within 30 seconds). More importantly, the algorithm has succeeded in rapidly identifying motifs in a large data set of prokaryotic promoters generated from the Escherichia coli database RegulonDB. The algorithm has also accurately identified motifs in ChIP-seq data sets for 12 mouse transcription factors involved in ES cell pluripotency and self-renewal. CONCLUSIONS: Our novel heuristic algorithm, TFBSGroup, is able to quickly identify nearly exact matches for long and weak (l, d) motifs in DNA sequences under the ZOMOPS constraint. It is also capable of finding motifs in real applications. The source code for TFBSGroup can be obtained from http://bioinformatics.bioengr.uic.edu/TFBSGroup/. 
23190111	 OmpR is a multifunctional DNA binding regulator with orthologues in many enteric  bacteria that exhibits classical regulator activity as well as nucleoid-associated protein-like characteristics. In the enteric pathogen Salmonella enterica, using chromatin immunoprecipitation of OmpR:FLAG and nucleotide sequencing, 43 putative OmpR binding sites were identified in S. enterica serovar Typhi, 22 of which were associated with OmpR-regulated genes. Mutation of a sequence motif (TGTWACAW) that was associated with the putative OmpR binding sites abrogated binding of OmpR:6×His to the tviA upstream region. A core set of 31 orthologous genes were found to exhibit OmpR-dependent expression  in both S. Typhi and S. Typhimurium. S. Typhimurium-encoded orthologues of two divergently transcribed OmpR-regulated operons (SL1068-71 and SL1066-67) had a putative OmpR binding site in the inter-operon region in S. Typhi, and were characterized using in vitro and in vivo assays. These operons are widely distributed within S. enterica but absent from the closely related Escherichia coli. SL1066 and SL1067 were required for growth on N-acetylmuramic acid as a sole carbon source. SL1068-71 exhibited sequence similarity to sialic acid uptake systems and contributed to colonization of the ileum and caecum in the streptomycin-pretreated mouse model of colitis. 
22923524	 Typical approaches for predicting transcription factor binding sites (TFBSs) involve use of a position-specific weight matrix (PWM) to statistically characterize the sequences of the known sites. Recently, an alternative physicochemical approach, called SiteSleuth, was proposed. In this approach, a linear support vector machine (SVM) classifier is trained to distinguish TFBSs from background sequences based on local chemical and structural features of DNA. SiteSleuth appears to generally perform better than PWM-based methods. Here, we improve the SiteSleuth approach by considering both new physicochemical features  and algorithmic modifications. New features are derived from Gibbs energies of amino acid-DNA interactions and hydroxyl radical cleavage profiles of DNA. Algorithmic modifications consist of inclusion of a feature selection step, use of a nonlinear kernel in the SVM classifier, and use of a consensus-based post-processing step for predictions. We also considered SVM classification based on letter features alone to distinguish performance gains from use of SVM-based models versus use of physicochemical features. The accuracy of each of the variant methods considered was assessed by cross validation using data available  in the RegulonDB database for 54 Escherichia coli TFs, as well as by experimental validation using published ChIP-chip data available for Fis and Lrp. 
22890136	 Two transcription termination mechanisms - intrinsic and Rho-dependent - have evolved in bacteria. The Rho factor occurs in most bacterial lineages, and has been hypothesized to play a global regulatory role. Genome-wide studies using microarray, 2D-gel electrophoresis and ChIP-chip provided evidence that Rho serves to silence transcription from horizontally acquired genes and prophages in Escherichia coli K-12, implicating the factor to be a part of the "cellular immune mechanism" protecting against deleterious phages and aberrant gene expression from acquired xenogenic DNA. We have investigated this model by adopting an alternate in silico approach and have extended the study to other species. Our analysis shows that several genomic islands across diverse phyla have under-representation of intrinsic terminators, similar to that experimentally observed in E. coli K-12. This implies that Rho-dependent termination is the predominant process operational in these islands and that silencing of foreign DNA is a conserved function of Rho. From the present analysis, it is evident that horizontally acquired islands have lost intrinsic terminators to facilitate Rho-dependent termination. These results underscore the importance of Rho as a conserved, genome-wide sentinel that regulates potentially toxic xenogenic DNA. 
22555467	 Signature tagged mutagenesis is a genetic approach that was developed to identify novel bacterial virulence factors. It is a negative selection method in which unique identification tags allow analysis of pools of mutants in mixed populations. The approach is particularly well suited to functional genetic analysis of the gastrointestinal phase of infection in foodborne pathogens and has the capacity to guide the development of novel vaccines and therapeutics. In  this review we outline the technical principles underpinning signature-tagged mutagenesis as well as novel sequencing-based approaches for transposon mutant identification such as TraDIS (transposon directed insertion-site sequencing). We also provide an analysis of screens that have been performed in gastrointestinal  pathogens which are a global health concern (Escherichia coli, Listeria monocytogenes, Helicobacter pylori, Vibrio cholerae and Salmonella enterica). The identification of key virulence loci through the use of signature tagged mutagenesis in mice and relevant larger animal models is discussed. 
21515770	 Bacterial Gre factors associate with RNA polymerase (RNAP) and stimulate intrinsic cleavage of the nascent transcript at the active site of RNAP. Biochemical and genetic studies to date have shown that Escherichia coli Gre factors prevent transcriptional arrest during elongation and enhance transcription fidelity. Furthermore, Gre factors participate in the stimulation of promoter escape and the suppression of promoter-proximal pausing during the beginning of RNA synthesis in E. coli. Although Gre factors are conserved in general bacteria, limited functional studies have been performed in bacteria other than E. coli. In this investigation, ChAP-chip analysis (chromatin affinity precipitation coupled with DNA microarray) was conducted to visualize the distribution of Bacillus subtilis GreA on the chromosome and to determine the effects of GreA inactivation on core RNAP trafficking. Our data show that GreA is uniformly distributed in the transcribed region from the promoter to coding region with core RNAP, and its inactivation induces RNAP accumulation at many promoter or promoter-proximal regions. Based on these findings, we propose that GreA would constantly associate with core RNAP during transcriptional initiation  and elongation and resolves its stalling at promoter or promoter-proximal regions, thus contributing to the even distribution of RNAP along the promoter and coding regions in B. subtilis cells. 
21278291	 Massively parallel sequencing of transposon-flanking regions assigned the genotype and fitness score to 91% of Escherichia coli O157:H7 mutants previously  screened in cattle by signature-tagged mutagenesis (STM). The method obviates the limitations of STM and markedly extended the functional annotation of the prototype E. coli O157:H7 genome without further animal use. 
21124945	 An important step in understanding gene regulation is to identify the DNA binding sites recognized by each transcription factor (TF). Conventional approaches to prediction of TF binding sites involve the definition of consensus sequences or position-specific weight matrices and rely on statistical analysis of DNA sequences of known binding sites. Here, we present a method called SiteSleuth in  which DNA structure prediction, computational chemistry, and machine learning are applied to develop models for TF binding sites. In this approach, binary classifiers are trained to discriminate between true and false binding sites based on the sequence-specific chemical and structural features of DNA. These features are determined via molecular dynamics calculations in which we consider  each base in different local neighborhoods. For each of 54 TFs in Escherichia coli, for which at least five DNA binding sites are documented in RegulonDB, the  TF binding sites and portions of the non-coding genome sequence are mapped to feature vectors and used in training. According to cross-validation analysis and  a comparison of computational predictions against ChIP-chip data available for the TF Fis, SiteSleuth outperforms three conventional approaches: Match, MATRIX SEARCH, and the method of Berg and von Hippel. SiteSleuth also outperforms QPMEME, a method similar to SiteSleuth in that it involves a learning algorithm.  The main advantage of SiteSleuth is a lower false positive rate. 
21051353	 Immuno-precipitation of protein-DNA complexes followed by microarray hybridization is a powerful and cost-effective technology for discovering protein-DNA binding events at the genome scale. It is still an unresolved challenge to comprehensively, accurately and sensitively extract binding event information from the produced data. We have developed a novel strategy composed of an information-preserving signal-smoothing procedure, higher order derivative  analysis and application of the principle of maximum entropy to address this challenge. Importantly, our method does not require any input parameters to be specified by the user. Using genome-scale binding data of two Escherichia coli global transcription regulators for which a relatively large number of experimentally supported sites are known, we show that ∼90% of known sites were resolved to within four probes, or ∼88 bp. Over half of the sites were resolved to within two probes, or ∼38 bp. Furthermore, we demonstrate that our strategy delivers significant quantitative and qualitative performance gains over available methods. Such accurate and sensitive binding site resolution has important consequences for accurately reconstructing transcriptional regulatory networks, for motif discovery, for furthering our understanding of local and non-local factors in protein-DNA interactions and for extending the usefulness horizon of the ChIP-chip platform. 
20817769	 To obtain insight into the in vivo dynamics of RNA polymerase (RNAP) on the Bacillus subtilis genome, we analyzed the distribution of the σ(A) and β' subunits of RNAP and the NusA elongation factor on the genome in exponentially growing cells using chromatin affinity precipitation coupled with gene chip mapping (ChAP-chip). In contrast to Escherichia coli RNAP, which often accumulates at the promoter-proximal region, B. subtilis RΝΑP is evenly distributed from the promoter to the coding sequences. This finding suggests that, in general, B. subtilis RNAP recruited to the promoter promptly translocates away from the promoter to form the elongation complex and proceeds without intragenic transcription attenuation. We detected RNAP accumulation in the promoter-proximal regions of some genes, most of which can be identified as transcription attenuation systems in the leader region. Our findings suggest that the differences in RNAP behavior between E. coli and B. subtilis during initiation and elongation steps might result in distinct strategies for postinitiation control of transcription. The E. coli mechanism involves trapping  at the promoter and promoter-proximal pausing of RNAP in addition to transcription attenuation, whereas transcription attenuation in leader sequences  is mainly employed in B. subtilis. 
20639326	 Histone-like protein H1 (H-NS) family proteins are nucleoid-associated proteins (NAPs) conserved among many bacterial species. The IncP-7 plasmid pCAR1 is transmissible among various Pseudomonas strains and carries a gene encoding the H-NS family protein, Pmr. Pseudomonas putida KT2440 is a host of pCAR1, which harbors five genes encoding the H-NS family proteins PP_1366 (TurA), PP_3765 (TurB), PP_0017 (TurC), PP_3693 (TurD), and PP_2947 (TurE). Quantitative reverse  transcription-PCR (qRT-PCR) demonstrated that the presence of pCAR1 does not affect the transcription of these five genes and that only pmr, turA, and turB were primarily transcribed in KT2440(pCAR1). In vitro pull-down assays revealed that Pmr strongly interacted with itself and with TurA, TurB, and TurE. Transcriptome comparisons of the pmr disruptant, KT2440, and KT2440(pCAR1) strains indicated that pmr disruption had greater effects on the host transcriptome than did pCAR1 carriage. The transcriptional levels of some genes that increased with pCAR1 carriage, such as the mexEF-oprN efflux pump genes and  parI, reverted with pmr disruption to levels in pCAR1-free KT2440. Transcriptional levels of putative horizontally acquired host genes were not altered by pCAR1 carriage but were altered by pmr disruption. Identification of genome-wide Pmr binding sites by ChAP-chip (chromatin affinity purification coupled with high-density tiling chip) analysis demonstrated that Pmr preferentially binds to horizontally acquired DNA regions. The Pmr binding sites  overlapped well with the location of the genes differentially transcribed following pmr disruption on both the plasmid and the chromosome. Our findings indicate that Pmr is a key factor in optimizing gene transcription on pCAR1 and the host chromosome. 
20460455	 Deregulation of the Wnt/β-catenin signaling pathway is a hallmark of colon cancer. Mutations in the adenomatous polyposis coli (APC) gene occur in the vast  majority of colorectal cancers and are an initiating event in cellular transformation. Cells harboring mutant APC contain elevated levels of the β-catenin transcription coactivator in the nucleus which leads to abnormal expression of genes controlled by β-catenin/T-cell factor 4 (TCF4) complexes. Here, we use chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) to identify β-catenin binding regions in HCT116 human colon cancer cells. We localized 2168 β-catenin enriched regions using a concordance approach for integrating the output from multiple peak alignment algorithms. Motif discovery algorithms found a core TCF4 motif (T/A-T/A-C-A-A-A-G), an extended TCF4 motif (A/T/G-C/G-T/A-T/A-C-A-A-A-G) and an AP-1 motif (T-G-A-C/T-T-C-A) to be significantly represented in β-catenin enriched regions.  Furthermore, 417 regions contained both TCF4 and AP-1 motifs. Genes associated with TCF4 and AP-1 motifs bound β-catenin, TCF4 and c-Jun in vivo and were activated by Wnt signaling and serum growth factors. Our work provides evidence that Wnt/β-catenin and mitogen signaling pathways intersect directly to regulate  a defined set of target genes. 
18974181	 EcoCyc (http://EcoCyc.org) provides a comprehensive encyclopedia of Escherichia coli biology. EcoCyc integrates information about the genome, genes and gene products; the metabolic network; and the regulatory network of E. coli. Recent EcoCyc developments include a new initiative to represent and curate all types of E. coli regulatory processes such as attenuation and regulation by small RNAs. EcoCyc has started to curate Gene Ontology (GO) terms for E. coli and has made a  dataset of E. coli GO terms available through the GO Web site. The curation and visualization of electron transfer processes has been significantly improved. Other software and Web site enhancements include the addition of tracks to the EcoCyc genome browser, in particular a type of track designed for the display of  ChIP-chip datasets, and the development of a comparative genome browser. A new Genome Omics Viewer enables users to paint omics datasets onto the full E. coli genome for analysis. A new advanced query page guides users in interactively constructing complex database queries against EcoCyc. A Macintosh version of EcoCyc is now available. A series of Webinars is available to instruct users in the use of EcoCyc. 
18697768	 MOTIVATION: Locating transcription factor binding sites (motifs) is a key step in understanding gene regulation. Based on Tompa's benchmark study, the performance  of current de novo motif finders is far from satisfactory (with sensitivity <or=0.222 and precision <or=0.307). The same study also shows that no motif finder performs consistently well over all datasets. Hence, it is not clear which finder one should use for a given dataset. To address this issue, a class of algorithms called ensemble methods have been proposed. Though the existing ensemble methods overall perform better than stand-alone motif finders, the improvement gained is not substantial. Our study reveals that these methods do not fully exploit the information obtained from the results of individual finders, resulting in minor improvement in sensitivity and poor precision. RESULTS: In this article, we identify several key observations on how to utilize  the results from individual finders and design a novel ensemble method, MotifVoter, to predict the motifs and binding sites. Evaluations on 186 datasets  show that MotifVoter can locate more than 95% of the binding sites found by its component motif finders. In terms of sensitivity and precision, MotifVoter outperforms stand-alone motif finders and ensemble methods significantly on Tompa's benchmark, Escherichia coli, and ChIP-Chip datasets. MotifVoter is available online via a web server with several biologist-friendly features. 
18460200	 BACKGROUND: Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show  how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer  or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. RESULTS: We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E.  coli extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344  interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a  significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions. CONCLUSION: Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.