useful.out
65.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
29476659 In Escherichia coli, one Sigma factor recognizes the majority of promoters, and six "alternative" Sigma factors recognize specific subsets of promoters. The alternative Sigma factor FliA (σ28 ) recognizes promoters upstream of many flagellar genes. We previously showed that most E. coli FliA binding sites are located inside genes. However, it was unclear whether these intragenic binding sites represent active promoters. Here, we construct and assay transcriptional promoter-lacZ fusions for all 52 putative FliA promoters previously identified by ChIP-seq. These experiments, coupled with integrative analysis of published genome-scale transcriptional datasets, strongly suggest that most intragenic FliA binding sites are active promoters that transcribe highly unstable RNAs. Additionally, we show that widespread intragenic FliA-dependent transcription may be a conserved phenomenon, but that specific promoters are not themselves conserved. We conclude that intragenic FliA-dependent promoters and the resulting RNAs are unlikely to have important regulatory functions. Nonetheless, one intragenic FliA promoter is broadly conserved, and constrains evolution of the overlapping protein-coding gene. Thus, our data indicate that intragenic regulatory elements can influence bacterial protein evolution, and suggest that the impact of intragenic regulatory sequences on genome evolution should be considered more broadly. This article is protected by copyright. All rights reserved.
29468196 The RNA polymerase (RNAP) of Escherichia coli K-12 is a complex enzyme consisting of the core enzyme with the subunit structure α2ββ'ω and one of the σ subunits with promoter recognition properties. The smallest subunit, omega (the rpoZ gene product), participates in subunit assembly by supporting the folding of the largest subunit, β', but its functional role remains unsolved except for its involvement in ppGpp binding and stringent response. As an initial approach for elucidation of its functional role, we performed in this study ChIP-chip (chromatin immunoprecipitation with microarray technology) analysis of wild-type and rpoZ-defective mutant strains. The altered distribution of RpoZ-defective RNAP was identified mostly within open reading frames, in particular, of the genes inside prophages. For the genes that exhibited increased or decreased distribution of RpoZ-defective RNAP, the level of transcripts increased or decreased, respectively, as detected by reverse transcription-quantitative PCR (qRT-PCR). In parallel, we analyzed, using genomic SELEX (systemic evolution of ligands by exponential enrichment), the distribution of constitutive promoters that are recognized by RNAP RpoD holoenzyme alone and of general silencer H-NS within prophages. Since all 10 prophages in E. coli K-12 carry only a small number of promoters, the altered occupancy of RpoZ-defective RNAP and of transcripts might represent transcription initiated from as-yet-unidentified host promoters. The genes that exhibited transcription enhanced by RpoZ-defective RNAP are located in the regions of low-level H-NS binding. By using phenotype microarray (PM) assay, alterations of some phenotypes were detected for the rpoZ-deleted mutant, indicating the involvement of RpoZ in regulation of some genes. Possible mechanisms of altered distribution of RNAP inside prophages are discussed. IMPORTANCE The 91-amino-acid-residue small-subunit omega (the rpoZ gene product) of Escherichia coli RNA polymerase plays a structural role in the formation of RNA polymerase (RNAP) as a chaperone in folding the largest subunit (β', of 1,407 residues in length), but except for binding of the stringent signal ppGpp, little is known of its role in the control of RNAP function. After analysis of genomewide distribution of wild-type and RpoZ-defective RNAP by the ChIP-chip method, we found alteration of the RpoZ-defective RNAP inside open reading frames, in particular, of the genes within prophages. For a set of the genes that exhibited altered occupancy of the RpoZ-defective RNAP, transcription was found to be altered as observed by qRT-PCR assay. All the observations here described indicate the involvement of RpoZ in recognition of some of the prophage genes. This study advances understanding of not only the regulatory role of omega subunit in the functions of RNAP but also the regulatory interplay between prophages and the host E. coli for adjustment of cellular physiology to a variety of environments in nature.
29463657 Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry.IMPORTANCE Incentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in Escherichia coli, we constructed a transposon mutant library of unprecedented density. Initial automated analysis of the resulting data revealed many discrepancies compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high-density TraDIS sequencing data for each putative essential gene for the E. coli model laboratory organism. This paper is important because it provides a better understanding of the essential genes of E. coli, reveals the limitations of relying on automated analysis alone, and provides a new standard for the analysis of TraDIS data.
29433444 BACKGROUND: Due to the DNA triplet code, it is possible that the sequences of two or more protein-coding genes overlap to a large degree. However, such non-trivial overlaps are usually excluded by genome annotation pipelines and, thus, only a few overlapping gene pairs have been described in bacteria. In contrast, transcriptome and translatome sequencing reveals many signals originated from the antisense strand of annotated genes, of which we analyzed an example gene pair in more detail. RESULTS: A small open reading frame of Escherichia coli O157:H7 strain Sakai (EHEC), designated laoB (L-arginine responsive overlapping gene), is embedded in reading frame -2 in the antisense strand of ECs5115, encoding a CadC-like transcriptional regulator. This overlapping gene shows evidence of transcription and translation in Luria-Bertani (LB) and brain-heart infusion (BHI) medium based on RNA sequencing (RNAseq) and ribosomal-footprint sequencing (RIBOseq). The transcriptional start site is 289 base pairs (bp) upstream of the start codon and transcription termination is 155 bp downstream of the stop codon. Overexpression of LaoB fused to an enhanced green fluorescent protein (EGFP) reporter was possible. The sequence upstream of the transcriptional start site displayed strong promoter activity under different conditions, whereas promoter activity was significantly decreased in the presence of L-arginine. A strand-specific translationally arrested mutant of laoB provided a significant growth advantage in competitive growth experiments in the presence of L-arginine compared to the wild type, which returned to wild type level after complementation of laoB in trans. A phylostratigraphic analysis indicated that the novel gene is restricted to the Escherichia/Shigella clade and might have originated recently by overprinting leading to the expression of part of the antisense strand of ECs5115. CONCLUSIONS: Here, we present evidence of a novel small protein-coding gene laoB encoded in the antisense frame -2 of the annotated gene ECs5115. Clearly, laoB is evolutionarily young and it originated in the Escherichia/Shigella clade by overprinting, a process which may cause the de novo evolution of bacterial genes like laoB.
29394395 Two major transcriptional regulators of carbon metabolism in bacteria are Cra and CRP. CRP is considered to be the main mediator of catabolite repression. Unlike for CRP, in vivo DNA binding information of Cra is scarce. Here we generate and integrate ChIP-exo and RNA-seq data to identify 39 binding sites for Cra and 97 regulon genes that are regulated by Cra in Escherichia coli. An integrated metabolic-regulatory network was formed by including experimentally-derived regulatory information and a genome-scale metabolic network reconstruction. Applying analysis methods of systems biology to this integrated network showed that Cra enables optimal bacterial growth on poor carbon sources by redirecting and repressing glycolysis flux, by activating the glyoxylate shunt pathway, and by activating the respiratory pathway. In these regulatory mechanisms, the overriding regulatory activity of Cra over CRP is fundamental. Thus, elucidation of interacting transcriptional regulation of core carbon metabolism in bacteria by two key transcription factors was possible by combining genome-wide experimental measurement and simulation with a genome-scale metabolic model.
29150605 CsrA is a post-transcriptional regulatory protein that is widely distributed among bacteria. This protein influences bacterial lifestyle decisions by binding to the 5' untranslated and/or early coding regions of mRNA targets, causing changes in translation initiation, RNA stability, and/or transcription elongation. Here, we assess the contribution of CsrA to gene expression in Escherichia coli on a global scale. UV crosslinking immunoprecipitation and sequencing (CLIP-seq) identify RNAs that interact directly with CsrA in vivo, while ribosome profiling and RNA-seq uncover the impact of CsrA on translation, RNA abundance, and RNA stability. This combination of approaches reveals unprecedented detail about the regulatory role of CsrA, including novel binding targets and physiological roles, such as in envelope function and iron homeostasis. Our findings highlight the integration of CsrA throughout the E. coli regulatory network, where it orchestrates vast effects on gene expression.
28911122 ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data.
28902868 In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set.
28842878 The advent of Chromatin Immunoprecipitation sequencing (ChIP-Seq) has allowed the identification of genomic regions bound by a DNA binding protein in-vivo on a genome-wide scale. The impact of the DNA binding protein on gene expression can be addressed using transcriptome experiments in appropriate genetic settings. Overlaying the above two sources of data enables us to dissect the direct and indirect effects of a DNA binding protein on gene expression. Application of these techniques to Nucleoid Associated Proteins (NAPs) and Global Transcription Factors (GTFs) has underscored the complex relationship between DNA-protein interactions and gene expression change, highlighting the role of combinatorial control. Here, we demonstrate the usage of ChIP-Seq to infer binding properties and transcriptional effects of NAPs such as Fis and HNS, and the GTF CRP in the model organism Escherichia coli K12 MG1655 (E. coli).
28489862 Uropathogenic Escherichia coli (UPEC) is the cause of ~75% of all urinary tract infections (UTIs) and is increasingly associated with multidrug resistance. This includes UPEC strains from the recently emerged and globally disseminated sequence type 131 (ST131), which is now the dominant fluoroquinolone-resistant UPEC clone worldwide. Most ST131 strains are motile and produce H4-type flagella. Here, we applied a combination of saturated Tn5 mutagenesis and transposon directed insertion site sequencing (TraDIS) as a high throughput genetic screen and identified 30 genes associated with enhanced motility of the reference ST131 strain EC958. This included 12 genes that repress motility of E. coli K-12, four of which (lrhA, ihfA, ydiV, lrp) were confirmed in EC958. Other genes represented novel factors that impact motility, and we focused our investigation on characterisation of the mprA, hemK and yjeA genes. Mutation of each of these genes in EC958 led to increased transcription of flagellar genes (flhD and fliC), increased expression of the FliC flagellin, enhanced flagella synthesis and a hyper-motile phenotype. Complementation restored all of these properties to wild-type level. We also identified Tn5 insertions in several intergenic regions (IGRs) on the EC958 chromosome that were associated with enhanced motility; this included flhDC and EC958_1546. In both of these cases, the Tn5 insertions were associated with increased transcription of the downstream gene(s), which resulted in enhanced motility. The EC958_1546 gene encodes a phage protein with similarity to esterase/deacetylase enzymes involved in the hydrolysis of sialic acid derivatives found in human mucus. We showed that over-expression of EC958_1546 led to enhanced motility of EC958 as well as the UPEC strains CFT073 and UTI89, demonstrating its activity affects the motility of different UPEC strains. Overall, this study has identified and characterised a number of novel factors associated with enhanced UPEC motility.
28245801 BACKGROUND: While NGS allows rapid global detection of transcripts, it remains difficult to distinguish ncRNAs from short mRNAs. To detect potentially translated RNAs, we developed an improved protocol for bacterial ribosomal footprinting (RIBOseq). This allowed distinguishing ncRNA from mRNA in EHEC. A high ratio of ribosomal footprints per transcript (ribosomal coverage value, RCV) is expected to indicate a translated RNA, while a low RCV should point to a non-translated RNA. RESULTS: Based on their low RCV, 150 novel non-translated EHEC transcripts were identified as putative ncRNAs, representing both antisense and intergenic transcripts, 74 of which had expressed homologs in E. coli MG1655. Bioinformatics analysis predicted statistically significant target regulons for 15 of the intergenic transcripts; experimental analysis revealed 4-fold or higher differential expression of 46 novel ncRNA in different growth media. Out of 329 annotated EHEC ncRNAs, 52 showed an RCV similar to protein-coding genes, of those, 16 had RIBOseq patterns matching annotated genes in other enterobacteriaceae, and 11 seem to possess a Shine-Dalgarno sequence, suggesting that such ncRNAs may encode small proteins instead of being solely non-coding. To support that the RIBOseq signals are reflecting translation, we tested the ribosomal-footprint covered ORF of ryhB and found a phenotype for the encoded peptide in iron-limiting condition. CONCLUSION: Determination of the RCV is a useful approach for a rapid first-step differentiation between bacterial ncRNAs and small mRNAs. Further, many known ncRNAs may encode proteins as well.
28061857 BACKGROUND: Enteric Escherichia coli survives the highly acidic environment of the stomach through multiple acid resistance (AR) mechanisms. The most effective system, AR2, decarboxylates externally-derived glutamate to remove cytoplasmic protons and excrete GABA. The first described system, AR1, does not require an external amino acid. Its mechanism has not been determined. The regulation of the multiple AR systems and their coordination with broader cellular metabolism has not been fully explored. RESULTS: We utilized a combination of ChIP-Seq and gene expression analysis to experimentally map the regulatory interactions of four TFs: nac, ntrC, ompR, and csiR. Our data identified all previously in vivo confirmed direct interactions and revealed several others previously inferred from gene expression data. Our data demonstrate that nac and csiR directly modulate AR, and leads to a regulatory network model in which all four TFs participate in coordinating acid resistance, glutamate metabolism, and nitrogen metabolism. This model predicts a novel mechanism for AR1 by which the decarboxylation enzymes of AR2 are used with internally derived glutamate. This hypothesis makes several testable predictions that we confirmed experimentally. CONCLUSIONS: Our data suggest that the regulatory network underlying AR is complex and deeply interconnected with the regulation of GABA and glutamate metabolism, nitrogen metabolism. These connections underlie and experimentally validated model of AR1 in which the decarboxylation enzymes of AR2 are used with internally derived glutamate.
27900321 The regulatory protein, GalR, is known for controlling transcription of genes related to D-galactose metabolism in Escherichia coli. Here, using a combination of experimental and bioinformatic approaches, we identify novel GalR binding sites upstream of several genes whose function is not directly related to D-galactose metabolism. Moreover, we do not observe regulation of these genes by GalR under standard growth conditions. Thus, our data indicate a broader regulatory role for GalR, and suggest that regulation by GalR is modulated by other factors. Surprisingly, we detect regulation of 158 transcripts by GalR, with few regulated genes being associated with a nearby GalR binding site. Based on our earlier observation of long-range interactions between distally bound GalR dimers, we propose that GalR indirectly regulates the transcription of many genes by inducing large-scale restructuring of the chromosome.
27492737 Conjugation plays an important role in the horizontal movement of DNA between bacterial species and even genera. Large conjugative plasmids in Gram-negative bacteria are associated with multi-drug resistance and have been implicated in the spread of these phenotypes to pathogenic organisms. A/C plasmids often carry genes that confer resistance to multiple classes of antibiotics. Recently, transcription factors were characterized that regulate A/C conjugation. In this work, we expanded the regulon of the negative regulator Acr2. We developed an A/C variant, pARK01, by precise removal of resistance genes carried by the plasmid in order to make it more genetically tractable. Using pARK01, we conducted RNA-Seq and ChAP-Seq experiments to characterize the regulon of Acr2, an H-NS-like protein. We found that Acr2 binds several loci on the plasmid. We showed, in vitro, that Acr2 can bind specific promoter regions directly and identify key amino acids which are important for this binding. This study further characterizes Acr2 and suggests its role in modulating gene expression of multiple plasmid and chromosomal loci.
26911138 BACKGROUND: Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). RESULTS: Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. CONCLUSIONS: These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.
26789284 Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches.
26673755 The two-component signal transduction system BarA-UvrY of Escherichia coli and its orthologs globally regulate metabolism, motility, biofilm formation, stress resistance, virulence of pathogens and quorum sensing by activating the transcription of genes for regulatory sRNAs, e.g. CsrB and CsrC in E. coli. These sRNAs act by sequestering the RNA binding protein CsrA (RsmA) away from lower affinity mRNA targets. In this study, we used ChIP-exo to identify, at single nucleotide resolution, genomic sites for UvrY (SirA) binding in E. coli and Salmonella enterica. The csrB and csrC genes were the strongest targets of crosslinking, which required UvrY phosphorylation by the BarA sensor kinase. Crosslinking occurred at two sites, an inverted repeat sequence far upstream of the promoter and a site near the -35 sequence. DNAse I footprinting revealed specific binding of UvrY in vitro only to the upstream site, indicative of additional binding requirements and/or indirect binding to the downstream site. Additional genes, including cspA, encoding the cold-shock RNA-binding protein CspA, showed weaker crosslinking and modest or negligible regulation by UvrY. We conclude that the global effects of UvrY/SirA on gene expression are primarily mediated by activating csrB and csrC transcription. We also used in vivo crosslinking and other experimental approaches to reveal new features of csrB/csrC regulation by the DeaD and SrmB RNA helicases, IHF, ppGpp and DksA. Finally, the phylogenetic distribution of BarA-UvrY was analyzed and found to be uniquely characteristic of γ-Proteobacteria and strongly anti-correlated with fliW, which encodes a protein that binds to CsrA and antagonizes its activity in Bacillus subtilis. We propose that BarA-UvrY and orthologous TCS transcribe sRNA antagonists of CsrA throughout the γ-Proteobacteria, but rarely or never perform this function in other species.
26670385 Iron, a major protein cofactor, is essential for most organisms. Despite the well-known effects of O2 on the oxidation state and solubility of iron, the impact of O2 on cellular iron homeostasis is not well understood. Here we report that in Escherichia coli K-12, the lack of O2 dramatically changes expression of genes controlled by the global regulators of iron homeostasis, the transcription factor Fur and the small RNA RyhB. Using chromatin immunoprecipitation sequencing (ChIP-seq), we found anaerobic conditions promote Fur binding to more locations across the genome. However, by expression profiling, we discovered that the major effect of anaerobiosis was to increase the magnitude of Fur regulation, leading to increased expression of iron storage proteins and decreased expression of most iron uptake pathways and several Mn-binding proteins. This change in the pattern of gene expression also correlated with an unanticipated decrease in Mn in anaerobic cells. Changes in the genes posttranscriptionally regulated by RyhB under aerobic and anaerobic conditions could be attributed to O2-dependent changes in transcription of the target genes: aerobic RyhB targets were enriched in iron-containing proteins associated with aerobic energy metabolism, whereas anaerobic RyhB targets were enriched in iron-containing anaerobic respiratory functions. Overall, these studies showed that anaerobiosis has a larger impact on iron homeostasis than previously anticipated, both by expanding the number of direct Fur target genes and the magnitude of their regulation and by altering the expression of genes predicted to be posttranscriptionally regulated by the small RNA RyhB under iron-limiting conditions.IMPORTANCE: Microbes and host cells engage in an "arms race" for iron, an essential nutrient that is often scarce in the environment. Studies of iron homeostasis have been key to understanding the control of iron acquisition and the downstream pathways that enable microbes to compete for this valuable resource. Here we report that O2 availability affects the gene expression programs of two Escherichia coli master regulators that function in iron homeostasis: the transcription factor Fur and the small RNA regulator RyhB. Fur appeared to be more active under anaerobic conditions, suggesting a change in the set point for iron homeostasis. RyhB preferentially targeted iron-containing proteins of respiration-linked pathways, which are differentially expressed under aerobic and anaerobic conditions. Such findings may be relevant to the success of bacteria within their hosts since zones of reduced O2 may actually reduce bacterial iron demands, making it easier to win the arms race for iron.
26307168 Repeated extragenic palindromes (REPs) in the enterobacterial genomes are usually composed of individual palindromic units separated by linker sequences. A total of 355 annotated REPs are distributed along the Escherichia coli genome. RNA sequence (RNAseq) analysis showed that almost 80% of the REPs in E. coli are transcribed. The DNA sequence of REP325 showed that it is a cluster of six repeats, each with two palindromic units capable of forming cruciform structures in supercoiled DNA. Here, we report that components of the REP325 element and at least one of its RNA products play a role in bacterial nucleoid DNA condensation. These RNA not only are present in the purified nucleoid but bind to the bacterial nucleoid-associated HU protein as revealed by RNA IP followed by microarray analysis (RIP-Chip) assays. Deletion of REP325 resulted in a dramatic increase of the nucleoid size as observed using transmission electron microscopy (TEM), and expression of one of the REP325 RNAs, nucleoid-associated noncoding RNA 4 (naRNA4), from a plasmid restored the wild-type condensed structure. Independently, chromosome conformation capture (3C) analysis demonstrated physical connections among various REP elements around the chromosome. These connections are dependent in some way upon the presence of HU and the REP325 element; deletion of HU genes and/or the REP325 element removed the connections. Finally, naRNA4 together with HU condensed DNA in vitro by connecting REP325 or other DNA sequences that contain cruciform structures in a pairwise manner as observed by atomic force microscopy (AFM). On the basis of our results, we propose molecular models to explain connections of remote cruciform structures mediated by HU and naRNA4.IMPORTANCE: Nucleoid organization in bacteria is being studied extensively, and several models have been proposed. However, the molecular nature of the structural organization is not well understood. Here we characterized the role of a novel nucleoid-associated noncoding RNA, naRNA4, in nucleoid structures both in vivo and in vitro. We propose models to explain how naRNA4 together with nucleoid-associated protein HU connects remote DNA elements for nucleoid condensation. We present the first evidence of a noncoding RNA together with a nucleoid-associated protein directly condensing nucleoid DNA.
26125937 Adherent-invasive Escherichia coli (AIEC) strains are detected more frequently within mucosal lesions of patients with Crohn's disease (CD). The AIEC phenotype consists of adherence and invasion of intestinal epithelial cells and survival within macrophages of these bacteria in vitro. Our aim was to identify candidate transcripts that distinguish AIEC from non-invasive E. coli (NIEC) strains and might be useful for rapid and accurate identification of AIEC by culture-independent technology. We performed comparative RNA-Sequence (RNASeq) analysis using AIEC strain LF82 and NIEC strain HS during exponential and stationary growth. Differential expression analysis of coding sequences (CDS) homologous to both strains demonstrated 224 and 241 genes with increased and decreased expression, respectively, in LF82 relative to HS. Transition metal transport and siderophore metabolism related pathway genes were up-regulated, while glycogen metabolic and oxidation-reduction related pathway genes were down-regulated, in LF82. Chemotaxis related transcripts were up-regulated in LF82 during the exponential phase, but flagellum-dependent motility pathway genes were down-regulated in LF82 during the stationary phase. CDS that mapped only to the LF82 genome accounted for 747 genes. We applied an in silico subtractive genomics approach to identify CDS specific to AIEC by incorporating the genomes of 10 other previously phenotyped NIEC. From this analysis, 166 CDS mapped to the LF82 genome and lacked homology to any of the 11 human NIEC strains. We compared these CDS across 13 AIEC, but none were homologous in each. Four LF82 gene loci belonging to clustered regularly interspaced short palindromic repeats region (CRISPR)--CRISPR-associated (Cas) genes were identified in 4 to 6 AIEC and absent from all non-pathogenic bacteria. As previously reported, AIEC strains were enriched for pdu operon genes. One CDS, encoding an excisionase, was shared by 9 AIEC strains. Reverse transcription quantitative polymerase chain reaction assays for 6 genes were conducted on fecal and ileal RNA samples from 22 inflammatory bowel disease (IBD), and 32 patients without IBD (non-IBD). The expression of Cas loci was detected in a higher proportion of CD than non-IBD fecal and ileal RNA samples (p <0.05). These results support a comparative genomic/transcriptomic approach towards identifying candidate AIEC signature transcripts.
26020590 In bacteria, selective promoter recognition by RNA polymerase is achieved by its association with σ factors, accessory subunits able to direct RNA polymerase "core enzyme" (E) to different promoter sequences. Using Chromatin Immunoprecipitation-sequencing (ChIP-seq), we searched for promoters bound by the σ(S)-associated RNA polymerase form (Eσ(S)) during transition from exponential to stationary phase. We identified 63 binding sites for Eσ(S) overlapping known or putative promoters, often located upstream of genes (encoding either ORFs or non-coding RNAs) showing at least some degree of dependence on the σ(S)-encoding rpoS gene. Eσ(S) binding did not always correlate with an increase in transcription level, suggesting that, at some σ(S)-dependent promoters, Eσ(S) might remain poised in a pre-initiation state upon binding. A large fraction of Eσ(S)-binding sites corresponded to promoters recognized by RNA polymerase associated with σ(70) or other σ factors, suggesting a considerable overlap in promoter recognition between different forms of RNA polymerase. In particular, Eσ(S) appears to contribute significantly to transcription of genes encoding proteins involved in LPS biosynthesis and in cell surface composition. Finally, our results highlight a direct role of Eσ(S) in the regulation of non coding RNAs, such as OmrA/B, RyeA/B and SibC.
25735747 DNA-binding motifs that are recognized by transcription factors (TFs) have been well studied; however, challenges remain in determining the in vivo architecture of TF-DNA complexes on a genome-scale. Here, we determined the in vivo architecture of Escherichia coli arginine repressor (ArgR)-DNA complexes using high-throughput sequencing of exonuclease-treated chromatin-immunoprecipitated DNA (ChIP-exo). The ChIP-exo has a unique peak-pair pattern indicating 5' and 3' ends of ArgR-binding region. We identified 62 ArgR-binding loci, which were classified into three groups, comprising single, double and triple peak-pairs. Each peak-pair has a unique 93 base pair (bp)-long (±2 bp) ArgR-binding sequence containing two ARG boxes (39 bp) and residual sequences. Moreover, the three ArgR-binding modes defined by the position of the two ARG boxes indicate that DNA bends centered between the pair of ARG boxes facilitate the non-specific contacts between ArgR subunits and the residual sequences. Additionally, our approach may also reveal other fundamental structural features of TF-DNA interactions that have implications for studying genome-scale transcriptional regulatory networks.
25275371 Flagellar synthesis is a highly regulated process in all motile bacteria. In Escherichia coli and related species, the transcription factor FlhDC is the master regulator of a multi-tiered transcription network. FlhDC activates transcription of a number of genes, including some flagellar genes and the gene encoding the alternative Sigma factor FliA. Genes whose expression is required late in flagellar assembly are primarily transcribed by FliA, imparting temporal regulation of transcription and coupling expression to flagellar assembly. In this study, we use ChIP-seq and RNA-seq to comprehensively map the E. coli FlhDC and FliA regulons. We define a surprisingly restricted FlhDC regulon, including two novel regulated targets and two binding sites not associated with detectable regulation of surrounding genes. In contrast, we greatly expand the known FliA regulon. Surprisingly, 30 of the 52 FliA binding sites are located inside genes. Two of these intragenic promoters are associated with detectable noncoding RNAs, while the others either produce highly unstable RNAs or are inactive under these conditions. Together, our data redefine the E. coli flagellar regulatory network, and provide new insight into the temporal orchestration of gene expression that coordinates the flagellar assembly process.
25049088 To further an improved understanding of the mechanisms used by bacterial cells to survive extreme exposure to ionizing radiation (IR), we broadly screened nonessential Escherichia coli genes for those involved in IR resistance by using transposon-directed insertion sequencing (TraDIS). Forty-six genes were identified, most of which become essential upon heavy IR exposure. Most of these were subjected to direct validation. The results reinforced the notion that survival after high doses of ionizing radiation does not depend on a single mechanism or process, but instead is multifaceted. Many identified genes affect either DNA repair or the cellular response to oxidative damage. However, contributions by genes involved in cell wall structure/function, cell division, and intermediary metabolism were also evident. About half of the identified genes have not previously been associated with IR resistance or recovery from IR exposure, including eight genes of unknown function.
24927582 The molecular mechanisms of ethanol toxicity and tolerance in bacteria, although important for biotechnology and bioenergy applications, remain incompletely understood. Genetic studies have identified potential cellular targets for ethanol and have revealed multiple mechanisms of tolerance, but it remains difficult to separate the direct and indirect effects of ethanol. We used adaptive evolution to generate spontaneous ethanol-tolerant strains of Escherichia coli, and then characterized mechanisms of toxicity and resistance using genome-scale DNAseq, RNAseq, and ribosome profiling coupled with specific assays of ribosome and RNA polymerase function. Evolved alleles of metJ, rho, and rpsQ recapitulated most of the observed ethanol tolerance, implicating translation and transcription as key processes affected by ethanol. Ethanol induced miscoding errors during protein synthesis, from which the evolved rpsQ allele protected cells by increasing ribosome accuracy. Ribosome profiling and RNAseq analyses established that ethanol negatively affects transcriptional and translational processivity. Ethanol-stressed cells exhibited ribosomal stalling at internal AUG codons, which may be ameliorated by the adaptive inactivation of the MetJ repressor of methionine biosynthesis genes. Ethanol also caused aberrant intragenic transcription termination for mRNAs with low ribosome density, which was reduced in a strain with the adaptive rho mutation. Furthermore, ethanol inhibited transcript elongation by RNA polymerase in vitro. We propose that ethanol-induced inhibition and uncoupling of mRNA and protein synthesis through direct effects on ribosomes and RNA polymerase conformations are major contributors to ethanol toxicity in E. coli, and that adaptive mutations in metJ, rho, and rpsQ help protect these central dogma processes in the presence of ethanol.
24272778 Escherichia coli AraC is a well-described transcription activator of genes involved in arabinose metabolism. Using complementary genomic approaches, chromatin immunoprecipitation (ChIP)-chip, and transcription profiling, we identify direct regulatory targets of AraC, including five novel target genes: ytfQ, ydeN, ydeM, ygeA, and polB. Strikingly, only ytfQ has an established connection to arabinose metabolism, suggesting that AraC has a broader function than previously described. We demonstrate arabinose-dependent repression of ydeNM by AraC, in contrast to the well-described arabinose-dependent activation of other target genes. We also demonstrate unexpected read-through of transcription at the Rho-independent terminators downstream of araD and araE, leading to significant increases in the expression of polB and ygeA, respectively. AraC is highly conserved in the related species Salmonella enterica. We use ChIP sequencing (ChIP-seq) and RNA sequencing (RNA-seq) to map the AraC regulon in S. enterica. A comparison of the E. coli and S. enterica AraC regulons, coupled with a bioinformatic analysis of other related species, reveals a conserved regulatory network across the family Enterobacteriaceae comprised of 10 genes associated with arabinose transport and metabolism.
24146625 Despite the importance of maintaining redox homeostasis for cellular viability, how cells control redox balance globally is poorly understood. Here we provide new mechanistic insight into how the balance between reduced and oxidized electron carriers is regulated at the level of gene expression by mapping the regulon of the response regulator ArcA from Escherichia coli, which responds to the quinone/quinol redox couple via its membrane-bound sensor kinase, ArcB. Our genome-wide analysis reveals that ArcA reprograms metabolism under anaerobic conditions such that carbon oxidation pathways that recycle redox carriers via respiration are transcriptionally repressed by ArcA. We propose that this strategy favors use of catabolic pathways that recycle redox carriers via fermentation akin to lactate production in mammalian cells. Unexpectedly, bioinformatic analysis of the sequences bound by ArcA in ChIP-seq revealed that most ArcA binding sites contain additional direct repeat elements beyond the two required for binding an ArcA dimer. DNase I footprinting assays suggest that non-canonical arrangements of cis-regulatory modules dictate both the length and concentration-sensitive occupancy of DNA sites. We propose that this plasticity in ArcA binding site architecture provides both an efficient means of encoding binding sites for ArcA, σ(70)-RNAP and perhaps other transcription factors within the same narrow sequence space and an effective mechanism for global control of carbon metabolism to maintain redox homeostasis.
24146601 Chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) has been successfully used for genome-wide profiling of transcription factor binding sites, histone modifications, and nucleosome occupancy in many model organisms and humans. Because the compact genomes of prokaryotes harbor many binding sites separated by only few base pairs, applications of ChIP-Seq in this domain have not reached their full potential. Applications in prokaryotic genomes are further hampered by the fact that well studied data analysis methods for ChIP-Seq do not result in a resolution required for deciphering the locations of nearby binding events. We generated single-end tag (SET) and paired-end tag (PET) ChIP-Seq data for σ⁷⁰ factor in Escherichia coli (E. coli). Direct comparison of these datasets revealed that although PET assay enables higher resolution identification of binding events, standard ChIP-Seq analysis methods are not equipped to utilize PET-specific features of the data. To address this problem, we developed dPeak as a high resolution binding site identification (deconvolution) algorithm. dPeak implements a probabilistic model that accurately describes ChIP-Seq data generation process for both the SET and PET assays. For SET data, dPeak outperforms or performs comparably to the state-of-the-art high-resolution ChIP-Seq peak deconvolution algorithms such as PICS, GPS, and GEM. When coupled with PET data, dPeak significantly outperforms SET-based analysis with any of the current state-of-the-art methods. Experimental validations of a subset of dPeak predictions from σ⁷⁰ PET ChIP-Seq data indicate that dPeak can estimate locations of binding events with as high as 2 to 21 bp resolution. Applications of dPeak to σ⁷⁰ ChIP-Seq data in E. coli under aerobic and anaerobic conditions reveal closely located promoters that are differentially occupied and further illustrate the importance of high resolution analysis of ChIP-Seq data.
23818864 FNR is a well-studied global regulator of anaerobiosis, which is widely conserved across bacteria. Despite the importance of FNR and anaerobiosis in microbial lifestyles, the factors that influence its function on a genome-wide scale are poorly understood. Here, we report a functional genomic analysis of FNR action. We find that FNR occupancy at many target sites is strongly influenced by nucleoid-associated proteins (NAPs) that restrict access to many FNR binding sites. At a genome-wide level, only a subset of predicted FNR binding sites were bound under anaerobic fermentative conditions and many appeared to be masked by the NAPs H-NS, IHF and Fis. Similar assays in cells lacking H-NS and its paralog StpA showed increased FNR occupancy at sites bound by H-NS in WT strains, indicating that large regions of the genome are not readily accessible for FNR binding. Genome accessibility may also explain our finding that genome-wide FNR occupancy did not correlate with the match to consensus at binding sites, suggesting that significant variation in ChIP signal was attributable to cross-linking or immunoprecipitation efficiency rather than differences in binding affinities for FNR sites. Correlation of FNR ChIP-seq peaks with transcriptomic data showed that less than half of the FNR-regulated operons could be attributed to direct FNR binding. Conversely, FNR bound some promoters without regulating expression presumably requiring changes in activity of condition-specific transcription factors. Such combinatorial regulation may allow Escherichia coli to respond rapidly to environmental changes and confer an ecological advantage in the anaerobic but nutrient-fluctuating environment of the mammalian gut.
23632166 To fit within the confines of the cell, bacterial chromosomes are highly condensed into a structure called the nucleoid. Despite the high degree of compaction in the nucleoid, the genome remains accessible to essential biological processes, such as replication and transcription. Here, we present the first high-resolution chromosome conformation capture-based molecular analysis of the spatial organization of the Escherichia coli nucleoid during rapid growth in rich medium and following an induced amino acid starvation that promotes the stringent response. Our analyses identify the presence of origin and terminus domains in exponentially growing cells. Moreover, we observe an increased number of interactions within the origin domain and significant clustering of SeqA-binding sequences, suggesting a role for SeqA in clustering of newly replicated chromosomes. By contrast, 'histone-like' protein (i.e. Fis, IHF and H-NS) -binding sites did not cluster, and their role in global nucleoid organization does not manifest through the mediation of chromosomal contacts. Finally, genes that were downregulated after induction of the stringent response were spatially clustered, indicating that transcription in E. coli occurs at transcription foci.
23586855 BACKGROUND: ChIP-chip and ChIP-seq are widely used methods to map protein-DNA interactions on a genomic scale in vivo. Waldminghaus and Skarstad recently reported, in this journal, a modified method for ChIP-chip. Based on a comparison of our previously-published ChIP-chip data for Escherichia coli σ32 with their own data, Waldminghaus and Skarstad concluded that many of the σ32 targets identified in our earlier work are false positives. In particular, we identified many non-canonical σ32 targets that are located inside genes or are associated with genes that show no detectable regulation by σ32. Waldminghaus and Skarstad propose that such non-canonical sites are artifacts, identified due to flaws in the standard ChIP methodology. Waldminghaus and Skarstad suggest specific changes to the standard ChIP procedure that reportedly eliminate the claimed artifacts. RESULTS: We reanalyzed our published ChIP-chip datasets for σ32 and the datasets generated by Waldminghaus and Skarstad to assess data quality and reproducibility. We also performed targeted ChIP/qPCR for σ32 and an unrelated transcription factor, AraC, using the standard ChIP method and the modified ChIP method proposed by Waldminghaus and Skarstad. Furthermore, we determined the association of core RNA polymerase with disputed σ32 promoters, with and without overexpression of σ32. We show that (i) our published σ32 ChIP-chip datasets have a consistently higher dynamic range than those of Waldminghaus and Skarstad, (ii) our published σ32 ChIP-chip datasets are highly reproducible, whereas those of Waldminghaus and Skarstad are not, (iii) non-canonical σ32 target regions are enriched in a σ32 ChIP in a heat shock-dependent manner, regardless of the ChIP method used, (iv) association of core RNA polymerase with some disputed σ32 target genes is induced by overexpression of σ32, (v) σ32 targets disputed by Waldminghaus and Skarstad are predominantly those that are most weakly bound, and (vi) the modifications to the ChIP method proposed by Waldminghaus and Skarstad reduce enrichment of all protein-bound genomic regions. CONCLUSIONS: The modifications to the ChIP-chip method suggested by Waldminghaus and Skarstad reduce rather than increase the quality of ChIP data. Hence, the non-canonical σ32 targets identified in our previous study are likely to be genuine. We propose that the failure of Waldminghaus and Skarstad to identify many of these σ32 targets is due predominantly to the lower data quality in their study. We conclude that surprising ChIP-chip results are not artifacts to be ignored, but rather indications that our understanding of DNA-binding proteins is incomplete.
23071782 The phosphate starvation response in bacteria has been studied extensively for the past few decades and the phosphate-limiting signal is known to be mediated via the PhoBR two-component system. However, the global DNA binding profile of the response regulator PhoB and the PhoB downstream responses are currently unclear. In this study, chromatin immunoprecipitation for PhoB was combined with high-density tiling array (ChIP-chip) as well as gene expression microarray to reveal the first global down-stream responses of the responding regulator, PhoB in E. coli. Based on our ChIP-chip experimental data, forty-three binding sites were identified throughout the genome and the known PhoB binding pattern was updated by identifying the conserved pattern from these sites. From the gene expression microarray data analysis, 287 differentially expressed genes were identified in the presence of PhoB activity. By comparing the results obtained from our ChIP-chip and microarray experiments, we were also able to identify genes that were directly or indirectly affected through PhoB regulation. Nineteen out of these 287 differentially expressed genes were identified as the genes directly regulated by PhoB. Seven of the 19 directly regulated genes (including phoB) are transcriptional regulators. These transcriptional regulators then further pass the signal of phosphate starvation down to the remaining differentially expressed genes. Our results unveiled the genome-wide binding profile of PhoB and the downstream responses under phosphate starvation. We also present the hierarchical structure of the phosphate sensing regulatory network. The data suggest that PhoB plays protective roles in membrane integrity and oxidative stress reduction during phosphate starvation.
22180530 IHF and HU are two heterodimeric nucleoid-associated proteins (NAP) that belong to the same protein family but interact differently with the DNA. IHF is a sequence-specific DNA-binding protein that bends the DNA by over 160°. HU is the most conserved NAP, which binds non-specifically to duplex DNA with a particular preference for targeting nicked and bent DNA. Despite their importance, the in vivo interactions of the two proteins to the DNA remain to be described at a high resolution and on a genome-wide scale. Further, the effects of these proteins on gene expression on a global scale remain contentious. Finally, the contrast between the functions of the homo- and heterodimeric forms of proteins deserves the attention of further study. Here we present a genome-scale study of HU- and IHF binding to the Escherichia coli K12 chromosome using ChIP-seq. We also perform microarray analysis of gene expression in single- and double-deletion mutants of each protein to identify their regulons. The sequence-specific binding profile of IHF encompasses ∼30% of all operons, though the expression of <10% of these is affected by its deletion suggesting combinatorial control or a molecular backup. The binding profile for HU is reflective of relatively non-specific binding to the chromosome, however, with a preference for A/T-rich DNA. The HU regulon comprises highly conserved genes including those that are essential and possibly supercoiling sensitive. Finally, by performing ChIP-seq experiments, where possible, of each subunit of IHF and HU in the absence of the other subunit, we define genome-wide maps of DNA binding of the proteins in their hetero- and homodimeric forms.
22082910 Although metabolic networks have been reconstructed on a genome scale, the corresponding reconstruction and integration of governing transcriptional regulatory networks has not been fully achieved. Here we reconstruct such an integrated network for amino acid metabolism in Escherichia coli. Analysis of ChIP-chip and gene expression data for the transcription factors ArgR, Lrp and TrpR showed that 19 out of 20 amino acid biosynthetic pathways are either directly or indirectly controlled by these regulators. Classifying the regulated genes into three functional categories of transport, biosynthesis and metabolism leads to the elucidation of regulatory motifs that constitute the integrated network's basic building blocks. The regulatory logic of these motifs was determined on the basis of relationships between transcription factor binding and changes in the amount of transcript in response to exogenous amino acids. Remarkably, the resulting logic shows how amino acids are differentiated as signaling and nutrient molecules, revealing the overarching regulatory principles of the amino acid stimulon.
21572102 The PurR transcription factor plays a critical role in transcriptional regulation of purine metabolism in enterobacteria. Here, we elucidate the role of PurR under exogenous adenine stimulation at the genome-scale using high-resolution chromatin immunoprecipitation (ChIP)-chip and gene expression data obtained under in vivo conditions. Analysis of microarray data revealed that adenine stimulation led to changes in transcript level of about 10% of Escherichia coli genes, including the purine biosynthesis pathway. The E. coli strain lacking the purR gene showed that a total of 56 genes are affected by the deletion. From the ChIP-chip analysis, we determined that over 73% of genes directly regulated by PurR were enriched in the biosynthesis, utilization and transport of purine and pyrimidine nucleotides, and 20% of them were functionally unknown. Compared to the functional diversity of the regulon of the other general transcription factors in E. coli, the functions and size of the PurR regulon are limited.
21097887 Nucleoid-associated proteins (NAPs) are global regulators of gene expression in Escherichia coli, which affect DNA conformation by bending, wrapping and bridging the DNA. Two of these--H-NS and Fis--bind to specific DNA sequences and structures. Because of their importance to global gene expression, the binding of these NAPs to the DNA was previously investigated on a genome-wide scale using ChIP-chip. However, variation in their binding profiles across the growth phase and the genome-scale nature of their impact on gene expression remain poorly understood. Here, we present a genome-scale investigation of H-NS and Fis binding to the E. coli chromosome using chromatin immunoprecipitation combined with high-throughput sequencing (ChIP-seq). By performing our experiments under multiple time-points during growth in rich media, we show that the binding regions of the two proteins are mutually exclusive under our experimental conditions. H-NS binds to significantly longer tracts of DNA than Fis, consistent with the linear spread of H-NS binding from high- to surrounding lower-affinity sites; the length of binding regions is associated with the degree of transcriptional repression imposed by H-NS. For Fis, a majority of binding events do not lead to differential expression of the proximal gene; however, it has a significant indirect effect on gene expression partly through its effects on the expression of other transcription factors. We propose that direct transcriptional regulation by Fis is associated with the interaction of tandem arrays of Fis molecules to the DNA and possible DNA bending, particularly at operon-upstream regions. Our study serves as a proof-of-principle for the use of ChIP-seq for global DNA-binding proteins in bacteria, which should become significantly more economical and feasible with the development of multiplexing techniques.
19706412 The transcription termination factor Rho is a global regulator of RNA polymerase (RNAP). Although individual Rho-dependent terminators have been studied extensively, less is known about the sites of RNAP regulation by Rho on a genome-wide scale. Using chromatin immunoprecipitation and microarrays (ChIP-chip), we examined changes in the distribution of Escherichia coli RNAP in response to the Rho-specific inhibitor bicyclomycin (BCM). We found approximately 200 Rho-terminated loci that were divided evenly into 2 classes: intergenic (at the ends of genes) and intragenic (within genes). The intergenic class contained noncoding RNAs such as small RNAs (sRNAs) and transfer RNAs (tRNAs), establishing a previously unappreciated role of Rho in termination of stable RNA synthesis. The intragenic class of terminators included a previously uncharacterized set of short antisense transcripts, as judged by a shift in the distribution of RNAP in BCM-treated cells that was opposite to the direction of the corresponding gene. These Rho-terminated antisense transcripts point to a role of noncoding transcription in E. coli gene regulation that may resemble the ubiquitous noncoding transcription recently found to play myriad roles in eukaryotic gene regulation.
19647521 Protein-DNA interactions are fundamental to core biological processes, including transcription, DNA replication, and chromosomal organization. We have developed in vivo protein occupancy display (IPOD), a technology that reveals protein occupancy across an entire bacterial chromosome at the resolution of individual binding sites. Application to Escherichia coli reveals thousands of protein occupancy peaks, highly enriched within and in close proximity to noncoding regulatory regions. In addition, we discovered extensive (>1 kilobase) protein occupancy domains (EPODs), some of which are localized to highly expressed genes, enriched in RNA-polymerase occupancy. However, the majority are localized to transcriptionally silent loci dominated by conserved hypothetical ORFs. These regions are highly enriched in both predicted and experimentally determined binding sites of nucleoid proteins and exhibit extreme biophysical characteristics such as high intrinsic curvature. Our observations implicate these transcriptionally silent EPODs as the elusive organizing centers, long proposed to topologically isolate chromosomal domains.
19052235 Broad-acting transcription factors (TFs) in bacteria form regulons. Here, we present a 4-step method to fully reconstruct the leucine-responsive protein (Lrp) regulon in Escherichia coli K-12 MG 1655 that regulates nitrogen metabolism. Step 1 is composed of obtaining high-resolution ChIP-chip data for Lrp, the RNA polymerase and expression profiles under multiple environmental conditions. We identified 138 unique and reproducible Lrp-binding regions and classified their binding state under different conditions. In the second step, the analysis of these data revealed 6 distinct regulatory modes for individual ORFs. In the third step, we used the functional assignment of the regulated ORFs to reconstruct 4 types of regulatory network motifs around the metabolites that are affected by the corresponding gene products. In the fourth step, we determined how leucine, as a signaling molecule, shifts the regulatory motifs for particular metabolites. The physiological structure that emerges shows the regulatory motifs for different amino acid fall into the traditional classification of amino acid families, thus elucidating the structure and physiological functions of the Lrp-regulon. The same procedure can be applied to other broad-acting TFs, opening the way to full bottom-up reconstruction of the transcriptional regulatory network in bacterial cells.
18370100 Interactions between cis-acting elements and proteins play a key role in transcriptional regulation of all known organisms. To better understand these interactions, researchers developed a method that couples chromatin immunoprecipitation with microarrays (also known as ChIP-chip), which is capable of providing a whole-genome map of protein-DNA interactions. This versatile and high-throughput strategy is initiated by formaldehyde-mediated cross-linking of DNA and proteins, followed by cell lysis, DNA fragmentation, and immunopurification. The immunoprecipitated DNA fragments are then purified from the proteins by reverse-cross-linking followed by amplification, labeling, and hybridization to a whole-genome tiling microarray against a reference sample. The enriched signals obtained from the microarray then are normalized by the reference sample and used to generate the whole-genome map of protein-DNA interactions. The protocol described here has been used for discovering the genomewide distribution of RNA polymerase and several transcription factors of Escherichia coli.
18340041 We determined the genome-wide distribution of the nucleoid-associated protein Fis in Escherichia coli using chromatin immunoprecipitation coupled with high-resolution whole genome-tiling microarrays. We identified 894 Fis-associated regions across the E. coli genome. A significant number of these binding sites were found within open reading frames (33%) and between divergently transcribed transcripts (5%). Analysis indicates that A-tracts and AT-tracts are an important signal for preferred Fis-binding sites, and that A(6)-tracts in particular constitute a high-affinity signal that dictates Fis phasing in stretches of DNA containing multiple and variably spaced A-tracts and AT-tracts. Furthermore, we find evidence for an average of two Fis-binding regions per supercoiling domain in the chromosome of exponentially growing cells. Transcriptome analysis shows that approximately 21% of genes are affected by the deletion of fis; however, the changes in magnitude are small. To address the differential Fis bindings under growth environment perturbation, ChIP-chip analysis was performed using cells grown under aerobic and anaerobic growth conditions. Interestingly, the Fis-binding regions are almost identical in aerobic and anaerobic growth conditions-indicating that the E. coli genome topology mediated by Fis is superficially identical in the two conditions. These novel results provide new insight into how Fis modulates DNA topology at a genome scale and thus advance our understanding of the architectural bases of the E. coli nucleoid.