22890136.txt 32.5 KB
Under-representation of intrinsic terminators across bacterial genomic islands: Rho
Horizontal gene transfer Intrinsic termination Hairpin RNA polymerase Pathogenicity 
1. Introduction
Transcription involves synthesis of RNA by RNA polymerase ( RNAP ) on a DNA template and is functionally divided into initiation , elongation and termination ( von Hippel , 1998 ) . 
The last step i.e. termination involves stopping of elongation , release of the RNA and dissociation of the RNAP machinery ( Richardson and Greenblatt , 1996 ) . 
In bacteria , termination functions by two mechanisms -- intrinsic and factor-dependent ( Peters et al. , 2011 ; Santangelo and Artsimovitch , 2011 ) . 
At intrinsic terminators ( ITs ) , termination is effected by the sequence and structural features of the hairpin and the U-trail of the na-scent RNA ( Epshtein et al. , 2007 ) . 
In contrast , factor-dependent termination predominantly involves the Rho protein which shows little preference for any specific sequence or structure on the RNA and the template DNA for its activity ( Ciampi , 2006 ; Richardson , 2002 ) . 
Rho seems to be the major termination factor for genes that do not have an IT downstream . 
It has been speculated that several genes are likely targets for Rho in vivo , although only few have been characterized ( Ciampi , 2006 ) . 
Historically , termination has received relatively lesser attention than the first two steps of transcription . 
In the recent 
Abbreviations : GI , genomic islands ; IT , intrinsic terminator ; HGT , horizontal gene transfer ; RNAP , RNA polymerase . 
Corresponding author at : Department of Microbiology and Cell Biology , Indian Institute of Science , Bangalore-560012 , India . 
Tel. : +91 80 22932598 ; fax : +91 80 23600668 . 
E-mail address : vraj@mcbl.iisc.ernet.in ( V. Nagaraja ) . 
post-genomics era its regulatory importance in the context of the whole cell is being understood ( Cardinale et al. , 2008 ; Peters et al. , 2009 ) . 
An outcome of the large-scale sequencing and annotation of genomes is the `` pangenome '' concept . 
It is now understood that horizontal gene transfer has played a pivotal role in the evolution of pro-karyotes ( Boto , 2010 ; Boyd et al. , 2009 ; Juhas et al. , 2009 ; Ochman et al. , 2000 ) . 
In a nutshell , horizontal gene transfer ( HGT ) is the acquisition of DNA from the environment and its integration into the genome of the recipient species . 
The genes would be inherited by the daughter cells , even though they were not transmitted `` vertically '' . 
Such genes or gene clusters ( henceforth , generically referred to as genomic islands ( GI ) ) , code for various protein ( s ) with myriad functions . 
Their acquisition can result in `` quantum leaps '' by bacterial genomes ( Boto , 2010 ; Nakamura et al. , 2004 ) . 
However , un-concerted expression of any recently-acquired gene ( s ) or expression of toxic proteins from bacteriophages ( Canchaya et al. , 2004 ) ( Casjens , 2003 ) can have disastrous effects on cellular homeostasis of the host . 
Hence , after entering a genome , most GIs are repressed by silencing mechanisms that act at different stages of the gene expression process ( Navarre et al. , 2007 ) . 
Although mechanisms that control initiation and repression of transcription in GIs have been studied , the importance of transcription termination at genomic islands was noticed only recently . 
In Escherichia coli , the global regulatory role of Rho has emerged from two studies using either microarray or ChIP-chip approaches , ( Cardinale et al. , 2008 ; Peters et al. , 2009 ) . 
Such studies have unambiguously show that in E. coli Rho-dependent termination is important in suppressing aberrant expression from the genomic islands including prophages . 
In this manuscript , we propose that the suppression of transcription in GIs could be a universally conserved function of Rho . 
We show that Rho-dependent termination indeed seems to play a similar role in regulating expression at GIs across diverse bacterial phyla . 
Furthermore , based on the experimental understanding about the mechanism of interactions of Rho with the nascent RNA and RNAP ( Dutta et al. , 2008 ) , we suggest that the lesser density of ITs in GI could actually facilitate Rho-dependent termination . 
2. Materials and methods
The program GeSTer can recognize both canonical and non-canonical intrinsic terminators . 
The mode of action of GeSTer has been described earlier ( Mitra et al. , 2009 , 2010 ; Unniraman et al. , 2002 ) . 
All genome sequences were downloaded in their GenBank format from NCBI ( ftp : / / ftp.ncbi.nih.gov / genbank/genomes/Bacteria / ) . 
Sequence information about GIs was obtained from available literature , and from the NCBI lists for individual genomes . 
Once GeSTer has identified all the ITs for a genome , we analyzed the intrinsic terminator-content in the GIs of that genome . 
For a given GI , the density of ITs ( DIT ) was calculated as the [ ( number of ITs identified ) / ( number of genes ) ] × 100 . 
Similarly , the genomic DIT = [ ( number of ITs identified in genome ) / ( number of genes in genome ) ] × 100 . 
To ascertain which transcription units ( multigenic operon or single-gene ) had an IT downstream , the gene at the 3 ′ end of a multigenic operon was identified from the DOOR database , and the GeSTer results for that genome were analyzed to see if that 3 ′ terminal gene had an IT after its stop codon . 
The HGT ( IT/TU ) % was calculated as ( number of ITs in the GI ) / ( number of transcription units in the GI ) . 
The total number of transcription units in the genome was calculated from the genome-specific statistics available at the DOOR site ( http://csbl1.bmb.uga.edu/OperonDB_10142009/displayspecies . 
php ) . 
Genomic ( IT/TU ) % is calculated as ( number of ITs in the genome ) / ( number of transcription units in the genome ) . 
3. Results and discussion
3.1. Rationale for the experimental design
A salient result of the microarray studies in E. coli K-12 is that , when Rho action was inhibited by the antibiotic Bicyclomycin , the transcription of several GIs ( known as K-islands in E. coli K-12 MG1655 ) significantly increased ( Cardinale et al. , 2008 ) . 
These studies also revealed an under-representation of ITs in the same K-islands . 
Yet another study , treatment of E. coli K-12 MG1655 with sublethal dosage of Bicyclomycin followed by ChIP-chip analysis showed several regions on the chromosome where RNAP could localize only in presence of Bicyclomicin ( Peters et al. , 2009 ) . 
The inference was that Bicyclomycin specifically inhibited Rho in these cells , thus allowing RNAP to transcribe into regions where Rho would have caused termination in absence of the antibiotic ( Peters et al. , 2009 ) . 
These genomic regions , named Bicyclomycin Sensitive Regions ( BSRs ) , are thus sites where Rho-dependent termination would normally occur . 
The study identified 23 BSRs which were downstream of K-12-specific genes ( belonging to K-islands ) or pro-phage DNA . 
We analyzed the IT profile of these BSRs and found that they have an under-representation of ITs and hairpins . 
Of the 23 BSRs that are downstream of the GIs , there was not a single IT or even a stable hairpin-forming sequence in 16 ( 70 % ) of them ( Supplementary Table S1 ) . 
Thus , ITs are under-represented in those regions of E. coli genome where Rho is functioning . 
In fact , the scarcity of ITs seems to have been compensated by the action of Rho ( Cardinale et al. , 2008 ) . 
Hence , Rho is most likely to terminate transcription at the ends of genes where ITs are absent as these are the only mechanisms of termination known in bacteria . 
This would mean that the intrinsic DIT of the GIs of any genome could be a pointer of Rho activity at such genomic islands . 
In other words , if the DIT of GI ( s ) is lower than the DIT of the whole genome , then Rho-dependent termination is probably an important mode of regulation in these GI ( s ) . 
Hence , we selected representative genomes from different phyla and clas-ses , for which information about GIs was available , and analyzed their IT profiles using the algorithm , GeSTer , which detects both ca-nonical and non-canonical ITs ( Mitra et al. , 2009 ; Mitra et al. , 2010 ; Unniraman et al. , 2002 ) . 
If the assumption that GIs across bacteria have extensive Rho-dependent termination is correct , we should observe a consistent trend of decreased presence of ITs in GIs in different species . 
Our sample included well characterized prophages , cryptic phages and other kinds of GIs . 
3.2. GIs of other E. coli strains are poor in ITs
The importance of Rho-dependent termination in GIs of E. coli was based primarily on experiments in E. coli K-12 MG1655 . 
In particular , the paucity of ITs in GIs was shown only for the K-islands of E. coli K-12 ( Blattner et al. , 1997 ; Cardinale et al. , 2008 ) . 
At first , we ensured that the results reported for the K-islands of E. coli K-12 could also be obtained using GeSTer . 
Tabulation of the ITs in 42 K-islands ( Cardinale et al. , 2008 ) showed that indeed , there was ~ 50 % reduction in DI . 
DIT in these GIs was only 21.9 % as compared to the whole genomic DIT of E. coli K-12 of 41.7 % . 
Thus , although we had used a different algorithm , these results were consistent with the previous study . 
Next , we considered another `` model '' strain , E. coli 0157 : H7 EDL933 , which also houses several GI , collectively called O-islands ( OIs ) . 
As with E. coli K-12 , the OIs of this genome also show enhanced transcription after bicyclomycin treatment ( Cardinale et al. , 2008 ) . 
Hence , the IT profile of 11 OIs -- OI-7 , 8 , 9 , 35 , 36 , 43 , 44 , 45 , 47 , 48 and 50 ( consisting of a total of 616 genes i.e. 11.8 % of genome ) of E. coli 0157 : H7 EDL933 was analyzed . 
The major criterion for selecting these OIs was that they all were relatively large GIs . 
The largest among them , OI-43 , encoded for 106 genes while the smallest , OI-35 , contained 15 genes . 
Additionally , in order to assess the regions annotated as resident phages , we selected a prophage ( OI-45 ) and four representative cryptic phages . 
Out of 616 genes from these 11 O-islands , only 135 genes have an IT immediately downstream . 
Thus , as observed in E. coli K-12 , the number of IT is distinctly lower ( DIT = 21.9 % ) in these islands as compared to the genomic DIT of 36.6 % ( Fig. 1A ) . 
A closer examination into the IT profiles of the individual OIs showed that large stretches of genes are devoid of any ITs . 
Also , as reported in E. coli K-12 , we note that many genes occur in series on the same strand and most of these genes , including the gene at the 3 ′ end of the series , often lack ITs ( Cardinale et al. , 2008 ) . 
If these serial gene clusters are operons , then it seems likely that they lack an IT downstream . 
In addition , ITs are absent for most of the genes that are at the 5 ′ or 3 ′ ends of the OIs . 
Lack of identifiable ITs hints at the possibility that Rho-dependent termination is probably the major termination mechanism in these OIs . 
The genomes of two other strains of E. coli -- enteropathogenic E. coli 234869 ( Iguchi et al. , 2009 ) and uropathogenic E. coli CFT073 ( Lloyd et al. , 2007 ) -- code for several experimentally characterized pathogenicity islands . 
The total number of GI genes identified in E. coli 234869 is 493 . 
Besides prophages , these GIs also include the LEE island that has been implicated in virulence . 
Similarly , the CFT073 strain houses the well-known islands -- PAI-II , PAI-III and PAI-CFT073-serX -- that encode a total of 299 genes ( Lloyd et al. , 2007 , 2009 ) . 
The DIs of these islands show that there is a similar decrease in abundance of ITs . 
The DIT of the islands were 19.9 % and 20.1 % for strains 234869 and CFT073 respectively i.e. between 50 and 58 % of the genomic values ( Fig. 1B , Supplementary Fig. 1A ) . 
A detailed analysis of the two islands -- PAI-II from strain CFT073 and LEE from strain 234869 for the presence of ITs in relation to the genomi organization reaffirms the observations . 
PAI-II has 74 genes and can be considered divisible into 17 gene clusters ( Fig. 1C ) . 
Of these , ITs are totally absent in case of 11 clusters while only three clusters have ITs at the ends . 
The IT profile of the LEE island is analogous . 
Of the nine gene clusters in the LEE island , six clusters have no IT at all ( Supplementary Fig S2 ) . 
This includes the clusters that are at the two ends of the LEE island , suggesting that Rho-dependent termination may also prevent read-through transcription into and out of the island . 
Thus , both the PAI-II and LEE islands are poor in ITs , with skewed distribution suggesting that Rho-dependent termination is a significant regulator of these GIs . 
Thus , it seems that GIs across various E. coli strains are poor in ITs , and , as experiments have shown in E. coli K-12 , are most likely `` hotspots '' for Rho-dependent termination . 
3.3. GIs in other γ-proteobacteria have a dearth of ITs
Salmonella enterica serovar Typhimurium ( LT2 isolate ) has a ge-nome that encodes several prophages , pathogenicity islands and phage remnants . 
We considered two pathogenicity islands SPI-I ( Lostroh and Lee , 2001 ) and SPI-II ( Hensel et al. , 1997 ) , four pro-phages and a phage remnant region ( 4422192 -- 4438335 bp ) . 
Similar to the results described above with the E. coli strains , both SPI-I and SPI-II showed very low DIT of 10.4 % and 6.8 % respectively , compared to the genomic DIT of 37.2 % ( Fig. 2A ) . 
Analysis of four representative prophage regions -- Gifsy-1 , Gifsy-2 , Fels-1 and Fels-2 also showed that the DIT values are consistently lower -- between 40 and 60 % of the genomic value . 
To ascertain that the paucity of ITs was observable only within the GIs , and was not a general feature of that part of the ge-nome , we resorted to a `` neighboring region '' approach . 
We analyzed the DIT ( Density of Intrinsic Terminators ) value of genomic stretches immediately adjacent to a GI . 
The stretch considered was very similar in total number of genes to the GI , but was part of the `` core genome '' . 
Thus , it served as a `` control experiment '' for the in silico analysis . 
The abundance of ITs in a 42-genes stretch ( STM2644-2693 ) ( DIT = 42.8 % ) was in sharp contrast to the neighboring 46-gene Fels-2 prophage which has a DIT of 17.4 % . 
Such `` neighborhood analysis '' revealed similar trends in other genomes . 
Another γ-proteobacterium , Pseudomonas aeruginosa PA-14 , harbors the large PAP-1 island ( Battle et al. , 2009 ) , shown to be important for virulence . 
GeSTer analysis showed that the PAP-1 island has only 11 ITs although it consists of 114 genes . 
This means that its DIT is only ~ 34 % of the genomic value . 
A closer inspection of the PAP-1 island ( Fig. 3A ) indicated the absence of ITs at either end of the island . 
Also , only two of the 16 gene-clusters in the PAP-1 island are probably terminated with an IT . 
Thus , Rho seems to be the major effector of transcription termination in this GI . 
The results are in congruence with the initial report in E. coli K-12 and suggest that Rho-dependent termination is indeed a strong regulator at GIs in γ-proteobacteria 
3.4. IT profiles of α-, β- and ε-proteobacteria species
Next , extending the study beyond γ-proteobacteria , the geno-mic islands of 3 representative proteobacteria -- Bordetella petrii ( β-proteobacteria ) , Helicobacter pylori ( ε-proteobacteria ) and Brucella melitensis ( α-proteobacteria ) -- were analyzed . 
B. petrii , an environmental Bordetella species ( Lechner et al. , 2009 ) has 7 large genomic islands ( GI-1 to − 7 ) and 2 prophages , encoding for a total of 1150 genes . 
In line with the previous results , the GIs of B. petrii also have a lesser number of ITs ( ~ 43 % of genomic average ) . 
If a `` control '' region of 90 genes just upstream of GI-7 ( encodes 87 genes ) is considered , it has a DIT of 33.3 % , in sharp contrast to GI-7 's DIT of 11.5 % ( Fig. 2B ) . 
Also , the two prophage regions in the B. petrii have the lowest DIT . 
H. pylori 26695 strain has a DIT of 14.7 % ( Mitra et al. , 2009 ) . 
However , the DIT of the 27-gene encoding cag island ( Blomstergren et al. , 2004 ) is 7.4 % i.e. 50 % of the genomic value ( Figs. 2C , 3B ) . 
It is noteworthy that although absolute values of ITs are lower in this species when compared to others analyzed , the trend of lower DIT in GIs is consistent across distant species . 
In contrast , a `` control '' region of 35 genes immediately upstream of cag island showed a DIT of 13.9 % , very close to the genomic average . 
In case of the pathogenic α-proteobacteria , B. melitensis , a comparison between the genomic DIT ( 27.9 % ) and that of the genomic islands ( 16 % ) ( Supplementary Fig. 1B ) revealed a consistent trend observed in other proteobacterial species . 
3.5. ITs in the genomic islands in other bacterial phyla
Several actinobacteria genomes sequenced so far also have their share of GIs . 
A functional Rho homologue has been reported for Micrococcus luteus , ( Nowatzke et al. , 1997 ) Streptomyces lividans and Mycobacterium tuberculosis ( Kalarickal et al. , 2009 ) . 
Thus , it is possible that Rho could play a similar `` silencing of xenogenic DNA '' role in actinobacteria . 
For the present analysis , three genomes were selected -- M. tuberculosis , Mycobacterium abscessus and Corynebacterium diphtheriae . 
GIs have been recently identified in the M. tuberculosis H37Rv genome ( Becq et al. , 2007 ) . 
The search identified only 36 ITs immediately downstream of the 454 GIs ' genes in M. tuberculosis H37Rv ( Supplementary Fig. 1C ) . 
Also , several `` large islands '' , notably , Rv739-750 , Rv2954-2961 , Rv3081-3089 , Rv3108-3227 and Rv298-303 did not have a single flanking or internal IT . 
The islands Rv0057-0080 ( Fig. 3C ) and Rv0595-0614 had several gene clusters with no IT at their 3 ′ ends . 
The experimentally characterized genomic island Rv0986-0988 ( RosasMagallanes et al. , 2006 ) had no IT either . 
Thus , even for a bacterium which has a distinctly low abundance of ITs ( DIT = 11.9 % ) , the GIs , which comprise ~ 10 % of the genome , show further decrease in the IT content ( DIT = 7.7 % ) . 
In M. abscessus ( Ripoll et al. , 2009 ) , the causative agent of Buruli ulcer , a similar pattern is observed ( Fig. 2D ) . 
Furthermore , if the GIs of M. abscessus are divided into prophage and non-prophage regions , then the three prophages show a further reduction in th number of ITs . 
In contrast , a `` control '' region consisting of similar number of genes ( MAB0198-0220 ) immediately upstream of the prophage , MAB0221-0242 , has a much larger number of ITs . 
An analogous situation is seen in the case of the non-mycobacterial actinomycete , C. diphtheriae 13129 ( Cerdeno-Tarraga et al. , 2003 ) . 
The two known prophages of C. diphtheriae are the poorest with respect to the number of ITs ( Supplementary Fig. 1D ) . 
3.6. Fewer operons within GIs have an IT downstream
It can be argued that the differences in DIT between GIs and `` core '' regions of a given genome are a function of their operonic content . 
To check this possible scenario , we used the DOOR ( Mao et al. , 2009 ) , which is considered a reliable database for operon prediction ( Brouwer et al. , 2008 ) . 
For any genome , the complete set of transcription units ( TU ) , that includes both multigenic operons and single-gene TU , can be obtained from DOOR . 
This data allowed us to ascertain the number of TUs in some of the GIs analyzed from diverse species and also how many of those TUs have an IT downstream . 
Although the results in this case are obtained from two prediction systems , DOOR and GeSTer are among the most reliable databases available , and so errors are likely to be minimal . 
The results show that fewer TUs belonging GIs have an IT downstream , as compared to the genomic estimates ( Supplementary Table 2 ) . 
The lack of ITs at the 3 ′ ends of many TU in these GIs indicates that these TU are employing Rho-dependent termination . 
3.7 . 
ITs in the GIs of Bacillus subtilis and Staphylococcus aureus , two species with low expression of Rho 
B. subtilis , a firmicute , has been shown experimentally to have very low intracellular levels of Rho , constituting about 0.004 % of the total cellular soluble protein ( Ingham et al. , 1999 ) . 
In contrast , the level of Rho is ~ 0.15 % of the total protein in E. coli , and even higher levels of expression of Rho is seen in mycobacteria ( Mitra et al. , unpublished observations ) . 
The non-essential nature of Rho in B. subtilis has been demonstrated by the fact that B. subtilis grows well in the presence of Bicyclomycin , the specific inhibitor of Rho although the antibiotic inhibits all known Rho homologues , including B. subtilis Rho in vitro . 
Moreover , in other firmicutes such as S. aureus and Streptococcus species , Rho is non-essential ( Washburn et al. , 2001 ) or has been lost ( Mitra et al. , 2009 ) , illustrating the limited importance of Rho in firmicutes . 
Not surprisingly , the firmicutes are species ' with the highest incidence of ITs ( de Hoon et al. , 2005 ; Mitra et al. , 2009 ) . 
Since Rho action seems to be predominant wherever there is a lack of ITs , as a corollary , in species with decreased levels of Rho , not only `` core genome '' regions but also genomic islands would employ a larger number of ITs for regulation of gene expression . 
Indeed , searching the GIs of B. subtilis 168 genome ( Westers et al. , 2003 ) with GeSTer confirms this hypothesis . 
The DIT for GIs is 32.7 % while the genomic DIT is 41.3 % . 
Similarly , the genomic DIT of S. aureus is 33.3 % , while that of three representative GIs ( including a prophage and TSST-pathogenicity island ) is 25 % . 
Thus of all the species analyzed , the GIs of B. subtilis and S. aureus have the highest number of ITs . 
It is most likely that the overall increased dependence of B. subtilis and S. aureus on intrinsic termination is mirrored in its GIs as there is insufficient Rho for efficient inter - and intragenic termination . 
Such species could be employing other mechanisms such as R-M systems or nucleoid-associated proteins to silence spurious expression of GIs . 
For example , the sequences of GIs often have a GC-content that is lesser than that of the host ge-nome , allowing NAPs to selectively silence such regions ( Gordon et al. , 2010 ; Navarre et al. , 2006 ) . 
3.8. Dearth of ITs in GIs may facilitate Rho-dependent termination
A simple explanation to the observation , consistent with experimental data , is that these GIs are `` hotspots '' for Rho-dependent termination ( Cardinale et al. , 2008 ; Peters et al. , 2009 ) . 
A trans factor such as Rho is probably advantageous over cis-acting intrinsic termination in the context of GIs . 
Rho initiates termination by first loading onto a stretch of nascent RNA called the rut ( Rho utilization ) site ( Richardson , 2003 ; Richardson and Richardson , 1996 ) . 
In the few Rho-dependent terminators that have been experimentally characterized ( Ciampi , 2006 ) the rut site is C-rich but has no consensus sequence . 
Thus , C-rich sequences can not be termed as specific sites and not all C-rich sequences are Rho sites . 
However , Rho also binds to other RNAs as well , and the recent genome-wide studies on Rho have not identified any degenerate sequence at Rho-dependent termination sites . 
Infact , the lack of a conserved sequence in the rut site could well enhance Rho 's ability to carry out both intergenic termination as well as intragenic termination of any gene provided it has access to a sufficient length of naked RNA ( Ciampi , 2006 ; Faus and Richardson , 1990 ; Gowrishankar and Harinarayanan , 2004 ) ( Fig. 4 ) . 
Additionally , many of these xenogenic genes have a relatively poor codon adaptation index . 
Hence , it is likely that the leading ribosome actually lags far behind the transcribing RNAP allowing Rho to bind to the naked RNA in between the RNAP and the ribosome and cause termination ( Richardson , 2006 ) ( Fig. 4 ) . 
Thus , Rho is uniquely suited to be primary mediator for prematurely terminating transcription of genes encoded in GIs . 
Since Rho is functioning , there is no evolutionary constraint in favor of ITs at these GIs . 
However , there are two `` limitations '' to Rho 's mechanism , and both are based on its limited ability to bypass a hairpin structure on the transcript . 
An intervening double stranded RNA stem can prevent E. coli Rho 's ability to translocate along the RNA towards the RNAP ( Steinmetz et al. , 1990 ) . 
Additionally , Rho can not terminate RNAP when the latter has been paused by a class I pause hairpin , ( Dutta et al. , 2008 ) . 
The exact mechanism of how a pause hairpin inhibits Rho-dependent termination is unclear . 
However , it has also been shown recently that Rho employs an allosteric mechanism to cause termination ( Epshtein et al. , 2010 ) . 
Rho interacts with the lid and other domains in the exit channel region of RNAP to transmit an inhibitory signal to the active center of RNAP . 
β ′ domains extending from the lid and other neighboring parts of the β ′ clamp probably mediate signals from Rho ( or Rho-RNA complex ) to the catalytic site . 
The pause hairpin formed in the exit channel uses the β and β ′ domains contacted by Rho to transmit a pause-inducing signal to the active site of RNAP ( Toulokhonov and Landick , 2003 ; Toulokhonov et al. , 2001 , 2007 ) . 
Thus , when Rho encounters RNAP paused at a hairpin , it is most likely that the RNAP domains that Rho would have used to transduce a terminating signal are either occluded by the hairpin or , alternatively , are in a conformation that is unresponsive to the factor ( Supplementary Fig S3 ) . 
The paused RNAP would however , resume elongation after a specific time . 
But , the already formed hairpin that is now extruded from the RNAP exit channel could still impede translocation of Rho . 
Thus , presence of sequences which have potential to form hairpins would effectively reduce the efficiency of Rho-dependent termination and increase the probability of RNAP completing the transcription of toxic or unnecessary genes of GIs . 
Since Rho-dependent termination seems to be a GI-silencing mechanism it is possible that there could be progressive selection against hairpin-encoding sequences to facilitate Rho action . 
In other words , Rho 's silencing action at the various GIs may be facilitated by the lack of structured RNA moieties like intrinsic terminators and hairpins which makes the RNA unstructured and more suitable as a substrate for Rho . 
Moreover , since these regions are not part of the core genome it is easier to select against them . 
However , selection against hairpins by substitution or deletion is also likely to delete a significant fraction of the ITs since all of them consist of a hairpin . 
In this scenario , their removal is not detrimental as these stretches of GIs are now regions where there is efficient Rho-dependent termination . 
In effect , Rho would have functionally replaced ITs in these regions . 
Two pieces of evidence -- both of which focus on the mutual exclusiveness of Rho-dependent terminators and ITs/hairpins -- seem to corroborate the above model . 
Firstly , as described earlier , the Bicyclomycin-sensitive regions ( BSRs ) of E. coli K-12 MG 1655 ge-nome ( Peters et al. , 2009 ) have an under-representation of ITs and potential hairpins , especially the BSRs that are downstream of K-12 specific and prophage DNA ( Supplementary Table S1 ) . 
Secondly , there is an inverse correlation between genomic GC content and the prevalence of ITs in any genome . 
Additionally , Rho seems to become more indispensable in species ' as genomic GC content increases . 
Since experiments have shown that Rho action may be facilitated by the lack of hairpins ( Dutta et al. , 2008 ) it is possible that genomes , which predominantly rely on Rho for termination could have an overall under-representation of hairpins -- both ITs and pause hairpins -- in the regions downstream of the genes . 
This would happen in both GIs and in core genomic regions , as Rho is a global regulator . 
To assay this , we determined the total number of hairpins for a sample of bacterial genomes ( n = 27 ) and computed the genomic ( hairpins/genes ) ratio for these genome ( Fig. 5 ) . 
The results show that as the genomic GC content increases , the genomic ( hairpins/genes ) value tends to decrease . 
The results are in harmony with the fact that most bacteria which lack Rho have AT-rich genomes ( eg. , mycoplasma , many streptococci ) ( Mitra et al. , 2009 ) , while Rho seems to be indispensable in species with high GC content ( eg. , Caulobacter crescentus , M. luteus , M. tuberculosis , Steptomyces sp . ) . 
In other words , in bacteria where Rho-dependent termination is more important , the absolute number of stable hairpins in intergenic regions decreases across the entire ge-nome , possibly to favor Rho-dependent termination . 
Such a situation would also be consistent with the lack of ITs in GIs across different genomes . 
3.9 . 
Intergenic , but not intragenic , Rho-dependent termination could function efficiently in expressing GIs 
As mentioned earlier , Rho can effect intragenic termination within the coding region of a poorly translated gene because transcription -- translation coupling is inefficient , allowing the factor to access to nascent RNA and RNAP ( Fig. 4 ) . 
However , if a GI gene that has been silenced in the past by intragenic Rho-dependent termination is now incorporated into the cellular machinery , selection is likely to ensure that the gene 's codon adaptive index is similar to that of the cell . 
In that case , transcription -- translation coupling would now function efficiently preventing intragenic termination by Rho , and ensure gene expression . 
However , Rho-dependent termination would still continue to be the preferred mode of termination , once the stop codon has been crossed ( Fig. 5 ) . 
Thus , Rho is likely to be the default mode of intergenic termination for most GIs , irrespective of whether they are silenced or expressed across species . 
This could explain why ITs are rare even in genomic islands that are known to express and carry out defined functions such as SPI-1 , SPI-2 , LEE , PAI-II , and cag . 
4. Conclusion
Once a genomic island gets integrated into a genome , multiple checkpoints are likely to ensure that expression from its genes is silenced or stringently regulated to prevent any toxicity . 
Cis factors like ITs are of limited effectiveness in such situations as they can only function when sequences that encode them are `` strategically '' inserted into the GI . 
In contrast , a trans factor like Rho protein is more effective in bringing about termination as it has lesser sequence constraints and can effectively sense uncoupling of transcription and translation . 
Hence , Rho-dependent termination is likely to be more effective in regulating transcription from any xenogenic DNA that enters the genome . 
Consequently , in stretches of the genome where there is active Rho-dependent termination ( such as GIs ) , ITs not only become functionally redundant , but experimental evidence hints that they may also hinder efficient Rho-dependent termination . 
Hence , over evolutionary timescales , these regions could undergo a selection against such RNA hairpins . 
Since our analysis is a snapshot in an evolutionary time-scale , a uniform decrease of ITs in GIs across different phyla is unlikely to be observed for various reasons . 
Both coding regions and non-coding regulatory elements of GIs are likely to be subjected to differential selection pressures . 
Individual GIs could have initially `` entered '' the host genome with different cohorts of ITs at different time points and varied time spans would have elapsed since their genomic integration . 
However , the genome analysis across diverse species reinforces the experimental evidence that Rho is indeed an important genome sentinel along with restriction -- modification systems , Nucleoid Associated Proteins , transcription repressors and other factors ( proteins , small RNAs ) that act at different stages to silence expression of foreign DNA . 
Rho 's ability to interact without exquisite sequence specificity coupled to its property of translocating along RNA interacting with RNAP has resulted in a versatile component of cellular `` immunity surveillance '' mechanism . 
Supplementary data to this article can be found online at http : / / dx.doi.org/10.1016/j.gene.2012.07.064 . 
Acknowledgments
V. N. is a recipient of the J. C. Bose fellowship of the Department of Science and Technology , Government of India . 
The work is supported by the Centre of Excellence for Mycobacterial Research Grant , Government of India .