21278291.txt
19 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
Retrospective Application of Transposon-Directed Insertion Site
Sequencing to a Library of Signature-Tagged Mini-Tn5Km2 Mutants of Escherichia coli O157 : H7 Screened in Cattle † Sabine E. Eckert ,1 ‡ Francis Dziva ,2 ‡ Roy R. Chaudhuri ,3 ‡ Gemma C. Langridge ,1 ‡ Daniel J. Turner ,1 § Derek J. Pickard ,1 Duncan J. Maskell ,3 Nicholas R. Thomson ,1 and Mark P. Stevens4 * The Wellcome Trust Sanger Institute , Wellcome Trust Genome Campus , Hinxton , Cambridge CB10 1SA , United Kingdom1 ; Enteric Bacterial Pathogens Laboratory , Institute for Animal Health , Compton , Berkshire RG20 7NN , United Kingdom2 ; Department of Veterinary Medicine , University of Cambridge , Madingley Road , Cambridge CB3 0ES , United Kingdom3 ; and Roslin Institute and Royal ( Dick ) School of Veterinary Studies , University of Edinburgh , Bush Farm Road , Roslin , Midlothian EH25 9RG , United Kingdom4
Enterohemorrhagic Escherichia coli ( EHEC ) strains comprise a subset of Shiga toxin-producing E. coli strains that cause acute enteritis in humans ( 2 ) .
Infections may be complicated by severe sequelae and are frequently acquired via contact with ruminant feces .
The molecular mechanisms underlying colonization of the ruminant intestines by EHEC are incompletely understood .
Previously , we screened a library of 1,900 EHEC O157 : H7 mutants for their ability to colonize bovine intestines by signature-tagged mutagenesis ( STM ) ( 6 ) .
STM relies on a panel of transposons harboring unique oligo-nucleotide tags .
The tags can be detected by amplification and hybridization , enabling the composition of complex pools to be analyzed before and after inoculation of animals .
Mutants that are negatively selected in vivo relative to the inoculum are inferred to lack a gene required for colonization or survival , which can be identified by isolation and sequencing of trans-poson-flanking regions ( 16 ) .
Our analysis focused on the prototype E. coli O157 : H7 strain EDL933 , for which the chromosome and plasmid sequences are known ( 1 , 18 ) .
Of the 1,900 signature-tagged mutants screened , 101 were underrepresented in pools recovered from feces 5 days postinoculation of calves ( 6 ) .
The transposon insertion site could be mapped in 79 such mutants , identifying 59 different genes influencing colonization ( 6 ) .
Thirteen attenuating mutations were mapped to the locus of enterocyte effacement ( LEE ) , which encodes a type III secretion system
* Corresponding author .
Mailing address : Roslin Institute and Royal ( Dick ) School of Veterinary Studies , University of Edinburgh , Bush Farm Road , Roslin , Midlothian EH25 9RG , United Kingdom .
Phone : 44 131 527 4200 .
Fax : 44 131 440 0434 .
E-mail : Mark.Stevens@roslin.ed.ac.uk .
§ Present address : Oxford Nanopore Technologies , 4 Robert Robinson Way , Magdalen Science Park , Oxford OX4 4GA , United Kingdom .
‡ Contributed equally to the study .
† Supplemental material for this article may be found at http://jb .
asm.org / .
Published ahead of print on 28 January 2011 .
( T3SS ) required for the formation of `` attaching and effacing '' lesions .
The role of T3SS components in intestinal coloni-zation was subsequently confirmed with defined mutants ( 6 , 17 ) and by screening of 480 signature-tagged mutants of EHEC O26 : H from calves ( 27 ) .
STM also detected attenuating mutations in genes encoding secreted substrates of the T3SS ( espD , map , and nleD ) ( 6 ) .
Though STM has provided valuable insights into the genetic basis of virulence of microbes , it is limited by the number of unique tags and the effort required to construct libraries and map attenuating mutations .
Moreover , only negatively selected mutants tend to be investigated and subjective judgments are required to compare signal intensities relative to the input and coscreened mutants .
Functional annotation of the E. coli O157 : H7 genome in reservoir hosts is further hindered by the cost of using large animals at a high level of disease containment .
Recently , several protocols have been described that permit the simultaneous assignment of the genotype and fitness score for mutants screened in pools .
Transposon-di-rected insertion-site sequencing ( TraDIS ) exploits Illumina sequencing to obtain the sequence flanking each transposon insertion ( 11 ) .
The massively parallel nature of such sequencing permits comparison of the number of specific reads derived from inocula and output pools recovered from animals , providing a numerical measure of the extent to which mutants were selected in vivo .
TraDIS obviates the need to construct and array uniquely tagged mutants and to subclone and sequence attenuating mutations , yielding substantial time and cost savings .
TraDIS-like methods have defined the essential gene complement of Salmonella enterica serovar Typhi ( 11 ) and Streptococcus pneumoniae ( 28 ) and have identified genes influencing Haemo-philus influenzae pathogenesis ( 7 ) and survival of the gut symbiont Bacteroides thetaiotaomicron ( 8 ) .
We retrospectively applied TraDIS to assign the genotype and fitness score of EDL933 mutants previously screened in calves .
This required the massively parallel sequencing of transposon-flanking regions in the input and output pools of
EDL933 mini-Tn5Km2 mutants obtained by Dziva et al. ( 6 ) , as schematically shown in Fig. 1 .
Adequate genomic DNA was retrieved for 19 of the mutant pools screened , comprising a total of 1,805 mutants .
Genomic DNA from each input and output sample was quantified with a Nanodrop ND-1000 spectrophotometer ( Thermo Fisher , Loughborough , United Kingdom ) .
Equal amounts ( 1 g ) from all input and all output samples were pooled , and input and output pools were separately fragmented by ultrasonication with a Covaris adaptive focused acoustics instrument , to an average of 200 bp ( 19 ) .
Fragment libraries were prepared with the Illumina paired-end DNA sample preparation kit ( PE-102-1001 ; Illumina , Little Chesterford , United Kingdom ) , according to the manufacturer 's instructions , and quantified on an Agilent DNA1000 chip ( Agilent , South Queensferry , United Kingdom ) .
To form dou-ble-strand adapters , oligonucleotides Ind_Ad_T ( 5 - ACACTC TTTCCCTACACGACGCTCTTCCGATC * T-3 [ where the asterisk represents phosphorothioate ) and Ind_Ad_B ( 5 - pG ATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCG ATCTC-3 ) were annealed .
The input and output DNA was ligated to the double-strand adapters and then quantified by quantitative PCR ( qPCR ) using the primers Ad_T_qPCR1 ( 5-CTTTCCCTACACGACGCTCTTC-3 ) and Ad_B_qPCR2 ( 5 - ATTCCTGCTGAACCGCTCTTC-3 ) and SYBR green ( Applied Biosystems , Warrington , United Kingdom ) .
Two hundred nanograms of adaptor-ligated fragments was used to specifically amplify transposon insertion sites .
Twenty-four cycles of PCR were performed with transposon-specific forward primer MiniTn5-P5-3pr-3 ( 5 - AATGATACGGCGACCACC GAGATCTACACCTAGGCTGCGGCTGCACTTGTG-3 ) , which contains the Illumina P5 end for attachment to the flow cell , and reverse primer RInV3 .3 ( 5 - CAAGCAGAAGACGG CATACGAGATCGGTACACTCTTTCCCTACACGACGC TCTTCCGATCT-3 , containing the Illumina P7 end ) .
PCR products were size separated on an agarose gel , and fragments of 350 to 450 bp were excised and recovered with QiaExII gel extraction columns ( Qiagen , Crawley , United Kingdom ) following the manufacturer 's instructions , but without heating ( 19 ) .
DNA was eluted in 30 l of elution buffer , and quantified by qPCR with standards of known concentration , using prim-ers Syb_FP5 ( 5 - ATGATACGGCGACCACCGAG-3 ) and Syb_RP7 ( 5 - CAAGCAGAAGACGGCATACGAG-3 ) ( 19 ) .
The DNA fragment libraries were sequenced for 37 cycles according to the manufacturer 's instructions on single end flow cells by an Illumina GAII sequencer , using the custom sequencing primer MiniTn5-3pr-seq3 ( 5 - TAGGCTGCGGCTG CACTTGTGTA-3 ) , which binds 10 bp from the transposon end .
There were 12.6 and 13.3 million reads obtained for the input and output pools , respectively ( European Nucleotide Archive accession no .
ERP000368 ) .
Totals of 12.1 million ( 96.3 % ) of the input reads and 12.4 million ( 93.7 % ) of the output reads contained perfect matches to the 3 end of mini-Tn5Km2 ( 3 ) , and these reads were included in downstream analyses .
Transposon-derived sequence was removed from each read with a custom Perl script available from the authors .
The remainder of each sequence read was mapped to th
EDL933 chromosome and pO157 with NovoAlign ( Novocraft Technologies Sdn Bhd , Selangor , Malaysia ) .
Totals of 9.9 million input reads ( 78.4 % ) and 10.7 million output reads ( 80.6 % ) were mapped to unique positions in the EDL933 genome .
Subsequent analyses were performed with R , version 2.8.0 ( R Foundation for Statistical Computing , Vienna , Austria ) .
To quantify changes in the number of reads arising from specific insertions between the input and output , we adopted an approach suggested for RNA-Seq data analysis ( 15 ) .
The number of reads at each insertion location ( x ) was treated as a proportion of the total number of mapped reads ( n ) , and a variance-stabilizing arcsine-root transformation was applied , converting each value of x to narcsin ( x/n ) .
The transformed output values were divided by the equivalent input values to determine the fold change .
To avoid infinite values derived from taking the log of 0 , sequence counts of 0 were replaced with an arbitrary value of 0.5 .
Log2 fold change values were calculated to represent the difference in abundance of each mutant in the output pools relative to the input and provide a measure of fitness .
In our experience , TraDIS may overpredict the number of insertion sites due to a low-level background signal derived from incorrectly mapped or chimeric reads .
To distinguish genuine inserts from this background signal , predicted insertion sites with fewer than 25 ( i.e. , 32 ) mapped reads were removed from the data set ( see Fig .
S1 in the supplemental material ) .
Of the 1,805 EDL933 mutants screened , TraDIS unambig-uously assigned the insertion site and fitness score for 1,645 , representing 855 different genes .
Importantly , we assigned the genotype and fitness scores to 91.1 % of the mutants analyzed , where previously we only identified the insertion site in 4.2 % of mutants owing to the constraints of STM ( 6 ) .
Insertions were in general well distributed , although there are AT-rich regions where insertions are overrepresented ( Fig. 2 ) , as may be expected as mini-Tn5Km2 preferentially inserts at TA dinucleotides .
Table S1 in the supplemental material lists the insertion site and log2 fold change relative to input for each mutation .
Figure S2 in the supplemental material shows a histogram of log2 fold change values obtained for all the mutants .
This distribution was modeled by fitting a bimodal normal distribution using the R package mixdist ( 13 ) ( Fig .
S2 ) .
This model represents the mutants as a mixture of two distinct populations .
Most of the mutants show no attenuation , with no clear change in abundance relative to the input pool and a normal distribution of log2 fold change values with a mean close to 0 .
Attenuated mutants show lower log2 fold change values , with a mean of approximately 3 .
The model suggests that a log2 fold change of 1 ( equivalent to a 2-fold decrease in the abundance of the mutant in the output pool relative to the input ) is a suitable cutoff value to identify most of the attenuated mutants while restricting the number of false positives to an acceptable level .
Seventy-two insertions were detected by both STM and TraDIS , 86.1 % of which were negatively selected in both cases and 72.2 % of which showed at least 1 log2 fold change or greater by TraDIS ( see Table S1 in the supplemental material ) .
Though STM screening of EDL933 mutants in calves identi-fied 13 attenuating mutations in LEE genes ( 6 ) and was considered exhaustive at the time , TraDIS identified 54 insertions in the LEE in 21 different genes .
By TraDIS , all LEE mutants were negatively selected , except those with insertions in rorf1 or the region between ler and espG ( Fig. 3 ) .
Insertions in the LEE-flanking regions were not attenuating .
Mutations in predicted T3SS structural components were strongly negatively selected , with the exception of a single insertion in a gene of unknown function ( rorf8 ) .
Several LEE genes were disrupted many times , producing comparable fitness scores .
Variance in the scores for a given gene may reflect differences in competition dynamics in the pools in which the mutants were screened .
Tra-DIS found 5 attenuating mutations in eae , encoding intimin and 3 mutations in tir , encoding the translocated intimin receptor .
These were missed by STM , even though intimin and Tir play key roles in intestinal colonization of cattle by E. coli O157 : H7 ( 22 , 29 ) .
TraDIS also identified mutations in 29 of the 39 type III secreted effectors of E. coli O157 : H7 verified by Tobe et al. ( 26 ) ( see Table S2 in the supplemental material ) .
Mutants with insertions in several LEE-encoded effectors ( EspF , EspB , Tir , Map , EspH , and EspZ ) were all negatively selected , consistent with the role of such effectors in intestinal persistence of Citrobacter rodentium in mice ( 4 ) and E. coli O157 : H7 in rabbits ( 20 ) .
Of the non-LEE-encoded effectors , several appeared to play little or no role ( e.g. , NleG , NleH , EspY1 , and EspY4 ) ( Table S2 ) , whereas mutations in the genes coding for the others were attenuating .
Among the latter was z1829 , encoding EspK , an effector missed by STM but which influences persistence of EHEC in calves ( 27 , 30 ) .
Though several effector phenotypes have been independently verified , we caution that some attenuating mutations identified by STM could not be reproduced when mutants were tested in isolation ( e.g. , map ) ( 6 ) or by coinfection with the parent strain ( e.g. , nleD ) ( 14 ) , possibly due to the distinct selection pressure exerted by combining 95 mutants during the library screen .
Analysis of signature-tagged mutants of EHEC O26 : H in calves indicated that the cytotoxins EspP and enterohemo-lysin may promote intestinal colonization ( 27 ) .
Though mutants with defects in these genes were not detected in the EDL933 STM screen ( 6 ) , TraDIS revealed that several such mutants were represented in the library and were generally negatively selected in calves .
Three of four EDL933 espP mutants were attenuated by TraDIS ( see Table S1 in the supplemental material ) , consistent with the modest attenuation of a defined espP mutant in calves ( 5 ) .
Nine of 11 mutants with defects in the enterohemolysin ( EHEC-hly ) operon were negatively selected by TraDIS , supporting the attenuation of an ehxA mutant of EHEC O26 : H in calves ( 27 ) .
EhxA appears not to play a significant role in rectal colonization in steers ( 22 ) ; however , the latter study involved rectal application of the mutant to ruminant steers , without passage through the intestines .
Eight insertions were detected in l7031/tagA , which encodes a zinc metalloprotease that cleaves C1-esterase inhib-itor ( StcE ) ( 12 ) , promotes adherence ( 9 ) , and modulates neutrophil function ( 25 ) .
StcE mutants were generally underrepresented in calves , as were mutants with insertions in the EtpCD type II secretion system required for StcE secretion , consistent with the role of this system in colonization of rabbits ( 10 ) .
Seventeen mutations were detected in the gene encoding the large clostridial toxin homolog L7095/ToxB , though only 7 were negatively selected by greater than 1 log2 fold change
This relatively weak phenotype is consistent with the phenotype of a defined E. coli O157 : H7 toxB mutant in calves ( 24 ) .
Other genes carried by pO157 that were missed by STM but putatively linked to colonization by TraDIS include katP ( cat-alase-peroxidase ) , l7029/msbB ( lipid A myristoyl transferase ) , and a gene of the linked ecf operon ( l7026 ) .
TraDIS faithfully reproduced the fitness defect of mutants detected by STM that are impaired in O-antigen biosynthesis ( e.g. , manC , per , wbdP , and wzy ) , consistent with the phenotype of an E. coli O157 : H7 perosamine synthetase ( per ) mutant in steers ( 23 ) .
It also identified other attenuating mutations missed by STM that affect this process , as well as other pathways implicated in bacterial survival in vivo , such as aromatic amino acid biosynthesis ( aroA ) and iron storage ( ftn ) .
Of further interest , TraDIS identified an attenuating mutation in the catalytic subunit of Shiga toxin 1 ( stx1A ) .
Previously , STM identified an attenuating mutation downstream of the toxin genes in prophage CP-933V but upstream of those involved in bacterial lysis .
The attenuation of the stx1A mutant supports the finding that Stx1 promotes intestinal colonization of mice by E. coli O157 : H7 ( 21 ) .
In common with other methods for screening pools of random mutants , TraDIS describes single gene-phenotype relationships and does not account for functional redundancy .
Rarely , mutants may also contain more than one transposon insertion , harbor a secondary mutation of another kind , or possess polar insertions affecting the expression of nearby genes .
These limitations impose a formal requirement to confirm mutant phenotypes via the evaluation of nonpolar mutant and repaired or trans-complemented strains .
The number of mutants that can be simultaneously screened will also be constrained by the requirement to obtain an output pool of an adequate size at a time postinoculation sufficient for attenuation to be evident .
It is estimated that if 100 mutants are screened , the output pool must comprise at least 10,000 colo-nies in order to state at the 95 % confidence interval that specific mutants are absent due to attenuation as opposed to chance ( 6 ) .
Moreover , at high pool complexities , stochastic loss of mutants may occur if the number of mutants exceeds a `` bottleneck '' above which individual mutants in the population no longer have an equal chance of establishing themselves in the host .
Such limitations are balanced by the ability of massively parallel sequencing of mutant libraries to derive vastly richer functional annotation of pathogen genomes than can be obtained by earlier methods .
In conclusion , TraDIS validated and substantially extended our analysis of signature-tagged E. coli O157 : H7 mutants in cattle .
It described the genotype and fitness score for 91.1 % of mutants screened , unlocking hundreds of novel phenotypes with no further animal use .
It represents a significant advance toward the principles of reduction , refinement , and replacement of animals in research and is relatively inexpensive to apply de novo or retrospectively .
The procedures described herein relate to transposons that have been extensively used in other microbes ( reviewed in reference 16 ) and can therefore be widely applied to derive quantitative data for functional annotation of microbial genomes .
We gratefully acknowledge the support of DEFRA ( grant OZ0707 ) , the BBSRC ( grants D017556 and D017947 ) , and the Wellcome Trust