22689638.txt
45.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
Altered tRNA characteristics and 3 maturation in
ABSTRACT acid proteins .
Reliable and efficient translation depends critically on tRNA , which must exhibit specificity in aminoacylation , and correct pairing of the anticodon with its codon on the mRNA .
The robust nature of the genetic code and numerous genome-encoded mechanisms promote translational accuracy ( 1,2 ) , thus preventing deleterious events such as the reassignment of codons that can alter the function of thousands of genes .
Nevertheless , tRNAs and the genetic code sometimes do change , especially in genomes undergoing size reduction as exemplified by mitochondria and plastids ( 1,3,4 ) .
These organelle genomes , which are derived from genomes of symbiotic bacteria ( 5,6 ) , exhibit the most extreme cases of architectural alterations such as an increase in molecular evolutionary rate , inability to recombine , and massive gene loss that sometimes leads to tRNA loss and changes in the genetic code ( 7 ) .
Organelles encode a limited set of proteins and rely on other co-occurring genomes for enzymes and tRNAs ( 3,8,9 ) .
The reduced genomes of some bacterial endosymbionts exhibit similar but less extreme alterations in genome sequence compared with organelles ( 10 ) .
However , unlike organelles , most endosymbionts are still autono-mous in the sense that they possess their own core genetic machinery ( 11 -- 13 ) , including the conventional bacterial structure of tRNAs ( 14,15 ) .
Most endosymbionts retain the universal genetic code , but exceptions do exist among the tiniest genomes , in which UGA is sometimes recoded from Stop to Trp ( 16,17 ) .
In contrast , organelle tRNAs and their translational machinery are highly divergent from those of most bacteria ( 1,3,8,18 ) .
The question still remains as to how endosymbiont tRNAs and translational mechanisms differ from those of ancestral free-living genomes that are not reduced .
Overall , we hypothesize that the process of genome shrinkage in endo-symbionts results in a reduction of translational efficiency and integrity resembling a transitional stage between free-living ancestors and organelles .
Present day genomic features of bacterial endosymbi-onts result from their ancient transition from a free-living lifestyle to an obligate intracellular association ( 10 ) .
Translational efficiency is controlled by tRNAs and other genome-encoded mechanisms .
In organelles , translational processes are dramatically altered because of genome shrinkage and horizontal acquisition of gene products .
The influence of genome reduction on translation in endosymbionts is largely unknown .
Here , we investigate whether divergent lineages of Buchnera aphidicola , the reduced-genome bacterial endosymbiont of aphids , possess altered translational features compared with their free-living relative , Escherichia coli .
Our RNAseq data support the hypothesis that translation is less optimal in Buchnera than in E. coli .
We observed a specific , convergent , pattern of tRNA loss in Buchnera and other endosymbionts that have undergone genome shrinkage .
Furthermore , many modified nucleoside pathways that are important for E. coli translation are lost in Buchnera .
Additionally , Buchnera 's A+T compositional bias has resulted in reduced tRNA thermostability , and may have altered aminoacyl-tRNA synthetase recognition sites .
Buchnera tRNA genes are shorter than those of E. coli , as the majority no longer has a genome-encoded 3 ' CCA ; however , all the expressed , 0 shortened tRNAs undergo 3 CCA maturation .
Moreover , expression of tRNA isoacceptors was not correlated with the usage of corresponding codons .
Overall , our data suggest that endosymbi-ont genome evolution alters tRNA characteristics that are known to influence translational efficiency in their free-living relative .
INTRODUCTION
In the final step of protein synthesis , mRNA sequences must be accurately and efficiently translated into amino
Many bacteria that replicate strictly in host intracellular environments possess reduced genomes with sequences that are A+T biased relative to those of their free-living ancestors ( 10,19,20 ) .
One such bacterium demonstrating these genomic shifts is Buchnera aphidicola , an obligate unculturable endosymbiont of aphids ( 21 ) .
Buchnera has coevolved with its aphid hosts for 200-250 million years ( 21 ) , during which its genome shrunk to only 416 -- 652 kbp depending on the lineage ( 22 -- 27 ) .
Based on previous gene expression and genomic studies in Buchnera , genome reduction and accelerated sequence evolution has resulted in changes that are hypothesized to lower the efficiency and accuracy of transcription and translation ( 28 -- 31 ) as compared with the free-living relatives .
We predict that Buchnera will also exhibit less optimal tRNA features .
Presently , transcribed tRNAs and associated transcriptional mechanisms , which are key components of efficient and accurate translation , have not been extensively examined in Buchnera or any other bacterial endosymbiont .
Comprehensive characterization of transcribed endo-symbiont tRNAs has previously been difficult largely because of the inability to isolate unculturable symbiont tRNAs free of host contamination .
However , analysis of tRNAs beyond the level of DNA-encoded genes can reveal the nature of tRNA maturation , including the diversity of posttranscriptional processing that may occur .
Taking advantage of new methodologies in high-throughput RNA sequencing ( directional RNAseq ) , and the availability of several divergent Buchnera genomes ( 23,25,27 ) , we investigated how genome reduction and A+T richness affect tRNA evolution in this model endosymbiont .
This comparative framework provides us with an understanding of the conservation of tRNA sequences that influence specificity in aminoacylation and secondary structure as well as conservation of nucleoside modification pathways that influence anticodon -- codon base pairing ( 1,2,32 ) .
From these data , we were able to address how Buchnera tRNAs and associated transcriptional fidelity mechanisms are altered relative to those of free-living relatives , exemplified by Escherichia coli .
Additionally , because numerous reduced endosymbiont genomes have recently been sequenced ( 10 ) , we investigated whether a pattern of tRNA loss was present among reduced endosymbiont genomes .
MATERIALS AND METHODS Sample preparation
Four aphid species , Acyrthosiphon pisum ( strains LSR1 and 5A ) , Acyrthosiphon kondoi ( strain Ak ) , Schizaphis graminum ( strain Sg ) and Uroleucon ambrosiae ( strain UA002 , referred to as Ua ) , were reared in the same growth chamber at 20 C. A. pisum was reared on seedlings of Vicia faba , A. kondoi on Medicago sativa , U. ambrosiae on Tithonia mexicana and S. graminum on Hordeum vulgare .
For each aphid strain , B. aphidicola cells were filtered from 3 g of mixed age aphids .
Filtration was done according to the study by Moran et al. ( 33 ) , with modifications as follows .
First , modified buffer A ( 34 ) was used instead of PBS .
Also , after the 1000 rpm centrifugation step , the pellet was resuspended and used for subsequent filtration steps instead of the supernatant .
After the last centrifugation step , supernatant and the protein layer were discarded and the pellet was immediately immersed with Ambion TRI Reagent Solution .
For RNA extraction , a similar protocol was used as in Hansen and Moran ( 34 ) except that , after step 5 , Qiagen 's miRNAeasy protocol under appendix A from Qiagen 's miRNAeasy Mini Handbook was used to enrich for miRNA ( i.e. RNA < 200 bp ) .
RNA was DNAase treated , and quality and quantity was checked as in Hansen and Moran ( 34 ) .
All filtration and extraction materials were treated with RNAse AWAY ( Molecular BioProducts , Inc , CA , USA ) , and all solutions were RNase free .
RNA sequencing, read processing, mapping, expression and identification
The Yale Keck sequencing center carried out library preparation and sequencing of Buchnera tRNA for all five aphid strains .
Briefly , for tRNA library preparation , the Illumina mRNA directional sequencing protocol was followed starting at the phosphatase treatment step .
RNA < 200 bp was directionally sequenced one lane per sample with Illumina 35 bp reads .
The CLC Genomic Workbench Aarhus , Denmark was used for read processing and mapping .
For all reads , small RNA adapters and reads with ambiguous nucleotides were trimmed from reads .
Trimmed reads were then mapped to corresponding Buchnera genomes ( Table 1 ; 23,25,27 ) with CLC Genomic Workbench short read local alignment mapping using the default settings for short reads .
All Buchnera taxa used in this study possess similar genome sizes ( Ap-5A = 642 122 bp ; Ak = 641 794 bp ; Ua = 615 380 bp ; Sg = 641 454 bp ) .
tRNA reads that mapped sense and anti-sense relative to the tRNA gene were converted into Reads Per Kilobase of exon model per million mapped reads ( RPKM ) .
Coverage per base pair was calculated using custom perl scripts and Microsoft Excel and was viewed in Artemis 13.0 ( 35 ) to visualize sense and anti-sense tRNA coverage .
For each Buchnera strain , tRNA genes were annotated using genome annotations in NCBI , tRNAscan-SE 1.21 ( 14,15 ) and Artemis 13.0 ( 35 ) to verify whether 30 CCA was encoded in the genome .
tRNA CCA 30 maturation occurs in all organisms and is essential for charging tRNAs with amino acids .
To identify CCA 30 maturation , the last 30 20 bp of annotated tRNA 's were retrieved from all high quality raw reads .
Reads that perfectly matched the last 20 bp were binned into the following three categories : ( i ) reads match the 30 tRNA end and no more nucleotides are processed , ( ii ) reads match the 30 tRNA end plus add-itional non-CCA nucleotides are transcribed and ( iii ) reads match the 30 tRNA end plus CCA is added by maturation .
To analyse A+T richness in Buchnera and E. coli CDS and tRNA genes the program EMBOSS ( 36 ) was used .
To calculate codon usage of 50 highly expressed Buchnera genes ( 37 ) , E-cai ( 38 ) was used
After consensus , RNAseq reads corresponding to tRNA genes were mapped and assembled , tRNA species were identified with tRNAscan-SE 1.21 , with E. coli homology Blast searches ( 39 ) , and with verification of the presence of signature identity elements relative to E. coli ( 32 ) .
Survey of tRNA complements in small genomes
The last comprehensive survey of tRNA genes from bacteria was conducted in 2002 and only included the endosymbiont genome of Buchnera strain APS ( 40 ) .
Because several smaller endosymbiont genomes have been sequenced since 2002 , we surveyed several more genomes that varied drastically in genome size and phylogenetic placement .
The tRNAscan-SE Genomic tRNA database ( 41 ) was used to characterize the presence of tRNA gene isoacceptors ( i.e. a tRNA species that binds to one or more codons for a particular amino acid residue ) in 16 genomes .
High throughput detection of modified nucleoside bases in tRNAs
During library preparation , some modified bases cause the reverse transcriptase to either fall off at the modified position , and/or to incorporate a ` mismatch ' relative to the reference genome sequence ( 42,43 ) .
To detect modified bases and potential posttranscriptional processing , we screened for mismatches in tRNA reads relative to the reference tRNA gene similar to Iida et al. ( 42 ) and Findeiß et al. ( 43 ) .
After mapping , only the sense tRNA reads in CLC ( using the same mapping parameters as discussed earlier ) , we ran CLC single-nucleotide polymorphism ( SNP ) analyses to detect mismatches .
Threshold criteria for counting a mismatch were established by identifying conserved mismatches in both Ap-5A and AP-LSR1 ( two different strains from the same aphid species ) .
These two strains shared 38 mismatches for which the mismatch rate was more than 1 % per base ( i.e. above Illumina 's expected error rate per base ) and the alternative variant count was at least eight reads .
This mismatch criterion was then used to detect mismatches in other strains for a total of four divergent Buchnera taxa ( Ap , Ak , Ua and Sg ) .
Predicted tRNA-modified bases and their pathways for each Buchnera tRNA were obtained from E. coli homologs using both http://modomics.genesilico.pl/ pathways / ( 44 ) and http://www.ecocyc.org/ ( 45 ) .
Divergent Buchnera genomes ( 23,25,27 ) were searched for modification pathway enzymes using E. coli homologs using Blastp ( 39 ) .
Infernal ( 46 ) was used to generate tRNA sequence and secondary structure alignments among Buchnera strains and E. coli .
The covariance model , RF00005 cm , was used , which accounts for tRNA secondary structure constraints .
Using Infernal output , 4sale ( 47 ) was used to compute pairwise compensatory substitution tables from stems for all tRNAs among Buchnera strains and E. coli .
Stability of tRNA secondary structure was measured as Delta G ( G ) , the change in Gibbs Free Energy ( in units of kcal/mole ) .
Thus , the more negative G is , the more thermodynamically stable the tRNA secondary structure .
G was computed for tRNAs of each strain individually using RNAalifold ( 48,49 ) with constraints on tRNA constraint folding generated by tRNAscan-SE 1.21 ( 14,15 ) .
All raw sense and anti-sense tRNA data were submitted to NCBI Genbank under SRA submission : SRA049863 .3 , under Bioproject # s : ( i ) PRJNA82811 , ( ii ) PRJNA82809 , ( iii ) PRJNA82797 , ( iv ) PRJNA82793 , ( v ) PRJNA82789 .
All paired sample-t test ( percent guanine-cytosine ( % GC ) , tRNA length ) , correlation ( pairwise RPKM comparisons ) and regression ( codon usage and tRNA expression ) statistics were carried out using IBM SPSS Statistics .
2010 for Mac , standard version 19.0 .
New York , USA .
RESULTS
All Buchnera tRNAs are transcribed
For all Buchnera genomes , tRNA genes occur in the same genomic positions ( Figure 1 ) .
Based on tScan and blastn detection of homology with E. coli the same 32 tRNA genes and 29 anticodon types are conserved across Buchnera taxa ( Figure 1 ) .
As expected , directional RNAseq reads map primarily in the sense direction of tRNA genes , with antisense reads averaging less than 1 % of the sense reads ( Table 1 , Figure 1 ) .
All Buchnera tRNA genes are expressed in the sense direction , but some lack antisense expression , depending on strain , and sense expression is always higher than antisense expression ( except for Phe GAA in Ak and Sg ) ( Figure 1 ) .
tRNA sense expression is positively correlated across divergent Buchnera taxa ( Table 2 ) .
The level of antisense expression is highly correlated across all Buchnera taxa , but the correlation is less for Buchnera-Sg , the most divergent taxon ( Table 2 ) .
Transcriptional start sites and coverage curves for antisense RNAs varied widely across Buchnera taxa .
Nevertheless , conserved 50 transcriptional start sites and coverage curves were identified for several antisense RNAs that occurred on or near tRNA genes for all five Buchnera taxa ( Supplementary Table S1 ) .
Conservation of tRNA identity elements in Buchnera
Recognition of tRNAs by tRNA synthetases is essential to the fidelity of translation .
Aminoacyl-tRNA synthetase
( aaRS ) must recognize multiple tRNA isoacceptors ( i.e. different tRNA species that bind to alternative codons for the same amino acid residue ) but discriminate against others .
This recognition is dependent on tRNA identity elements , consisting of evolutionarily conserved bases at specific positions of tRNAs ( Giege ' et al. 1998 ) .
Based on RNAseq data from all taxa , unmodified identity elements for each tRNA are identical to those in E. coli GCA except for base substitutions in Cys ( G15 to U15 ; A13 GGA GCT to G13 ) , Ser ( G73 to A73 ) , Ser ( variable loop 1 bp GGC shorter , except in Sg ) and Ala ( G20 to U20 , 5A and Ua only ; G20 to C20 , Sg and Ak only ) .
Based on blastp analyses , all 20 cognate aaRS are encoded within each Buchnera genome .
In contrast to E. coli and most other organisms with nonreduced genomes , Buchnera does not encode multiple tRNA genes with matching anticodons , except for three tRNA genes encoding the anticodon CAU .
Two of these genes encode either an initiation or elongation Met tRNA based on tRNA identity elements and homology ( Table 3 ) .
The other tRNA gene encoding a CAU anticodon possesses homology and identity elements corresponding LAU to the Ile anticodon ( Table 3 ) .
* Bold = significant ( P < 0.01 ) ; unbold = nonsignificant ( P > 0.05 ) .
N = 32 .
Selective loss of tRNA isoacceptors from Buchnera genomes
Numerous tRNA isoacceptors are present in E. coli but missing from all Buchnera strains .
Many Buchnera tRNA isoacceptors that belong to 4-codon family boxes and to two-codon families ( 50-NNR codon type ) have been lost from Buchnera genomes ( Table 3 ) .
50-CNN anticodons were preferentially lost in family boxes corresponding to Leu , Gly , Ser , Thr and Pro .
Only one family box , corres-ponding to Pro , lost both 50-CNN and 50-GNN anticodons .
For two-codon families , a 50 CNN anticodon was lost from Gln ( and Leu and Arg for 6-codon families ) , relative to E. coli ( Table 3 ) .
Based on Watson and Crick base-pairing and revised wobble rules ( 50,51 ) , all tRNA isoacceptors encoded and expressed in Buchnera can base pair with the 61 possible codons ( Table 3 ) , which are all still encoded in Buchnera 's protein-coding genes at variable frequencies .
The pattern of tRNA gene isoacceptor loss was examined in 16 bacterial taxa representing a wide range of genome sizes and phylogenetic associations , including some with extremely reduced genomes ( Figure 2 ) .
Reduced genomes show common patterns of retention of particular anticodons .
For family box codons , 50-CNN anticodons followed by 50-GNN anticodons are consistently eliminated from the small genomes .
For 50-NNR two-box codons , 50-CNN anticodons are eliminated .
In the most reduced genomes , only 50-UNN anticodons remain for both family box and two-box codons .
Unmodified 50-U anticodons can wobble and pair with all four base combinations for family box codons ( 50 ) .
Therefore , for 50-NNR two-box codons , the 50-U of anticodons must be modified to prevent mistranslation of neighboring two-box ( NNY ) codons ( 1 , 2 ) ( e.g. an unmodified 50 U in a Gln 50-UUG anticodon can mispair with His codons 50-CAU and CAC [ Table 3 ] ) .
Based on E. coli tRNA homologs , 26 different types of nucleoside modifications are predicted to occur in Buchnera tRNAs ( Table 4 , Supplementary Table S2 , Supplementary Dataset 1 ) .
Nine of these modifications are important for the efficiency and fidelity of protein synthesis and occur in N34 tRNA positions ( wobble ) of E. coli ( Table 4 ) .
We expect five of these N34 modifications to be retained to code for all cognate codon pairs and prevent mistranslation of other amino acids ( e.g. 5 5 2 5 2 mnm u , mnm s U , cmnm Um , I and K C ) .
An inosine ( I ) modification is important in E. coli because 50-A from anticodon Arg is modified into I , which can ACG wobble and pair with Arg codons CGA , CGU and CGC ( 55 ) .
Lysidine ( K C ) is an important modification in 2 E. coli because 50-C from anticodon IleCAU is modified into K C ( L ) , which pairs with Ile codon AUA ( instead 2 of the Met codon AUG ) ( 59 ) .
Other expected N34 modifications ( mnm u , mnm s2U and cmnm Um , ) are 5 5 5 important for modifying anticodon 50 U for NNR two codon boxes , thus preventing mistranslation ( 1,2 ) .
Based on Buchnera genome annotations , entire pathways are only present for expected wobble bases I , k C and 2 cmnm um ( Table 4 ) ; however , some pathways are only 5 missing the last enzyme in a pathway ( e.g. mnm u 5 and mnm s U ) , and/or are still unknown in E. coli .
5 2 High throughput mismatch evidence ( see ` Materials and Methods ' section ) shared by multiple taxa supports the presence of a modified nucleoside at 50-A from anticodon Arg in all Buchnera strains .
These data support ACG the presence of an inosine modification in all taxa .
For example , we found a high frequency of anticodon 50-ACG transcribed as 50-GCG , where the frequency of 50-G/A at this wobble base position was , Ap-5A : 61/39 % ; Ap-LSR1 = 70/30 % ; Ak = 69/31 % ; Ua = 27 / 73 % ; and Sg = 72/28 % .
Presence of transcripts containing a 50-G for the ArgACG anticodon is strong indirect evidence for an inosine modification .
For example , during the reverse transcription process , the modified nucleoside inosine base pairs with C residues , and therefore ` G ' is found in the consensus cDNA sequence instead of ` A ' ( 60 ) .
Conserved high throughput mismatch evidence for Ap-5A , Ap-LSR1 and Ak also supports the presence of a modified base at N34 for Lys , suggesting that TTT mnm s U is present in these strains even though the 5 2 E. coli version of the pathway appears incomplete in Buchnera .
Error evidence was not detected for other expected modified wobble positions relative to E. coli , even though full pathways are retained in the genome ( Table 4 ) .
Other tRNA modifications that are very important for the fidelity of protein synthesis are N37 modifications .
N37 modifications are known to stabilize weak A : U and U : A base pairing between N36 of the anticodon and N1 of the codon ( 1,2,51 ) .
Based on in vitro experiments , N37 modifications are known to increase the interaction of the codon with the anticodon , preventing miscoding of amino acids and frameshifts ( 52,53,61 -- 63 ) .
In turn , to maintain efficient translation , we expect these modifications to be retained .
Based on modifications for the hom-ologous tRNAs in E. coli , seven important N37 modified nucleosides are predicted in Buchnera .
Among Buchnera genomes , four N37 nucleosides pathways are retained , two are missing , and one has an unknown pathway in E. coli ( Table 4 ) .
High throughput mismatch evidence supports the presence of a modified base at N37 for Phe , GAA Pro , Leu and Leu and thus suggests that TGG GAG TAG ms2i6A , m1G , xG and xG , respectively , are present in all taxa .
However , no mismatch was detected in Sg for Leu .
The tRNA modifications at positions other GAG than N34 and N37 that are supported by mismatch evidence are shown in Supplementary Table S2 , and Supplementary Dataset 1 .
Mismatch evidence was also found at positions at which E. coli does not process modified nucleosides , suggesting the presence of new modified nucleoside sites and/or RNA editing of mature tRNAs ( Supplementary Dataset 1 ) .
Collectively , al mismatch frequencies ( with the exception of Arg ) were ACG dominated by the reference sequence base at a frequency of 90-99 % relative to mismatches for all taxa .
Mismatches were primarily not changes to a single nucleo-tide base , but were composed of three different bases other than the reference base .
No relationship between codon frequencies and tRNA expression
In many species , tRNA abundances are positively correlated with codon usage for highly expressed genes ( 64,65 ) .
Anticodons of highly expressed tRNAs corres-pond to codons that are used frequently in these genes , thus improving the efficiency of translation ( 64,65 ) .
Based on Watson and Crick and revised wobble base-pairing rules ( 50,51 ) , each Buchnera isoacceptor was paired with its corresponding codon pair .
Met CAU , the only duplicate anticodon coding for the same codon , was excluded from analysis .
The relationship between percent average codon usage of highly expressed genes and corresponding tRNA isoacceptor expression was examined for each Buchnera strain .
No significant relationship was found between average codon usage of 50 highly expressed genes in Ap-5A ( on leading and lagging strands ) and cognate tRNA isoacceptor sense expression ( Figure 3 ) .
No significant relationship was found on examining the relationship between highly expressed Buchnera genes ( four chaperones and 54 ribosomal proteins ) and cognate tRNA isoacceptor sense expression for all taxa ( Supplementary Figure S1A and S1B ) .
Examination of codon usage and tRNA expression scatterplots reveals that most tRNA isoacceptors , regardless of codon usage , are expressed at similar levels ( e.g. for Ap-5a in RPKM the 75 percentile = 843 950 , median = 309 742 and CCA max = 4 407 138 ; Figure 3 ) .
Trp is the highest expressed isoacceptor in all taxa ( except Ua ) , even though the corresponding codon occurs at low frequency ( Figure 3 and Supplementary Figure S1 ) .
Buchnera tRNAs maintain secondary structure with compensatory base substitutions
As expected , Buchnera CDS are significantly more A+T rich relative to CDS of E. coli [ Figure 4 ( c ) ] .
Within each Buchnera genome , tRNA genes are 2.2-fold more G+C rich relative to CDS , indicating that selection conserves higher % G+C in tRNA genes .
Nevertheless , Buchnera tRNA genes are significantly more A+T rich than homologs in E. coli [ Figure 4 ( c ) ] .
Stability of tRNA secondary structure can decrease with a reduction in % GC , especially in stem structures .
Because Buchnera tRNAs are more A+T rich than those of E. coli [ Figure 4 ( c ) ] , we measured the stability of Buchnera tRNA secondary structure .
G was significantly more negative in E. coli tRNAs relative to homologs in Buchnera for all strains , indicating that Buchnera tRNAs have reduced stability in vitro [ Figure 4 ( b ) ] .
Whether they have reduced stability in vivo , where stabilizing proteins may play a role , remains to be tested .
Two tRNAs with the weakest secondary structure in all Buchnera relative to E. coli were Val and Trp ; both tRNAs possess GAG CCA numerous compensatory and single base substitutions in the stem regions [ Figure 4 ( a ) ] .
Buchnera tRNAs are more A+T biased and display weaker secondary structure than those of E. coli ( Figure 4 ) .
However , a high frequency of compensatory base substitutions are expected in the stem regions as a mechanism for maintaining functionality of these essential molecules .
Relative to E. coli , a total of 37 -- 42 compensa-tory base substitutions were found in Buchnera tRNA stem regions ( Table 5 ) .
Many of these compensa-tory substitutions were C/G to T/A directional changes ( Table 5 ) .
Buchnera tRNA gene shrinkage and compensatory 30 maturation
Genome reduction primarily reflects loss of coding genes , as reduction in gene length is minor ( < 1 % , 37 ) , and gene packing is similar for bacterial genomes of different sizes ( 66 ) .
However , Buchnera tRNA genes are often shorter in length than their homologs in E. coli [ Figure 5 ( a ) ] .
The difference in length is typically 3 bp and mostly reflects the loss of encoded 30 CCA in the Buchnera tRNA genes .
At the 30 end of tRNAs , CCA is required for amino acid activation , and must either be encoded in the tRNA gene or added during tRNA maturation by the CCA-adding enzyme .
Although E. coli and other close relatives of Buchnera such as Vibrio and Pseudomonas spp .
all encode 30 CCA in all tRNA genes except that for selenocysteine , only half of Buchnera tRNA genes encode 30 CCA [ 14-17 depending on strain , Figure 5 ( b ) ] .
0 The remaining Buchnera tRNA genes have lost the 3 encoded CCA .
Our analysis of directional RNAseq reads indicates that the mature transcript of these genes 0 possesses a CCA at the 3 end [ Figure 5 ( b ) ] , implying CCA-addition .
Some Buchnera tRNA genes with 30 CCA encoded also displayed CCA 30 maturation [ Figure 5 ( b ) ] , resulting in double or triple CCA at the 30 end of tRNAs .
Recently , it was shown that tRNAs with dual 30 CCA are targeted for degradation ( 67 ) .
More specifically , if a tRNA has 50
Gs on bp 1 and 2 , and its acceptor stem is structurally unstable , then the CCA-adding enzyme marks unstable tRNAs by adding dual 30 CCAs , targeting it for degrad-ation by RnaseR ( 67 ) .
Such degradation also seems possible in Buchnera strains , which encode both the CCA-adding enzyme and RnaseR .
Thus , we examined all Buchnera tRNAs with dual and triple 30 CCA matur-ation .
First , we noted that all E. coli tRNAs with a 50 G at the 1st and 2nd base position encode dual or triple CCA 0 on the 3 end of the tRNA gene [ Figure 5 ( c ) ] .
Based on tRNAscan-SE 1.21 , the penultimate CCA is always incorporated into the 30 acceptor stem , exposing a single 30 CCA for activation .
Most Buchnera tRNAs that display dual or triple 30 CCA maturation still retain a 50 G at the 1st and 2nd bases and are homologs to dual or triple 30 CCA encoded E. coli tRNAs [ Figure 5 ( c ) ] .
Three strain-specific tRNAs with dual 30 CCA maturation do not have E. coli homologs with dual CCAs encoded .
These Buchnera tRNAs also do not encode 50 Gs at the
1st and 2nd base .
All Buchnera with dual or triple 30 CCA maturation incorporate the 2nd to last CCA into the 30 acceptor stem as in E. coli , except for one case , tRNA Leu in Ak [ Figure 5 ( c ) ] .
TAA
DISCUSSION
The efficiency and fidelity of translation is reinforced by many mechanisms encoded in genomes .
In reduced genomes , mutation rates are typically high , and selection becomes less effective in maintaining translational mechanisms .
In this study , we found that bacterial endosymbi-ont lineages ( Buchnera ) that experience relaxed selection display less optimal tRNA characteristics relative to those of their free-living relative E. coli .
Gene loss and A+T mutational bias in Buchnera have lead to the loss of tRNA isoacceptors and loss of modified base pathways , the reduction of tRNA gene length , and the accumulation of base substitutions and indels ( insertions / deletions ) in tRNA sequences that weaken tRNA secondary structure and possibly aminoacyl-tRNA synthetase recognition .
These tRNA characteristics are conserved across four Buchnera lineages spanning 70 million years of divergence and may result in reduced translational efficiency and fidelity relative to their ancestors .
However , we did detect compensatory base substitutions in Buchnera tRNAs , which are expected to maintain secondary structure of tRNA stem regions .
Additionally , RNAseq reads 0 reveal novel 3 maturation processes that compensate for tRNA gene length reduction .
Divergent Buchnera taxa in this study encode and express the same 32 tRNA genes composed of 32 different isoacceptor types ( Figure 1 ) .
In turn , no duplication of tRNA gene isoacceptors was found .
Based on a survey of 50 eukaryotic , eubacterial , and archaeal genomes , low tRNA gene redundancy ( i.e. only one or two gene copies of a particular isoacceptor ) was only found in all archaeans and several bacterial genomes , and was approximately correlated with genome size ( 40 ) .
In Buchnera , because of modified wobble rules ( 50,51 ) , all mature tRNAs expressed can theoretically base pair with the 61 possible codons ( Table 3 , Figure 1 ) , which are all still encoded in Buchnera CDS .
One special Buchnera isoacceptor that has been identified previously in Buchnera-Ap ( taxa type strain APS ) is tRNA Ile CAU ( 40 ) , where 50-C is modified into lysidine by the enzyme TilS in E. coli ( 55 ) , which all Buchnera strains still encode .
This special Ile isoacceptor codes for Ile instead of CAU Met due to a wobble modification , and is ubiquitous in Eubacteria and Archaea ( 40 ) .
During genome reduction , Buchnera has preferentially lost 50-CNN , and to a lesser extent , 50-GNN anticodons in 0 family boxes and 5 CNN anticodons from two-codon NNR families ( Table 3 ) .
This pattern of tRNA isoacceptor loss is common for many bacteria with reduced genomes ( Figure 2 ) , and is most likely related to gene deletion processes .
Selective loss of these specific isoacceptors in family boxes and NNR two-codon families in Eubacteria was observed in previous studies ( 1,40,68,69 ) but was related to A+T sequence bias not deletion processes ( 1,70 ) .
We hypothesize that genome reduction , which is correlated with A+T bias , is the most likely explanation for this pattern of tRNA isoacceptor loss .
First , the potential for wobble in codon -- anticodon basepairing implies that some tRNA isoacceptors are not essential for pairing with corresponding codons ( e.g. 50-CNN , 50-GNN anticodons ) and can be eliminated through mutation and deletion .
Second , due to wobble rules , 50-GNN anticodons followed by 50-UNN anticodons are the most promiscuous isoacceptors when pairing with cognate codons ; thus , it is not surprising that 50UNN is always retained in family box and two-box NNR codons in the most reduced genomes .
In turn , 50-UNN anticodons are probably retained because of their ability to recognize alternative codons rather than because of the high frequency of cognate codons in A+T rich CDS .
Typically in bacteria and eukaryotes 50-CNN and 50-GNN anticodons of family boxes and 50-CNN anticodons from two-codon families along with 50 U anticodon modifications extending wobble are maintained by selection , because they increase the efficiency of translation ( 1,71 ) .
We predict that the loss of tRNA isoacceptors in Buchnera as well as other endosymbionts potentially results in less efficient translation .
Numerous unmodified nucleotides at specific nucleotide positions on tRNA isoacceptors are conserved phylogen-etically and are known to play crucial roles in defining tRNA specificity for aminoacylation ( 32,72 ) .
These conserved nucleotides are called identity elements and are required for proper recognition by the cognate aaRS in addition to playing roles as deterrents to false recognition ( 32 ) .
Our results reveal that most Buchnera tRNAs have maintained identity elements homologous to those in E. coli , with the exceptions of Cys , Ser , Ser GCA GGA GCT and Ala .
In E. coli tRNA , the identity elements GGC cys G15 · G48 form an unusual tertiary base pair called a Levitt pair ( 73 ) .
Additionally , the E. coli identity elements A13 · A22 are important in determining the structure of G15 · G48 ( 74 ) .
Collectively , these E. coli identity elements are required for CysRS recognition due to their role in RNA tertiary structure ( 73 ) .
In all , Buchnera taxa , tRNA G15 cys · G48 has mutated to U15 · G48 and A13 · A22 has mutated to G13 · A22 .
Hou et al. ( 73 ) found that when G15 · G48 is mutated to U15 · G48 , its backbone configuration is similar to the wild type tRNA ; however , only partial aminoacylation cys ( 46.2 % ) occurs relative to the wild type .
How both types of changes in identity element together affect tertiary structure is unknown .
In Buchnera tRNA Ala , the identity element G20 is GGC mutated to U20 in strains 5A and Ua and to C20 in Ak and Sg .
In E. coli tRNA Ala , these same base changes VGC were shown to result in 6 and 50 reductions in alanine charging activity , respectively , relative to native tRNA Ala ( 75 ) .
Buchnera Ala does not possess this VGC UGC mutation .
Potentially , if this mutation is deleterious in Ala recognition , Ala can wobble to all four GGC UGC alternative codons for the family box codon family for alanine .
Interestingly , the smallest sequenced genome of Buchnera , for the host Cinara cedri , retains the same tRNA isoacceptors and aaRSs as other Buchnera taxa examined in this study ; however , the Ala tRNA GGC gene has been lost , resulting in a total of only 31 tRNA genes .
In Buchnera tRNA Ser , the identity element G73 GGA ( the discriminator base ) has mutated to A73 .
Generally a mutation in the discriminator base is known to result in the loss of cognate aminoacyl-tRNA synthetase recognition ; however , Shimizu et al. ( 76 ) demonstrated that any four bases substituted in the discriminator base of E. coli Ser tRNA resulted in the same level of aminoacylation .
Nevertheless , G73 in Ser tRNA is phylogenetically conserved ( 72 ) and has been shown to play minor roles in SerRS discrimination ( 77,78 ) .
Additionally , in E. coli Ser tRNA , the variable region plays a very important role as an identity element ( 77,79 ) .
In all Buchnera taxa , except Sg , the variable region length of the Ser isoacceptor is GCT 1 bp shorter than the E. coli Ser isoacceptor .
In GCT summary , it is unknown how all these mutated identity elements affect Buchnera translation , but the same mutations in E. coli are known to significantly reduce the efficiency of aminoacylation .
In addition to requiring specificity in aminoacylation , reliable and efficient translation requires the anticodon to correctly pair with its codon .
Modified nucleosides of tRNAs are essential mechanisms reinforcing translational fidelity and efficiency , especially at the wobble ( N34 ) and 30 position immediately adjacent to the anticodon ( P37 ) , ( 1,2,51 ) .
Based on E. coli tRNA homologs , we expect 16 different types of modified bases to be present in the remaining 32 Buchnera tRNAs , for both N34 and N37 positions .
In E. coli , 13 of these modified base pathways are known and Buchnera encodes complete pathways for six of these ( Table 4 ) .
All Buchnera taxa have lost enzymes responsible for encoding N37 modified bases m A and 2 m A , which are important in stabilizing 5 6 0-NNC/G anticodons ( 2 ) ( Table 4 ) .
Enzymes that synthesize the N37 modification m t A are conserved in only half of 6 6 Buchnera taxa ; this enzyme is known to slightly increase the efficiency of base pairing of the anticodon Thr to GGU the codon ACC in E. coli ( 54 ) .
All N37 modified base pathways important for preventing frameshifts and stabilizing A : U and U : A at the wobble position of the anticodon and the first position of the codon were retained in all Buchnera taxa ( Table 4 ) .
These mechanisms may be essential for the fidelity of translation , especially for A+T rich genomes .
Modified nucleosides at the wobble base position ( N34 ) of the anticodon are important for encoding the right amino acid , extending or restricting wobble , increasing the efficiency of base pairing and preventing frameshifts ( 2,53,55,56,58,59 ) .
Buchnera taxa all encode the enzyme TilS that is essential for the synthesis of the modified base lysidine , and is important for encoding the amino acid Ile instead of Met ( 59 ) .
All Buchnera taxa also encode the core enzymes MmmE and MnmG that are important for synthesizing the modified bases mnm u , 5 mnm s U , and cmnm Um , which restrict 5 5 2 5 0U wobble in NNR two-box codons , including Arg and Leucine ( Table 4 ) .
All of these pathways are complete except for MmmC , which is involved in the last step for both modified bases , mnm u and mnm s U. However , 5 5 2 RNAseq mismatch evidence supports the presence of a modified base at the expected position of mnm s U 5 2 ( Table 4 ) .
Interestingly , the genes encoding MmmA , MmmE , MnmG , and IscS or SufS , but not MnmC are retained in several tiny endosymbiont genomes ( 10 )
Conservation of these enzymes in reduced genomes indicates that these enzymes or derivatives are important for the production of the modified bases mnm u , cmnm Um , 5 5 and especially mnm s U , which is essential for preventing 5 2 frameshifts and restricting wobble in NNR two codon boxes ( Glu , Lys and Gln ) , thereby preventing the miscoding of amino acids .
For incomplete pathways producing modified bases mnm u and mnm s U , either a 5 5 2 derivative may be synthesized and/or the insect host may import MnmC .
For example , the pea aphid , A. pisum expresses its mnmC homolog ( XP_003245837 ) in both its body and in the specialized aphid cells ( bacteriocytes ) that contain Buchnera cells ( 34 ) .
Another key enzyme that is retained in Buchnera is TadA , which is responsible for synthesizing inosine in E. coli ( 55 ) .
This wobble modification is present on Arg in many bacteria and can wobble to three alterACG native codons of Arg ( 2,59 ) .
Rnaseq mismatch evidence highly supports this modification , as inosine is recognized as G during the reverse transcription process ( 60 ) , and therefore we were able to measure a high frequency of modified Arg transcripts from all Buchnera taxa .
ACG Unfortunately , other modified bases do not appear to be recognized as specific bases and in turn incorporate different frequencies of any of the four bases during reverse transcription of modified transcripts ( 42,43 ) .
Collectively , Rnaseq evidence supported the presence of five modified bases , four in which the pathways are known and present ( or near present for mnm s U ) and one in 5 2 which the pathway is unknown ( Table 4 ) .
If Buchnera tRNAs can be isolated without host contamination , modified base presence and identity can be confirmed .
In many bacterial species , tRNA abundances are positively correlated with codon usage for highly expressed genes , thus increasing translational efficiency ( 64,65 ) .
In addition to analysing specific tRNA characteristics that influence the accuracy and efficiency of translation , we examined whether codon usage correlates with tRNA expression .
We found that tRNA sense expression is highly correlated across Buchnera taxa ( Table 2 ) , and many tRNA isoacceptors are expressed at similar levels within taxa ( Figure 3 , Supplementary Figure S1 ) .
A previous microarray study suggested that tRNA expression and codon usage of 50 highly expressed genes in Buchnera-Ap were positively correlated ( 37 ) , but the relationship was weak and expression of sense and antisense tRNAs were not distinguished , possibly confounding results .
Our directional RNAseq data show no relationship between tRNA expression and codon usage , for the same set of highly expressed genes in Buchnera-Ap under similar conditions ( Figure 3 ) .
Furthermore , no relationship was detectable in three other Buchnera taxa ( Supplementary Figure S1 ) .
Collectively , these results suggest that selection is not maintaining codon bias for highly expressed proteins .
Interestingly , Trp , is CCA the highest expressed isoacceptor in all Buchnera taxa ( except Ua ) and has very low codon usage .
In all , Buchnera examined , isoacceptor Trp displays one of CCA the lowest secondary structures relative to E. coli 's homolog ; potentially Trp is highly expressed to comCCA pensate for low aminoacylation efficiency related to numerous base substitutions that weaken its secondary structure [ Figure 4 ( a ) ] .
In this study , we found that Buchnera tRNAs have maintained high % GC relative to its CDS ; however , its tRNAs are more A+T rich and less stable relative to homologs in E. coli ( Figure 4 ) .
These results are consistent with previous findings ( 28 ) showing that 16S rRNAs of Buchnera and other endosymbiont species are more A+T rich and less stable than those of free-living relatives .
Similarly , mitochondrial tRNAs from animals are more A+T rich and less stable than nuclear tRNAs ( 80 ) .
Collectively , these results suggest that the accumulation of deleterious mutations can lead to less stable secondary structures of essential RNAs involved in translation .
Some selection for stabilization is also evident as numerous compensatory base substitutions have been fixed in the stem regions of both rRNAs ( 28 ) and tRNAs ( Figure 5 ) .
Alternatively , E. coli tRNAs may possess higher % GC because its optimal growth temperature is higher than that of Buchnera ( 81 ) , thus favoring higher % GC for increased thermal stability .
During genome reduction , 72 -- 78 % of Buchnera tRNA genes among all taxa have deleted 3 bp , due to the loss of 30 encoded CCA [ Figure 5 ( a ) ] .
Nevertheless , we found that all mature Buchnera tRNAs process 30 CCA , and therefore they all have potential for amino acid activation [ Figure 5 ( b ) ] .
In all Buchnera taxa , six to eight mature tRNAs process dual or triple 30 CCA [ Figure 5 ( b ) ] .
These characteristics , in addition to 50 G at the 1st and 2nd position and instability of the acceptor stem , result in tRNA degradation ( 67 ) .
Interestingly , these tRNAs in Buchnera and E. coli transcribe 50 G at the first and second base position and process dual or triple 30 CCA [ Figure 5 ( c ) ] .
In these mature tRNAs in both E. coli and Buchnera , the second to last 30 CCA is always incorporated into the 30 acceptor stem .
Potentially , the retention of encoded 50 G at N1 and N2 and the conservation of dual and triple 30 CCA maturation in these tRNAs [ Figure 5 ( c ) ] are essential to maintain the correct secondary structure and to police unstable tRNAs via the tRNA degradation pathway .
In conclusion , our observations of altered tRNA characteristics are consistent with the hypothesis that translational fidelity is lower in Buchnera compared with free-living relatives as represented by E. coli .
First , Buchnera genome reduction has resulted in the loss of specific tRNA isoacceptors and modified nucleoside pathways that may reduce translational efficiency and fidelity .
Second , Buchnera 's A+T mutational bias and reduced selection has resulted in the reduction of tRNA stability in vitro and specific tRNA base substitutions that may alter the efficiency of aaRS recognition .
Moreover , reduced translational efficiency was supported by the lack of relationship between codon usage of highly expressed genes and cognate tRNA isoacceptor expression .
Nevertheless , purifying selection appears to be strong enough in Buchnera genomes to maintain high % GC of tRNA genes relative to CDS .
Also , CCA 30 maturation of shortened tRNA genes , and numerous compensatory base substitutions in tRNA stems help maintain tRNA second-ary structure and function .
Consequently , we predict tha the translational efficiency and fidelity evident in Buchnera are in an intermediate state between free-living bacteria and organelles .
ACCESSION NUMBERS
All raw sense and anti-sense tRNA data were submitted to NCBI Genbank under SRA Submission : SRA049863 .3 , under Bioproject numbers : ( i ) PRJNA82811 , ( ii ) PRJNA82809 , ( iii ) PRJNA82797 , ( iv ) PRJNA82793 and ( v ) PRJNA82789 .
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online : Supplementary Tables 1 and 2 , Supplementary Figure 1 and Supplementary Dataset 1 .
ACKNOWLEDGEMENTS
The authors thank Kim Hammond for rearing aphids and Dieter Söll , Jiqiang Ling , Patrick O'Donoghue and Markus Englert for helpful discussions and feedback on tRNA data .
Also , they also thank Yogeshwar Kelkar , Rahul Raghavan and Patrick Degnan for helpful comments on the manuscript and thank four anonymous reviewers for their helpful comments and suggestions .