GSE79880-GSM2407481-GPL18006-GPL18956-GPL21317-GPL21690-GPL21691-GPL22710-GPL22711-PMID:27997543.tsv
3.07 KB
"genotype/variation: wild type" "characteristics_ch1.1"
"strain: K12" "characteristics_ch1.2"
"Library strategy: REC-Seq" "data_processing.1"
"genome build: NC_000913.3" "data_processing.2"
"To remove contaminating sequences, the reads were split according to the HinfI consensus motif (5’- G^ANTC-3’) considered as a barcode sequence using fastx_toolkit (http://hannonlab.cshl.edu/fastx_toolkit/) (fastx_barcode_splitter.pl --bcfile barcodelist.txt --bol -- exact). Most of the reads (more than 90%) were rejected, and the reads kept were remapped to the reference genomes with bwa mem and samtools to generate a sorted bam file. The bam file was further filtered to remove low mapping quality reads (keeping AS >= 45) and split by orientation (alignmentFlag 0 or 16) with bamtools. The reads were counted at 5' positions using Bedtools (bedtools genomecov -d -5). Both orientation count files were combined into a bed file at each identified 5’-GANTC-3’ motif (where reverse counts >=1 at position N+1 and forward counts >=1 at position N-1) using a home-made PERL script. The HinfI positions in the bed file were associated with the closest gene using Bedtools closest and the gff3 file of the reference genomes . The final bed file was converted to an MS Excel sheet (S1 and S2 Tables) with a homemade script. For the MboI-based REC-Seq, the strategy was identical except that a different adaptor was used for ligation after cleavage and the MboI consensus motif (5’-^GATC-3’) was used as barcode for filtering of V. cholerae O1 biovar El Tor and E. coli K12 Ec100D gDNA mapped onto the MG1655 genome." "data_processing.3"
"Supplementary_files_format_and_content: tab delimited text files, with feature annotation, REC_seq score, gene name and description" "data_processing.4"
"For REC-Seq (restriction enzyme cleavage–sequencing) 1 ug of genomic DNA from C. crescentus NA1000 and S. meliloti Rm2011 was cleaved with HinfI, a blocked (5’biotinylated) specific adaptor was ligated to the ends and the ligated fragments were then sheared to an average size of 150-400 bp (Fasteris SA, Geneva, CH). Illumina adaptors were then ligated to the sheared ends followed by deep-sequencing using a Hi-Seq Illumina sequencer, and the (50 bp single end) reads were quality controlled with FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/)." "extract_protocol_ch1.1"
"DNA libraries were prepared for sequencing using standard Illumina protocols by FASTERIS SA, Switzerland" "extract_protocol_ch1.2"
"Caulobacter crescentus and derivatives were grown at 30°C in PYE (Peptone yeast extract) or LB" "growth_protocol_ch1.1"
"Escherichia coli str. K-12 substr. MG1655" "organism_ch1.1"
"Escherichia coli MG1655" "source_name_ch1.1"
"GHA507" "title.1"
"OTHER" "library_strategy.1"
"genotype/variation: wild type" "characteristics_ch1.1"
"strain: K12" "characteristics_ch1.2"
"Caulobacter crescentus and derivatives were grown at 30°C in PYE (Peptone yeast extract) or LB" "growth_protocol_ch1.1"
"Escherichia coli str. K-12 substr. MG1655" "organism_ch1.1"
"Escherichia coli MG1655" "source_name_ch1.1"