Kevin Meza Landeros

Update README.md

Showing 1 changed file with 13 additions and 2 deletions
...@@ -19,8 +19,19 @@ Thats why, our hypothesis is that a predictive model can determine the GCs of th ...@@ -19,8 +19,19 @@ Thats why, our hypothesis is that a predictive model can determine the GCs of th
19 19
20 20
21 ## Metodolgy 21 ## Metodolgy
22 - 1. GEO files download 22 + 1. GEO files download
23 - 2. Obtaining SOFT files and its transformation to an XML format 23 + GEO files from all Entero bacteria were downloaded to a server and ordered in 4 directorie (all of them with lots of _GSE00000000_ folders):
24 + - Binding_exp
25 + - Binding_HT
26 + - Function_ex
27 + - Function_HT
28 + Each of the _GSE00000000_ folders contains a compresed file (GSE00000_family.soft.gz) that must be extracted.
29 +
30 + 2.- Obtaining SOFT files and its transformation to an XML format
31 + An script goes trhough every _GSE00000000_ folder an unzips _"GSE00000_family.soft.gz"_ files, in order to obain _"GSE00000_family.soft"_ files.
32 + These last are all saved in another directory, keeping the structure of the 4 father directories.
33 + Then another script transforms SOFT files into XML files.
34 +
24 3. Tagging the GC within the XML files 35 3. Tagging the GC within the XML files
25 36
26 ## Prerequisites 37 ## Prerequisites
......