Kevin Meza Landeros

Update README.md

Showing 1 changed file with 13 additions and 3 deletions
1 -# "Automatic Extraction of Growth Conditions (GC) from the Gene Expression Omnibus (GEO)". 1 +# "Automatic Extraction of Growth Conditions (GCs) from the Gene Expression Omnibus (GEO)".
2 -Project to extract in an automatic way, the growth conditions of all enterobacteria within the GEO using "Conditional Random Fields " (CRFs). 2 +Project to extract in an automatic way the growth conditions of all enterobacteria within the GEO using "Conditional Random Fields " (CRFs).
3 3
4 ## Prerequisites 4 ## Prerequisites
5 ### Programming languages 5 ### Programming languages
6 - - Python (version 2.7, version 3) 6 + - Python (version 2.7, version 3.7)
7 - Bash 7 - Bash
8 8
9 ## Folder content 9 ## Folder content
...@@ -27,8 +27,18 @@ Project to extract in an automatic way, the growth conditions of all enterobacte ...@@ -27,8 +27,18 @@ Project to extract in an automatic way, the growth conditions of all enterobacte
27 **CoreNLP** 27 **CoreNLP**
28 - bin 28 - bin
29 1. get-raw-sentences.sh 29 1. get-raw-sentences.sh
30 + _Script that **extracts the GCs** from the file: "tagged-xml-data" and adds the phrase: "PGCGROWTHCONDITIONS" to all lines._
30 2. single_run.sh 31 2. single_run.sh
32 + _Script that **runs** th script: "corenlp.sh" with the desired parameters._
31 - input 33 - input
32 1. raw-metadata-senteneces.txt 34 1. raw-metadata-senteneces.txt
35 + _Resulting file from "get-raw-sentences.sh". **Contains all the GCs.**_
33 - output 36 - output
34 1. raw-metadata-senteneces.txt.conll 37 1. raw-metadata-senteneces.txt.conll
38 + _This file contains **all the words of all the GCs** tagged with its **"lemma" & "POS"**_
39 +
40 +**data-sets**
41 + - report-manually-tagged-gcs
42 + _Contains the extracted GCs of all the samples for each serie._
43 + - tagged-xml-data
44 + _Contains the **original xml-tagged files** where the GCs will be extracted._
...\ No newline at end of file ...\ No newline at end of file
......