Update README.md

Ignacio Arroyo Fernández
Commit 80dc62e8291dde9399664616b83ee2482fd65694 80dc62e8 1 parent 4c27cbd2
Showing 1 changed file with 5 additions and 4 deletions
README.md
--- a/README.md
View file @80dc62e
+++ b/README.md
View file @80dc62e
@@ -14,6 +14,7 @@ The main method follows the next pipeline:
 ### Prediction mode
 - Parse abstracts from a unique input file
 - Transform abstracts into their TFIDF sparse representations
+- Transform TFIDF representations into their 200-dimensional SVD approximation
 - Predict useless/useful papers by means of their abstracts using pretrained Support Vector Machines
 # Usage
@@ -21,7 +22,7 @@ The main method follows the next pipeline:
 For filtering unknown abstracts run
 ```bash
-$ python filter_abstracts.py --input data/test_abstracts.txt
+$ python filter_abstracts_binClass.py --input data/test_abstracts.txt
 ```
 The predictions will be stored by default at `filter_output/`, unless a different directory is specified by means of the `--out` option. The default names containing the predicitons are 
@@ -36,10 +37,10 @@ The format of each file is:
 <PMID> \t <text of the abstract>
 ``` 
-For training a new model set the list of parameters at `model_params.conf` and then run
+For training a new model set the list of parameters at `model_params_binClass.conf` and then run
 ```bash
-$ python filter_abstracts.py --classA data/ecoli_abstracts/not_useful_abstracts.txt --classB data/ecoli_abstracts/useful_abstracts.txt
+$ python filter_abstracts_binClass.py --classA data/ecoli_abstracts/not_useful_abstracts.txt --classB data/ecoli_abstracts/useful_abstracts.txt
 ```
-where `--classA` and `--classA` are used to specify input training files. In this example `data/ecoli_abstracts/useful_abstracts.txt` is the training files containing abstracts of papers reporting experimental data (the desired or useful class for us).
+where `--classA` and `--classB` (the useful papers) are used to specify input training files. In this example `data/ecoli_abstracts/useful_abstracts.txt` is the training files containing abstracts of papers reporting experimental data (the desired or useful class for us).