Toggle navigation
Toggle navigation
This project
Loading...
Sign in
Carlos-Francisco Méndez-Cruz
/
sentence-simplification
Go to a project
Toggle navigation
Toggle navigation pinning
Projects
Groups
Snippets
Help
Project
Activity
Repository
Pipelines
Graphs
Issues
0
Merge Requests
0
Wiki
Snippets
Network
Create a new issue
Builds
Commits
Issue Boards
Authored by
Carlos-Francisco Méndez-Cruz
2018-02-22 00:22:56 -0600
Browse Files
Options
Browse Files
Download
Email Patches
Plain Diff
Commit
dd8564cb516901f82771ed1e2c862a6b8a291e49
dd8564cb
1 parent
8222e220
Update README.md
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
32 additions
and
23 deletions
README.md
README.md
View file @
dd8564c
# Automatic extraction of gene - disease events
## Input data sets (original corpus given by Yalbi Balderas)
# Sentence simplification with iSimp and Daniel Gutiérrez's algorithm
## Directories
### iSimp
```
Shell
\input-data-sets
/isimp_v2
```
## Dictionaries
### Original given by Yalbi Balderas
### Temporal iSimp files with constructs
```
Shell
\dictionaries-original
/iSimp_sentences
```
## Terminological resources (dictionaries for entity recognition)
### Final simplified sentences
```
Shell
\terminologicalResour
ces
/algorithm_senten
ces
```
## JSON dictionaries (Gene, disease, effect and GO)
### Cleaned sentences
```
Shell
\dictionaries-json
format/sanitized_sentences
```
## Example sentences
### Example sentences with different tags (non redundant)
### Separated sentences one per file
```
Shell
\example-
sentences
format/split_
sentences
```
## Corpus
### Set of abstracts and full article sentence-splitted
## Scripts
### Clean sentences for iSimp
```
Shell
Usage: ./format/regex.py <input_file_path> <output_file_path>
./format/regex.py ./input-sentences/input-sentences.txt ./format/sanitized_sentences/input-sentences.txt
```
### Main shell script for sentence simplification
```
Shell
\corpora
./sentence-simplification-main.sh
Usage: ./sentence-simplification-main.sh <input_path> <output_file_path>
./sentence-simplification/sentence-simplification-main.sh ./format/split_sentences ./algorithm_sentences/filename.txt
<input_path> Path for cleaned and separated sentences, one per file.
<output_file_path> Path and filename. It uses filename to create files with simplified sentences and with an index within the filename.
Requirements: senteces must be separated one per file and they must be cleaned.
It calls simplifier.py
```
## scripts
### Scripts used to tag and preprocess text
### Python scritp for sentences simplification
```
Shell
\scripts
```
\ No newline at end of file
simplifier.py
It is called by sentence-simplification-main.sh
```
...
...
Please
register
or
login
to post a comment