Toggle navigation
Toggle navigation
This project
Loading...
Sign in
Carlos-Francisco Méndez-Cruz
/
spanish-maya-nahuatl-morphological-analysis
Go to a project
Toggle navigation
Toggle navigation pinning
Projects
Groups
Snippets
Help
Project
Activity
Repository
Pipelines
Graphs
Issues
0
Merge Requests
0
Wiki
Snippets
Network
Create a new issue
Builds
Commits
Issue Boards
Authored by
Carlos-Francisco Méndez-Cruz
2018-09-02 22:54:33 -0500
Browse Files
Options
Browse Files
Download
Email Patches
Plain Diff
Commit
72d1542c8326e3761ebef7df565b5d5bb0dd00be
72d1542c
1 parent
8031d48d
README
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
7 deletions
README.md
README.md
View file @
72d1542
# Automatic analysis of morphological units: segmentation and clustering of Spanish, Maya and Nahuatl
## Carlos-Francisco Méndez-Cruz and Ignacio Arroyo-Fernández
#
#
Automatic analysis of morphological units: segmentation and clustering of Spanish, Maya and Nahuatl
##
#
Carlos-Francisco Méndez-Cruz and Ignacio Arroyo-Fernández
In this repository, results of two automatic morphological
analyzes for Spanish, Nahuatl and Maya are shown.
...
...
@@ -18,23 +18,23 @@ conclude that the word embeddings represented the contextual
information necessary to differentiate them from morphs with
lexical-semantic content.
# Directory description
### Corpora
`\corpora`
## Corpora
Only a sample of documents employed in our study.
Complete versions must be request by e-mail (see
**Contact**
).
## Segmentation
##
#
Segmentation
Segmented corpus for each language.
Maya and Nahuatl were segmented using _Morfessor CatMap_
(http://www.cis.hut.fi/projects/morpho/).
Spanish was segmented by the authors.
## Clustering
##
#
Clustering
Clusters of morphs for each language:
500 groups for Maya and Nahuatl, 1000 groups for Spanish.
## Contact
##
#
Contact
Carlos Méndez (cmendezc at ccg dot unam dot mx)
Center for Genomic Sciences, UNAM, Mexico
...
...
Please
register
or
login
to post a comment