1# MONOGENIC DISEASES
Human Genomics Project
Team:
- Garcia Flores Fernanda Renee
- Meza Landeros Kevin Emmanuel
- Schafer Juarez Badillo Alejandra Nicole
- Zeferino Garcia Karla
Here we display all the data and scripts used in order to answer one of the most relevant actual questions in the field of Human Health:
Which is the proportion of diseases that are caused due to afections in coding and non coding regions?
Folder content
- Grsphs
Has plots that show the proportion of coding and non-coding sequences of Monogenic Diseases.
- Grafica1.png
- Grafica2.png
-
alignments
Contains the resultant files of aligning the secuences from genes of interest (that cause a monogenis disease) to the human genome.- sequences_aligned_A.bam
- sequences_aligned_A.sam
- sequences_aligned_A_sort.bam
-
data
-
DISEASES DB
Stores one of the databases use for the project, a file that has all the information of the monogenic diseases contained within it.- human_disease_textmining_full.tsv
- merge_list_monogenic_diseases.tsv (list of genes form "merge_monogenic_diseases.tsv")
- merge_monogenic_diseases.tsv
-
Ensembl
Harbors the following information about human genes: Gene start (bp); Gene end (bp); Gene type; Gene name; Strand; Protein stable ID- mart_export_v2.txt
-
Homo_sapiens
Includes human genome sequence and it's annotation.- Homo_sapiens.GRCh38.100.gff3.gz (annotation)
- Homo_sapiens.GRCh38.dna.alt.fa.gz (sequence)
-
OMIM
Contains a file that has information about different heritable conditions and that was was filtered to get what corresponds to monogenic diseases.- gene_filtered_phenENS.txt
-
DISEASES DB
-
scripts
Has the scripts that were used through this project.- CambioCol.R
- ObtencionSecuencias.R
- ObtenciondeAllData.R
- alineamiento.sh
- get_monogenic_disease_data_DISEASES.sh
- get_monogenic_disease_data_OMIM.sh
- mapeo.R