Toggle navigation
Toggle navigation
This project
Loading...
Sign in
Carlos-Francisco Méndez-Cruz
/
deep-learning-workshop
Go to a project
Toggle navigation
Toggle navigation pinning
Projects
Groups
Snippets
Help
Project
Activity
Repository
Pipelines
Graphs
Issues
0
Merge Requests
0
Wiki
Snippets
Network
Create a new issue
Builds
Commits
Issue Boards
Authored by
Carlos-Francisco Méndez-Cruz
2019-05-08 13:34:01 -0500
Browse Files
Options
Browse Files
Download
Email Patches
Plain Diff
Commit
d60f92ab4d4720f3ba72d7a0eed71b2b077e6a44
d60f92ab
1 parent
052e1ceb
Deep Learning Workshop
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
5 deletions
data-sets/get-hga-training-test-py27.py
data-sets/get-hga-training-test-py27.py
View file @
d60f92a
...
...
@@ -86,18 +86,17 @@ if __name__ == "__main__":
print
(
"Max exon length: {}"
.
format
(
max_exon_length
))
print
(
"Max utr length: {}"
.
format
(
max_utr_length
))
quit
()
# Fill sequence with X char to get max length
# One-hot-encoding of sequences
for
sequence
,
label
in
zip
(
sequences
,
labels
):
if
label
==
"exon"
:
if
len
(
sequence
)
<
max_exon_length
:
sequence
.
ljust
(
max_exon_length
+
len
(
sequence
),
'X'
)
sequence
_adjust
=
sequence
.
ljust
(
max_exon_length
+
len
(
sequence
),
'X'
)
elif
label
==
"utr"
:
if
len
(
sequence
)
<
max_utr_length
:
sequence
.
ljust
(
max_utr_length
+
len
(
sequence
),
'X'
)
integer_encoded
=
integer_encoder
.
fit_transform
(
list
(
sequence
))
sequence_adjust
=
sequence
.
ljust
(
max_utr_length
+
len
(
sequence
),
'X'
)
print
(
"Length sequence_adjust: {}"
.
format
(
len
(
sequence_adjust
)))
integer_encoded
=
integer_encoder
.
fit_transform
(
list
(
sequence_adjust
))
integer_encoded
=
np
.
array
(
integer_encoded
)
.
reshape
(
-
1
,
1
)
one_hot_encoded
=
one_hot_encoder
.
fit_transform
(
integer_encoded
)
input_features
.
append
(
one_hot_encoded
.
toarray
())
...
...
Please
register
or
login
to post a comment