Back to "Biological Language Modeling Seminar Topics"
Back to "Protein sequence analysis"
Protein classification by motifind
Neural Network
unknown protein ---(n-gram encoding*)---> N-gram vector ---(training sequence - term weight)--> Input vector (Motif) --->
NN (input, hidden, output layer)
feed-forward, back-propagation network
---> Input vector global ------------------------------------------------------------------>
Result: family yes or no
* N-gram encoding:
20^2[+20^3]+6^4, then SVD to 100 entries/vector