Back to "Biological Language Modeling Seminar Topics"

Back to "Protein sequence analysis"

 

Protein classification by motifind

Neural Network

unknown protein ---(n-gram encoding*)---> N-gram vector ---(training sequence - term weight)--> Input vector (Motif) --->

                                                                                                                                                                                              NN (input, hidden, output layer)

                                                                                                                                                                                              feed-forward, back-propagation network

                                                              ---> Input vector global ------------------------------------------------------------------>

 

Result: family yes or no

* N-gram encoding:

20^2[+20^3]+6^4, then SVD to 100 entries/vector