- reference
Adapting sequence-to-sequence models to a particular type of problem.
Ensembling
P(E|F)=λP1(E|F)+(1−λ)P2(E|F)
Multi-task Learning
- Multi-task Loss Functionss
l(C1,C2)=λ1l1(C1)+λ2l2(C2). - Task Labels
- adding domain-specific features
- Multi-task Loss Functionss
Transfer Learning
- Continued Training
- Data Selection
- log-likelihood differential
diff(E)=logPin(E)−logPgen(E). (in: in-domain data, gen: general-domain data)
- log-likelihood differential