CMU 11-731(MT&Seq2Seq) Advanced Topics: Adaptation Methods

Adapting sequence-to-sequence models to a particular type of problem.

  • Ensembling

    $$ P(E\;\vert\;F)\;=\;\lambda P_1(E\;\vert\;F)\;+\;(1-\lambda)P_2(E\;\vert\;F) $$

  • Multi-task Learning

    • Multi-task Loss Functionss
      $$ l(C1,\;C2)\;=\lambda_1l_1(C1)\;+\;\lambda_2l_2(C2).\; $$
    • Task Labels
      • adding domain-specific features
  • Transfer Learning

    • Continued Training
    • Data Selection
      • log-likelihood differential
        $$ diff(E)=log P_{in}(E)\;-\;log P_{gen}(E). $$ (in: in-domain data, gen: general-domain data)
Share