CMU 11-731(MT&Seq2Seq) Applications 1 Monolingual Sequence-to-sequence Problems

Paraphrase Generation

  • re-wording sentences into other sentences with the same content but di↵erent surface features

  • e.g.

    • query expansion for information retrieval
    • improve robustness of machine translation to lexical variations
  • difficulty

    • task definition
      • bidirectional entailment -> mostly bidirectional entailment
    • paucity of data
      • dataset
        • Quora question pair dataset
        • MSCOCO captions dataset
      • methods
        • distributional similarity(words that appear in similar contexts tend to to be similar)
          • key word: empty “slots”,
          • major problem
            • hard to distinguish between distributionally similar but semantically different words
            • extremely sensitive to data sparsity
              • sol: bilingual data to learn monolingual paraphrases
        • bilingual pivoting
    • evaluate the generated paraphrases
      • PINC (like BLEU but considers not only the BLEU score, but also the dissimilarity from the original input)

Style Transformation

the same semantic content, but with a different style or register

  • Text Simplification (for second language reading comprehension)
  • Register Conversion (“Register” is the type of language used in a particular setting)
  • Personal Style Conversion
  • Demographics-level Conversion

  • method

    • simplest method: large parallel corpus and train a supervised model
    • tailor phrase-based translation models to the task of style transformation

Summarization

most salient information

  • Sentence Compression
  • Single-document Summarization
  • Multi-document Summarization

extractive summarization VS abstractive summarization

  • removing irrelevant content
    • deleting words:
      • tree-based methods
      • form as constrained optimization problem: delete words to fixed length summary while maximizing the amount of relevant content
    • sequence-to-sequence transduction problem
      • tree substitution grammars + copy words + control the length of the summary
      • attentional neural networks + copy words + control the length of the summary
  • evaluation
    • the amount of recall of important information that can be achieved within the limited summary length (e.g. ROUGE)
Share