Tuesday, November 13, 2018. 12:00PM. GHC 6115.
Kirstin Early -- Towards a general-purpose text representation for mail
Abstract: There are many possibilities for using machine learning on email to help users save time and accomplish their goals -- e.g., spam classification, reply prediction, and message categorization. However, building a separate model for each task is inefficient because each model generates its own internal representation of messages. A general-purpose representation could eliminate this redundant computation and be used for many downstream tasks. We train encoder-decoder neural networks on self-supervised mail tasks and generate representations of new mail messages as the encoder output of these networks. Simple models for downstream tasks can then be trained on the representations. We illustrate this method on a pilot task of RSVP classification and find the general-purpose representation performs similarly to a model built specifically for this task.