11-697: Introduction to Question Answering with Large Language Models
Instructors: Eric Nyberg and Teruko Mitamura
Course Description: This course is designed to be
accessible to Masters and advanced undergraduate students who seek
the basic skills necessary to implement practical Question
Answering (QA) applications using Large Language Models (LLMs) in
specific information domains. The syllabus includes learning
materials on the core concepts of QA and LLMs, and how they are
applied in closed commercial systems (e.g. ChatGPT) as well as
open systems (e.g. Llama, T5). Students complete a set of hands-on
exercises in Python that develop skills in applying LLMs for
various open-source QA datasets. The course is also a
prerequisite for 11-797 Question Answering (an advanced
project-oriented course).
Prerequisite Knowledge: A course in Statistics and Probability and at least
intermediate Python programming skills
Course Goals: Students acquire basic knowledge of QA approaches and tasks,
including Data Analysis, Solution Design, Metrics, Evaluation and Error Analysis.
Grading:
- Quizzes = 40% (8 x 5%)
- Homework Tasks = 50% (5 x 10%)
- Attendance/Class Participation = 10%
Outline of Learning Materials:
- Foundations (Course Prerequisites, Definitions, Concepts, etc.)
- A First Example: LLMs for QA (e.g. ChatGPT)
- What are LLMs? How can LLMs be incorporated into QA systems?
- What happens when ChatGPT is evaluated as a QA system?
- History: a survey of tasks, domains, methods
- Tasks: factoid, list factoid, summary, yes/no, etc.
- Domains: media collections, Q/A datasets, languages & distributions
- Classic Methods and Pipelines: retrieval-based, NLP-based; multi-strategy architecture
- Modern Methods: neural models, LLMs; neural architectures
- Elements of the QA Task
- Data Design (Structure, content, distribution, labeling of questions, answer
contexts, answer candidates)
- Module Design (abstraction, implementation, parameterization)
- Pipeline Design (abstraction, implementation, parameterization)
- Task Curation & Evaluation
- Curation (Dataset design, sourcing, preliminary analysis, bias)
- Evaluation (Metrics & significance, overlap / error analysis & prioritization)
- Neural Nets for QA
- Neural Networks and Neural Language Models
- RNNs and LSTMs
- Transformers and Pretrained Language Models
- Fine-tuning and Masked Language Models
- Multi-Hop QA
- Conversational QA
- Multimodal QA
- Generative LLMs
- Wrap-Up
- Discuss learning objectives / outcomes, material covered, feedback
- b. Discuss open challenges, possible project topics for 11-797