11-697: Introduction to Question Answering with Large Language Models

Instructors: Eric Nyberg and Teruko Mitamura

Course Description: This course is designed to be accessible to Masters and advanced undergraduate students who seek the basic skills necessary to implement practical Question Answering (QA) applications using Large Language Models (LLMs) in specific information domains. The syllabus includes learning materials on the core concepts of QA and LLMs, and how they are applied in closed commercial systems (e.g. ChatGPT) as well as open systems (e.g. Llama, T5). Students complete a set of hands-on exercises in Python that develop skills in applying LLMs for various open-source QA datasets. The course is also a prerequisite for 11-797 Question Answering (an advanced project-oriented course).

Prerequisite Knowledge: A course in Statistics and Probability and at least intermediate Python programming skills

Course Goals: Students acquire basic knowledge of QA approaches and tasks, including Data Analysis, Solution Design, Metrics, Evaluation and Error Analysis.

Grading:

Quizzes = 40% (8 x 5%)
Homework Tasks = 50% (5 x 10%)
Attendance/Class Participation = 10%

Outline of Learning Materials:

Foundations (Course Prerequisites, Definitions, Concepts, etc.)
A First Example: LLMs for QA (e.g. ChatGPT)
- What are LLMs? How can LLMs be incorporated into QA systems?
- What happens when ChatGPT is evaluated as a QA system?
History: a survey of tasks, domains, methods
- Tasks: factoid, list factoid, summary, yes/no, etc.
- Domains: media collections, Q/A datasets, languages & distributions
- Classic Methods and Pipelines: retrieval-based, NLP-based; multi-strategy architecture
Modern Methods: neural models, LLMs; neural architectures
Elements of the QA Task
- Data Design (Structure, content, distribution, labeling of questions, answer contexts, answer candidates)
- Module Design (abstraction, implementation, parameterization)
- Pipeline Design (abstraction, implementation, parameterization)
Task Curation & Evaluation
- Curation (Dataset design, sourcing, preliminary analysis, bias)
- Evaluation (Metrics & significance, overlap / error analysis & prioritization)
Neural Nets for QA

Neural Networks and Neural Language Models
RNNs and LSTMs
Transformers and Pretrained Language Models
Fine-tuning and Masked Language Models

Multi-Hop QA
Conversational QA
Multimodal QA
Generative LLMs
Wrap-Up
- Discuss learning objectives / outcomes, material covered, feedback
- b. Discuss open challenges, possible project topics for 11-797