17-630 Prompt Engineering

Instructor: Prof. Travis Breaux

Associate Professor
Software and Societal Systems Department
https://www.cs.cmu.edu/~breaux

Hiring Prompt Engineers?

Tech's hottest new job: AI whisperer. No coding required. Washington Post, Feb 25, 2023.
By Drew Harwell

A smart, focused young woman pulling the strings of a puppet dressed in suspenders with her left hand, while she is scribbling mathematical formula and geometric shapes on a piece of paper with her right hand

'Prompt engineers’ are being hired for their skill in getting AI systems to produce exactly what they want. And they make pretty good money." (Read Article)

Small Language Models Compete on Benchmark Tasks

A group of people sitting around a person who is cupping a small globe in their hands, showing the globe to all.

Microsoft releases Phi-2, "a 2.7 billion-parameter language model that demonstrates outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters. On complex benchmarks Phi-2 matches or outperforms models up to 25x larger, thanks to new innovations in model scaling and training data curation." (Read Article)

In-context Learning Opens New Frontier

Language Models are Few-Shot Learners. NeurIPS 2020
By Brown et al., OpenAI

$Women and men using ropes and ladders to climb a pile of large, heavy rocks with mathematical fractions inscribed upon the rocks.$

"... For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic." (Read Article)

Prompt Selection and Order Matters

Calibrate Before Use: Improving Few-Shot Performance of Language Models Machine Learning Research, 2021
By Zhao et al., Berkeley, UMD and UCI

A man trying to find the right words to say and in which order to obtain the response that he desires.

"Prompt[s] that contain a few training examples... can be unstable: the choice of prompt format, training examples, and even the order of the training examples can cause accuracy to vary from near chance to near state-of-the-art." (Read Article)