15-494/694 Cognitive Robotics Lab 9: Large Language Models
I. Software Update and Initial Setup
At the beginning of every lab you should update your copy of the
cozmo-tools package. Do this:
$ cd ~/cozmo-tools
$ git pull
II. Play Semantris
Play
the Semantris
game in "Blocks" mode. Note: it's more fun with sound enabled.
How does Semantris know which words are related? It's using
embeddings, and computing dot products to measure similarity.
Start a new game and take a screenshot of the initial state. You will
use this screenshot in the next step.
III. Experiment With Word Embeddings
- Run
the WordEmbeddingDemo.
- Try hovering over a word in the 3D plot to see the closest words.
- You can add new words to the 3D plot by typing them in the text
box below. Try that.
- Press the "Clear all" button to erase all the words from the display.
- Examine these slides to see
how we can use the demo to explore the kind of matching that
Semantris does.
- Pick six words from your Semantris screenshot. Type in one word
at a time to add it to the display. After adding the word, its dot is
red. Click on one of the six slots on the right side of the screen to
load the word into that slot. Continue until all six words have been
loaded.
- Pick one of the six words as your target word. Think of a one
word prompt you could use to reach that target. Add the prompt word
to the display by typing it in the text box.
- Click on the newly added prompt word to turn it from red to
black. Then click on it again to turn it back to red and display the
similarity measures to the six words in the slots. Did you hit your
target?
- Take a screenshot showing the similarity lines.
IV. Question Answering With BERT
The BERT model has been publicly released by Google, and is
distributed in a convenient form by Hugging Face. In this part of the
lab you will run BERT on your workstation (using the GPU) to perform
extractive question answering.
- Make a lab9 directory.
- Download Lab9a.py.
- Read the source code.
- Run Lab9a.py and try the following queries. Which ones work,
and which ones don't? Make a record of the responses (you can
just paste them into a file).
- Is cube1 visible?
- Is cube2 visible?
- What is cube1's orientation?
- What is sideways?
- What cube is sideways?
- What cubes are visible?
- What isn't visible?
- Is cube1 delicious?
- How many cubes are there?
- What is cube1?
- What is cube2?
- What is the distance to cube3?
V. GPT-3
In this section you will experiment with GPT-3, which is much more
powerful than BERT. Instead of downloading the model you will use the
OpenAI GPT-3 API.
- Download Lab9b.py and Lab9c.py.
- Read the source code.
- Enter the following shell command:
export OPENAI_API_KEY= key_you_received_in_email.
- Run Lab9b.py and examine the result.
- Run Lab9c.py and try the same queries you used with BERT. Make a record of the results.
- Compare the answers you got from BERT with the answers you got from GPT-3. What do you conclude?
- Are GPT-3's distance calculations accurate?
VI. Homework (Solo): GPT-3.5 and Cozmo
Do this part by yourself, not in a team.
OpenAI has released GPT-3.5, which is set up to do chat completion
rather than simple completion. The main difference is that in chat
completion the input is structured as a sequence of messages instead
of one giant string. Read the
documentation here
for details. This doesn't make much difference for the simple
question answering task we've been exploring, but the quality of the
results may be better.
- Write a program TestChat.py as a modified version of Lab9c that
uses GPT-3.5.
- Compare the quality of the answers from TestChat.py to what you
got from Lab9a and Lab9c.
- Read
this page
from OpenAI on prompt engineering.
- Write a program CozmoChat.fsm that accepts queries using the
"tm" command in simple_cli, and uses GPT-3.5 to answer the query.
To form the prompt for the query your program should examine
Cozmo's world map to determine what it knows about cubes, walls,
doorways, and faces, and put these together into a string. Note
that to get cube orientation you must use wcube1 instead of cube1,
since this is a feature of cozmo-tools, not the SDK.
- Develop a set of questions you can ask Cozmo to demonstrate the strengths and
weaknesses of GPT-3.5. For example, since it knows the cube locations, can you ask
which cube is closest to cube1, or what is the distance between cube1 and cube2?
Hand In
Hand in the following:
- Your Semantris and WordEmbeddingDemo screenshots.
- Your observations comparing results from Lab9a, Lab9c, and
TestChat.
- Your source code for TestChat.py and CozmoChat.fsm.
- Your own questions for and results from CozmoChat.fsm.
|