15-494/694 Cognitive Robotics Lab 9: Large Language Models
I. Software Update and Initial Setup
At the beginning of every lab you should update your copy of the
cozmo-tools package. Do this:
$ cd ~/cozmo-tools
$ git pull
II. Play Semantris
Play
the Semantris
game in "Blocks" mode. Note: it's more fun with sound enabled.
How does Semantris know which words are related? It's using
embeddings, and computing dot products to measure similarity.
Start a new game and take a screenshot of the initial state. You will
use this screenshot in the next step.
III. Experiment With Word Embeddings
- Run
the WordEmbeddingDemo.
- Try hovering over a word in the 3D plot to see the closest words.
- You can add new words to the 3D plot by typing them in the text
box below. Try that.
- Press the "Clear all" button to erase all the words from the display.
- Examine these slides to see
how we can use the demo to explore the kind of matching that
Semantris does.
- Pick six words from your Semantris screenshot. Type in one word
at a time to add it to the display. After adding the word, its dot is
red. Click on one of the six slots on the right side of the screen to
load the word into that slot. Continue until all six words have been
loaded.
- Pick one of the six words as your target word. Think of a one
word prompt you could use to reach that target. Add the prompt word
to the display by typing it in the text box.
- Click on the newly added prompt word to turn it from red to
black. Then click on it again to turn it back to red and display the
similarity measures to the six words in the slots. Did you hit your
target?
- Take a screenshot showing the similarity lines.
IV. Question Answering With BERT
The BERT model has been publicly released by Google, and is
distributed in a convenient form by Hugging Face. In this part of the
lab you will run BERT on your workstation (using the GPU) to perform
extractive question answering.
- Make a lab9 directory.
- Download Lab9a.py.
- Read the source code.
- Run Lab9a.py. The first time your run it, it will have to
download a large weight file, so be patient. Try the following
queries. Which ones work, and which ones don't? Make a record of
the responses (you can just paste them into a file).
- Is cube1 visible?
- Is cube2 visible?
- What is cube1's orientation?
- What is sideways?
- What cube is sideways?
- What cubes are visible?
- What isn't visible?
- Is cube1 delicious?
- How many cubes are there?
- What is cube1?
- What is cube2?
- What is the distance to cube3?
V. GPT-4
In this section you will experiment with GPT-4, which is much more
powerful than BERT. Instead of downloading the model you will use the
OpenAI GPT-4 API.
- Download Lab9b.py and Lab9c.py.
- Read the source code.
- Enter the following shell command:
export OPENAI_API_KEY= key_you_received_in_email.
- Run Lab9b.py and examine the result.
- Run Lab9c.py and try the same queries you used with BERT. Make a record of the results.
- Compare the answers you got from BERT with the answers you got from GPT-4. What do you conclude?
- Are GPT-4's distance calculations accurate?
VI. Homework (Solo): GPT-4 and Cozmo
Do this part by yourself, not in a team.
Read
this page
from OpenAI on prompt engineering.
Write a program CozmoChat.fsm that accepts queries using the "tm"
command in simple_cli, and uses GPT-4 to answer the query. To form
the context for the query your program should examine Cozmo's world
map to determine what it knows about cubes, walls, doorways, and
faces, and put these together into a string. Note that to get cube
orientation you must use wcube1 instead of cube1, since this is a
feature of cozmo-tools, not the SDK.
Develop a set of questions you can ask Cozmo to demonstrate the
strengths and weaknesses of GPT-4. For example:
- Is cube2 sideways?
- How many cubes are there?
- What is the distance between cube1 and cube2?
- Which cube is closest to cube1?
The code in Lab9c.py treats each query as a new conversation. We can
make CozmoChat behave more like ChatGPT by cumulatively growing the
context, by appending each new query and response to the messages list
passed in the API call. (The user's queries are marked as role
"user", while GPT-4's responses should be marked as role
"assistant".) This will allow you to have interactions like:
- Please remember that all cubes are 45 mm on a side.
- How big is cube1?
- What is the volume of cube3?
Hand In
Hand in the following:
- Your Semantris and WordEmbeddingDemo screenshots.
- Your observations comparing results from Lab9a vs. Lab9c.
- Your source code for CozmoChat.fsm.
- Your own questions for and results from CozmoChat.fsm.
|