Homework Assignment #4 Sound Editing

Due November 15

Use the Microsoft Sound Recorder (or other tool, see below) to record 5 different sounds (noises, music, speech) and save them in 3 different sampling rates each (high, medium, low) and preferably more than one sound file format. To get full credit for this assignment, download and make use of a sound editor to create some effects such as echo, reverberation, repetition, fade, splicing two sounds together, etc. You can find a free sound editor at http://www.goldwave.com. In your URL, describe what Goldwave lets you do and illustrate with 3 or more examples. Cool Edit from www.syntrillium.com/cooledit is another alternative sound editor.

Do you notice degradation in different formats of sampling rates? What are the size requirements?

What about the effects of lossy compression? Hand in a URL with the sounds and a description of your findings.

Alternatively, you can install a copy of a speech recognition system on your laptop (see Alex Hauptmann for the code) and try it out. You should hand in a URL describing your experiences with installing the package and training the system. How long did training take? Try saying a few sentences such as "The quick brown fox jumped over the lazy dogs", "My name is xxx and I am a student in the multimedia e-commerce class." Also try a sentence that includes the name of your TA (Rong Jin), as well as a sentence from this homework assignment or anything else you want to try. What did the system recognize? If you were chewing gum or had just eaten something, how well did it recognize those sentences then? Was it better when you trained the system some more? You will get full credit if you install the system and give us the recognition results for five sentences, once normally and once after any one of these actions: eating, chewing, drinking, standing far away from the microphone, standing outdoors, or having the radio on.

Or,If you are really ambitious, try creating a small application using the CMU Sphinx OCX tool kit . This requires that you have VB installed on your laptop, as well as some minimal programming in VB/VC++/VJ++. See Alex Hauptmann for details. If you get anything to work there at all, you get full credit.

Email alex@cs.cmu.edu a URL which points to your text and sound files.

Homework Assignment #5 Digitizing Video

You will be responsible for digitizing at least ten sequences of video of an interview with yourself as the person being interviewed. We suggest you pair up and have one person ask the questions. Sample questions include "how old are you?", "what is your name?", "what are your hobbies?", "where were you born?", and "what do you like to eat?" Include "generic" answers such as "no", "yes", "how dare you ask that!" and "that question is not within the scope of this interview." You should then store these digital clips on the MSEC server. These clips will become part of the synthetic video interview, which you will create in a future homework assignment.

We can provide a video camera (either 8mm or digital video) or you may use your own video equipment.

All video (VHS or 8mm, digital) will have to be encoded using the facilities in the Informedia Lab (WeH5304). You can digitize either into MPEG-1 format (high-bit rate and quality, slower download time) or RealNetworks G2 format (lower-bit rate, lower quality, faster download) depending on your choice.

This homework is due together at the complete synthetic interview due date (Assignment #8). You will need pieces of it for homework assignments 6 and 7, though.

Homework Assignment #6 Creating Files in Synchronized Media Interface Language (SMIL) Format.

Due November 24

Take your interview clips and, for each clip, compile a list of 5 questions that could be asked in order to obtain that answer clip. For example, "how old are you", "when were you born", "what is your birthday", "tell me what your age is", and "what is your chronological age" would all fit with the interview answer "I am 15 years old."

For this assignment it will help if you pay attention in class when we discuss the SMIL format. If you need further help and examples, you can download the FREE Basic RealProducer from RealNetworks at http://www.real.com or (http://proforma.real.com/mario/tools/producer.html?src=producer&wp=699tools)

Create a SMIL file that uses the text of the answer and possibly the text of one of the question forms, to synchronize and scroll the text with the interview video clip.

For extra credit, add some of the digitized images from assignment #2, or other multimedia data (sounds, music) into appropriate places in the SMIL files. This may include material you located in assignment #1 but can also include newly acquired or created material.

Email christel@cs.cmu.edu a pointer to your SMIL files. Specify which SMIL player (G2, Internet Explorer, etc.) can be used.

Homework Assignment #7 Synthetic Interview Text Preparation

Due November 29

Look at the basic synthetic Interview found in http://euro.ecom.cmu.edu/~zak/cgi-bin/syntheticInterview.cgi This is an example prototype of the interview you will try to create for homework #7 and #8.
In this phase, look at the list of questions you generated for homework #5 and #6 and convert them into the form
fileID questionText
where the fileID is the name of the clip that you digitized and the question text is one of the questions that this clip answers. E.g. in the example URL above, the file looks like this:
alx01 how old are you
alx01 what is your age
alx01 when were you born
alx01 where were you born
alx02 in what city where you born
alx02 where are you from
alx03 what is your sign
alx03 what is your zodiac sign
alx03 what is your zodiac
alx03 what is your astrological sign
ignoreRest xyzzy

Compile the list of questions from Homework Assignment #6 into this type of file. An example file is at http://euro.ecom.cmu.edu/~zak/html/alx.ClipsWithText
Using the template Perl code provided at URL http://euro.ecom.cmu.edu/~zak/html/syntheticInterview.txt create a typed synthetic interview. Modify the clipFile file name to point to your file that contains the clip IDs with the question text. This perl script code should be put in your cgi-bin directory with a .cgi extension to allow the web server to access it appropriately, while the clip file should live in your html directory.

Feel free to modify the template code in any way you want to make the interview more appealing. This could include adding "advertising", better layout and graphics, background images and sounds, flash animations, etc.

If you don't have an account on euro.ecom.cmu.edu, see Alex to set up an alternative. Email alex@cs.cmu.edu a URL which points to your text "interview" (The video portion doesn't have to work until the next homework.)

Homework Assignment #8 Synthetic Interview Integration

Due December 3

Using the template Perl code from HW #7 provided at URL http://euro.ecom.cmu.edu/~zak/html/syntheticInterview.txt substitute the names of your interview video files with the identifiers. Your text responses should include video clips playing as the answer. In the example code, the video is stored as simple .rm video files in the /~zak/html/ directory with the names alx01.rm, alx02.rm, and alx03.rm. The video files are called through the corresponding .smi (SMIL) files alx01.smi, alx02.smi, etc. in the perl cgi program. You should use your own .smi files for this homework.
Modify the template code in any way you want to make the interview more appealing. This could include adding "advertising", better layout and graphics, background images and sounds, flash animations, etc.

Email alex@cs.cmu.edu AND christel@cs.cmu.edu a URL which points to your integrated "synthetic interview" and suggest (or program) 3 ways in which this interface could be improved.

Homework Assignment #9 Image Similarity Searching

Due December 10

Download the IBM QBIC image matching system from http://wwwqbic.almaden.ibm.com/
Try it on the images that you have collected for earlier assignments. If you can't do that, evaluate the capabilities of QBIC using the sites they list as using their system.

Write up a short description on what you found, e.g., how well does it work with your images, and what patterns of success can you see?