Carnegie Mellon University
15-826 Multimedia Databases and Data Mining
Fall 2019 - C. Faloutsos
PROJECT INFORMATION - non-default
projects
0. Reminder
As mentioned, this is for Ph.D.
students that choose to do a non-default project. Again,
the list of non-default projects is here, internal to CMU.
1. First Deliverable: Phase
1 -
The Proposal
Once you have selected a non-default topic, you should do some
background reading so that you are capable of describing, in some
detail, what you expect to accomplish. For example, if you decide
that
you want to implement an algorithm to spot fake reviews on
`amazon',
you will have to carefully read the papers that propose earlier
such
methods, pinpoint their weaknesses, and explain how your approach
will
address these weaknesses. Once you have read up on your topic, you
will
be ready to write your proposal.
The proposal should describe what you plan to do for your
project. It should describe the problem that you will be
addressing, how you plan to address it, what tools (e.g., "yacc",
Postgres, hadoop, etc.) you will need for your work, what you
expect to produce as a result of your work, and anything else that
you think the instructor should know to evaluate your plans. You
should also describe what portion of the project each partner will
be doing.
Your proposal should be approximately 6-8 pages long,
typed (eg., latex/pdf/msword), 12pt size font, neat,
and with pictures if they seem useful (`idraw', `xfig', 'tikz',
are good
choices). Also, the proposal should be self-contained: You
should briefly review the key ideas in the references, and
describe clearly the alternatives that you will be examining.
Important points - check-list:
- Grading scheme: 60% for the survey; 30% for innovation; 10%
for
plan of activities
- Use the latex
template -
follow its outline.
- please provide a plan
of activities and time estimates, per
group member.
- Attribution: list
which group member did (or will do) what
- Your survey should have at least 3 papers or book
chapters per group member (outside of the reading list).
- Short papers, like PNAS, Nature, Science papers, count as
0.5.
- Copying the abstract of the papers is obviously prohibited,
constituting plagiarism.
- For each paper, discuss it in about 1 page, and describe
- (a) the problem definition (input/output)
- (b) the main idea,
- (c) why (or why not) it
will be useful for your project, and
- (d) its potential shortcomings,
that you will try to improve upon.
- Clear problem definition:
for the non-default projects, give a precise problem definition,
as in the latex template above.
- Check grammar and syntax (small penalty for each typo /
grammar error).
Reminders:
- Keep the graded Phase 1 report, and attach it to your phase 2
and phase 3
submissions
2. Second Deliverable:
Phase 2 -
The Progress Report
This should be a 10-15 page long report, and it serves as a
check-point. It should consist of the same sections as your final
report (introduction, survey, etc), with a few sections `under
construction'. Specifically, the introduction and survey sections
should be in their final form; the section on the proposed method
should be almost finished; the sections on the experiments and
conclusions will have whatever results you have obtained, as well
as `place-holders' for the results you plan/hope to obtain.
Grading scheme for the project report:
- 70% for proposed method (should be almost finished)
- 25% for the design of upcoming experiments
- 5% for plan of activities (in an appendix, please show the old
one and the revised one, along with the activities of each group
member)
- Attach your graded
phase 1 report
- Clear list of innovations:
give a list of the best 2-4 ideas that your approach
exhibits
Reminders, for all projects:
- Again: Keep the
graded
Phase 2 report, and attach
it to your phase 3 submission.
3. Third deliverable: Phase
3 -
The Final Report and Poster
The grade of the final phase of the project will have the
following components:
- writeup: there, you would describe the novelties of
your
approach and your discoveries/insights/experiments. Your
final
report is expected to be a 20-30 pages long report, treating in
depth the agreed topic.
- software: packaging, documentation, and portability.
The
goal is to provide enough material, so that other people can use
it
and continue your work.
- poster
presentation. The poster of each group will consist of
nine pages (e.g.,
use
power-point/openoffice to create those 9 pages)
3.1. Grading Scheme for Final Report and Poster
- Writeup
- [2%] Introduction - Motivation
- [3%] Problem definition
- [5%] Survey
- Proposed method
- [10%] Intuition - why should it be better than the
state of the
art?
- [35%] Description of its algorithms
- Experiments
- [5%] Description of your testbed; list of questions
your
experiments are designed to answer
- [25%] Details of the experiments; observations (as many
as you can!)
- [5%] Conclusions
- Software (testing, packaging and documentation) [5%]
- Poster presentation [5%]
3.2. Specifications for packaging of software:
Please create a tar-file, like this sample
package ( use gunzip ; tar xvf).
Check-list:
- after un-tar-ing, the command 'make' should
compile your system, install it if necessary and run a small
demo
on a sample input file (included in your package)
- it should have a README file, corresponding to the
`user's manual': This file should describe the package in
a
few paragraphs, as well as how to install it and how to use
it.
- it should have a directory DOC, with your writeup, and
your foils (in your favorite form: latex, pdf, powerpoint,
ms-word)
- 'make
paper.pdf' should create the
corresponding version of your writeup (skip this step, if you
use
ms-word)
- `make clean' should eliminate all the
derived files (*.o, *.class, *.aux, etc)
- `make all.tar' (or 'make all.zip') should
create a tar/zip-file, ready for distribution.
- please make sure that your package includes only the
absolutely necessary set of files!
3.3. Final project report - what to hand in:
On the announced due date on the schedule,
- bring a hard copy of the writeup in class,
- a (hard copy) of the graded phase-2 report
and graded phase-1 report and
- e-mail your tar/zip-file
before class. If the file is too large for e-mailing,
contact
the instructor and/or put your large datasets in dropbox.
3.4. Poster session
- When: We
will have all projects presenting a poster, on the last day of
classes, Fri, Dec. 6, 1pm - 5pm.
- What: The poster of
each group will consist of nine pages (e.g., use
power-point/openoffice to create those 9 pages)
- Who: At least
one project member should be present during the poster hours, or
a
significant, pre-arranged, subset of it.
- How: we will provide
scotch tape, to post your pages on the wall.
- Demo: it is
optional but
encouraged. If you do give a demo, please bring your own laptop
(and everything else necessary: ethernet cable, power adaptors,
etc)
- Who will attend: The
session will be open to everybody (SCS, CIT,
etc).
Created: Sept. 16, 2019, by Christos Faloutsos