Carnegie Mellon University
15-826 Multimedia Databases and Data Mining
Spring 2017 - C. Faloutsos
GRADING SCHEME - default projects
updated: 4/19 (see yellow highlights)
1. First Deliverable: Phase 1 -
The Literature survey
Your survey should be approximately 6-8 pages long,
typed (eg., latex/pdf/msword), 12pt size font, neat,
and with pictures if they seem useful (`idraw', `xfig', 'tikz', are good
choices).
Important points - check-list:
- Grading scheme: 90% for the survey; 10% for
plan of activities - who-does-what
- Use the latex template -
follow its outline.
- Attribution: list which group member did (or will do) what
- Your survey should have at least 3 papers or book
chapters per group member (outside of the course reading list).
- Short papers, like PNAS, Nature, Science papers, count as
0.5.
- Copying the abstract of the papers is obviously prohibited,
constituting plagiarism.
- For each paper, discuss it in about 1 page, and describe (**clarification, 2/12/2017**)
- a) the problem definition that the paper is addressing
- b) a summary of the main idea of the paper (in your own words - cutting-and-pasting
text from the paper or any other source, is plagiarism)
- c) why, or why not, this paper is useful for your project
- d) list of shortcomings, that you think that future research could address.
- Check grammar and syntax (small penalty for each typo / grammar error).
Reminders:
- Keep the graded Phase 1 report, and attach it to your phase 2 and phase 3
submissions
2. Second Deliverable: Phase 2 -
The Progress Report
This should be a 10-15 page long report, and it serves as a
check-point. It should consist of the same sections as your final
report (introduction, survey, etc), with a few sections `under
construction'. Specifically, the introduction and survey sections
should be in their final form; the section on the implementation
should be finished; the sections on the experiments and
conclusions will have whatever results you may have obtained, as well
as `place-holders' for the results you plan/hope to obtain.
Grading scheme for the project report:
- 70% for implementation (should be finished)
- 25% for the design of upcoming experiments
- 5% for plan of activities (in an appendix, please show the old
one and the revised one, along with the activities of each group
member)
- Attach your graded
phase 1 report
- Give a list of unit-tests you have tried for your software
Reminders, for all projects:
- Again: Keep the graded
Phase 2 report, and attach
it to your phase 3 submission.
3. Third deliverable: Phase 3 -
The Final Report
The grade of the final phase of the project will have the
following components:
- writeup: there, you would describe your discoveries/insights/experiments. Your final
report is expected to be a 20-30 pages.
- software: packaging, documentation, and portability. The
goal is to provide enough material, so that other people can use it
and continue your work.
3.1. Grading Scheme for Final Report
- FYI: Tasks 3, 4, 5 refer to the project description (see that, for more details).
- Reminder: in phase3, make sure you address our feedback, from phase1 and phase2.
3.1.1. Two-member teams:
- Writeup
- [2%] Introduction - Motivation
- [3%] Problem definition
- [5%] Survey
- Implementation - Task 3: optimization
- [20%] For T3.1 ('copy' method)
- [20%] For T3.2 ('mark' method
- [10%] For an optimization of your own (describe idea; give experimental results)
- Experiments - Task 4: Evaluation
- [10%] For T4.1: report info about the top k=5 blocks for each of the two datasets
- [0%] (optional) For T4.2: describe briefly whether you think they are suspicious or not
- [10%] for T4.3: give the ROC curves and AUC for each curve
- Experiments - Task 5: Anomaly detection
- [5%] For T5.1: report info about the top k=5 blocks for each dataset
- [0%] (optional) For T5.2: describe briefly whether/which ones are suspicious.
- [5%] Conclusions
- Software (testing, packaging and documentation) [10%]
3.1.2.Single-member teams:
As above, with reduced load for the following tasks:
- For Task4.1, do only the 'AirForce TCP Dump' dataset
- For Task4.3, again, only 'AirForce'; use 'geometric average' for density, and 'maximum cardinality' for dimension selection.
- For Task5.1, do only the 'wikipedia' dataset, with 'geometric average' and 'maximum-cardinality'.
3.2. Specifications for packaging of software:
Please create a tar-file, like this
sample package ( use gunzip ; tar xvf).
Check-list:
- after un-tar-ing, the command 'make' should
compile your system, install it if necessary and run a small demo
on a sample input file (included in your package)
- it should have a README file, corresponding to the
`user's manual': This file should describe the package in a
few paragraphs, as well as how to install it and how to use
it.
- it should have a directory DOC, with your writeup, and
your foils (in your favorite form: latex, pdf, powerpoint,
ms-word)
- 'make paper.pdf' should create the
corresponding version of your writeup (skip this step, if you use
ms-word)
- `make clean' should eliminate all the
derived files (*.o, *.class, *.aux, etc)
- `make all.tar' (or 'make all.zip') should
create a tar/zip-file, ready for distribution.
- please make sure that your package includes only the
absolutely necessary set of files!
3.3. Final project report - what to hand in:
On the announced due date on the schedule,
- bring a hard copy of the writeup in class,
- a (hard copy) of the graded phase-2 report
and graded phase-1 report and
- upload on black-board your tar/zip-file
before class.
- Only one team-member need to upload the file.
- If the file is too large (it shouldn't - omit the datafiles), contact
the instructor.
Created: Feb. 12, 2017; last modified: April 19, 2017 (items in yellow highlight), by Christos Faloutsos.