Carnegie Mellon University
15-826 Multimedia Databases and Data Mining
Fall 2012 - C. Faloutsos
Final exam study guide
Preliminaries:
- No aids allowed, except
- two, standard 8'' x 11.5'' pages, with your notes (you may use
both sides) and
- pocket calculators: strongly
recommended (logarithms)
- Duration: 3 hours
- The exam will be comprehensive,
with more emphasis on
the material after the midterm
Reminders:
- Time and place: Thu. December
13 5:30p.m.‐8:30p.m. DH A302, as announced at the 'hub':
http://www.cmu.edu/hub/docs/final-exams.pdf
- Office hours by
instructor:
- the usual Wed
Dec. 5, 2-3pm
- extra: Mon Dec. 10, 11-12noon
- extra: Tue Dec. 11, 12-2pm
- Several of the links are internal
to CMU.
Material to be examined
As in the original reading list, with a few papers deleted.
Specifically, the material consists of all
the lecture foils (including the guest lectures), and all the papers/chapters
below:
1. Foils:
Everything we covered in the foils (see their pdf, from the
course schedule page)
2. Multimedia Indexing
- Primary key access methods
- Secondary key and spatial access methods
- Jon
Louis Bentley,
Multidimensional binary search trees used for associative
searching, Comm. of the ACM (CACM), Volume 18 ,
Issue 9, pp. 509-517, (September 1975)
- A. Guttman
R-Trees: a Dynamic Index Structure for Spatial
Searching, Proc. ACM SIGMOD, June 1984, pp. 47-57, Boston,
Mass.
- J. Orenstein,
Spatial Query Processing in an Object-Oriented Database
System, Proc. ACM SIGMOD, May, 1986, pp. 326-336,
Washington D.C.
- Ibrahim Kamel and Christos Faloutsos,
Hilbert R-tree: An improved R-tree using fractals
Proc.
of VLDB Conference, Santiago, Chile, Sept. 12-15, 1994, pp.
500-509.
- Roberto F. Santos Filho, Agma Traina, Caetano Traina Jr., and
Christos Faloutsos:
Similarity search without tears: the OMNI family of
all-purpose
access methods ICDE, Heidelberg, Germany, April 2-6
2001.
- MM Textbook, chapters 4 and 5.
- Fractals
- Christos Faloutsos and Ibrahim Kamel,
Beyond Uniformity and Independence: Analysis of R-trees
Using
the Concept of Fractal Dimension, Proc. ACM
SIGACT-SIGMOD-SIGART PODS, May 1994, pp. 4-13, Minneapolis,
MN.
- Alberto Belussi and Christos Faloutsos, Estimating
the Selectivity of Spatial Queries Using the `Correlation' Fractal
Dimension Proc. of VLDB, p. 299-310, 1995.
- Power laws, lognormals etc: M. E. J. Newman, Power laws, Pareto
distributions and Zipf's law Contemporary Physics 46, 323-351
(2005) (local
pdf copy)
- Text and LSI
- MM Textbook, chapter 6
- Peter W. Foltz and Susan T. Dumais,
Personalized Information Delivery: an Analysis of
Information
Filtering Methods, Comm. of ACM (CACM), 35, 12, Dec. 1992,
pp. 51-60.
- SVD: In PTVF ch. 2.6; MM
Textbook Appendix D
- PageRank: Sergey Brin, Lawrence Page The Anatomy of a
Large-Scale Hypertextual Web Search Engine (1998) (local
pdf)
Also, [GM textbook section 15.4]
- HITS: Jon M. Kleinberg Authoritative Sources in a
Hyperlinked Environment JACM, 46,5 (1999) (local
pdf) (also [GM textbook, section 15.3]
- Tensors: Tamara G. Kolda and Brett W. Bader.
Tensor decompositions and applications. Technical Report
SAND2007-6702, Sandia National Laboratories, Albuquerque, NM and
Livermore, CA, November 2007 (local
pdf copy ) (Also, [GM textbook, ch 16]).
- ICA: Jia-Yu Pan, Christos Faloutsos, Masafumi Hamamoto and
Hiroyuki Kitagawa: AutoSplit: Fast and Scalable Discovery of
Hidden Variables in Stream and Multimedia Databases, PAKDD,
Sydney, Australia, May 2004.
- Time sequences
- DSP and image databases
- Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jon Ashley,
Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee,
Dragutin Petkovic, David Steele and Peter Yanker
Query by Image and Video Content: the QBIC System
IEEE
Computer 28, 9, Sep. 1995, pp. 23-32.
- Journal
of Intelligent Inf. Systems, 3, 3/4, pp. 231-262, 1994 An
earlier, more technical version of the IEEE Computer '95
paper.
- FastMap: MM Textbook chapter 11; Also in:
C.
Faloutsos and K.I. Lin
FastMap: A Fast Algorithm for Indexing, Data-Mining and
Visualization of Traditional and Multimedia Datasets ACM
SIGMOD 95, pp. 163-174
.
- DFT/DCT: In PTVF ch. 12.1, 12.3, 12.4; in
MM Textbook Appendix B.
- Wavelets: In PTVF ch. 13.10; in MM Textbook Appendix C
- Karhunen-Loeve: in MM Textbook Appendix D.
- JPEG: Gregory K. Wallace,
The JPEG Still Picture Compression Standard, CACM,
34,
4, April 1991, pp. 31-44
- MPEG: D. Le Gall,
MPEG: a Video Compression Standard for Multimedia
Applications CACM, 34, 4, April 1991, pp. 46-58
- Fractal compression: M.F. Barnsley and A.D. Sloan,
A Better Way to Compress Images, BYTE, Jan. 1988,
pp.
215-223.
- MM Textbook, chapter 9
3. Data mining
- Graph mining and social networks:
- Patterns in graphs [GM textbook, ch 3-8]
- R. Albert, H. Jeong, and A.-L. Barabási,
Diameter of the World Wide Web Nature,
401, 130-131 (1999).
- Réka Albert and Albert-László
Barabási
Statistical mechanics of complex networks, Reviews of
Modern Physics, 74, 47 (2002).
- Jure Leskovec, Jon Kleinberg, Christos Faloutsos Graphs
over Time: Densification Laws, Shrinking Diameters and Possible
Explanations, KDD 2005, Chicago, IL, USA, 2005.
- Graph generators: RMAT and Kronecker
[GM textbook, ch 12-14]
- Community detection [GM textbook, ch 17]
- Time series forecasting
- Data Mining in Databases:
Reminders: Required texts
Recommended text
- [HK] Jiawei Han and
Micheline Kamber, Data Mining: Concepts
and Techniques, Morgan Kaufmann, 2000 (1st edition); 2006 (2nd
edition).
- [PTVF] William H. Press Saul A.
Teukolsky William T. Vetterling Brian P. Flannery Numerical
Recipes in C Cambridge University Press, 1992, 2nd Edition.
On-line evaluation copy
-
Undergraduate DB textbook, for
those who took a db class too long ago:
Last modified: Dec. 3, 2012, by Christos Faloutsos