Carnegie Mellon University
15-826 Multimedia Databases and Data Mining
Spring 2007 - C. Faloutsos
Reading List
NOTICE:
Several of the links are internal to CMU.
Required text
Recommended text
- [HK] Jiawei Han and
Micheline Kamber, Data Mining: Concepts
and Techniques, Morgan Kaufmann, 2000.
- [PTVF] William H. Press Saul A.
Teukolsky William T. Vetterling Brian P. Flannery Numerical
Recipes in C Cambridge University Press, 1992, 2nd Edition.
On-line evaluation copy
- Undergraduate DB textbook, for
those who took a db class too long ago:
- Raghu Ramakrishnan, Johannes Gehrke, "Database Management
Systems," McGraw-Hill 2002 (3rd ed).
Foils:
In pdf, from the course schedule
page.
A. Multimedia Indexing
- Primary key access methods
- Secondary key and spatial access methods
- A. Guttman
R-Trees: a Dynamic Index Structure for Spatial
Searching, Proc. ACM SIGMOD, June 1984, pp. 47-57, Boston,
Mass.
- J. Orenstein,
Spatial Query Processing in an Object-Oriented Database
System, Proc. ACM SIGMOD, May, 1986, pp. 326-336,
Washington D.C..
- Textbook, chapters 4 and 5.
- Fractals
- Ibrahim Kamel and Christos Faloutsos,
Hilbert R-tree: An improved R-tree using fractals Proc.
of VLDB Conference, Santiago, Chile, Sept. 12-15, 1994, pp.
500-509.
- Christos Faloutsos and Ibrahim Kamel,
Beyond Uniformity and Independence: Analysis of R-trees Using
the Concept of Fractal Dimension, Proc. ACM
SIGACT-SIGMOD-SIGART PODS, May 1994, pp. 4-13, Minneapolis,
MN.
- Text and LSI
- Time sequences
- DSP and image databases
- Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jon Ashley,
Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee,
Dragutin Petkovic, David Steele and Peter Yanker
Query by Image and Video Content: the QBIC System IEEE
Computer 28, 9, Sep. 1995, pp. 23-32.
- Journal
of Intelligent Inf. Systems, 3, 3/4, pp. 231-262, 1994 An
earlier, more technical version of the IEEE Computer '95
paper.
- FastMap: Textbook chapter 11; Also in: C.
Faloutsos and K.I. Lin FastMap: A Fast Algorithm for Indexing,
Data-Mining and Visualization of Traditional and Multimedia
Datasets ACM SIGMOD 95, pp. 163-174.
- DFT/DCT: In PTVF ch. 12.1, 12.3, 12.4; in
Textbook Appendix B.
- Wavelets: In PTVF ch. 13.10; in Textbook Appendix C
- Karhunen-Loeve: in Textbook Appendix D.
- JPEG: Gregory K. Wallace,
The JPEG Still Picture Compression Standard, CACM, 34,
4, April 1991, pp. 31-44
- MPEG: D. Le Gall,
MPEG: a Video Compression Standard for Multimedia
Applications CACM, 34, 4, April 1991, pp. 46-58
- Fractal compression: M.F. Barnsley and A.D. Sloan,
A Better Way to Compress Images, BYTE, Jan. 1988, pp.
215-223.
- Textbook, chapter 9
B. Data mining
- Graph mining and social networks:
- Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos,
On Power-Law Relationships of the Internet Topology,
SIGCOMM 1999.
- R. Albert, H. Jeong, and A.-L. Barabási,
Diameter of the World Wide Web Nature,
401, 130-131 (1999).
- Réka Albert and Albert-László
Barabási
Statistical mechanics of complex networks, Reviews of
Modern Physics, 74, 47 (2002).
- Jure Leskovec, Jon Kleinberg, Christos Faloutsos Graphs
over Time: Densification Laws, Shrinking Diameters and Possible
Explanations, KDD 2005, Chicago, IL, USA, 2005.
- D. Chakrabarti and C. Faloutsos, Graph Mining: Laws, Generators and
Algorithms, in ACM Computing Surveys, 38(1), 2006
(pdf
draft, internal to CMU)
- Statistics background: In PTVF pp. 620-621
and ch. 14.4-14.5;
- AI background / Classification
- [HK] chapter 7.3
- Rakesh Agrawal, Sakti Ghosh, Tomasz Imielinski, Bala Iyer and
Arun Swami
An Interval Classifier for Database Mining Applications
VLDB Conf. Proc. Vancouver, BC, Canada, Aug. 1992, pp.
560-573.
- M. Mehta, R. Agrawal and J. Rissanen, `SLIQ:
A Fast Scalable Classifier for Data Mining', Proc. of the
Fifth Int'l Conference on Extending Database Technology, Avignon,
France, March 1996.
- Data Mining in Databases:
- Data warehouses, OLAP and DataCubes: [HK], ch. 2.
- Data reduction: [HK] chapter 3.4
- Association Rules:
- Cluster analysis: [HK] chapter 8.
- Miscellaneous (ICA, approximate counting)
- Jia-Yu Pan, Christos Faloutsos, Masafumi Hamamoto and Hiroyuki
Kitagawa:
AutoSplit: Fast and Scalable Discovery of Hidden Variables in
Stream and Multimedia Databases, PAKDD, Sydney, Australia,
May 2004.
- Christopher Palmer, Phillip Gibbons and Christos Faloutsos,
ANF: A Fast and Scalable Tool for Data Mining in Massive
Graphs, KDD 2002, Edmonton, Alberta, Canada, July 2002
-
Efficient and Tunable Similar Set Retrieval, by
Aristides Gionis, Dimitrios Gunopulos and Nikos Koudas, ACM
SIGMOD, Santa Barbara, California, May 21-24, 2001.
-
New sampling-based summary statistics for improving approximate
query answers, by Phillip B. Gibbons and Yossi Matias, ACM
SIGMOD, pp 331 - 342, Seattle, Washington, 1998.
RECOMMENDED OPTIONAL READING
Additional, optional citations, that may be useful for
your project:
Multimedia indexing
- Spatial access methods:
- N. Beckmann, H.-P. Kriegel, R. Schneider B. Seeger The
R*-Tree: an Efficient and Robust Access Method for Points and
Rectangles ACM SIGMOD, May 1990, pp. 322-331 Atlantic City, NJ.
(Deferred splitting in R-trees)
- Fractals
- B. Mandelbrot Fractal Geometry of Nature W.H. Freeman,
1977. (The classic book on fractals).
- Manfred Schroeder, Fractals, Chaos, Power Laws: Minutes From
an Infinite Paradise W.H. Freeman and Company, 1991. (An
excellent introduction to fractals)
Data mining
- Time sequences
- George E.P. Box, Gwilym M. Jenkins and Gregory C. Reinsel,
Time Series Analysis: Forecasting and Control Prentice Hall,
1994 (3rd Edition). (Time series forecasting - the classic
approach. It also has the algorithms for linear predictive
coding.)
- Andreas S. Weigend and Neil A. Gerschenfeld, Time Series
Prediction: Forecasting the Future and Understanding the Past
Addison Wesley, 1994. (Time series forecasting: non-linear/chaotic
approaches)
- Spiros Papadimitriou, Jimeng Sun and Christos Faloutsos
Streaming
Pattern Discovery in Multiple Time-Series VLDB 2005,
Trondheim, Norway.
- Graph mining:
- Tom Mitchell,
Machine Learning, McGraw Hill, 1997.
- John Ross Quinlan C4.5: Programs for Machine Learning
Morgan Kaufmann Publishers Inc., 1993. (Introduction to data
mining, with source code)
Last modified: 01/13/2007, by Christos Faloutsos