TITLE: Mining large graphs and streams using matrix and tensor tools INSTRUCTOR: Christos Faloutsos, CMU Tammy Kolda, Sandia National lab Jimeng Sun, CMU INTENDED DURATION: 3 hours DESCRIPTION - OBJECTIVES How can we find patterns in sensor streams (eg., a sequence of temperatures, or water-pollutant measurements, machine room measurements)? How to mine internet traffic graph over time? How to make the process incremental? We review the state of the art in four related fields: (a) numerical analysis and linear algebra (b) multi-linear/tensor analysis (c) graph mining and (d) stream mining. We will present both the theoretical results/algorithms and case study on several real applications. CONTENT AND OUTLINE Note to evaluators: We are asking for a 3-hour duration, to include all parts. However, we could easily create a 2-hour version, by omitting Part II below. [Part I. Core] Data model - Fundamental concepts (3'-5') - Time series - Matrices - Tensors Matrix analysis (12'-30') - SVD, PCA and eigen-decomposition - Page-rank, HITS - Independent Component Analysis (ICA) - sparse decompositions: CUR - Semi-discrete decomposition (SDD) - Co-clustering Tensor analysis (30'-35') - Intro (15') - Parafac (5') - Tucker Model (5'-10') Tucker 1 and PCA; Tucker 2 and Tensor PCA; Tucker 3 and High-order SVD (HO-SVD) - Other models (5'-10') Combination of PARAFAC and Tucker DEDICOM [II. Extension - optional] Nonnegativity (15'-20') - Nonnegative matrix factorization - Nonnegative tensor factorization Miss values (15'- 20') - Matrices - Tensors Stream mining (10'-15') - Incremental PCA - Dynamic tensor analysis - Window-based tensor analysis [III. Practice] Software (10') - Intro - Issues Scalability, Accuracy, Sparsity Case study (10'-15') - sensor network, machine monitoring - internet forensic computing - social network analysis - web graph study WHO SHOULD ATTEND Researchers that want to get up to speed with the major tools in stream mining, graph mining. Also, practitioners who want a concise, intuitive overview of the state of the art. RELATED PREVIOUS TUTORIALS (if any) None ABOUT THE INSTRUCTORS Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), seven ``best paper'' awards, and several teaching awards. He has served as a member of the executive committee of SIGKDD; he has published over 140 refereed articles, one monograph, and holds five patents. His research interests include data mining for streams and networks, fractals, indexing for multimedia and bio-informatics data bases, and performance. Tamara Kolda ............ Jimeng Sun ............