|
|
Photos courtesy
of San Francisco Convention & Visitors Bureau
|
Tutorial
3: |
Data
Mining the Internet |
Speakers |
Michalis and Christos Faloutsos |
Date |
Monday, March 31, 2003 |
Time |
9am - 5pm |
Abstract:
In this tutorial, we address two questions:
what do we know about the Internet? And, how can you learn more about it?
First, we present the state of the art of what we know about modeling
and simulating the Internet. Second, we present cutting edge techniques
of how to further our understanding of the network.
The motivation
is that despite the significant research efforts, we know very little al)out
the Internet. Furthermore, most network researchers are unaware of the
wealth of analysis tools from the areas of data mining and statistics.
Data analysis based on averages, standard derivation and Poisson processes
has exhausted its capabilities.
We present
two scenarios that describe eloquently the two main thrusts of this tutorial.
-
Scenario one (i.e., what): You
want to simulate your new protocol. What topology should you use:? What
is the distribution of sources and destinations? What is the traffic intensity
of each connection? What kind of background traffic should you use?
-
Scenario two (i.e., how): You just
obtained large measurement data of round trip delays among several node
pairs over a few hours. How can you characterize? How do you compare the
delays between different end-points? How do you cluster "similar"' round-trip
behavior? How can you identify abnormal behavior such as a Distributed
Denial of Service Attack (DDoS)?
In
a nutshell, the main goal of this tutorial is to present what we know about
modeling the Internet, and how we can learn more, The tutorial intends
to bridge the gap between network researchers and datamining research.
Foils:
Part 1 and Part
2
Presenter Biographies:
The
instructors have been in collaboration for 4 years, with multiple joint
papers. This joint work has been a fusion of the two research areas of
the collaborators: networks and datamining. The work has focused on Internet
modeling using the advanced data-mining techniques and has lead to discoveries
that would not have been feasible otherwise.
Michalis
Faloutsos received the B.Sc. degree in Electrical engineering (1993)
from the National Technical University of Athens, Greece and the M.Sc.
and Ph.D. degrees in Computer Science from the University of Toronto, Canada
(1999). he is currently an assistant professor at the University of California
Riverside. He has received the CAREER award from NSF (2000), and two major
DARPA grants. He has co-authored with Christos and Petros Faloutsos the
highly-cite paper "On Powerlaws and the Internet
Topology" (SIGCOMM '99), which renewed
the interest of the community in modeling the Internet topology. hes interests
include Internet measurements, multicast protocols, real-time communications,
and wireless networks.
Christos
Faloutsos received the B.Sc. degree in Electrical Engineering (1981)
from the National Technical University of Athens, Greece and the M.Sc.
and Ph.D. degrees in Computer Science from the University of Toronto, Canada.
he is currently a professor at Carnegie Mellon University. He has received
the Presidential Young Investigator Award by the National science Foundation
(1989), several "best paper" awards (SIGMOD 94, VLDB 97, KDD01 (runner-up),
Performance 2002 (best student paper)), and four teaching awards. He has
published over 120 refereed articles, one monograph, and holds four patents.
His research interests include data mining, network analysis, indexing
in relational and multimedia databases.