Sparse boolean matrix multiplication, data mining, and hashing
Feb 9, 2011
ABSTRACT:
Boolean matrix multiplication, i.e., multiplication over the (OR, AND)
semiring, is known to exhibit deep connections to a number of
fundamental computational problems. However, the sparse case where
most input (and/or output) entries are 0 is less well-studied. In this
talk I will argue that this special case is interesting because it
lies at the heart of some fundamental problems in databases and data
mining. I will then present the key theoretical ideas in three recent
results (ICDT '09, RANDOM '10, IPDPS '11) that look into different
aspects of this problem: New complexity upper bounds in RAM and I/O
models, estimation of the sparseness of a matrix product, and
parallel computation. The two last results exploit hashing techniques,
including a variation of cuckoo hashing that allows efficient parallel
set intersections.
Speaker: Rasmus Pagh, IT University of Copenhagen.
Joint work with Rasmus Resen Amossen and Andrea Campagna.
Speaker: Rasmus Pagh, IT University of Copenhagen.
Joint work with Rasmus Resen Amossen and Andrea Campagna.