News
What is Ligra? Try it!
Ligra is a lightweight graph processing
framework for shared memory. It is particularly suited for
implementing parallel graph traversal algorithms where only a subset
of the vertices are processed in an iteration. The project was
motivated by the fact that the largest publicly available real-world
graphs all fit in shared memory. When graphs fit in shared-memory,
processing them using Ligra can give performance improvements of up to
orders of magnitude compared to distributed-memory graph processing
systems.
Ligra supports two data types, one representing
a
graph, and another for representing subsets of the vertices,
which is referred to as
vertexSubsets. Other than constructors
and size queries, the interface supplies only two functions, one for
mapping over vertices (
vertexMap) and the other for mapping
over edges (
edgeMap). Since a vertexSubset is a subset of
vertices, the vertexMap can be used to map over any subset of the
original vertices, and hence its utility in traversal algorithms---or
more generally in any algorithm in which only (possibly small) subsets
of the graph are processed on each round. The edgeMap also processes
a subset of the edges, which is specified using a vertexSubset to
indicate the valid sources, and a Boolean function to indicate the
valid targets of each edge.
For example, a parallel breadth-first search (BFS) algorithm
is implemented using vertexSubsets and
edgeMap:
The BFS uses a Parents array (initialized all to -1, except
for the root
r where Parents[
r] =
r) in which each vertex will
point to its parent in a BFS tree. As with standard parallel versions
of BFS, on each step
i (starting at 0) the algorithm maintains
a frontier of all vertices reachable from the root
r
in
i steps. Initially a vertexSubset containing just the root
vertex is created to represent the frontier. Using edgeMap, each step
checks the neighbors of the frontier to see which have not been
visited, updates those to point to their parent in the frontier, and
adds them to the next frontier. The user-supplied
function
Update atomically checks to see if a vertex has been
visited using a compare-and-swap and returns true if not previously
visited (Parents[
i] == -1). The
Cond function tells edgeMap to consider only target vertices
which have not been visited. The edgeMap function returns a new vertex
set containing the target vertices for which
Update returns true,
i.e., all the vertices in the next
frontier. The BFS completes when the frontier is empty and hence no
more vertices are reachable.
Ligra uses an optimization for edgeMap which switches between
read-based and write-based processing of frontier vertices based on
the size of the frontier. This optimization was introduced for BFS
by
Beamer, Asanovic and Patterson, and we generalize it to
a variety of other graph algorithms besides BFS.
Ligra+ is an extension to Ligra that supports graph
compression. The interface of Ligra+ is exactly the same as Ligra's
interface, so code written in Ligra can run in Ligra+ without
modification. Ligra+ currently supports three compression
schemes---byte codes, byte codes with run-length encoding, and nibble
codes. We show that using graph compression, we are able to reduce the
space usage of graph algorithms and improve their parallel performance
at the same time.
Try it
The source code for Ligra and Ligra+ is publicly
available
on Github. Implementations for breadth-first
search, approximate betweenness centrality, graph radii estimation,
connected components, PageRank and Bellman-Ford single-source shortest
paths are currently available. Please contact
Julian
Shun if you experience any problems running it. Feedback will be
appreciated.
We plan to add more algorithms implemented using Ligra. We welcome
others to develop their own applications in Ligra, and would be happy
to add some of them to the Github repository. Please send your
implementations to
Julian Shun.
Resources