David
Walker
Princeton
University
PADS/ML:
A Functional Data Description Language
Abstract:
Massive
amounts of useful data are stored and processed in ad hoc formats for
which common tools like parsers and pretty printers do not exist.
Traditional data management systems provide rich infrastructure for
processing well-behaved data, but are of little use when dealing with
data in ad hoc formats. To address the challenges of ad hoc data,
we have designed PADS/ML, a declarative data description language for
the ML family of languages. PADS/ML is based on the ML type
structure and features polymorphic, dependent, recursive datatypes for
describing ad hoc data.
In this talk, we will
describe the design, implementation and semantics of PADS/ML data
descriptions. The design exploits the elegance of ML's datatypes
and the power of its module system. The implementation has been
done in O'Caml. Our compilation strategy uses a "types as modules"
paradigm that pushes up against the limits of ML's advanced module
system and poses practical challenges for module system
designers. The semantics are based on an extension of our
previous work on the Data Description Calculus [POPL 06] with
type-parameterized types, and we have proven the resulting system
type-correct --- generated parsers return data of the expected type.
This is joint work with
Yitzhak Mandelbaum, Kathleen Fisher, Mary Fernandez and Artem Gleyzer.
- - - - - - - -
Host: Karl Crary
- - - - - - - -
Tuesday, May 30, 2006*
3:30 p.m.
Wean
Hall 8220
Principles
of Programming Seminars
*NOTE: NOT USUAL DAY