- D. Balcan, M. S. Lewicki. Point Coding: Sparse Image Representation with
Adaptive Shiftable-Kernel Dictionaries. In SPARS 2009.[pdf]
[abs]
This paper addresses the problem of adaptively
deriving optimally sparse image representations, using an dictionary
composed of shiftable kernels. Algorithmic advantages of
our solution make possible the computation of an approximately
shift-invariant adaptive image representation. Learned kernels
can have different sizes and adapt to different scales. Coefficient
extraction uses a fast implementation of Matching Pursuit with
essentially logarithmic cost per iteration. Dictionary update is
performed by solving a structured least-squares problem either
by algebraic characterization of pseudoinverses of structured
matrices, or by superfast interpolation methods. Kernels learned
from natural images display expected 2D Gabor aspect (localization
in orientation and frequency), as well as other structures
commonly occurring in images (e.g., curved edges, or cross
patterns), while when applied to newspaper text images, kernels
tend to reproduce printed symbols or groups thereof.
- D. Balcan, M. S. Lewicki. Adaptive coding of images via
Multiresolution ICA. In IEEE ICASSP, Taipei, Taiwan, 2009.[pdf]
[abs]
Multiresolution
(MR) representations have been very successful in
image encoding, due to both their algorithmic performance and coding
efficiency. However these transforms are fixed, suggesting that
coding efficiency could be further improved if a multiresolution code
could be adapted to a specific signal class. Among adaptive coding
methods, independent component analysis (ICA) provides the
best linear code by finding a linear transform with maximally independent
coefficients, given a specific signal distribution. This technique,
however, scales poorly with the dimensionality of the data,
and has been ill-suited for large-scale image coding. We propose
a hybrid method (multi-resolution ICA) which derives an ICA basis
for each subband space produced by a given MR transform over
the image class. We find that this method produces a significantly
more efficient code compared to the MR transform alone. We provide
both quantitative and qualitative assessments of coding performance,
and illustrate improvement over standard (i.e., non-adaptive)
wavelet-based representations such as that used in JPEG2000.
- D. Balcan, A. Sandryhaila, J. Gross, and M. Püschel. Alternatives
to the Discrete Fourier Transform. In IEEE ICASSP, Las Vegas, NV,
USA, 2008.[pdf]
[abs]
It is
well-known that the discrete Fourier transform (DFT) of a finite
length discrete-time signal samples the discrete-time Fourier
transform of the same signal at equidistant points on the unit circle.
Hence, as the signal length goes to infinity, the DFT approaches
the DTFT. Associated with the DFT are circular convolution and a
periodic signal extension. In this paper we identify a large class
of alternatives to the DFT using the theory of polynomial algebras.
Each of these Fourier transforms approaches the DTFT just as the
DFT does, but has its own signal extension and notion of convolution.
Further, these Fourier transforms have Vandermonde structure,
which enables their fast computation. We provide a few experimental
examples that confirm our theoretical results.
- E. Doi, D. C. Balcan, and M. S. Lewicki. Robust coding over
noisy overcomplete channels, IEEE Trans. Image Proc., vol. 2, no.
16, Feb. 2007, pp. 442-452.[pdf]
[abs]
We
address the problem of robust coding in which the
signal information should be preserved in spite of intrinsic noise
in the representation. We present a theoretical analysis for 1- and
2-D cases and characterize the optimal linear encoder and decoder
in the mean-squared error sense. Our analysis allows for an arbitrary
number of coding units, thus including both under- and
over-complete representations, and provides insights into optimal
coding strategies. In particular, we show how the form of the code
adapts to the number of coding units and to different data and
noise conditions in order to achieve robustness. We also present
numerical solutions of robust coding for high-dimensional image
data, demonstrating that these codes are substantially more robust
than other linear image coding methods such as PCA, ICA, and
wavelets.
- J. Rosca, T. Gerkmann, D.C. Balcan. Statistical Inference
of Missing Speech Data in the ICA Domain.
In IEEE ICASSP, Toulouse, France, 2006.[pdf]
[abs]
We
address the problem of speech estimation as statistical estimation
with “missing” data in the independent component analysis (ICA) domain.
Missing components are substituted by values drawn from “similar” data
in
a multi-faceted ICA representation of the complete data. The paper
presents
the algorithm for the inference of missing data in the case of a fixed
pattern
of missing data. We apply our approach to the problem of bandwidth
extension,
or where speech is degraded by a fixed filtering process and show the
capability of the algorithm to reconstruct fine missing details of the
original
data with little artifacts. The evaluation is done using objective
distortion
measures on speech samples from the NTT database.
- D.C. Balcan, J. Rosca. Independent Component Analysis for
Speech Enhancement with Missing TF Content.
In ICA, Charleston, SC, USA, 2006.[pdf]
[abs]
We
address the problem of Speech Enhancement in a setting
where parts of the time-frequency content of the speech signal are
missing.
In telephony, speech is band-limited and the goal is to reconstruct a
wide-band version of the observed data. Quite differently, in Blind
Source
Separation scenarios, information about a source can be masked by noise
or other sources. These masked components are “gaps” or missing source
values to be “filled in”. We propose a framework for unitary treatment
of
these problems, which is based on a relatively simple “spectrum
restoration”
procedure. The main idea is to use Independent Component Analysis
as an adaptive, data-driven, linear representation of the signal in the
speech frame space, and then apply a vector-quantization-based matching
procedure to reconstruct each frame. We analyze the performance of
the reconstruction with objective quality measures such as log-spectral
distortion and Itakura-Saito distance.
- E. Doi, D.C. Balcan, M.S. Lewicki. A Theoretical Analysis
of Robust Coding over Noisy Overcomplete Channels.
In NIPS, Vancouver, BC, Canada, 2005.[pdf]
[abs]
Biological
sensory systems are faced with the problem of encoding a
high-fidelity sensory signal with a population of noisy, low-fidelity
neurons.
This problem can be expressed in information theoretic terms as
coding and transmitting a multi-dimensional, analog signal over a set
of
noisy channels. Previously, we have shown that robust, overcomplete
codes can be learned by minimizing the reconstruction error with a
constraint
on the channel capacity. Here, we present a theoretical analysis
that characterizes the optimal linear coder and decoder for one- and
twodimensional
data. The analysis allows for an arbitrary number of coding
units, thus including both under- and over-complete representations,
and
provides a number of important insights into optimal coding strategies.
In particular, we show how the form of the code adapts to the number
of coding units and to different data and noise conditions to achieve
robustness.
We also report numerical solutions for robust coding of highdimensional
image data and show that these codes are substantially more
robust compared against other image codes such as ICA and wavelets.