|
SCS
Student
Seminar
Series
abstracts
previous talks
scs seminars
SCS
CMU
|
|
|
|
All SSS talks can be carried out either in person or through Zoom.
Spring 2022 Schedule
2022 Spring Feb 18, 25, 28 are unavailble
Mon, Jan 10 |
GHC 6501 |
|
available |
Fri, Jan 14 |
GHC 6501 |
|
available |
Mon, Jan 17 |
GHC 6501 |
|
available |
Fri, Jan 21 |
Remote |
|
DriftSurf: Stable-State / Reactive-State Learning under Concept Drift
Presented by Ellango Jothimurugesan
https://cmu.zoom.us/j/93961739346?pwd=amJjWFppbjJ0dVRGMWp1enZpajFiZz09
When learning from streaming data, a change in the data distribution, also known as concept drift, can render a previously-learned model inaccurate and require training a new model. We present an adaptive learning algorithm that extends previous drift-detection-based methods by incorporating drift detection into a broader stable-state/reactive-state process. The advantage of our approach is that we can use aggressive drift detection in the stable state to achieve a high detection rate, but mitigate the false positive rate of standalone drift detection via a reactive state that reacts quickly to true drifts while eliminating most false positives. Our theoretical analysis shows that the risk of the algorithm is competitive to an algorithm with oracle knowledge of when (abrupt) drifts occur. Experiments on synthetic and real datasets with concept drifts confirm our theoretical analysis.
This talk is in partial fulfillment of the speaking requirement.
|
Fri, Jan 24 |
Remote |
|
Training Provably Robust Neural Networks
Presented by Klas Leino
https://cmu.zoom.us/j/93404279609?pwd=a0xROWt0ZzJ2UG5wOHcxOGR2M3BiQT09
Deep networks have extensively been shown to be vulnerable to maliciously-perturbed inputs, termed "adversarial examples," through which an attacker can cause the network to make arbitrary mistakes. This raises concerns for neural networks deployed in the wild, especially in safety-critical settings, e.g., in autonomous vehicles. In turn, this has motivated a volume of work on practical defenses; among these, this talk will focus primarily on rigorous defenses that provide provable guarantees of a property termed "local robustness," which precludes an important class of adversarial examples. Specifically, we will cover an elegant and effective defense that modifies the architecture of a neural network to naturally provide provable guarantees of local robustness without introducing overhead to the network's inferences. Finally, we will discuss some of the limitations of local robustness, demonstrating that in some contexts it is too stringent (we propose some natural relaxations), while in other ways it is insufficient to capture certain realistic forms of attacks.
|
Fri, Jan 28 |
Remote |
|
Considering Process in Algorithmic Bias Detection and Mitigation
Presented by Emily Black
https://cmu.zoom.us/j/94694048366?pwd=Zjl2VHcxdHFiK0VleDhjY0tIbTVCdz09
Artificial Intelligence (AI) systems now affect important decisions in people's lives, from the news articles they read, to whether or not they receive a loan. While the use of AI may lead to great accuracy and efficiency in the making of important decisions, recent news and research reports have shown that AI models can act unfairly: from exhibiting gender bias in hiring models, to racial bias in recidivism prediction systems.
In this talk, I will discuss methods for understanding fairness issues in AI through considering the process by which models arrive at their decisions. This technique contrasts with a large portion of AI fairness literature, which focuses on studying model outcomes alone. Specifically, I will show how considering a models end-to-end decision process allows us to expand our understanding of unfair behavior---such as in my work demonstrating how model instability can lead to unfairness by having important decisions rely on arbitrary modeling choices (e.g. whether or not a person is granted a loan from a decision-making model may depend on whether some unrelated person happened to be in the training set). Secondly, I will discuss how considering process can help us find bias mitigation techniques which avoid a tradeoff between predictive utility and fairness, with a case study from my collaboration with Stanford RegLab investigating tax auditing practices.
This talk is in partial fulfillment of the speaking requirement.
|
Mon, Jan 31 |
GHC 6501 |
|
available |
Feb 2 |
Remote |
|
Back to Futures
Presented by Klaas Pruiksma
Concurrent (and parallel) programming offer(s) a way to increase performance without needing continual improvement to single-threaded performance. A wide variety of approaches to concurrent programming exist, but are often dealt with in an ad hoc manner, making reasoning about the behaviour of concurrent programs notoriously difficult. We present a formal explanation of futures, a particular concurrent programming construct, with foundations in logic. The close ties between this language's type system and logic allow for simpler reasoning about programs, and incidentally provide a better understanding of linear futures, which have been proposed in the context of algorithm design and analysis, but never formally explored. Moreover, by adding a construct to enforce sequentiality, we recover a language that looks much like a standard functional language augmented with futures.
|
Fri, Feb 04 |
GHC 6501 |
|
available |
Mon, Feb 07 |
GHC 6501 |
|
available |
Fri, Feb 11 |
GHC 6501 |
|
available |
Mon, Feb 14 |
GHC 6501 |
|
available |
Mon, Feb 21 |
GHC 6501 |
|
available |
Fri, Mar 04 |
GHC 6501 |
|
available |
Mon, Mar 07 |
GHC 6501 |
|
unavailable |
Fri, Mar 11 |
GHC 6501 |
|
available |
Mon, Mar 14 |
GHC 6501 |
|
available |
Fri, Mar 18 |
GHC 6501 |
|
available |
Mon, Mar 21 |
GHC 6501 |
|
available |
Mar 22 |
GHC 6501 |
|
Enabling Context-sensitive Bandwidth-Efficient Visual Search on Edge Data
Presented by Shilpa Anna George
The ability to execute ad hoc search queries on freshly-captured visual data at
the edge is valuable. A key challenge is a bandwidth-efficient assembly of large
training sets for customizing deep neural networks (DNNs) that express the
queries.
In this talk, I will present the system Hawk, an interactive labeling system
in which learning is transparent, recursive, and scalable. Its key insight is
that even a weak model, learned on-the-fly, can be used to enrich a live sensor
stream by pruning away irrelevant data. Over many rounds, Hawks tight
integration of data collection, inferencing, selective transmission, labeling,
and training converges to an accurate model.
|
Fri, Mar 25 |
GHC 6501 |
|
available |
Mon, Mar 28 |
GHC 6501 |
|
available |
Fri, Apr 01 |
GHC 6501 |
|
available |
Mon, Apr 04 |
GHC 6501 |
|
available |
Apr 8 |
GHC 6501 |
|
Two Studies on Peer Review: Citation Bias and arXiv Bias
Presented by Ivan Stelmakh
In this talk, we will discuss two studies (involving several thousand papers and reviewers) that scrutinize certain aspects of the peer-review process. In the first study, we investigate whether citing potential reviewers can boost the acceptance chances of a paper. For this, we intervene in the paper-reviewer assignment procedure and carefully analyze the outcomes of the review process. In the second study, we compare the pros and cons of releasing a paper on arXiv before it gets published in a conference or a journal. For this, we use surveys to estimate the expected visibility a paper gets on arXiv and the probability of the reviewer breaking the double-blind reviewing policy.
|
Apr 11 |
remote |
|
Multi-point Queries on Concurrent Data Structures
Presented by Yuanhao Wei
https://cmu.zoom.us/j/92977908123?pwd=M2svSzQ0UXZxS1M3aUt2Y0lkM2ZOUT09
Most work on concurrent data structures has focused on supporting single-point operations, such as insert, delete, and lookup, but many applications require supporting these alongside multi-point queries, such as searching for ranges of keys, finding the first key that matches some criteria, or checking if a collection of keys are all present. In this presentation, I'll cover a general technique for adding linearizable multi-point queries to existing concurrent data structures. This technique maintains the time bounds and progress properties (e.g. lock-freedom/wait-freedom) of the original data structure. Furthermore, multi-point queries written with this technique are wait-free and take time proportional to their sequential complexity plus a contention term representing the number of update operations concurrent with the query.
|
Apr 15 |
remote |
|
Improved quantum data analysis
Presented by Costin Badescu
Let p denote an unknown probability distribution on the set of integers
[d] = {1, 2, ..., d} and let E1, ..., Em denote events, i.e. subsets of
[d]. Consider the following problem: How many samples from p are
necessary and sufficient to estimate the probabilities of the events
p(E1), ..., p(Em) to ε accuracy as a function of m, d, and ε? The answer
is known to be theta(log(m)/e^2) samples.
The quantum version of this problem, where the distribution p is
replaced by a quantum state and the events are replaced by binary
quantum measurements, is known as "shadow tomography." Finding the
sample complexity of shadow tomography is an open problem; currently,
the best known lower bound matches the classical lower bound of
omega(log(m)/e^2) and the best known upper bound is O(log^2(m) log(d)/ε^4).
In this talk, I will introduce the shadow tomography problem, explain
its significance for quantum learning and estimation, and sketch the
main ideas behind joint work with Ryan O'Donnell which improved the
upper bound on the sample complexity of shadow tomography to O(log²(m) log(d)/ε^4).
No prior knowledge of quantum theory is needed.
|
Mon, Apr 18 |
GHC 6501 |
|
available |
Fri, Apr 22 |
GHC 6501 |
|
available |
Apr 25 |
remote |
|
Convertible Codes: Efficient Conversion of Coded Data in Distributed Storage
Presented by Francisco Jose Maturana Sanguineti
https://cmu.zoom.us/j/97615026876?pwd=VmViSTBrK2sxeFc0bkNHc0pGL0djQT09
Erasure codes are typically used in large-scale distributed storage systems to provide reliability against failures. In this setting, a set of k data chunks is encoded using an [n, k] code to generate n chunks that are then stored on different nodes. Recent work shows that the failure rates of storage devices vary significantly over time, and that changing the rate of the code (via a change in the parameters n and k) in response to such variations provides significant reduction in storage overhead. However, when using traditional codes, the overhead on CPU, disk IO, and network bandwidth of realizing such a change in code rate is prohibitively high. Motivated by this application, we present a new framework to formalize the notion of code conversion, i.e., the process of converting data encoded with an [n^I, k^I] code into data encoded with an [n^F, k^F] code while maintaining desired decodability properties. As part of this framework, we also introduce convertible codes, a new class of code pairs that allow for code conversions in a resource-efficient manner.
|
Apr 29 |
GHC 6501 |
|
Optimal Resource Allocation for Parallelizable Jobs
Presented by Ben Berg
Modern computer systems allow resources to be dynamically allocated to parallelizable jobs. When a job is parallelized across many servers or cores, it will complete more quickly. However, jobs typically receive diminishing returns from being allocated additional resources. Hence, given a fixed number of cores, it is not obvious how to dynamically allocate cores to a stream of incoming jobs in order to minimize the overall mean response time across jobs.
For example, an optimal allocation policy must favor shorter jobs, but favoring any single job too heavily can cause the system to operate very inefficiently. Additionally, an optimal policy must decide how much, if at all, to favor more parallelizable jobs over less parallelizable jobs. In a variety of settings, we show how to derive an optimal allocation policy which minimizes mean response time. We then show that policies inspired by our theoretical results can be implemented in a modern database to reduce mean response time by a factor of 2.
|
May 2 |
GHc 6501 |
|
Learning-based Streaming Codes are Approximately Optimal for Variable-Size Frames
Presented by Michael Rudow
Providing a high quality-of-service for live communication is a pervasive challenge plagued by bursts of packet losses during transmission. Streaming codes are designed explicitly to recover lost packets in real-time for such low-latency streaming communication settings using minimal bandwidth. Applications such as live video streaming involve transmitting messages whose sizes vary over time due to compressing frames prior to transmission. The variability demands extra redundancy because consecutive large frames can be lost in a burst, leaving less bandwidth remaining for data. Mitigating the adverse effects of variability necessitates spreading the messages over multiple future packets. However, the optimal strategy for spreading depends on the sizes of future messages, which are not available due to streaming codes operating in an “online” setting. Algebraic coding techniques alone are hence insufficient for designing optimal streaming codes. We present a learning-based approach that combines machine learning with algebraic coding techniques to design the first approximately optimal streaming codes for a range of parameter regimes suitable for practical applications.
|
May 6 |
GHC 6501 |
|
available |
Mon, May 09 |
GHC 6501 |
|
available |
Fri, May 13 |
GHC 6501 |
|
available |
Mon, May 16 |
GHC 6501 |
|
available |
Fri, May 20 |
GHC 6501 |
|
available |
Mon, Aug 29 |
GHC 6501 |
|
available |
Fri, Sep 02 |
GHC 6501 |
|
available |
Mon, Sep 05 |
GHC 6501 |
|
available |
Fri, Sep 09 |
GHC 6501 |
|
available |
Mon, Sep 12 |
GHC 6501 |
|
available |
Fri, Sep 16 |
GHC 6501 |
|
available |
Mon, Sep 19 |
GHC 6501 |
|
available |
Fri, Sep 23 |
GHC 6501 |
|
available |
Mon, Sep 26 |
GHC 6501 |
|
available |
Fri, Sep 30 |
GHC 6501 |
|
available |
Mon, Oct 03 |
GHC 6501 |
|
available |
Fri, Oct 07 |
GHC 6501 |
|
available |
Mon, Oct 10 |
GHC 6501 |
|
available |
Fri, Oct 14 |
GHC 6501 |
|
available |
Mon, Oct 17 |
GHC 6501 |
|
available |
Fri, Oct 21 |
GHC 6501 |
|
Not available |
Mon, Oct 24 |
GHC 6501 |
|
available |
Fri, Oct 28 |
GHC 6501 |
|
available |
Mon, Oct 31 |
GHC 6501 |
|
available |
Fri, Nov 04 |
GHC 6501 |
|
available |
Mon, Nov 07 |
GHC 6501 |
|
available |
Fri, Nov 11 |
GHC 6501 |
|
available |
Mon, Nov 14 |
GHC 6501 |
|
available |
Fri, Nov 18 |
GHC 6501 |
|
available |
Mon, Nov 21 |
GHC 6501 |
|
available |
Fri, Nov 25 |
GHC 6501 |
|
available |
Mon, Nov 28 |
GHC 6501 |
|
available |
Fri, Dec 02 |
GHC 6501 |
|
available |
Mon, Dec 05 |
GHC 6501 |
|
available |
Fri, Dec 09 |
GHC 6501 |
|
available |
Mon, Dec 12 |
GHC 6501 |
|
available |
Fri, Dec 16 |
GHC 6501 |
|
available |
Mon, Dec 19 |
GHC 6501 |
|
available |
Fri, Dec 23 |
GHC 6501 |
|
available |
General Info
The Student Seminar Series is an informal seminar
for SCS graduate students to communicate ideas.
Each meeting starts at noon and lasts 1 hour. Lunch is
provided by the Computer Science Department (thanks to
Debbie Cavlovich!). At each meeting, a different student
speaker will give an informal, 40-minute talk about his/her research,
followed by questions/suggestions/brainstorming. We try to attract
people with a diverse set of interests, and encourage speakers to
present at a very general, accessible level.
So why are we doing this and why take part? In the best case
scenario, this will lead to some interesting cross-disciplinary work
among people in different fields and people may get some new ideas
about their research. In the worst case scenario, a few people will
practice their public speaking and the rest get together for a free
lunch.
Previous years
2021Fall,
2021Spring,
2020Fall,
2020Spring,
2019Fall,
2019Spring,
2018Fall,
2018Spring,
2017Fall,
2017Spring,
2016Fall,
2016Spring,
2015Fall,
2015Spring,
2014Fall,
2014Spring,
2013Fall,
2013Spring,
2012Fall,
2012Spring,
2011Fall,
2011Spring,
2010Fall,
2010Spring,
2009Fall,
2009Spring,
2008Fall,
2008Spring,
2007Fall,
2007Spring,
2006Fall,
2006Spring,
2005Fall,
2005Spring,
2004Fall,
2004Spring,
2003Fall,
2003Spring,
2002Fall,
2002Spring,
2001Fall,
2001Spring,
2000Fall,
2000Spring,
1999Fall,
1999Spring,
1998Fall,
1998Spring,
SSS Coordinators
Juncheng Yang, CSD
|