Relevant Paper(s): N/A
Abstract: My students and I often find ourselves as "subject matter experts" needing to create models for use in big data computer graphics and video analysis applications. Yet it is frustrating that a capable grad student, armed with a large unlabeled image/video collection, a palette of modern pre-trained models, and an idea of what novel object or event they want to detect, still requires days-to-weeks to create good models for their task. In this talk, I will discuss challenges we've faced carrying out the iterative process of data curation, model training, and model validation for the specific case of rare events and categories in image and video collections (such as professional broadcast sports and cable TV). Our ultimate goal (not yet achieved) is to create training techniques and data selection interfaces that enable interactive, grad-student-in-the-loop workflows where the expert human is working concurrently with massive amounts of parallel processing to interactively and continuously perform cycles of data acquisition, training, and validation.
Bio: Kayvon Fatahalian is an Assistant Professor in the Computer Science Department at Stanford University. His lab works on visual computing systems projects, including high performance rendering for RL, large-scale video analytics, programming systems for video data mining, and compilation techniques for optimizing image processing pipelines. In all these efforts, the goal is to enable rapid development of applications that involve image and video processing at scale.