Alumni Profile: Kubica's Books Explore Computer Science the Fun Way

Susie CribbsTuesday, January 17, 2023

SCS alum Jeremy Kubica hopes to teach people with computer programming experience more theoretical aspects of the field through amusing examples in his new book, "Data Structures the Fun Way."

Jeremy Kubica likes coffee, but probably not as much as you would think from the title of his latest book, "Data Structures the Fun Way: An Amusing Adventure With Coffee-Filled Examples." The truth is that coffee provides a host of fun cases that explain the inner workings of data structures — the specialized format for organizing, processing, retrieving and storing data that allows it to be efficiently accessed. 

"Originally I was looking at a broad range of examples, and I kept finding good ones in terms of coffee," said Kubica, who earned his Ph.D. in 2005 from the School of Computer Science's (SCS) Auton Lab. "One of the more absurd parts of the book is, assuming you have an entire pantry filled with thousands and thousands of coffee blends and you want to find that one type of coffee you want right now, how do you do it very quickly? Especially first thing in the morning when you know you just want the coffee?" (The answer: structuring your storage to make the search easy.)

Kubica's book, published this past November by No Starch Press, aims to teach people with computer programming experience more theoretical aspects of the field. "Maybe it's high school students who have taken AP computer science, or people who have learned to program through bootcamps and online tutorials," he said. "Anybody who knows the basics and can program but hasn't really looked at how to use data structures daily and what they really mean."

Kubica is no stranger to distilling difficult concepts into amusing examples — it's been his hobby since high school, when he would craft short stories to help friends struggling with AP chemistry and physics. While he got away from writing during college and grad school, he picked it back up eventually and published a series of books that use fiction to explore topics in computer science. The first, "Computational Fairy Tales," grew out of his blog of the same name and was followed by "Best Practices of Spell Design," and "The CS Detective."

"Data Structures the Fun Way" marks Kubica's first foray into nonfiction intended for a slightly more sophisticated audience, and he certainly has the professional chops to inform his writing. While a student at CMU, Kubica studied data structures with Andrew Moore — an SCS professor who would go on to lead the school as dean from 2014 to 2018. Specifically, Kubica's thesis research investigated how data structures can speed up data mining for astronomical surveys.

"I was looking at the problem of, given a bunch of images of the night sky, how do we tell which of the dots there are actually asteroids and connect enough together to estimate an initial velocity and start using traditional algorithms to track them," he said. "It was essentially very early stage asteroid discovery."

Kubica completed his Ph.D. work around the time that Google opened its Pittsburgh office, and his former labmate recruited him to the office — which, coincidentally, Moore led. He spent 15 years at Google Pittsburgh, 12 of them using machine learning to solve problems in the company's ads-quality efforts, and three years in the Cloud AI group focused on using that technology to unravel novel machine learning problems for external customers.

Less than a year ago, Kubica left Google Pittsburgh to undertake a project closely related to the Ph.D. thesis that started it all: applying machine learning to astronomical surveys. Kubica returned to CMU last April to lead the LINCC Frameworks engineering team, a part of the Legacy Survey of Space and Time (LSST) Interdisciplinary Network for Collaboration and Computing (LINCC) effort, with CMU Physics Professor Rachel Mandelbaum and the University of Washington's Andy Connolly (who served on Kubica's thesis committee). The multiyear collaboration will create new software platforms to analyze large astronomical datasets generated by the upcoming LSST that will be carried out by the Vera C. Rubin Observatory in northern Chile.

"The way science traditionally has worked — and this has changed recently with some projects and open-source collaborations — has been that you get funding for these large telescopes, the infrastructure and the data-processing pipelines to produce catalogs and processed images. But then each individual lab ends up writing some of its own analysis algorithms." Kubica said. "Our team brings in software engineers with industry experience to apply an industry development model to astronomical software. Our goal is to build out open-source, general purpose, scalable software that can work on this stream of data coming from Rubin that will help universities across the world."

Kubica hopes the team's efforts enable science in powerful ways and support discoveries that people all over the world will see. His also has big dreams for "Data Structures the Fun Way."

"This book applies to a broad audience, so I have some pretty high hopes for it. I wouldn't be able to make this a full-time job, though," he said with a laugh. "This is something I do because it's a lot of fun, and I honestly hope that a couple of students will read it and say, 'Wow, this is really interesting' or 'I'm looking at this in a different way. I understand it now.'"

Kubica notes that CMU definitely inspired his side gig as an author. 

"One of the things that I saw during my time at CMU was teachers explaining things in novel ways, and that really resonated with me. I think that was a big factor for this particular book. CMU is where I gained an appreciation for the power and impact of data structures," he said. "I learned them in undergrad, but it was when I got to graduate school that I started trying to develop novel algorithms on them. I started to see where they break down and understand them at a different level. And that's what drove this book. Because I really want to try to capture some of that understanding and share it with other people."

And while Kubica fondly recalls his grad school morning routine of entering Newell-Simon Hall, chatting with colleagues and getting coffee in the kitchenette as a favorite memory, don't be concerned about his coffee consumption.

"I have one cup a day," he said. "Maybe two."

For More Information

Aaron Aupperlee | 412-268-9068 | aaupperlee@cmu.edu