Hafeezul Rahman Flipkart India Private Limited; Bangalore, India; August 17, 2018. As a research intern, I worked on developing a software tool that would enable tracking the provenance of every entity in an R script, directly at runtime. Also provided alongwith, is a visualization extension that enables viewing the data provenance graphs which could be used for debugging R scripts and compare different R script execution flows.
Anonymization of information(specifically, personally identifiable information) is very important and a serious concern, yet it comes at a cost. It cannot be used for data analytics to the fullest extent possible. I worked on developing data anonymization techniques to deploy a practical anonymizing data analytics system that tries to provide rich analytics while at the same time, ensures user-anonymity. The technology underpinning the system includes trusted computing, query sand-boxing, and anonymization mechanisms, all built on scalable Map-Reduce framework coupled with PostgreSQL.
Traffic impact prediction refers to being able to predict in real time, short and long term traffic due to impact in a road segment. I worked on designing the implementing the Co-operative Contextual Baandit Algorithm in such a framework to study the traffic conditions, when an accident(or any other traffic impact like roadblocks, etc) happens on the LA streets.
Graduate Student
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
Email: hmohamma@cs.cmu.edu
Publications
Invited Talk
Research Internships
End to End Data Provenance Capture
May 2016 - July 2016
Harvard John A. Paulson School Of Engineering And Applied Sciences, Cambridge, MA, USA
Practical Anonymizing Data Analytics
May 2015 - July 2015
Max Planck Institute for Software Systems, Kaiserslautern, Germany
Data Mining & Machine Learning in Traffic Prediction
May 2014 - July 2014
University of Southern California, Los Angeles, USA