Ph.D., Computer Science Department
Carnegie Mellon University
I'm now at Google Research. I received a Ph.D. from the Computer Science Department, in the School of Computer Science at Carnegie Mellon University. My PhD adviser was William Cohen.
I am interested in the intersection of Natural Language Processing, Information Retrieval, and Machine Learning. My research experience includes the following topics:
grounded language learning, learning semantic relations, topic models, mining software repositories and software-focused corpora, bootstrapping on biomedical ontologies, knowledge base population, bootstrap learning and semantic drift, seed set refinement, text alignment with Hidden Markov Models, social media analysis, and computational biology.
Before coming to CMU, I got my M.Sc. and B.Sc. degrees in the Computer Science and Computational Biology program at the School of Computer Science and Engineering of The Hebrew University of Jerusalem. During that time, I did research at the Furman Lab (Dept. of Molecular Genetics and Biotechnology), and my adviser was Prof. Ora Schueler-Furman. In this group, we used computational methods to understand protein-protein interactions from a structural bioinformatics perspective. More specifically, we made predictions of the structural changes that take place in proteins during docking.
Apart from doing research I had a chance to get some great industry experience working for IBM, Facebook and Google.
You can find my full research and work history in my CV.
One of the things I enjoy most is hiking and traveling around the world. So far one of my favorite hiking locations has been New-Zealand, and I plan to return! I have an awesome husband, who was also a CSD PhD student at CMU.
PhD Thesis: Grounded Knowledge Bases for Scientific Domains
Dana Movshovitz-Attias, August 2015
Committee: William Cohen, Tom Mitchel, Roni Rosenfeld, Alon Halevi
[pdf]
[Thesis oral presentation]
KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts
Dana Movshovitz-Attias and William Cohen, 2015, Association for Computational Linguistics (ACL)
[pdf]
[data]
[ACL presentation]
[bibtex]
Discovering Subsumption Relationships for Web-Based Ontologies
Dana Movshovitz-Attias, Steven Euijong Whang, Natalya Noy, and Alon Halevy, 2015,
Proc. 18th International Workshop on the Web and Databases (WebDB) at ACM Sigmod
Winner of the WebDB Best Paper Award.
[pdf]
[WebDB presentation]
[bibtex]
Grounded Discovery of Coordinate Term Relationships between Software Entities
Dana Movshovitz-Attias and William Cohen, 2015, arXiv preprint arXiv:1505.00277
[pdf]
[arXiv link]
[bibtex]
Natural Language Models for Predicting Programming Comments
Dana Movshovitz-Attias and William Cohen, 2013, Association for Computational Linguistics (ACL)
[pdf]
[corpus]
[code (as Eclipse plugin)]
[ACL presentation]
[bibtex]
Analysis of the Reputation System and User Contributions on a Question Answering Website: StackOverflow
Dana Movshovitz-Attias*, Yair Movshovitz-Attias*, Peter Steenkiste and Christos Faloutsos, 2013, ASONAM
[pdf]
[bibtex]
Alignment-HMM-based Extraction of Abbreviations from Biomedical Text
Dana Movshovitz-Attias and William Cohen, 2012, BioNLP in NAACL
[pdf]
[github code (within the second-string package)]
[code description and downloadable data]
[abbreviations extracted from PubMed]
[BioNLP presentation]
[bibtex]
Bootstrapping Biomedical Ontologies for Scientific Text using NELL
Dana Movshovitz-Attias and William Cohen, 2012, BioNLP in NAACL
[pdf]
[tech report]
[BioNLP presentation]
[bibtex]
Detection of Peptide‐Binding Sites on Protein Surfaces: The First Step Towards the Modeling and Targeting of Peptide‐Mediated Interactions
Assaf Lavi, Chi Ho Ngan, Dana Movshovitz‐Attias, Tanggis Bohnuud, Christine Yueh, Dmitri Beglov, Ora Schueler‐Furman, Dima Kozakov, 2013, Proteins: Structure, Function and Bioinformatics
[pdf]
[bibtex]
Can Self-Inhibitory Peptides Be Derived from the Interfaces of Globular Protein-Protein Interactions?
Nir London, Barak Raveh, Dana Movshovitz-Attias and Ora Schueler-Furman, 2010, Proteins: Structure, Function and Bioinformatics
[pubmed]
[bibtex]
On The Use of Structural Templates for High-Resolution Docking
Dana Movshovitz-Attias, Nir London and Ora Schueler-Furman, 2010, Proteins: Structure, Function and Bioinformatics
[pdf]
[pubmed]
[bibtex]
Poster presented at the 11th Israeli Bioinformatics Symposium at Tel-Aviv University, Israel, 4/2008.
The Structural Basis of Peptide-Protein Binding Strategies
Nir London, Dana Movshovitz-Attias and Ora Schueler-Furman, 2010, Structure
[pdf]
[pubmed]
[bibtex]
Poster presented at the 12th Israeli Bioinformatics Symposium at Weizmann Institute, Israel, 4/2009.
Code, tools, and research-related data.
If you have questions about this content, or if there is other data you would like to use, please contact me at: dma [at] cs.cmu.edu
Dataset based on StackOverflow that was used to train the KB-LDA model from our ACL2015 paper.
This eclipse plugin enables word-completion within comments based on an n-gram model trained on multiple open-source JAVA projects and data from StackOverflow. Word completion works in a similar way to code completion tools built into standard code editors. While writing the comment you will be prompted for suggestions based on the implementation of the class you are currently commenting.
This is an abbreviation extractor based on a Hidden Markov Model. With this code you can extract abbreviations and their definitions from a text corpus. The Abbreviation Alignment HMM code is a part of the second-string open source package.
com.wcohen.ss.AbbreviationAlignment
is an implementation of the abbreviation alignment metric.com.wcohen.ss.expt.ExtractAbbreviations
is a utility for extracting abbreviations from a text corpus using our method.Courses and TA experience