This page is outdated. Please check my current page instead.
Vitor
R. Carvalho
contact:
gmail.com@vitordecarvalho
I'm a Lead Research Scientist/Manager at Snapchat Research . I live in San Diego, CA.
My PhD thesis advisor was
the
ingenious William W.
Cohen . I have worked at Microsoft, Qualcomm Research, Inome and Ericsson R&D.
I'm interested in
applied research interfacing Machine
Learning, Natural Language Processing, Information
Retrieval, Text Mining, Data Mining and AI.
Writing and Publications
:
A Few Recent papers:
IJCAI 2017 , Exploring Personalized Neural Conversational Models
GIS 2012 (GIBDA) , Geocoding Billions of Addresses: Toward a Spatial Record Linkage System with Big Data
NAACL 2012 , The Intelius Nickname Collection: Quantitative Analyses from Billions of Public Records - LDC link for the data is here
VLDB-2011 (QDB) ,
The Case for
Cost-Sensitive and Easy-To-Interpret Models in Industrial Record Linkage
ECIR-2011 , An Analysis
of Time-Instability in Web Search Results
SIGIR Forum 2011 ,
Crowdsourcing for Search and Data Mining
CIKM 2010
, Online
Stratified Sampling: Evaluating Classifiers at Web-Scale
SIGIR
CSE-2010
, Proceedings
of the SIGIR2010 Workshop on Crowdsourcing for Search Evaluation
SIGIR
2010
FGSIR Workshop , Online
Feature Selection for Information Retrieval
SIGIR 2010
, Exploring
Reductions in Long Web Queries
SIGIR 2010
, Predicting
Query Performance on the Web
SIGIR 2009
, Reducing
Long
Queries Using Query Quality Predictors
CEAS 2009
, Information
Leaks and Suggestions: a Case Study using Mozilla Thunderbird
CIKM 2008
, Suppressing
Outliers in Pairwise Preference Ranking
AAAI
WS-08-04 , Proceedings
of the AAAI 2008 EMAIL Workshop
SIGIR-2008
LR4IR , A
Meta-Learning Approach for Robust Rank Learning
AAAI-2008
EMAIL Workshop , CutOnce
- Recipient Recommendation and Leak Detection in Action
ECIR-2008
, Ranking
Users for Intelligent Message Addressing
WSDM-2008
, Fast
Learning of Document Ranking Functions with the Committee Perceptron
Some older publications you may be looking for:
All
older
publications
All
Publications:
Software:
Ciranda
- Java package for email-speech-act prediction
Jangada
- Java package for extraction of signatures (sig files) and
reply-to (quotes) lines in email messages
Cut Once
- A Mozilla
Thunderbird plug-in for email information leak prevention and recipient
recommendation
I contribute to Minorthird ,
a package for text learning, classification, extraction and
annotations
Datasets:
Other
Stuff (may be outdated):
My "academic
lineage "
tracing back all the way to Leibniz, James Clerk
Maxwell, Poisson, Lagrange, Bernoulli and Euler (compiled
by William Cohen)
I've recently organized the Workshop on
Crowdsourcing for Search and Data Mining at ACM WSDM 2011 with Matt Lease
and Emine
Yilmaz .
I organized the SIGIR 2010
Workshop on "Crowdsourcing
for Search Evaluation " with Matt Lease
and Emine
Yilmaz .
A few recent program committees: SIGIR-2011 , AAAI HCOMP
2011 , NextMail-2011 ,
CEAS-2010 , NAACL-ACL
2010 Young Investigators , EMNLP-09 , IEEE CEC-09
(E3C ),
CEAS-09 , IJCAI-09 , ICML-09 , COLING/ACL-06 , AAAI-07 , CEAS-05-06-07-08
, WWW-08
I organized, with Mark
Dredze and Tessa Lau, EMAIL-2008: the
AAAI-2008 Workshop on Enhanced Messaging
Check out our new IR group
blog....Probably
Irrelevant
I used to organize the CMU Machine
Learning Lunch and the CMU Information
Retrieval Discussion Series
I was a TA in the Machine
Learning
(10-601) course during Fall 2007
I was the TA of the Information
Extraction
course (MLD 10-707 and LTI 11-748) during Spring 07
Alma Matres: CMU -SCS -LTI , UNICAMP -FEEC , UFPE , Colegio
Diocesano - Teresina
When in Pittsburgh, check out our radio program
on WRCT
(88.3 FM)