lewis_evaluating_1991 inproceedings Evaluating text categorization 312??????318 1991 Proceedings of Speech and Natural Language Workshop Morgan Kaufmann 1991 scholkopf_new_2000 book New Support Vector Algorithms 1207--1245 12 2000 {MIT} Press 2000 popescu_extracting_2005 inproceedings Consumers are often forced to wade through many on-line reviews in order to make an informed product choice. This paper introduces Opine, an unsupervised information-extraction system which mines reviews in order to build a model of important product features, their evaluation by reviewers, and their relative quality across {products.Compared} to previous work, Opine achieves 22\% higher precision (with only 3\% lower recall) on the feature extraction task. Opine's novel use of relaxation labeling for finding the semantic orientation of words in context leads to strong performance on the tasks of finding opinion phrases and their polarity. Extracting product features and opinions from reviews information extraction opinion mining product features sentiment analysis Vancouver, British Columbia, Canada 339--346 2005 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing http://portal.acm.org/citation.cfm?id=1220575.1220618 Association for Computational Linguistics 2005 etzioni_unsupervised_2005 article Artificial Intelligence Unsupervised named-entity extraction from the Web: An experimental study machine learning ner unsupervised web 91--134 165 2005 1 2005 al-mubaid_context-based_2005 inproceedings This paper presents a new context-based method for automatic detection and extraction of similar and related words from texts. Finding similar words is a very important task for many {NLP} applications including anaphora resolution, document retrieval, text segmentation, and text summarization. Here we use word similarity to improve search quality for search engines in (general and) specific domains. Our method is based on rules for extracting the words in the neighborhood of a target word, then connecting this with the surroundings of other occurrences of the same word in the (training) text corpus. This is an on-going work, and is still under extensive testing. The preliminary results, however, are promising and encouraging more work in this direction. Context-based similar words detection and its application in specialized search engines information retrieval similarity detection synonymy San Diego, California, {USA} 260--262 1-58113-894-6 2005 Proceedings of the 10th international conference on Intelligent user interfaces 10.1145/1040830.1040890 http://portal.acm.org/citation.cfm?id=1040890 {ACM} 2005 widdows_semantic_2008 inproceedings Semantic Vectors: A Scalable Open Source Package and Online Technology Management Application open source semantics wordspace 2008 Proceedings of the sixth international conference on Language Resources and Evaluation {(LREC} 2008) 2008 turney_measuring_2005 inproceedings Measuring Semantic Similarity by Latent Relational Analysis latent relational analysis machine learning semantic similarity semantics 1136 19 2005 International Joint Conference on Artificial Intelligence Lawrence Erlbaum Associates Ltd 2005 turney_similarity_2006 article cs/0608100 There are at least two kinds of similarity. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason:stone is analogous to the pair carpenter:wood. This paper introduces Latent Relational Analysis {(LRA),} a method for measuring relational similarity. {LRA} has potential applications in many areas, including information extraction, word sense disambiguation, and information retrieval. Recently the Vector Space Model {(VSM)} of information retrieval has been adapted to measuring relational similarity, achieving a score of 47\% on a collection of 374 college-level multiple-choice word analogy questions. In the {VSM} approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. {LRA} extends the {VSM} approach in three ways: (1) the patterns are derived automatically from the corpus, (2) the Singular Value Decomposition {(SVD)} is used to smooth the frequency data, and (3) automatically generated synonyms are used to explore variations of the word pairs. {LRA} achieves 56\% on the 374 analogy questions, statistically equivalent to the average human score of 57\%. On the related problem of classifying semantic relations, {LRA} achieves similar gains over the {VSM.} Similarity of Semantic Relations semantic similarity semantics August 2006 Computational Linguistics, (2006), 32(3), 379-416 http://arxiv.org/abs/cs/0608100 2006-08 landauer_solution_1997 article {PSYCHOLOGICAL} {REVIEW-NEW} {YORK-} A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge clustering latent semantic analysis linear algebra machine learning 211--240 104 1997 1997 berry_using_1995 article {SIAM} Review Using linear algebra for intelligent information retrieval 573---595 37 1995 10.1.1.34.9579 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.9579 1995 chamberlain_phrase_2008 inproceedings Phrase Detectives: A Web-based Collaborative Annotation Game anaphora resolution games human computation phrases 2008 Proceedings of the International Conference on Semantic Systems {(I-Semantics??????} 08), Graz. Forthcoming 2008 aickelin_affinity_2004 article Proceedings of the 5th International Conference on Recent Advances in Soft Computing {(RASC} 2004), Nottingham, {UK} We combine Artificial Immune Systems {'AIS',} technology with Collaborative Filtering {'CF'} and use it to build a movie recommendation system. We already know that Artificial Immune Systems work well as movie recommenders from previous work by Cayzer and Aickelin 3, 4, 5. Here our aim is to investigate the effect of different affinity measure algorithms for the {AIS.} Two different affinity measures, Kendalls Tau and Weighted Kappa, are used to calculate the correlation coefficients for the movie recommender. We compare the results with those published previously and show that Weighted Kappa is more suitable than others for movie problems. We also show that {AIS} are generally robust movie recommenders and that, as long as a suitable affinity measure is chosen, results are good. On Affinity Measures for Artificial Immune System Movie Recommenders Computer Science - Artificial {Intelligence Computer} Science - Computers and {Society Computer} Science - Neural and Evolutionary Computing 2004 http://arxiv.org/abs/0801.4307 2004 hockenmaier_creatingccgbank_2006 article Proc. of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics Creating a {CCGbank} and a wide-coverage {CCG} lexicon for German 505????512 2006 2006 kay_what_1984 article American Anthropologist What Is the {Sapir-Whorf} Hypothesis? linguistics sapir-whorf 65--79 86 1984 1 1984 liu_syntactic_2005 article Proceedings of {ACL} Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization Syntactic Features for Evaluation of Machine Translation machine translation mt eval syntax 2005 2005 hu_mining_2004 inproceedings Mining Opinion Features in Customer Reviews opinion mining sentiment analysis 755--760 2004 {AAAI} '04 2004 zhuang_movie_2006 inproceedings With the flourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and summarization has become a hot research topic recently. Different from traditional text summarization, review mining and summarization aims at extracting the features on which the reviewers express their opinions and determining whether the opinions are positive or negative. In this paper, we focus on a specific domain - movie review. A multi-knowledge based approach is proposed, which integrates {WordNet,} statistical analysis and movie knowledge. The experimental results show the effectiveness of the proposed approach in movie review mining and summarization. Movie review mining and summarization review mining summarization Arlington, Virginia, {USA} 43--50 1-59593-433-2 2006 10.1145/1183614.1183625 http://portal.acm.org/citation.cfm?id=1183614.1183625 {ACM} 2006 lavie_significance_2004 article Proceedings of the 6th Conference of the Association for Machine Translation in the Americas The Significance of Recall in Automatic Metrics for {MT} Evaluation machine translation mt eval recall 2004 2004 mehay_bleuatre:_2007 article Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation {(TMI)} {BLEUATRE:} Flattening Syntactic Dependencies for {MT} Evaluation dependency grammar machine translation mt eval syntax 2007 2007 kamps_using_2004 article Proceedings of the 4th International Conference on Language Resources and Evaluation {(LREC} 2004 Using {WordNet} to measure semantic orientation of adjectives 1115??????1118 4 2004 2004 hu_mining_2004-1 inproceedings Mining and Summarizing Customer Reviews opinion mining sentiment analysis 2004 Proceedings of the Tenth {ACM} {SIGKDD} International Conference on Knowledge Discovery and Data Mining 2004 gimenez_iqmt:framework_2006 article Proceedings of the 5th {LREC} {IQMT:} A Framework for Automatic Machine Translation Evaluation machine translation mt eval 2006 2006 amig_mt_2006 article Proceedings of the {COLING/ACL} on Main conference poster sessions {MT} evaluation: human-like vs. human acceptable machine translation mt eval 17--24 2006 2006 owczarzak_labelled_2007 article Proceedings of the Second Workshop on Statistical Machine Translation Labelled Dependencies in Machine Translation Evaluation papineni_bleu:method_2001 inproceedings Human evaluations of machine translation are extensive but expensive. Human evaluations can take months to finish and involve human labor that can not be reused. We propose a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run. We present this method as an automated understudy to skilled human judges which substitutes for them when there is need for quick or frequent evaluations. {BLEU:} a method for automatic evaluation of machine translation bleu machine translation mt eval Philadelphia, Pennsylvania 311--318 2001 http://portal.acm.org/citation.cfm?id=1073083.1073135 Association for Computational Linguistics 2001 matsumoto_sentiment_2005 inbook Document sentiment classification is a task to classify a document according to the positive or negative polarity of its opinion (favorable or unfavorable). We propose using syntactic relations between words in sentences for document sentiment classification. Specifically, we use text mining techniques to extract frequent word sub-sequences and dependency sub-trees from sentences in a document dataset and use them as features of support vector machines. In experiments on movie review datasets, our classifiers obtained the best results yet published using these data. Sentiment Classification Using Word Sub-sequences and Dependency Sub-trees machine learning opinion mining sentiment classification svms 301--311 2005 Advances in Knowledge Discovery and Data Mining http://dx.doi.org/10.1007/11430919_37 2005 albrecht_re-examination_2007 article Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics {(ACL-2007)} A Re-examination of Machine Learning Approaches for {Sentence-Level} {MT} Evaluation machine learning machine translation mt svms 2007 2007 dave_miningpeanut_2003 inproceedings The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). We begin by identifying the unique properties of this problem and develop a method for automatically distinguishing between positive and negative reviews. Our classifier draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation. The best methods work as well as or better than traditional machine learning. When operating on individual sentences collected from web searches, performance is limited due to noise and ambiguity. But in the context of a complete web-based tool and aided by a simple method for grouping sentences into attributes, the results are qualitatively quite useful. Mining the peanut gallery: opinion extraction and semantic classification of product reviews opinion mining sentiment analysis Budapest, Hungary 519--528 1-58113-680-3 2003 Proceedings of the 12th international conference on World Wide Web 10.1145/775152.775226 http://portal.acm.org/citation.cfm?id=775152.775226&type=series {ACM} 2003 pang_sentimental_2004 article Proceedings of the {ACL} A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts opinion mining sentiment analysis 271??????278 2004 2004 lavrenko_relevance_2001 inproceedings We explore the relation between classical probabilistic models of information retrieval and the emerging language modeling approaches. It has long been recognized that the primary obstacle to effective performance of classical models is the need to estimate arelevance model: probabilities of words in the relevant class. We propose a novel technique for estimating these probabilities using the query alone. We demonstrate that our technique can produce highly accurate relevance models, addressing important notions of synonymy and polysemy. Our experiments show relevance models outperforming baseline language modeling systems on {TREC} retrieval and {TDT} tracking tasks. The main contribution of this work is an effective formal method for estimating a relevance model with no training data. Relevance based language models information retrieval language models relevance New Orleans, Louisiana, United States 120--127 1-58113-331-6 2001 Proceedings of the 24th annual international {ACM} {SIGIR} conference on Research and development in information retrieval 10.1145/383952.383972 http://portal.acm.org/citation.cfm?id=383972 {ACM} 2001 lee_opinion_2008 inproceedings As people leave on the Web their opinions on products and services they have used, it has become important to develop methods of (semi-)automatically classifying and gauging them. The task of analyzing such data, collectively called customer feedback data, is known as opinion mining. Opinion mining consists of several steps, and multiple techniques have been proposed for each step. In this paper, we survey and analyze various techniques that have been developed for the key tasks of opinion mining. On the basis of our survey and analysis of the techniques, we provide an overall picture of what is involved in developing a software system for opinion mining. Opinion mining of customer feedback data on the web opinion mining opinion summarization sentiment analysis Suwon, Korea 230--235 978-1-59593-993-7 2008 Proceedings of the 2nd international conference on Ubiquitous information management and communication 10.1145/1352793.1352842 http://portal.acm.org/citation.cfm?id=1352793.1352842 {ACM} 2008 koehn_manual_2006 article Proceedings of the Workshop on Statistical Machine Translation Manual and automatic evaluation of machine translation between european languages machine translation mt eval 102????121 2006 2006 liu_integrating_1998 article Knowledge Discovery and Data Mining Integrating Classification and Association Rule Mining association rules machine learning 80--86 1998 1998 combinations_heterogeneous_???? article Heterogeneous Automatic {MT} Evaluation Through {Non-Parametric} Metric Combinations machine translation mt eval lavie_meteor:automatic_2007 article Proceedings of Workshop on Statistical Machine Translation {(WMT)} at the 45th Annual Meeting of the Association of Computational Linguistics {(ACL-07)} Meteor: An Automatic Metric for {MT} Evaluation with High Levels of Correlation with Human Judgments machine translation meteor mt eval 2007 2007 fahrni_old_2008 inproceedings Old Wine or Warm Beer: {Target-Specific} Sentiment Analysis of Adjectives opinion mining sentiment analysis University of Aberdeen, Aberdeen, Scotland 2008 Symposium on Affective Language in Human and Machine 2008 bell_scalable_???? article Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights owczarzak_dependency-based_2007 article Proceedings of {SSST,} {NAACL-HLT/AMTA} Workshop on Syntax and Structure in Statistical Translation {Dependency-Based} Automatic Evaluation for Machine Translation dependency grammar machine translation mt eval 2007 2007 chklovski_buildingsense_2002 inproceedings Open Mind Word Expert is an implemented active learning system for collecting word sense tagging from the general public over the Web. It is available at http://teach-computers.org. We expect the system to yield a large volume of high-quality training data at a much lower cost than the traditional method of hiring lexicographers. We thus propose a Senseval-3 lexical sample activity where the training data is collected via Open Mind Word Expert. If successful, the collection process can be extended to create the definitive corpus of word sense information. Building a sense tagged corpus with open mind word expert human computation wsd 116--122 2002 http://portal.acm.org/citation.cfm?id=1118692&dl=GUIDE, Association for Computational Linguistics 2002 pang_thumbs_2002 inproceedings We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed {(Naive} Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging. Thumbs up?: sentiment classification using machine learning techniques 79--86 2002 http://portal.acm.org/citation.cfm?id=1118704&dl=GUIDE, Association for Computational Linguistics 2002 esuli_sentiwordnet:publicly_2006 inproceedings {SentiWordNet:} A publicly available lexical resource for opinion mining opinion mining sentiment analysis wordnet 417--422 2006 Proceedings of {LREC} 2006 vilar_human_2007 article Proceedings of the Second Workshop on Statistical Machine Translation Human Evaluation of Machine Translation Through Binary System Comparisons pennock_computational_???? article Computational Aspects of Prediction Markets corston-oliver_machine_2001 article Proceedings of the 39th Annual Meeting on Association for Computational Linguistics A machine learning approach to the automatic evaluation of machine translation machine learning machine translation mt eval 148--155 2001 2001 kulesza_learning_2004 article Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation A learning approach to improving sentence-level {MT} evaluation machine learning machine translation mt eval svms 2004 2004 kauchak_paraphrasing_2006 article Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics Paraphrasing for automatic evaluation machine translation mt eval paraphrases 455--462 2006 2006 russo-lassner_paraphrase-based_2005 techreport A Paraphrase-based Approach to Machine Translation Evaluation machine translation mt eval paraphrases 2005 Technical Report {LAMPTR-125/CS-TR-4754/UMIACS-TR-2005-57,} University of Maryland, College Park, {MD} 2005 manning_introduction_2008 book Introduction to Information Retrieval information retrieval 2008 Cambridge University Press 2008 esuli_determining_2006 article In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics {(EACL??????06} Determining term subjectivity and term orientation for opinion mining opinion mining sentiment analysis 2006 10.1.1.60.8645 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.60.8645 2006 salakhutdinov_restricted_2007 inproceedings Most of the existing approaches to collaborative filtering cannot handle very large data sets. In this paper we show how a class of two-layer undirected graphical models, called Restricted Boltzmann Machines {(RBM's),} can be used to model tabular data, such as user's ratings of movies. We present efficient learning and inference procedures for this class of models and demonstrate that {RBM's} can be successfully applied to the Netflix data set, containing over 100 million user/movie ratings. We also show that {RBM's} slightly outperform carefully-tuned {SVD} models. When the predictions of multiple {RBM} models and multiple {SVD} models are linearly combined, we achieve an error rate that is well over 6\% better than the score of Netflix's own system. Restricted Boltzmann machines for collaborative filtering Corvalis, Oregon 791--798 978-1-59593-793-3 2007 10.1145/1273496.1273596 http://portal.acm.org/citation.cfm?id=1273496.1273596 {ACM} 2007 hewavitharana_cmu_2005 article The 2005 International Workshop on Spoken Language Translation The {CMU} statistical machine translation system for {IWSLT2005} 2005 2005 joachims_optimizing_2002 article Proceedings of the eighth {ACM} {SIGKDD} international conference on Knowledge discovery and data mining Optimizing search engines using clickthrough data clickthrough machine learning search engines svms 133--142 2002 2002 albrecht_regression_2007 article Proceedings of {ACL} Regression for {Sentence-Level} {MT} Evaluation with Pseudo References machine learning machine translation mt eval svms 2007 2007 collins_convolution_2002 article Advances in Neural Information Processing Systems Convolution kernels for natural language kernels machine learning nlp svms 625????632 14 2002 2002 agrawal_fast_1994 article Proc. 20th Int. Conf. Very Large Data Bases, {VLDB} Fast algorithms for mining association rules 487499 1215 1994 1994 si_flexible_2003 article In Proceedings of {ICML} Flexible mixture model for collaborative filtering collaborative {filtering EM} algorithm hmm machine learning recommender systems 704---711 2003 10.1.1.3.901 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.901 2003 mullen_sentiment_2004 inproceedings Sentiment analysis using support vector machines with diverse information sources machine learning opinion mining sentiment analysis svms 412????418 2004 Proceedings of {EMNLP} 2004 lin_orange:method_2004 article Proceedings of {COLING} {ORANGE:} a Method for Evaluating Automatic Evaluation Metrics for Machine Translation machine translation mt eval 501??????507 2004 2004 mendes_adjectives_2006 inproceedings Adjectives in {WordNet} 2006 Proceedings of the {GWA} {2006??????Global} {WordNet} Association Conference 2006 ding_holistic_2008 inproceedings One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews. This problem has many applications, e.g., opinion mining, summarization and search. Most existing techniques utilize a list of opinion (bearing) words (also called opinion lexicon) for the purpose. Opinion words are words that express desirable (e.g., great, amazing, etc.) or undesirable (e.g., bad, poor, etc) states. These approaches, however, all have some major shortcomings. In this paper, we propose a holistic lexicon-based approach to solving the problem by exploiting external evidences and linguistic conventions of natural language expressions. This approach allows the system to handle opinion words that are context dependent, which cause major difficulties for existing algorithms. It also deals with many special words, phrases and language constructs which have impacts on opinions based on their linguistic patterns. It also has an effective function for aggregating multiple conflicting opinion words in a sentence. A system, called Opinion Observer, based on the proposed technique has been implemented. Experimental results using a benchmark product review data set and some additional reviews show that the proposed technique is highly effective. It outperforms existing methods significantly A holistic lexicon-based approach to opinion mining context dependent opinions Palo Alto, California, {USA} 231--240 978-1-59593-927-9 2008 10.1145/1341531.1341561 http://portal.acm.org/citation.cfm?id=1341531.1341561 {ACM} 2008 banerjee_meteor:automatic_2005 article Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization {METEOR:} An Automatic Metric for {MT} Evaluation with Improved Correlation with Human Judgments machine translation meteor mt eval 2005 2005 lee_empirical_2002 inproceedings In this paper, we evaluate a variety of knowledge sources and supervised learning algorithms for word sense disambiguation on {SENSEVAL-2} and {SENSEVAL-1} data. Our knowledge sources include the part-of-speech of neighboring words, single words in the surrounding context, local collocations, and syntactic relations. The learning algorithms evaluated include Support Vector Machines {(SVM),} Naive Bayes, {AdaBoost,} and decision tree algorithms. We present empirical results showing the relative contribution of the component knowledge sources and the different learning algorithms. In particular, using all of these knowledge sources and {SVM} (i.e., a single learning algorithm) achieves accuracy higher than the best official scores on both {SENSEVAL-2} and {SENSEVAL-1} test data. An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation 41--48 2002 Proceedings of the {ACL-02} conference on Empirical methods in natural language processing - Volume 10 http://portal.acm.org/citation.cfm?id=1118693.1118699 Association for Computational Linguistics 2002 turian_evaluation_2003 article Machine Translation Summit {IX} Evaluation of Machine Translation and its Evaluation machine translation mt eval 2 100 2003 2003 eguchi_sentiment_2006 inproceedings Sentiment retrieval using generative models generative models machine learning opinion mining sentiment analysis 345--354 2006 2006 esuli_determiningsemantic_2005 inproceedings Sentiment classification is a recent subdiscipline of text classification which is concerned not with the topic a document is about, but with the opinion it expresses. It has a rich set of applications, ranging from tracking users' opinions about products or about political candidates as expressed in online forums, to customer relationship management. Functional to the extraction of opinions from text is the determination of the orientation of ``subjective'' terms contained in text, i.e. the determination of whether a term that carries opinionated content has a positive or a negative connotation. In this paper we present a new method for determining the orientation of subjective terms. The method is based on the quantitative analysis of the glosses of such terms, i.e. the definitions that these terms are given in on-line dictionaries, and on the use of the resulting term representations for semi-supervised term classification. The method we present outperforms all known methods when tested on the recognized standard benchmarks for this task. Determining the semantic orientation of terms through gloss classification opinion mining semantic orientation sentiment analysis Bremen, Germany 617--624 1-59593-140-6 2005 Proceedings of the 14th {ACM} international conference on Information and knowledge management 10.1145/1099554.1099713 http://portal.acm.org/citation.cfm?id=1099554.1099713 {ACM} 2005 owczarzak_contextual_2006 article Proceedings of the {HLT-NAACL} 2006 Workshop on Statistical Machine Translation Contextual {Bitext-Derived} Paraphrases in Automatic {MT} Evaluation machine translation mt eval paraphrases 86--93 2006 2006 clark_parsingwsj_2004 article Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics Parsing the {WSJ} using {CCG} and log-linear models ccg log-linear models parsing 2004 2004 cahill_long-distance_2004 article Proceedings of {ACL} {Long-Distance} Dependency Resolution in Automatically Acquired {Wide-Coverage} {PCFG-Based} {LFG} Approximations lfg linguistics parsing 320????327 4 2004 2004 liu_source-language_2007 article Proceedings of {NAACL} {HLT} {Source-Language} Features and Maximum Correlation Training for Machine Translation Evaluation machine translation mt eval 41--48 2007 2007 brown_automatic_2005 inproceedings In the {REAP} system, users are automatically provided with texts to read targeted to their individual reading levels. To find appropriate texts, the user's vocabulary knowledge must be assessed. We describe an approach to automatically generating questions for vocabulary assessment. Traditionally, these assessments have been hand-written. Using data from {WordNet,} we generate 6 types of vocabulary questions. They can have several forms, including wordbank and multiple-choice. We present experimental results that suggest that these automatically-generated questions give a measure of vocabulary skill that correlates well with subject performance on independently developed human-written questions. In addition, strong correlations with standardized vocabulary tests point to the validity of our approach to automatic assessment of word knowledge. Automatic question generation for vocabulary assessment {CALL question} generation Vancouver, British Columbia, Canada 819--826 2005 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing http://portal.acm.org/citation.cfm?id=1220678 Association for Computational Linguistics 2005 nielsen_question_2008 inproceedings Question Generation: Proposed Challenge Tasks and Their Evaluation question generation Arlington, Virginia, {USA} 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge {\textless}p{\textgreater}3 tasks:{\textless}/p{\textgreater} {\textless}ol{\textgreater} {\textless}li{\textgreater}concept selection{\textless}/li{\textgreater} {\textless}li{\textgreater}question type determination{\textless}/li{\textgreater} {\textless}li{\textgreater}question construction{\textless}/li{\textgreater} {\textless}/ol{\textgreater} 2008 vanderwende_importance_2008 inproceedings The Importance of Being Important: Question Generation question generation Arlington, Virginia, {USA} 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge 2008 nielsen_taxonomy_2008 inproceedings A Taxonomy of Questions for Question Generation Arlington, Virginia, {USA} 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge 2008 hoshino_realtime_2005 inproceedings A realtime multiple-choice question generation for language testing: A preliminary study question generation 2005 Proceedings of the {ACL} 2005 The Second Workshop on Building Educational Applications Using Natural Language Processing 2005 chen_fast:automatic_2006 inproceedings {FAST:} an automatic generation system for grammar tests question generation 1--4 2006 Proceedings of the {COLING/ACL} on Interactive presentation sessions Association for Computational Linguistics Morristown, {NJ,} {USA} 2006 ignatova_generating_2008 inproceedings Generating high quality questions from low quality questions Arlington, Virginia, {USA} 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge 2008 gates_generating_2008 inproceedings Generating loop-back strategy questions from expository texts 2008 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge 2008 prasad_discourse-based_2008 inproceedings A discourse-based approach to generating why-questions from texts Arlington, Virginia, {USA} 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge 2008 rus_evaluation_2007 inproceedings Evaluation in Natural Language Generation: The Question Generation Task 20--21 2007 Proceedings of the Workshop on Shared Tasks and Comparative Evaluation in Natural Language Generation 2007 smith_question_2008 inproceedings Question Generation as a Competitve Undergraduate Course Project 2008 Workshop on the Question Generation Shared Task and Evaluation Challenge 2008 kunichika_automated_2003 inproceedings Automated question generation methods for intelligent English learning systems and its evaluation 2--5 2003 Proceedings of {ICCE2004} 2003 beulen_automatic_1998 inproceedings Decision tree based state tying uses so-called phonetic questions to assign triphone states to reasonable acoustic models. These phonetic questions are in fact phonetic categories such as vowels, plosives or fricatives. The assumption behind this is that context phonemes which belong to the same phonetic class have a similar influence on the pronunciation of a phoneme. For a new phoneme set, which has to be used, for example, when switching to a different corpus, a phonetic expert is needed to define proper phonetic questions. In this paper a new method is presented which automatically defines good phonetic questions for a phoneme set. This method uses the intermediate clusters from a phoneme clustering algorithm which are reduced to an appropriate number afterwards. Recognition results on the Wall Street Journal data for within-word and across-word phoneme models show competitive performance of the automatically generated questions with our best handcrafted question set Automatic question generation for decision tree based state tying acoustic models acoustic signal processing across-word phoneme model automatic question generation decision theory decision tree based state tying fricatives matrix algebra pattern classification performance phoneme clustering algorithm phoneme set phonetic categories phonetic expert phonetic questions plosives pronunciation recognition results speech recognition trees (mathematics) triphone {states vowels Wall} Street Journal data within-word phoneme model 805--808 vol.2 1520-6149 2 1998 Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 {IEEE} International Conference on {10.1109/ICASSP.1998.675387} 1998 titov_joint_???? article Urbana A Joint Model of Text and Aspect Ratings for Sentiment Summarization 61801 51 kyoomarsi_optimizing_2008 inproceedings Optimizing Text Summarization Based on Fuzzy Logic fuzzy logic summarization 2008 http://www2.computer.org/portal/web/csdl/doi/10.1109/ICIS.2008.46 2008 blei_latent_2003 article The Journal of Machine Learning Research We describe latent Dirichlet allocation {(LDA),} a generative probabilistic model for collections of discrete data such as text corpora. {LDA} is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an {EM} algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic {LSI} model. Latent dirichlet allocation clustering document modeling latent dirichlet allocation 993--1022 3 2003 http://portal.acm.org/citation.cfm?id=944919.944937 2003 griffiths_finding_2004 inproceedings A first step in identifying the content of a document is determining which topics that document addresses. We describe a generative model for documents, introduced by Blei, Ng, and Jordan {[Blei,} D. M., Ng, A. Y. \& Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993-1022], in which each document is generated by choosing a distribution over topics and then choosing each word in the document from a topic selected according to this distribution. We then present a Markov chain Monte Carlo algorithm for inference in this model. We use this algorithm to analyze abstracts from {PNAS} by using Bayesian model selection to establish the number of topics. We show that the extracted topics capture meaningful structure in the data, consistent with the class designations provided by the authors of the articles, and outline further applications of this analysis, including identifying ??????hot topics?????? by examining temporal dynamics and tagging abstracts to illustrate semantic content. Finding scientific topics April 101 2004 Proceedings of the National Academy of Sciences http://www.pnas.org/content/101/suppl.1/5228.abstract?ck=nck 2004-04 blei_correlated_2007 article Annals of Applied Statistics A {CORRELATED} {TOPIC} {MODEL} {OF} {SCIENCE} 17--35 1 2007 1 2007 government_of_canada_uniform_???? misc A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations {\textbar} Publications {\textbar} {NRC-IIT} http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-50398\_e.html http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-50398_e.html zhang_avoiding_2008 inproceedings The primary premise upon which {top-N} recommender systems operate is that similar users are likely to have similar tastes with regard to their product choices. For this reason, recommender algorithms depend deeply on similarity metrics to build the recommendation lists for end-users. Avoiding monotony: improving the diversity of recommendation lists accuracy diversity metrics novelty recommender system Lausanne, Switzerland 123--130 978-1-60558-093-7 2008 Proceedings of the 2008 {ACM} conference on Recommender systems 10.1145/1454008.1454030 http://portal.acm.org/citation.cfm?doid=1454008.1454030 {ACM} 2008 kaser_tag-cloud_2007 article cs/0703109 Tag clouds provide an aggregate of tag-usage statistics. They are typically sent as in-line {HTML} to browsers. However, display mechanisms suited for ordinary text are not ideal for tags, because font sizes may vary widely on a line. As well, the typical layout does not account for relationships that may be known between tags. This paper presents models and algorithms to improve the display of tag clouds that con- sist of in-line {HTML,} as well as algorithms that use nested tables to achieve a more general 2-dimensional layout in which tag relationships are considered. The first algorithms leverage prior work in typesetting and rectangle packing, whereas the second group of algorithms leverage prior work in Electronic Design Automation. Experiments show our algorithms can be efficiently implemented and perform well. {Tag-Cloud} Drawing: Algorithms for Cloud Visualization Computer Science - Data Structures and Algorithms March 2007 http://arxiv.org/abs/cs/0703109 2007-03 theobald_spotsigs:_2008 inproceedings Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present {SpotSigs,} a new algorithm for extracting and matching signatures for near duplicate detection in large Web crawls. Our spot signatures are designed to favor natural-language portions of Web pages over advertisements and navigational bars. {SpotSigs:} robust and efficient near duplicate detection in large web collections high-dimensional similarity search inverted index pruning optimal partitioning stopword signatures Singapore, Singapore 563--570 978-1-60558-164-4 2008 Proceedings of the 31st annual international {ACM} {SIGIR} conference on Research and development in information retrieval 10.1145/1390334.1390431 http://portal.acm.org/citation.cfm?id=1390431 {ACM} 2008 broder_syntactic_1997 article Computer Networks and {ISDN} Systems Syntactic clustering of the Web 1157--1166 29 1997 8-13 1997 wang_topics_2006 inproceedings This paper presents an {LDA-style} topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution over timestamps, and for each generated document, the mixture distribution over topics is influenced by both word co-occurrences and the document's timestamp. Thus, the meaning of a particular topic can be relied upon as constant, but the topics' occurrence and correlations change significantly over time. We present results on nine months of personal email, 17 years of {NIPS} research papers and over 200 years of presidential state-of-the-union addresses, showing improved topics, better timestamp prediction, and interpretable trends. Topics over time: a {non-Markov} continuous-time model of topical trends graphical models temporal analysis topic modeling Philadelphia, {PA,} {USA} 424--433 1-59593-339-5 2006 Proceedings of the 12th {ACM} {SIGKDD} international conference on Knowledge discovery and data mining 10.1145/1150402.1150450 http://portal.acm.org/citation.cfm?id=1150402.1150450 {ACM} 2006 lemire_diversity_2008 unpublished Online communities have become become a crucial ingredient of e-business. Supporting open social networks builds strong brands and provides lasting value to the consumer. One function of the community is to recommend new products and services. Open social networks tend to be resilient, adaptive, and broad, but simplistic recommender systems can be 'gamed' by members seeking to promote certain products or services. We argue that the gaming is not the failure of the open social network, but rather of the function used by the recommender. To increase the quality and resilience of recommender systems, and provide the user with genuine and novel discoveries, we have to foster diversity, instead of closing down the social networks. Fortunately, software increases the broadcast capacity of each individual, making dense open social networks possible. Numerically, we show that dense social networks encourage diversity. In business terms, dense social networks support a long tail. Diversity in open social networks 2008 Tech Report http://www.daniel-lemire.com/fr/abstracts/DIVERSITY2008.html 2008 bargiela_towardtheory_2008 article Fuzzy Systems, {IEEE} Transactions on Human-centered information processing has been pioneered by Zadeh through his introduction of the concept of fuzzy sets in the mid 1960s. The insights that were afforded through this formalism have led to the development of the granular computing {(GrC)} paradigm in the late 1990s. Subsequent research has highlighted the fact that many founding principles of {GrC} have, in fact, been adopted in other information-processing paradigms and, indeed, in the context of various scientific methodologies. This study expands on our earlier research exploring the foundations of {GrC} and casting it as a structured combination of algorithmic and non- algorithmic information processing that mimics human, intelligent synthesis of knowledge from information. Toward a Theory of Granular Computing for {Human-Centered} Information Processing 1063-6706 320--330 16 2008 2 {10.1109/TFUZZ.2007.905912} 2008 bargiela_towardtheory_2008-1 article Fuzzy Systems, {IEEE} Transactions on Human-centered information processing has been pioneered by Zadeh through his introduction of the concept of fuzzy sets in the mid 1960s. The insights that were afforded through this formalism have led to the development of the granular computing {(GrC)} paradigm in the late 1990s. Subsequent research has highlighted the fact that many founding principles of {GrC} have, in fact, been adopted in other information-processing paradigms and, indeed, in the context of various scientific methodologies. This study expands on our earlier research exploring the foundations of {GrC} and casting it as a structured combination of algorithmic and non- algorithmic information processing that mimics human, intelligent synthesis of knowledge from information. Toward a Theory of Granular Computing for {Human-Centered} Information Processing 1063-6706 320--330 16 2008 2 {10.1109/TFUZZ.2007.905912} 2008 bargiela_towardtheory_2008-2 article Fuzzy Systems, {IEEE} Transactions on Toward a Theory of Granular Computing for {Human-Centered} Information Processing 320--330 16 2008 2 2008 lin_data_???? article Data Mining, Rough Sets and Granular Computing zadeh_fuzzy_???? article {FUZZY} {SETS} {AND} {INFORMATION} {GRANULARITY} liu_arsa:sentiment-aware_2007 inproceedings Due to its high popularity, Weblogs (or blogs in short) present a wealth of information that can be very helpful in assessing the general public's sentiments and opinions. In this paper, we study the problem of mining sentiment information from blogs and investigate ways to use such information for predicting product sales performance. Based on an analysis of the complex nature of sentiments, we propose Sentiment {PLSA} {(S-PLSA),} in which a blog entry is viewed as a document generated by a number of hidden sentiment factors. Training an {S-PLSA} model on the blog data enables us to obtain a succinct summary of the sentiment information embedded in the blogs. We then present {ARSA,} an autoregressive sentiment-aware model, to utilize the sentiment information captured by {S-PLSA} for predicting product sales performance. Extensive experiments were conducted on a movie data set. We compare {ARSA} with alternative models that do not take into account the sentiment information, as well as a model with a different feature selection method. Experiments confirm the effectiveness and superiority of the proposed approach. {ARSA:} a sentiment-aware model for predicting sales performance using blogs autoregressive model blog sentiment mining Amsterdam, The Netherlands 607--614 978-1-59593-597-7 2007 Proceedings of the 30th annual international {ACM} {SIGIR} conference on Research and development in information retrieval 10.1145/1277741.1277845 http://portal.acm.org/citation.cfm?id=1277845 {ACM} 2007 piche_trend_1995 inproceedings Double moving averages are commonly used to identify trends within capital markets. In this paper, a novel analysis technique is presented which is based upon plotting the returns associated with thousands of different double moving average trading rules for a time series. The resulting plots are useful for gaining an understanding of the trends contained within the time series. The analysis technique, which is referred to as trend visualization, is also useful for selecting the appropriate parameters for double moving average trading systems. In this paper, the technique is used to investigate trends within the foreign currency spot market Trend visualization capital markets data visualisation double moving average trading systems financial data processing foreign currency spot market foreign exchange trading moving average processes plotting returns time series trading rules trend visualization 146--150 1995 Computational Intelligence for Financial Engineering, {1995.,Proceedings} of the {IEEE/IAFE} 1995 {10.1109/CIFER.1995.495268} 1995 beineke_exploration_2003 inproceedings An exploration of sentiment summarization 12--15 2003 Proceedings of {AAAI} 2003 whitelaw_using_2005 inproceedings Little work to date in sentiment analysis (classifying texts by `positive' or `negative' orientation) has attempted to use fine-grained semantic distinctions in features used for classification. We present a new method for sentiment classification based on extracting and analyzing appraisal groups such as ``very good'' or ``not terribly funny''. An appraisal group is represented as a set of attribute values in several task-independent semantic taxonomies, based on Appraisal Theory. Semi-automated methods were used to build a lexicon of appraising adjectives and their modifiers. We classify movie reviews using features based upon these taxonomies combined with standard ``bag-of-words'' features, and report state-of-the-art accuracy of 90.2\%. In addition, we find that some types of appraisal appear to be more significant for sentiment classification than others. Using appraisal groups for sentiment analysis appraisal theory opinion mining review classification sentiment analysis shallow parsing text classification Bremen, Germany 625--631 1-59593-140-6 2005 Proceedings of the 14th {ACM} international conference on Information and knowledge management 10.1145/1099554.1099714 http://portal.acm.org/citation.cfm?id=1099714&dl=GUIDE, {ACM} 2005 zhuang_movie_2006-1 inproceedings With the flourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and summarization has become a hot research topic recently. Different from traditional text summarization, review mining and summarization aims at extracting the features on which the reviewers express their opinions and determining whether the opinions are positive or negative. In this paper, we focus on a specific domain - movie review. A multi-knowledge based approach is proposed, which integrates {WordNet,} statistical analysis and movie knowledge. The experimental results show the effectiveness of the proposed approach in movie review mining and summarization. Movie review mining and summarization Arlington, Virginia, {USA} 43--50 1-59593-433-2 2006 Proceedings of the 15th {ACM} international conference on Information and knowledge management 10.1145/1183614.1183625 http://portal.acm.org/citation.cfm?id=1183614.1183625 {ACM} 2006 zhuang_movie_2006-2 inproceedings With the flourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and summarization has become a hot research topic recently. Different from traditional text summarization, review mining and summarization aims at extracting the features on which the reviewers express their opinions and determining whether the opinions are positive or negative. In this paper, we focus on a specific domain - movie review. A multi-knowledge based approach is proposed, which integrates {WordNet,} statistical analysis and movie knowledge. The experimental results show the effectiveness of the proposed approach in movie review mining and summarization. Movie review mining and summarization Arlington, Virginia, {USA} 43--50 1-59593-433-2 2006 Proceedings of the 15th {ACM} international conference on Information and knowledge management 10.1145/1183614.1183625 http://portal.acm.org/citation.cfm?id=1183614.1183625 {ACM} 2006 zhuang_movie_2006-3 inproceedings Movie review mining and summarization 43--50 2006 Proceedings of the 15th {ACM} international conference on Information and knowledge management {ACM} Press New York, {NY,} {USA} 2006 mihalcea_corpus-based_2006 inproceedings A corpus-based approach to finding happiness 2006 Proceedings of the {AAAI} Spring Symposium on Computational Approaches to Weblogs 2006 wiebe_word_2006 inproceedings Word Sense and Subjectivity 1065 44 2006 {ANNUAL} {MEETING-ASSOCIATION} {FOR} {COMPUTATIONAL} {LINGUISTICS} 2006 lacoste-julien_disclda:_???? article {DiscLDA:} Discriminative Learning for Dimensionality Reduction and Classification stonebraker_c-store:column-oriented_2005 inproceedings This paper presents the design of a read-optimized relational {DBMS} that contrasts sharply with most current systems, which are write-optimized. Among the many differences in its design are: storage of data by column rather than by row, careful coding and packing of objects into storage including main memory during query processing, storing an overlapping collection of column-oriented projections, rather than the current fare of tables and indexes, a non-traditional implementation of transactions which includes high availability and snapshot isolation for read-only transactions, and the extensive use of bitmap indexes to complement B-tree {structures.We} present preliminary performance data on a subset of {TPC-H} and show that the system we are building, {C-Store,} is substantially faster than popular commercial products. Hence, the architecture looks very encouraging. C-store: a column-oriented {DBMS} Trondheim, Norway 553--564 1-59593-154-6 2005 Proceedings of the 31st international conference on Very large data bases http://portal.acm.org/citation.cfm?id=1083658 {VLDB} Endowment 2005 harizopoulos_oltp_2008 inproceedings Online Transaction Processing {(OLTP)} databases include a suite of features - disk-resident B-trees and heap files, locking-based concurrency control, support for multi-threading - that were optimized for computer technology of the late 1970's. Advances in modern processors, memories, and networks mean that today's computers are vastly different from those of 30 years ago, such that many {OLTP} databases will now fit in main memory, and most {OLTP} transactions can be processed in milliseconds or less. Yet database architecture has changed little. {OLTP} through the looking glass, and what we found there dbms architecture main memory transaction processing oltp online transaction processing Vancouver, Canada 981--992 978-1-60558-102-6 2008 Proceedings of the 2008 {ACM} {SIGMOD} international conference on Management of data 10.1145/1376616.1376713 http://portal.acm.org/citation.cfm?id=1376713 {ACM} 2008 kallman_h-store:high-performance_2008 inproceedings {H-Store:} A {High-Performance,} Distributed Main Memory Transaction Processing System 1496--1499 2008 Proceedings of {VLDB} 2008 kukich_technique_1992 article {ACM} Comput. Surv. Research aimed at correcting words in text has focused on three progressively more difficult problems:(1) nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction. In response to the first problem, efficient pattern-matching and n-gram analysis techniques have been developed for detecting strings that do not appear in a given word list. In response to the second problem, a variety of general and application-specific spelling correction techniques have been developed. Some of them were based on detailed studies of spelling error patterns. In response to the third problem, a few experiments using natural-language-processing tools or statistical-language models have been carried out. This article surveys documented findings on spelling error patterns, provides descriptions of various nonword detection and isolated-word error correction techniques, reviews the state of the art of context-dependent word correction techniques, and discusses research issues related to all three areas of automatic error correction in text. Technique for automatically correcting words in text context-dependent spelling correction grammar checking n-gram analysis natural-language-processing models neural net classifiers optical character recognition (ocr) spell checking spelling error detection spelling error patterns statistical-language models word recognition and correction 377--439 24 1992 4 10.1145/146370.146380 http://portal.acm.org/citation.cfm?doid=146370.146380 1992 otero_contextual_2007 inbook Spelling correction is commonly a critical task for a variety of {NLP} tools. Some systems assist users by offering a set of possible corrections for a given misspelt word. An automatic spelling correction system would be able to choose only one or, at least, to rank them according to a certain criterion. We present a dynamic framework which allows us to combine spelling correction and {Part-of-Speech} tagging tasks in an efficient way. The result is a system capable of ranking the set of possible corrections taking the context of the erroneous words into account. Contextual Spelling Correction 290--296 2007 Computer Aided Systems Theory ?????? {EUROCAST} 2007 http://dx.doi.org/10.1007/978-3-540-75867-9_37 2007 hirst_correcting_2005 article Natural Language Engineering Correcting real-word spelling errors by restoring lexical cohesion 87--111 11 2005 01 2005 wilcox-ohearn_real-word_2008 inbook The trigram-based noisy-channel model of real-word spelling-error correction that was presented by Mays, Damerau, and Mercer in 1991 has never been adequately evaluated or compared with other methods. We analyze the advantages and limitations of the method, and present a new evaluation that enables a meaningful comparison with the {WordNet-based} method of Hirst and Budanitsky. The trigram method is found to be superior, even on content words. We then show that optimizing over sentences gives better results than variants of the algorithm that optimize over fixed-length windows. {Real-Word} Spelling Correction with Trigrams: A Reconsideration of the Mays, Damerau, and Mercer Model 605--616 2008 Computational Linguistics and Intelligent Text Processing http://dx.doi.org/10.1007/978-3-540-78135-6_52 2008 fossati_mixed_2007 inbook This paper addresses the problem of real-word spell checking, i.e., the detection and correction of typos that result in real words of the target language. This paper proposes a methodology based on a mixed trigrams language model. The model has been implemented, trained, and tested with data from the Penn Treebank. The approach has been evaluated in terms of hit rate, false positive rate, and coverage. The experiments show promising results with respect to the hit rates of both detection and correction, even though the false positive rate is still high. A Mixed Trigrams Approach for Context Sensitive Spell Checking 623--633 2007 Computational Linguistics and Intelligent Text Processing http://dx.doi.org/10.1007/978-3-540-70939-8_55 2007 oflazer_error-tolerant_1996 article Computational Linguistics Error-tolerant finite-state recognition with applications to morphological analysis and spelling correction 73--89 22 1996 1 1996 brill_improved_2000 inproceedings An improved error model for noisy channel spelling correction 286--293 2000 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics Association for Computational Linguistics Morristown, {NJ,} {USA} 2000 church_probability_???? article Probability Scoring for Spelling Correction mays_context_1991 article Information processing \& management Context based spelling correction 517--522 27 1991 5 1991 camelin_opinion_2006 inproceedings Opinion Mining in a Telephone Survey Corpus 2006 Ninth International Conference on Spoken Language Processing {ISCA} 2006 Soderland, Stephen Soderland Stephen Soderland Khosravi, Hamid Khosravi Hamid Khosravi Amig????, E. Amig???? E. Amig???? Collier, N. Collier N. Collier Brockett, C. Brockett C. Brockett Jing, F. Jing F. Jing Fahrni, A. Fahrni A. Fahrni Etzioni, Oren Etzioni Oren Etzioni Rasin, Alex Rasin Alex Rasin Pennock, David M. Pennock David M. Pennock Siddharth, Jonathan Siddharth Jonathan Siddharth Bargiela, A. Bargiela A. Bargiela Dumais, S. T. Dumais S. T. Dumais Cafarella, Michael Cafarella Michael Cafarella Lemire, Daniel Lemire Daniel Lemire Eugenio, Barbara Di Eugenio Barbara Di Eugenio Nakagawa, H. Nakagawa H. Nakagawa Mihalcea, R. Mihalcea R. Mihalcea Brew, C. Brew C. Brew Ward, Todd Ward Todd Ward Smith, Noah Smith Noah Smith Graesser, A. C. Graesser A. C. Graesser Rus, V. Rus V. Rus Papineni, Kishore Papineni Kishore Papineni Hwa, R. Hwa R. Hwa Brill, E. Brill E. Brill Gim????nez, J. Gim????nez J. Gim????nez Sagae, K. Sagae K. Sagae Aickelin, Uwe Aickelin Uwe Aickelin Fossati, Davide Fossati Davide Fossati Lee, Yoong Keok Lee Yoong Keok Lee Jayaraman, S. Jayaraman S. Jayaraman Blei, David M. Blei David M. Blei {MAYS}, E. {MAYS} E. {MAYS} Ignatova, Kateryna Ignatova Kateryna Ignatova Knoll, G. Knoll G. Knoll Tajoddin, Asghar Tajoddin Asghar Tajoddin Garg, Navendu Garg Navendu Garg Shaked, Tal Shaked Tal Shaked Marx, M. Marx M. Marx Downey, Doug Downey Doug Downey Kulesza, A. Kulesza A. Kulesza Liu, Yang Liu Yang Liu Cai, Z. Cai Z. Cai Kunichika, H. Kunichika H. Kunichika Argamon, Shlomo Argamon Shlomo Argamon Smola, A. J. Smola A. J. Smola Beulen, K. Beulen K. Beulen Lee, Lillian Lee Lillian Lee Berry, M. W Berry M. W Berry Hinton, Geoffrey Hinton Geoffrey Hinton Shen, L. Shen L. Shen {DAMEREAU}, {FJ} {DAMEREAU} {FJ} {DAMEREAU} Srikant, R. Srikant R. Srikant Gildea, D. Gildea D. Gildea Zhu, {Wei-Jing} Zhu {Wei-Jing} Zhu Dave, Kushal Dave Kushal Dave {HIRST}, G. {HIRST} G. {HIRST} O'brien, G. W O'brien G. W O'brien Hirashima, T. Hirashima T. Hirashima Downes, Stephen Downes Stephen Downes Kamps, J. Kamps J. Kamps Moore, R. C. Moore R. C. Moore Buckingham, J. Buckingham J. Buckingham Sebastiani, Fabrizio Sebastiani Fabrizio Sebastiani Genabith, J. van Genabith J. van Genabith Lee, Sang-goo Lee Sang-goo Lee {Russo-Lassner}, G. {Russo-Lassner} G. {Russo-Lassner} Joachims, T. Joachims T. Joachims Zhuang, L. Zhuang L. Zhuang Ding, Xiaowen Ding Xiaowen Ding Sami, R. Sami R. Sami Hori, C. Hori C. Hori Barzilay, R. Barzilay R. Barzilay Theobald, Martin Theobald Martin Theobald Palen, L. Palen L. Palen Monz, C. Monz C. Monz Madden, Samuel Madden Samuel Madden Marsh, B. Marsh B. Marsh Ney, H. Ney H. Ney Jin, Rong Jin Rong Jin Gale, W. A. Gale W. A. Gale Chklovski, Timothy Chklovski Timothy Chklovski Scholkopf, B. Scholkopf B. Scholkopf Jing, Feng Jing Feng Jing Albrecht, J. S. Albrecht J. S. Albrecht Lawrence, Steve Lawrence Steve Lawrence Ferreira, Miguel Ferreira Miguel Ferreira Steyvers, Mark Steyvers Mark Steyvers Vilares, M. Vilares M. Vilares Kimura, Hideaki Kimura Hideaki Kimura Lee, Dongjoo Lee Dongjoo Lee Resnik, P. Resnik P. Resnik Agarwal, A. Agarwal A. Agarwal Bechet, F. Bechet F. Bechet Vanderwende, L. Vanderwende L. Vanderwende Okumura, Manabu Okumura Manabu Okumura Manning, C. Manning C. Manning Natkins, Jonathan Natkins Jonathan Natkins Mullen, T. Mullen T. Mullen Combinations, {TNPM} Combinations {TNPM} Combinations Agrawal, R. Agrawal R. Agrawal Lafferty, J. D. Lafferty J. D. Lafferty Takeuchi, A. Takeuchi A. Takeuchi Hurley, Neil Hurley Neil Hurley Vaithyanathan, Shivakumar Vaithyanathan Shivakumar Vaithyanathan Yu, Philip S. Yu Philip S. Yu Camelin, N. Camelin N. Camelin Zhu, {Xiao-Yan} Zhu {Xiao-Yan} Zhu Sha, F. Sha F. Sha Williamson, R. C. Williamson R. C. Williamson Paquet, Sebastien Paquet Sebastien Paquet Pennock, D. M. Pennock D. M. Pennock Lee, L. Lee L. Lee Tran, Nga Tran Nga Tran Zhang, Mi Zhang Mi Zhang Blei, D. M. Blei D. M. Blei Wiebe, J. Wiebe J. Wiebe Curran, J. R. Curran J. R. Curran Titov, I. Titov I. Titov Whitelaw, Casey Whitelaw Casey Whitelaw Hildebrand, A. S. Hildebrand A. S. Hildebrand Huang, Xiangji Huang Xiangji Huang Mendes, S. Mendes S. Mendes Bartlett, P. L. Bartlett P. L. Bartlett {O????Donovan}, R. {O????Donovan} R. {O????Donovan} Kallman, Robert Kallman Robert Kallman Croft, W. Bruce Croft W. Bruce Croft Lin, T. Y. Lin T. Y. Lin Broder, A. Z. Broder A. Z. Broder An, Aijun An Aijun An Lin, Amerson Lin Amerson Lin Chen, Qi Chen Qi Chen Och, F. J. Och F. J. Och Zdonik, Stan Zdonik Stan Zdonik Jones, Evan Jones Evan Jones Gimenez, J. Gimenez J. Gimenez Gra????a, J. Gra????a J. Gra????a Chang, J. S. Chang J. S. Chang Pedrycz, W. Pedrycz W. Pedrycz Brown, Jonathan C. Brown Jonathan C. Brown Vogel, S. Vogel S. Vogel Abadi, Daniel Abadi Daniel Abadi Jordan, M. I. Jordan M. I. Jordan Kyoomarsi, Farshad Kyoomarsi Farshad Kyoomarsi Kauchak, D. Kauchak D. Kauchak Owczarzak, K. Owczarzak K. Owczarzak Landauer, T. K. Landauer T. K. Landauer Paepcke, Andreas Paepcke Andreas Paepcke Collins, M. Collins M. Collins {Corston-Oliver}, S. {Corston-Oliver} S. {Corston-Oliver} M????rquez, L. M????rquez L. M????rquez Dumais, S. T Dumais S. T Dumais Liu, D. Liu D. Liu Piche, {S.W.} Piche {S.W.} Piche Lin, J. Lin J. Lin Jordan, Michael I. Jordan Michael I. Jordan Salakhutdinov, Ruslan Salakhutdinov Ruslan Salakhutdinov Gonzalo, J. Gonzalo J. Gonzalo {McDonald}, R. {McDonald} R. {McDonald} Mehay, D. Mehay D. Mehay Duffy, N. Duffy N. Duffy Prasad, Rashmi Prasad Rashmi Prasad Klenner, M. Klenner M. Klenner Stonebraker, Michael Stonebraker Michael Stonebraker Gurevych, Iryna Gurevych Iryna Gurevych Hwa, Rebecca Hwa Rebecca Hwa Hastie, T. Hastie T. Hastie Mihalcea, Rada Mihalcea Rada Mihalcea Madden, Sam Madden Sam Madden Zhu, X. Y. Zhu X. Y. Zhu Nielsen, R. Nielsen R. Nielsen Esuli, A. Esuli A. Esuli Zhao, B. Zhao B. Zhao Turney, P. D. Turney P. D. Turney Hockenmaier, J. Hockenmaier J. Hockenmaier Eslami, Esfandiar Eslami Esfandiar Eslami Yao, Y. Y. Yao Y. Y. Yao Kukich, Karen Kukich Karen Kukich Vaithyanathan, S. Vaithyanathan S. Vaithyanathan Liu, B. Liu B. Liu Kay, P. Kay P. Kay Kruschwitz, U. Kruschwitz U. Kruschwitz Chen, C. Y. Chen C. Y. Chen Dehkordy, Pooya Khosravyan Dehkordy Pooya Khosravyan Dehkordy Hirst, Graeme Hirst Graeme Hirst Eskenazi, Maxine Eskenazi Maxine Eskenazi Bell, R. M. Bell R. M. Bell {O'Neil}, Pat {O'Neil} Pat {O'Neil} Groves, D. Groves D. Groves Yates, Alexander Yates Alexander Yates Raghavan, P. Raghavan P. Raghavan Joshi, Aravind Joshi Aravind Joshi Bernhard, Delphine Bernhard Delphine Bernhard Cahill, A. Cahill A. Cahill Turney, Peter D Turney Peter D Turney Batkin, Adam Batkin Adam Batkin Shieber, S. M. Shieber S. M. Shieber Amigo, E. Amigo E. Amigo Sebastiani, F. Sebastiani F. Sebastiani {Lacoste-Julien}, S. {Lacoste-Julien} S. {Lacoste-Julien} Albrecht, J. Albrecht J. Albrecht Melamed, I. D. Melamed I. D. Melamed Roukos, Salim Roukos Salim Roukos Zhang, Yang Zhang Yang Zhang {MERCER}, {RL} {MERCER} {RL} {MERCER} Zhuang, Li Zhuang Li Zhuang Lewis, D. D. Lewis D. D. Lewis Lavrenko, Victor Lavrenko Victor Lavrenko Pavlo, Andrew Pavlo Andrew Pavlo Yu, Xiaohui Yu Xiaohui Yu Turian, J. P. Turian J. P. Turian Stonebraker, Mike Stonebraker Mike Stonebraker Otero, J. Otero J. Otero Beineke, P. Beineke P. Beineke Eck, M. Eck M. Eck Lau, Edmond Lau Edmond Lau Liu, H. Liu H. Liu Genabith, J. Van Genabith J. Van Genabith Manasse, M. S. Manasse M. S. Manasse Widdows, D. Widdows D. Widdows Matsumoto, Shotaro Matsumoto Shotaro Matsumoto Mori, R. D. Mori R. D. Mori {BUDANITSKY}, A. {BUDANITSKY} A. {BUDANITSKY} Liou, H. C. Liou H. C. Liou Mokken, R. J. Mokken R. J. Mokken Waibel, A. Waibel A. Waibel Harizopoulos, Stavros Harizopoulos Stavros Harizopoulos Zweig, G. Zweig G. Zweig Wang, Xuerui Wang Xuerui Wang Popescu, {Ana-Maria} Popescu {Ana-Maria} Popescu Si, Luo Si Luo Si Damnati, G. Damnati G. Damnati Hewavitharana, S. Hewavitharana S. Hewavitharana Frishkoff, Gwen A. Frishkoff Gwen A. Frishkoff Zadeh, L. A. Zadeh L. A. Zadeh Sch????tze, H. Sch????tze H. Sch????tze Banerjee, S. Banerjee S. Banerjee Koehn, P. Koehn P. Koehn Koren, Y. Koren Y. Koren Poesio, M. Poesio M. Poesio Lin, C. Y. Lin C. Y. Lin Hugg, John Hugg John Hugg Kaser, Owen Kaser Owen Kaser Pang, Bo Pang Bo Pang Budanitsky, Alexander Budanitsky Alexander Budanitsky Berry, Michael W Berry Michael W Berry Eguchi, Koji Eguchi Koji Eguchi Esuli, Andrea Esuli Andrea Esuli {McCallum}, Andrew {McCallum} Andrew {McCallum} Weld, Daniel S. Weld Daniel S. Weld {O'Neil}, Elizabeth {O'Neil} Elizabeth {O'Neil} Takamura, Hiroya Takamura Hiroya Takamura Cherniack, Mitch Cherniack Mitch Cherniack Chen, Ping Chen Ping Chen Ng, Andrew Y. Ng Andrew Y. Ng Hsu, W. Hsu W. Hsu Chen, Xuedong Chen Xuedong Chen Abadi, Daniel J. Abadi Daniel J. Abadi Lavie, A. Lavie A. Lavie Gavin Gavin Gavin Liu, Bing Liu Bing Liu Hoshino, A. Hoshino A. Hoshino Glassman, S. C. Glassman S. C. Glassman Mnih, Andriy Mnih Andriy Mnih Jeong, {Ok-Ran} Jeong {Ok-Ran} Jeong Canada, National Research Council Canada Government of Canada National Research Council Canada Government of Canada Hu, M. Hu M. Hu Way, A. Way A. Way Ma, Y. Ma Y. Ma Gates, Donna Gates Donna Gates {Al-Mubaid}, Hisham {Al-Mubaid} Hisham {Al-Mubaid} Dumais, Susan T Dumais Susan T Dumais Griffiths, Thomas L. Griffiths Thomas L. Griffiths {Wilcox-O??????Hearn}, Amber {Wilcox-O??????Hearn} Amber {Wilcox-O??????Hearn} Ferraro, K. Ferraro K. Ferraro Clark, S. Clark S. Clark Church, K. W. Church K. W. Church Pang, B. Pang B. Pang Gamon, M. Gamon M. Gamon Heilman, Michael Heilman Michael Heilman Manning, C. D. Manning C. D. Manning Kempton, W. Kempton W. Kempton Ng, Hwee Tou Ng Hwee Tou Ng Oflazer, K. Oflazer K. Oflazer Rijke, M. de Rijke M. de Rijke Burke, M. Burke M. Burke Katayama, T. Katayama T. Katayama Chamberlain, J. Chamberlain J. Chamberlain
(
)
(
)
.
,
(
)
,
.
,
(
)
,
.
:
Categories:
Keywords:
[
link
]
[
doi
]
Computational Linguistics and other assorted topics
A collection of papers by Jason Adams