This application (XlingRetEval.cpp) runs cross-lingual retrieval experiments.
Parameters are:
sourceIndex
: The complete name of the index for the source language collection. This provides the background model for the source language. targetIndex
: The complete name of the index for the target language collection. This is the collection that is searched.
textQuery
: the query text stream, in the source language XLlambda
: The smoothing parameter for mixing P(t|D) and P(s|GS). XLbeta
: The Jelinik-Mercer lambda for estimating P(t|D).
sourceBackgroundModel
: One of "term" or "doc". If term, background model for the source language is estimated as tf(s)/|V|. If doc, the background model for the source language is estimated as df(t)/sum_w_in_V df(w). Default is term. targetBackgroundModel
: One of "term" or "doc". If term, background model for the target language is estimated as tf(s)/|V|. If doc, the background model for the target language is estimated as df(t)/sum_w_in_V df(w). Default is term.
resultFile
: the result file resultFormat
: whether the result format should be of the TREC format (i.e., six-column) or just a simple three-column format <queryID, docID, score>. String value, either trec
for TREC format or 3col
for three column format. Default: TREC format. -
resultCount
: the number of documents to return for each query
-
feedbackDocCount
: the number of docs to use for pseudo-feedback (0 means no-feedback) -
feedbackTermCount
: the number of terms to add to a query when doing feedback.
Simple KL parameters:
smoothSupportFile
: The name of the smoothing support file smoothMethod
: One of the four: jelinikmercer
or jm
for Jelinek-Mercer dirichletprior
or dir
for Dirichlet prior absolutediscount
or ad
for Absolute discounting twostage
or 2s
for two stage. smoothStrategy
: Either interpolate
for interpolate or backoff
for backoff.
adjustedScoreMethod
: Which type of score to output, one of: JelinekMercerLambda
: The collection model weight in the JM interpolation method. Default: 0.5
DirichletPrior
: The prior parameter in the Dirichlet prior smoothing method. Default: 1000
discountDelta
: The delta (discounting constant) in the absolute discounting method. Default 0.7. queryUpdateMethod
: feedback method, one of: relevancemodel1
or rm1
for relevance model 1. relevancemodel2
or rm2
for relevance model 2. feedbackCoefficient
: the coefficient of the feedback model for interpolation. The value is in [0,1], with 0 meaning using only the original model (thus no updating/feedback) and 1 meaning using only the feedback model (thus ignoring the original model).
feedbackTermCount
: Truncate the feedback model to no more than a given number of words/terms.
feedbackProbThresh
: Truncate the feedback model to include only words with a probability higher than this threshold. Default value: 0.001.
feedbackProbSumThresh
: Truncate the feedback model until the sum of the probability of the included words reaches this threshold. Default value: 1. feedbackTermCount
, feedbackProbThresh
, and feedbackProbSumThresh
work conjunctively to control the truncation, i.e., the truncated model must satisfy all the three constraints.