#include <SimpleKLDocModel.hpp>
Inheritance diagram for SimpleKLDocModel:
Public Methods | |
SimpleKLDocModel (DOCID_T docID, const UnigramLM &collectLM, int dl=1, const double *prMass=NULL, SimpleKLParameter::SmoothStrategy strat=SimpleKLParameter::INTERPOLATE) | |
~SimpleKLDocModel () | |
virtual double | termWeight (TERMID_T termID, const DocInfo *info) const |
term weighting function, weight(w) = p_seen(w)/p_unseen(w) | |
virtual double | scoreConstant () const |
doc-specific constant term in the scoring formula | |
virtual double | unseenCoeff () const=0 |
a(d) | |
virtual double | seenProb (double termFreq, TERMID_T termID) const=0 |
p(w|d), w seen | |
Protected Attributes | |
const UnigramLM & | refLM |
const double * | docPrMass |
SimpleKLParameter::SmoothStrategy | strategy |
abstract interface of doc representation for smoothed document unigram model
adapt a smoothed document language model interface to a DocumentRep interface
p(w|d) = q(w|d) if w seen = a(d) * Pc(w) if w unseen where, a(d) controls the probability mass allocated to all unseen words and Pc(w) is the collection language model
|
|
|
|
|
doc-specific constant term in the scoring formula
Implements DocumentRep. |
|
p(w|d), w seen
Implemented in JelinekMercerDocModel, DirichletPriorDocModel, AbsoluteDiscountDocModel, and TwoStageDocModel. |
|
term weighting function, weight(w) = p_seen(w)/p_unseen(w)
Implements DocumentRep. |
|
a(d)
Implemented in JelinekMercerDocModel, DirichletPriorDocModel, AbsoluteDiscountDocModel, and TwoStageDocModel. |
|
|
|
|
|
|