Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

TextQueryRetMethod Class Reference

#include <TextQueryRetMethod.hpp>

Inheritance diagram for TextQueryRetMethod:

RetrievalMethod CORIRetMethod CosSimRetMethod OkapiRetMethod SimpleKLRetMethod TFIDFRetMethod List of all members.

Public Methods

 TextQueryRetMethod (const Index &ind, ScoreAccumulator &accumulator)
virtual ~TextQueryRetMethod ()
virtual TextQueryRepcomputeTextQueryRep (const TermQuery &qry)=0
 compute the query representation for a text query (caller responsible for deleting the memory of the generated new instance)

virtual TextQueryRepcomputeTextQueryRep (DOCID_T docid)
 compute a query rep for an existing doc

virtual QueryRepcomputeQueryRep (const Query &qry)
 overriding abstract class method

virtual double scoreDoc (const QueryRep &qry, DOCID_T docID)
 ooverriding abstract class method

virtual void scoreCollection (const QueryRep &qry, IndexedRealVector &results)
 overriding abstract class method with a general efficient inverted index scoring procedure

virtual void scoreCollection (DOCID_T docid, IndexedRealVector &results)
 add support for scoring an existing document against the collection

virtual DocumentRepcomputeDocRep (DOCID_T docID)=0
 compute the doc representation (caller responsible for deleting the memory of the generated new instance)

virtual ScoreFunctionscoreFunc ()=0
 return the scoring function pointer

virtual void updateQuery (QueryRep &qryRep, const DocIDSet &relDocs)
 update the query

virtual void updateTextQuery (TextQueryRep &qryRep, const DocIDSet &relDocs)=0
 Modify/update the query representation based on a set (presumably) relevant documents.

virtual void scoreInvertedIndex (const QueryRep &qryRep, IndexedRealVector &scores, bool scoreAll=false)
 Efficient scoring with the inverted index.

virtual double scoreDocVector (const TextQueryRep &qry, DOCID_T docID, FreqVector &docVector)
virtual double scoreDocPassages (const TermQuery &qRep, DOCID_T docID, PassageScoreVector &scores, int psgSize, int overlap)
 Score a query for each passage of a document.


Protected Attributes

ScoreAccumulatorscAcc
DocumentRep ** docReps
 cache document reps.

bool cacheDocReps
 whether or not to cache document representations

int docRepsSize
 number of documents plus 1, the size of the docReps array.


Detailed Description

A text query retrieval method is determined by specifying the following elements
Given a query q =(q1,q2,...,qN) and a document d=(d1,d2,...,dN), where q1,...,qN and d1,...,dN are terms, TextQueryRetMethod assumes the following general scoring function:

s(q,d) = g(w(q1,d1,q,d) + ... + w(qN,dN,q,d),q,d)
That is, the score of a document d against a query q is a function g of the accumulated weight w for each matched term.

The score is thus determined by two functions g and w; both may depend on the whole query or document. The function w gives the weight of each matched term, while the function g makes it possible to perform any further transformation of the sum of the weight of all matched terms based on the "summary" information of a query or a document (e.g., document length).

TextQueryRep, DocumentRep, and ScoreFunction are designed to support this general scoring function in the following way:

A ScoreFunction is responsible for defining the two functions g and w. A TextQueryRep provides any information required for scoring from the query side (e.g., query term frequency). Similarly, a DocumentRep provides any information required for scoring from the document side. Furthermore, a TextQueryRep supports iteration over all query terms, allowing easy accumulation of weights over matched terms. The weighting function w and score adjustment function g typically assume and depend on some particular information and representation of the query and document, so a specific ScoreFunction (for a specific retrieval method) only works for some specific TextQueryRep and DocumentRep that are appropriate for the specific retrieval method.


Constructor & Destructor Documentation

TextQueryRetMethod::TextQueryRetMethod const Index   ind,
ScoreAccumulator   accumulator
 

Create the retrieval method. If cacheDocReps is true, allocate DocumentRep cache array.

virtual TextQueryRetMethod::~TextQueryRetMethod   [inline, virtual]
 

Destroy the object. If cacheDocReps is true, delete the DocumentRep cache array


Member Function Documentation

virtual DocumentRep* TextQueryRetMethod::computeDocRep DOCID_T    docID [pure virtual]
 

compute the doc representation (caller responsible for deleting the memory of the generated new instance)

Implemented in CORIRetMethod, CosSimRetMethod, OkapiRetMethod, SimpleKLRetMethod, and TFIDFRetMethod.

QueryRep * TextQueryRetMethod::computeQueryRep const Query   qry [inline, virtual]
 

overriding abstract class method

Implements RetrievalMethod.

virtual TextQueryRep* TextQueryRetMethod::computeTextQueryRep DOCID_T    docid [inline, virtual]
 

compute a query rep for an existing doc

Reimplemented in CosSimRetMethod.

virtual TextQueryRep* TextQueryRetMethod::computeTextQueryRep const TermQuery   qry [pure virtual]
 

compute the query representation for a text query (caller responsible for deleting the memory of the generated new instance)

Implemented in CORIRetMethod, CosSimRetMethod, OkapiRetMethod, SimpleKLRetMethod, and TFIDFRetMethod.

void TextQueryRetMethod::scoreCollection DOCID_T    docid,
IndexedRealVector   results
[virtual]
 

add support for scoring an existing document against the collection

void TextQueryRetMethod::scoreCollection const QueryRep   qry,
IndexedRealVector   results
[virtual]
 

overriding abstract class method with a general efficient inverted index scoring procedure

Reimplemented from RetrievalMethod.

Reimplemented in CORIRetMethod.

double TextQueryRetMethod::scoreDoc const QueryRep   qry,
DOCID_T    docID
[virtual]
 

ooverriding abstract class method

Implements RetrievalMethod.

double TextQueryRetMethod::scoreDocPassages const TermQuery   qRep,
DOCID_T    docID,
PassageScoreVector   scores,
int    psgSize,
int    overlap
[virtual]
 

Score a query for each passage of a document.

Parameters:
qRep  the TextQuery to score.
docID  the document to score.
scores  accumulator for the passage scores, in passage order.
psgSize  the number of tokens for sliding window.
overlap  the number of tokens to overlap in each passage.
Returns:
the maximum score over the passages.

double TextQueryRetMethod::scoreDocVector const TextQueryRep   qry,
DOCID_T    docID,
FreqVector   docVector
[virtual]
 

virtual ScoreFunction* TextQueryRetMethod::scoreFunc   [pure virtual]
 

return the scoring function pointer

Implemented in CORIRetMethod, CosSimRetMethod, OkapiRetMethod, SimpleKLRetMethod, and TFIDFRetMethod.

void TextQueryRetMethod::scoreInvertedIndex const QueryRep   qryRep,
IndexedRealVector   scores,
bool    scoreAll = false
[virtual]
 

Efficient scoring with the inverted index.

a general scoring procedure shared by many different models (assuming "sortedScores has memory allocated)

virtual void TextQueryRetMethod::updateQuery QueryRep   qryRep,
const DocIDSet   relDocs
[inline, virtual]
 

update the query

Implements RetrievalMethod.

virtual void TextQueryRetMethod::updateTextQuery TextQueryRep   qryRep,
const DocIDSet   relDocs
[pure virtual]
 

Modify/update the query representation based on a set (presumably) relevant documents.

Implemented in CORIRetMethod, CosSimRetMethod, OkapiRetMethod, SimpleKLRetMethod, and TFIDFRetMethod.


Member Data Documentation

bool TextQueryRetMethod::cacheDocReps [protected]
 

whether or not to cache document representations

DocumentRep** TextQueryRetMethod::docReps [protected]
 

cache document reps.

int TextQueryRetMethod::docRepsSize [protected]
 

number of documents plus 1, the size of the docReps array.

ScoreAccumulator& TextQueryRetMethod::scAcc [protected]
 


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:58 2004 for Lemur Toolkit by doxygen1.2.18