Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

PassageRep Class Reference

#include <PassageRep.hpp>

Inheritance diagram for PassageRep:

DocumentRep List of all members.

Public Methods

 PassageRep (DocumentRep &dRep, int d, int p, int o)
 Fixed sized window passage with overlap.

PassageRep::iterator begin ()
PassageRep::iterator end ()
void setEnd (int s, int e, int dl)
 update end and length values

int passageTF (TERMID_T tid, MatchInfo *matches) const
 Term frequency of a term within the current passage.

int getStart () const
 start of the current passage

int getEnd () const
 end + 1 of the current passage

virtual double termWeight (TERMID_T termID, const DocInfo *info) const
 Delegate call to termWeight of the encapsulated DocumentRep.

virtual double scoreConstant () const
 Delegate call to scoreConstant of the encapsulated DocumentRep.


Protected Attributes

DocumentRepdocRep
 DocumentRep for the whole document. Calls to termWeight and scoreConstant are delegated to it.

int psgSize
 Size of the passage, in number of tokens.

int overlap
 Number of tokens to overlap when advancing the passage window.

int docEnd
 Length of the whole document.

int start
 index of start of the current passage.

int pEnd
 index of end of the current passage.


Detailed Description

Passage representation for a document. Supports iteration over passages of fixed window size with an overlap of K terms for the window. Encapsulates the DocumentRep for the whole document, modifying its docLength attribute. Delegates calls to termWeight and scoreConstant to the encapsulated DocumentRep. TFIDFRetMethod with BM25 tf weighting and OkapiRetMethod will not compute correct scores, as they use the average document length from the collection in their formulas. The difference should be small.


Constructor & Destructor Documentation

PassageRep::PassageRep DocumentRep   dRep,
int    d,
int    p,
int    o
[inline]
 

Fixed sized window passage with overlap.

Parameters:
dRep  DocumentRep for the document as returned by computeDocRep.
d  length of whole document.
p  size of passage in terms of tokens.
o  number of tokens to overlap.


Member Function Documentation

PassageRep::iterator PassageRep::begin   [inline]
 

PassageRep::iterator PassageRep::end   [inline]
 

int PassageRep::getEnd   const [inline]
 

end + 1 of the current passage

int PassageRep::getStart   const [inline]
 

start of the current passage

int PassageRep::passageTF TERMID_T    tid,
MatchInfo   matches
const [inline]
 

Term frequency of a term within the current passage.

Parameters:
tid  the term id to count.
matches  the term matches returned by MatchInfo::getMatches for the document. This list is used for efficiency, as it is shorter than the whole TermInfoList for the document.
Returns:
the frequency of a term within the current passage.

virtual double PassageRep::scoreConstant   const [inline, virtual]
 

Delegate call to scoreConstant of the encapsulated DocumentRep.

Implements DocumentRep.

void PassageRep::setEnd int    s,
int    e,
int    dl
[inline]
 

update end and length values

virtual double PassageRep::termWeight TERMID_T    termID,
const DocInfo   info
const [inline, virtual]
 

Delegate call to termWeight of the encapsulated DocumentRep.

Implements DocumentRep.


Member Data Documentation

int PassageRep::docEnd [protected]
 

Length of the whole document.

DocumentRep& PassageRep::docRep [protected]
 

DocumentRep for the whole document. Calls to termWeight and scoreConstant are delegated to it.

int PassageRep::overlap [protected]
 

Number of tokens to overlap when advancing the passage window.

int PassageRep::pEnd [protected]
 

index of end of the current passage.

int PassageRep::psgSize [protected]
 

Size of the passage, in number of tokens.

int PassageRep::start [protected]
 

index of start of the current passage.


The documentation for this class was generated from the following file:
Generated on Wed Nov 3 12:59:50 2004 for Lemur Toolkit by doxygen1.2.18