Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

InvIndex Class Reference

#include <InvIndex.hpp>

Inheritance diagram for InvIndex:

Index InvFPIndex List of all members.

Public Methods

 InvIndex ()
 InvIndex (const string &indexName)
 ~InvIndex ()
void setMesgStream (ostream *lemStream)
 set the mesg stream

Open index
bool open (const string &indexName)
 Open previously created Index with given prefix, return true if opened successfully.

Spelling and index conversion
TERMID_T term (const TERM_T &word) const
 Convert a term spelling to a termID.

const TERM_T term (TERMID_T termID) const
 Convert a termID to its spelling.

DOCID_T document (const EXDOCID_T &docIDStr) const
 Convert a spelling to docID.

const EXDOCID_T document (DOCID_T docID) const
 Convert a docID to its spelling.

const DocumentManagerdocManager (DOCID_T docID) const
Summary counts
COUNT_T docCount () const
 Total count (i.e., number) of documents in collection.

COUNT_T termCountUnique () const
 Total count of unique terms in collection.

COUNT_T termCount (TERMID_T termID) const
 Total counts of a term in collection.

COUNT_T termCount () const
 Total counts of all terms in collection.

float docLengthAvg () const
 Average document length.

COUNT_T docCount (TERMID_T termID) const
 Total counts of doc with a given term.

COUNT_T docLength (DOCID_T docID) const
 Total counts of terms in a document, including stop words.

virtual COUNT_T docLengthCounted (DOCID_T docID) const
 Total count of terms in given document, not including stop words.

Index entry access
DocInfoListdocInfoList (TERMID_T termID) const
 doc entries in a term index,
See also:
DocList , InvFPDocList


TermInfoListtermInfoList (DOCID_T docID) const
 word entries in a document index (bag of words),
See also:
TermList



Protected Methods

bool fullToc (const string &fileName)
 readin all toc

bool indexLookup ()
 readin index lookup table

bool invFileIDs ()
 readin inverted index filenames map

bool docMgrIDs ()
 read in document manager internal and external ids map

bool dtLookup ()
 read in dt index lookup table of format ver1.9 (and up?)

bool dtLookup_ver1 ()
 read in dt index lookup table of format older than ver1.9

bool dtFileIDs ()
 read in dt index filenames map

bool termIDs ()
 read in termIDs to term spelling map

bool docIDs ()
 read in docIDs to doc spelling map


Protected Attributes

LOC_Tcounts
string * names
float aveDocLen
inv_entrylookup
dt_entrydtlookup
int dtloaded
TERM_Tterms
EXDOCID_Tdocnames
string * dtfiles
ifstream * dtfstreams
string * invfiles
ifstream * invfstreams
vector< DocumentManager * > docmgrs
map< TERM_T, TERMID_T, less<
TERM_T > > 
termtable
map< EXDOCID_T, DOCID_T, less<
EXDOCID_T > > 
doctable
ostream * msgstream

Constructor & Destructor Documentation

InvIndex::InvIndex  
 

InvIndex::InvIndex const string &    indexName
 

InvIndex::~InvIndex  
 


Member Function Documentation

COUNT_T InvIndex::docCount TERMID_T    termID const [virtual]
 

Total counts of doc with a given term.

Implements Index.

COUNT_T InvIndex::docCount   const [inline, virtual]
 

Total count (i.e., number) of documents in collection.

Implements Index.

bool InvIndex::docIDs   [protected]
 

read in docIDs to doc spelling map

DocInfoList * InvIndex::docInfoList TERMID_T    termID const [virtual]
 

doc entries in a term index,

See also:
DocList , InvFPDocList

Implements Index.

Reimplemented in InvFPIndex.

COUNT_T InvIndex::docLength DOCID_T    docID const [virtual]
 

Total counts of terms in a document, including stop words.

Implements Index.

float InvIndex::docLengthAvg   [virtual]
 

Average document length.

Implements Index.

COUNT_T InvIndex::docLengthCounted DOCID_T    docID const [virtual]
 

Total count of terms in given document, not including stop words.

Reimplemented in InvFPIndex.

const DocumentManager * InvIndex::docManager DOCID_T    docID const [virtual]
 

A String identifier for the document manager to get at the source of the document with this document id

Reimplemented from Index.

bool InvIndex::docMgrIDs   [protected]
 

read in document manager internal and external ids map

const EXDOCID_T InvIndex::document DOCID_T    docID const [virtual]
 

Convert a docID to its spelling.

Implements Index.

DOCID_T InvIndex::document const EXDOCID_T   docIDStr const [virtual]
 

Convert a spelling to docID.

Implements Index.

bool InvIndex::dtFileIDs   [protected]
 

read in dt index filenames map

bool InvIndex::dtLookup   [protected]
 

read in dt index lookup table of format ver1.9 (and up?)

bool InvIndex::dtLookup_ver1   [protected]
 

read in dt index lookup table of format older than ver1.9

bool InvIndex::fullToc const string &    fileName [protected]
 

readin all toc

bool InvIndex::indexLookup   [protected]
 

readin index lookup table

bool InvIndex::invFileIDs   [protected]
 

readin inverted index filenames map

bool InvIndex::open const string &    indexName [virtual]
 

Open previously created Index with given prefix, return true if opened successfully.

Implements Index.

void InvIndex::setMesgStream ostream *    lemStream
 

set the mesg stream

const TERM_T InvIndex::term TERMID_T    termID const [virtual]
 

Convert a termID to its spelling.

Implements Index.

TERMID_T InvIndex::term const TERM_T   word const [virtual]
 

Convert a term spelling to a termID.

Implements Index.

COUNT_T InvIndex::termCount   const [inline, virtual]
 

Total counts of all terms in collection.

Implements Index.

COUNT_T InvIndex::termCount TERMID_T    termID const [virtual]
 

Total counts of a term in collection.

Implements Index.

COUNT_T InvIndex::termCountUnique   const [inline, virtual]
 

Total count of unique terms in collection.

Implements Index.

bool InvIndex::termIDs   [protected]
 

read in termIDs to term spelling map

TermInfoList * InvIndex::termInfoList DOCID_T    docID const [virtual]
 

word entries in a document index (bag of words),

See also:
TermList

Implements Index.

Reimplemented in InvFPIndex.


Member Data Documentation

float InvIndex::aveDocLen [protected]
 

LOC_T* InvIndex::counts [protected]
 

vector<DocumentManager*> InvIndex::docmgrs [protected]
 

EXDOCID_T* InvIndex::docnames [protected]
 

map<EXDOCID_T, DOCID_T, less<EXDOCID_T> > InvIndex::doctable [protected]
 

string* InvIndex::dtfiles [protected]
 

ifstream* InvIndex::dtfstreams [protected]
 

int InvIndex::dtloaded [protected]
 

dt_entry* InvIndex::dtlookup [protected]
 

string* InvIndex::invfiles [protected]
 

ifstream* InvIndex::invfstreams [protected]
 

inv_entry* InvIndex::lookup [protected]
 

ostream* InvIndex::msgstream [protected]
 

string* InvIndex::names [protected]
 

TERM_T* InvIndex::terms [protected]
 

map<TERM_T, TERMID_T, less<TERM_T> > InvIndex::termtable [protected]
 


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:41 2004 for Lemur Toolkit by doxygen1.2.18