Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

PDict Class Reference

Probabilistic dictionary using Keyfile for data storage. More...

#include <PDict.hpp>

List of all members.

Public Methods

 PDict ()
 default constructor

 ~PDict ()
 clean up

DictEntryVectorgetTranslations (const string &term, DictEntryFilter *filter=NULL) const
 Get dictionary entries (translations) for a term.

int numTranslations (const string &term, DictEntryFilter *filter=NULL) const
 Get the number dictionary entries (translations) for a term.

int getNumPairs () const
 Get the total size of the dictionary.

int getSourceCount () const
 Get the number of unique terms in the source vocabulary.

int getTargetCount () const
 Get the number of unique terms in the target vocabulary.

const string & getName () const
 Get the name of the dictionary.

bool isUsingCounts () const
 Is the dictionary using counts or probabilities.

void setUsingCounts (bool val)
 Set the flag for using counts or probabilities.

void add (const string &source, DictEntry &value, double(*compose)(double, double)=NULL)
 Add an entry for a term.

void remove (const string &source, DictEntry &value)
 Remove an entry for a term.

void remove (const string &source)
 Remove all entries for a term.

void write (const string &outputName, const string &delim)
 Output dictionary as plain text, separator delimited values.

bool read (const string &dictName, const string &delim, bool counts=false)
 Input a dictionary from plain text, separator delimited values.

bool open (const string &dictName)
 Open an existing probabilistic dictionary.

bool create (const string &dictName)
 Create a new, empty probabilistic dictionary.

void close ()
 Close the dictionary. Flushes all buffers and closes all files.

void normalize ()
 Normalize probabilities of entries to sum to one Normalizes all entries, updating the dictionary.

void startIteration ()
 Initialize for iteration over all keys.

DictEntryVectornextTranslations (string &term, DictEntryFilter *filter=NULL) const
 Get next key's dictionary entry (translations).


Detailed Description

Probabilistic dictionary using Keyfile for data storage.


Constructor & Destructor Documentation

PDict::PDict  
 

default constructor

PDict::~PDict  
 

clean up


Member Function Documentation

void PDict::add const string &    source,
DictEntry   value,
double(*    compose)(double, double) = NULL
 

Add an entry for a term.

Parameters:
source  the key for the entry
value  the value to add
compose  the function to use to combine this entries probability/frequency value with if there is an existing entry in the dictionary. Default is to sum. Replaces the entry for value if one exists.

void PDict::close  
 

Close the dictionary. Flushes all buffers and closes all files.

bool PDict::create const string &    dictName
 

Create a new, empty probabilistic dictionary.

Parameters:
dictName  the dictionary file to create.
Returns:
true if created successfully. Otherwise false. Create a new dictionary.

const string& PDict::getName   const [inline]
 

Get the name of the dictionary.

Returns:
The name of the dictionary

int PDict::getNumPairs  
 

Get the total size of the dictionary.

Returns:
Total number of pairs in the dictionary

int PDict::getSourceCount  
 

Get the number of unique terms in the source vocabulary.

Returns:
Total number of unique source term entries in the dictionary

int PDict::getTargetCount  
 

Get the number of unique terms in the target vocabulary.

Returns:
Total number of unique target term entries in the dictionary

DictEntryVector * PDict::getTranslations const string &    term,
DictEntryFilter   filter = NULL
const
 

Get dictionary entries (translations) for a term.

Parameters:
term  the term to lookup.
filter  to apply to the entries. If unspecified, defaults to NULL.
Returns:
Pointer to the vector of dictionary entries for the term. Caller is responsible for deleting.

bool PDict::isUsingCounts   const [inline]
 

Is the dictionary using counts or probabilities.

Returns:
true if the dictionary contains frequencies, otherwise false.

DictEntryVector * PDict::nextTranslations string &    term,
DictEntryFilter   filter = NULL
const
 

Get next key's dictionary entry (translations).

Parameters:
term  set to the term for this entry.
filter  to apply to the entries. If unspecified, defaults to NULL.
Returns:
Pointer to the vector of dictionary entries for the term. Caller is responsible for deleting. Returns NULL at end of file.

void PDict::normalize  
 

Normalize probabilities of entries to sum to one Normalizes all entries, updating the dictionary.

int PDict::numTranslations const string &    term,
DictEntryFilter   filter = NULL
const
 

Get the number dictionary entries (translations) for a term.

Parameters:
term  the term to lookup.
filter  to apply to the entries. If unspecified, defaults to NULL.
Returns:
Number of dictionary entries for the term after filtering.

bool PDict::open const string &    dictName
 

Open an existing probabilistic dictionary.

Parameters:
dictName  the dictionary file to open.
Returns:
true if opened successfully. Otherwise false. Open an existing dictionary.

bool PDict::read const string &    dictName,
const string &    delim,
bool    counts = false
 

Input a dictionary from plain text, separator delimited values.

Parameters:
dictName  the file to read
delim  the delimiter to use.
counts  true if the input file contains frequencies. Default is false.
Returns:
true if created successfully. Otherwise false. NB single char delimiter ? Escape in source/target.

void PDict::remove const string &    source
 

Remove all entries for a term.

Parameters:
source  the key for the entry

void PDict::remove const string &    source,
DictEntry   value
 

Remove an entry for a term.

Parameters:
source  the key for the entry
value  the value to delete

void PDict::setUsingCounts bool    val [inline]
 

Set the flag for using counts or probabilities.

Parameters:
val  true if the dictionary contains frequencies otherwise false.

void PDict::startIteration   [inline]
 

Initialize for iteration over all keys.

void PDict::write const string &    outputName,
const string &    delim
 

Output dictionary as plain text, separator delimited values.

Parameters:
outputName  the name of the file to write to.
delim  the delimiter to use. NB single char delimiter ? Escape in source/target.


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 12:59:50 2004 for Lemur Toolkit by doxygen1.2.18