|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--EDU.gatech.cc.is.learning.i_ReinforcementLearner_id | +--EDU.cmu.cs.coral.learning.i_PriQLearner_id
An object that learns to select from several actions based on a reward. Uses the Prioritized Sweeping technique of Moore.
The module will learn to select a discrete output based on state and a continuous reinforcement input. The "i"s in front of and behind the name imply that this class takes integers as input and output. The "d" indicates a double for the reinforcement input (i.e. a continuous value).
Copyright (c)2000 Tucker Balch
Inner Class Summary | |
protected class |
i_PriQLearner_id.state
|
Field Summary | |
protected PriorityQueue |
changeQueue
|
protected int |
criteria
|
static int |
DISCOUNTED
Used to indicate the learner uses discounted rewards. |
protected int |
numactions
|
protected i_PriQLearner_id.state[] |
states
|
Fields inherited from class EDU.gatech.cc.is.learning.i_ReinforcementLearner_id |
logging,
numactions,
numstates,
policyfilename |
Constructor Summary | |
i_PriQLearner_id(int numstatesin,
int numactionsin)
Instantiate a Q learner using default parameters. |
|
i_PriQLearner_id(int numstatesin,
int numactionsin,
int criteriain)
Instantiate a Q learner using default parameters. |
|
i_PriQLearner_id(int numstatesin,
int numactionsin,
int criteriain,
long seedin)
Instantiate a Prioritized Sweeping learner using default parameters. |
Method Summary | |
void |
endTrial(double Vn,
double rn)
Called when the current trial ends. |
double |
getAvgReward()
Report the average reward per step in the trial. |
int |
getPolicyChanges()
Report the number of policy changes in the trial. |
int |
getQueries()
Report the number of queries in the trial. |
int |
initTrial(int s)
Called to initialize for a new trial. |
int |
query(int yn,
double rn)
Select an output based on the state and reward. |
void |
readPolicy()
Read the policy from a file. |
void |
savePolicy()
Write the policy to a file. |
void |
saveProfile(java.lang.String profile_filename)
Write the policy profile to a file. |
void |
setGamma(double g)
Set gamma for the Q-learner. |
void |
setRandomRate(double r)
Set the random rate for the Q-learner. |
void |
setRandomRateDecay(double r)
Set the random decay for the Q-learner. |
java.lang.String |
toString()
Generate a String that describes the current state of the learner. |
protected void |
updateState(i_PriQLearner_id.state st)
|
Methods inherited from class EDU.gatech.cc.is.learning.i_ReinforcementLearner_id |
log,
loggingOff,
loggingOn,
loggingOn,
setPolicyFileName |
Methods inherited from class java.lang.Object |
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
wait,
wait,
wait |
Field Detail |
public static final int DISCOUNTED
protected int criteria
protected i_PriQLearner_id.state[] states
protected PriorityQueue changeQueue
protected int numactions
Constructor Detail |
public i_PriQLearner_id(int numstatesin, int numactionsin, int criteriain, long seedin)
numstates
- int, the number of states the system could be in.numactions
- int, the number of actions or outputs to
select from.criteria
- int, should be DISCOUNTED or AVERAGE.seed
- long, the seed.public i_PriQLearner_id(int numstatesin, int numactionsin, int criteriain)
numstates
- int, the number of states the system could be in.numactions
- int, the number of actions or outputs to
select from.criteria
- int, should be DISCOUNTED or AVERAGE.public i_PriQLearner_id(int numstatesin, int numactionsin)
numstates
- int, the number of states the system could be in.numactions
- int, the number of actions or outputs to
select from.Method Detail |
public void setGamma(double g)
g
- double, the new value for gamma (0 < g < 1).public void setRandomRate(double r)
r
- double, the new value for random rate (0 < r < 1).public void setRandomRateDecay(double r)
r
- double, the new value for randomdecay (0 < r < 1).public java.lang.String toString()
protected void updateState(i_PriQLearner_id.state st)
public int query(int yn, double rn)
statein
- int, the current state.rewardin
- double, reward for the last output, positive
numbers are "good."public void endTrial(double Vn, double rn)
Vn
- double, the value of the absorbing state.reward
- double, the reward for the last output.public int initTrial(int s)
statein
- int, the current state.public double getAvgReward()
public int getQueries()
public int getPolicyChanges()
public void readPolicy() throws java.io.IOException
filename
- String, the name of the file to read from.public void savePolicy() throws java.io.IOException
filename
- String, the name of the file to write to.public void saveProfile(java.lang.String profile_filename) throws java.io.IOException
filename
- String, the name of the file to write to.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |