Rivest06

Overview

Package

Class

Tree

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

grsnc.binb
Class Rivest06

java.lang.Object
  |
  +--java.util.Observable
        |
        +--lnsc.page.AbstractObservableAgent
              |
              +--grsnc.binb.Rivest06

All Implemented Interfaces:: Agent, java.io.Serializable

public class Rivest06
extends AbstractObservableAgent

Title: BG Math Model (Francois Rivest, 10 Mar 2006)
Description: Agent based a Mathematical Model of the Basal Ganglia
Copyright: Copyright (c) 2004
Company: UdeM
Note: Eligibility traces are bounded between -1 and 1. (Actors still untraced)

Summary. In this model, the critic uses standard TD formula, while the actor uses a natural gradient that gives biological three-synaptic update rule. Although the critic part is not totally biologically plausible, it is the same as the Suri&Schultz1999Model equations. It would be interesting to find a biologically plausible equivalent formula.

Implementation details:

Si(t) is stimuli given by StateRepresentation
r(t) is the primary reward
Unless state is final, returnReward value is processed in next call to requestAction. If it is final, it is process in episodeTerminated.
Critic:
- Wik is weights between stimuli Si(t) and prediction of stimuli Pk(t)
- Pk(t) = sum(Wik*Si(t)) is reward prediction k
- P(t) = sum(Pk(t))
- e(t) = r(t) + gamma*P(t) - P(t-1)
- eti(t) = lambda*eti(t-1) + Si(t-1)
- gamma = .98 (discounting factor)
- learning rule: Wik(t) = Wik(t-1) + etac*e(t)*eti(,t)
- initialisation: Wik(t=0) > 0
Actor:
- Wij is weights between stimuli Si(t) and action Aj
- Aj(t) = sum(Wij*Si(t)) is actor activity k
- Aj'(t) = 1 if Aj'(t)>0 and Aj'(t) > Al'(t) for all l not j
- learning rule: Wij(t) = Wij(t-1) + etaa*e(t)*Aj'(t-1)*Si(t-1)
- initialisation: Wij(t=0) > 0
Assumptions:

newEpisode & first requestAction have the same state
requestAction & next returnReward have the same state
last returnReward and endEpisode have the same state

See Also:
Serialized Form

Field Summary

static java.lang.String ACTION


static java.lang.String ACTORS


static java.lang.String ACTORS_WEIGHTS


static java.lang.String ACTORS_WEIGHTS_CHANGE


static java.lang.String CRITICS


static java.lang.String CRITICS_WEIGHTS


static java.lang.String CRITICS_WEIGHTS_CHANGE


static java.lang.String DOPAMINE


static java.lang.String PREDICTION


static java.lang.String REWARD


static java.lang.String STIMULUS


Constructor Summary

Rivest06(int newActorCount, int newCriticCount, StateRepresentation newStateRep, double newLearningRate, double newInitWeightFactor)
          Construct an agent based on Francois Rivest May 17 BG Math Model.

Method Summary

void endEpisode(State finalState)
          Complete processContext.

void newEpisode(State newState)
          Starts by filling previous stimuli, prediction and action.

Action requestAction(State currentState)
          Computes actors and critics activities with no-reward given at time t.

void returnReward(State resultState, double reward)
          Save reward.

DataSet toDataSet()
          Similar to the toString method, but return state content in the form of a DataSet.

java.lang.String toString()


Methods inherited from class lnsc.page.AbstractObservableAgent

getEvalMode, isAdaptive, isEvaluable, notifyObservers, setEvalMode

Methods inherited from class java.util.Observable

addObserver, countObservers, deleteObserver, deleteObservers, hasChanged, notifyObservers

Methods inherited from class java.lang.Object

equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Detail

ACTION

public static final java.lang.String ACTION

See Also:
Constant Field Values

ACTORS

public static final java.lang.String ACTORS

See Also:
Constant Field Values

ACTORS_WEIGHTS

public static final java.lang.String ACTORS_WEIGHTS

See Also:
Constant Field Values

ACTORS_WEIGHTS_CHANGE

public static final java.lang.String ACTORS_WEIGHTS_CHANGE

See Also:
Constant Field Values

CRITICS

public static final java.lang.String CRITICS

See Also:
Constant Field Values

CRITICS_WEIGHTS

public static final java.lang.String CRITICS_WEIGHTS

See Also:
Constant Field Values

CRITICS_WEIGHTS_CHANGE

public static final java.lang.String CRITICS_WEIGHTS_CHANGE

See Also:
Constant Field Values

DOPAMINE

public static final java.lang.String DOPAMINE

See Also:
Constant Field Values

PREDICTION

public static final java.lang.String PREDICTION

See Also:
Constant Field Values

REWARD

public static final java.lang.String REWARD

See Also:
Constant Field Values

STIMULUS

public static final java.lang.String STIMULUS

See Also:
Constant Field Values

Constructor Detail

Rivest06

public Rivest06(int newActorCount, int newCriticCount, StateRepresentation newStateRep, double newLearningRate, double newInitWeightFactor)

Construct an agent based on Francois Rivest May 17 BG Math Model.

Parameters:
newCriticCount - Number of critic neurons.
newLearningRate - Actor & Critic learning rates.
newInitWeightFactor - Initialization weight factor.

Method Detail

endEpisode

public void endEpisode(State finalState)

Complete processContext.

Parameters:
finalState - Final state of the episode.

newEpisode

public void newEpisode(State newState)

Starts by filling previous stimuli, prediction and action.

Parameters:
newState - First stae of the episode.

requestAction

public Action requestAction(State currentState)

Computes actors and critics activities with no-reward given at time t.

Parameters:
currentState - The current state of the agent.
Returns:
The action to be done.

returnReward

public void returnReward(State resultState, double reward)

Save reward.

Parameters:
resultState - Resulting state from last action.
reward - Resulting reward from last action.

toDataSet

public DataSet toDataSet()

Description copied from interface: Agent

Similar to the toString method, but return state content in the form of a DataSet. Can be null.

Specified by:
toDataSet in interface Agent
Overrides:
toDataSet in class AbstractObservableAgent

Returns:
A DataSet containing a description of the State.

toString

public java.lang.String toString()

Overrides:
toString in class java.lang.Object

Overview Package Class Tree Index Help

PREV CLASS NEXT CLASS FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD DETAIL: FIELD | CONSTR | METHOD

Field Summary
`static java.lang.String`	`ACTION`
`static java.lang.String`	`ACTORS`
`static java.lang.String`	`ACTORS_WEIGHTS`
`static java.lang.String`	`ACTORS_WEIGHTS_CHANGE`
`static java.lang.String`	`CRITICS`
`static java.lang.String`	`CRITICS_WEIGHTS`
`static java.lang.String`	`CRITICS_WEIGHTS_CHANGE`
`static java.lang.String`	`DOPAMINE`
`static java.lang.String`	`PREDICTION`
`static java.lang.String`	`REWARD`
`static java.lang.String`	`STIMULUS`

Constructor Summary
`Rivest06(int newActorCount, int newCriticCount, StateRepresentation newStateRep, double newLearningRate, double newInitWeightFactor)` Construct an agent based on Francois Rivest May 17 BG Math Model.

Method Summary
`void`	`endEpisode(State finalState)` Complete processContext.
`void`	`newEpisode(State newState)` Starts by filling previous stimuli, prediction and action.
`Action`	`requestAction(State currentState)` Computes actors and critics activities with no-reward given at time t.
`void`	`returnReward(State resultState, double reward)` Save reward.
`DataSet`	`toDataSet()` Similar to the toString method, but return state content in the form of a DataSet.
`java.lang.String`	`toString()`

grsnc.binb Class Rivest06

ACTION

ACTORS

ACTORS_WEIGHTS

ACTORS_WEIGHTS_CHANGE

CRITICS

CRITICS_WEIGHTS

CRITICS_WEIGHTS_CHANGE

DOPAMINE

PREDICTION

REWARD

STIMULUS

Rivest06

endEpisode

newEpisode

requestAction

returnReward

toDataSet

toString

grsnc.binb
Class Rivest06