|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--lnsc.pmvf.AbstractFunctionalUnit2 | +--lnsc.lstm.FastLSTMMemoryBlock
Memory blocks for Long Short-Term Memory (LSMT) network.
Implements Gers, Schraudolph & Schmiduber 2002 (from Journal of Machine Learning Research 3:115-143) LSTM memory block. See also Gers, Schmidhuber, & Cummins 2000 (from Neural Computation, 12(10) 2451-2471) and Hochreiter & Schmidhuber 1997 (from Neural Computation 9(8):1735-1780).
This memory block a given number of memory cells, one input gate, one forget gate and one output gate (optional). It also includes all weights coming from a possible input layer, recurrent weights from it own output, recurrent weights from other blocks and local peepwhole weights. The input of the block is the vector of all activations connecting to it (unweighted). The output of the block is given by the memory cell outputs augmented with the input, forget and output gate (in that order). Because weights are internal parameters, only derivatives to parameters is available.
Squashing function g() and h() and gate functions should be provided (single input and output, differentiable and stateless).
The input vector for a block should look like [1, x_1(t), ..., x_n_in(t), y_1(t-1), ... y_n_rec(t-1)] where x_i(t) is a current input and y_j(t-1) is the activation of a hidden unit having a recurrent link into the block. Note that the first input should be a constant value of 1 (BiasUnit).
The output vector for a block looks like [z_1(t), ... z_n_cell(t), z_in(t), z_fgt(t), z_out(t)].
Parameters are all the weights mentionned above. Derivative to parameters represent the derivative of each output z_k(t) with respect to each weights based on the papers formulas. Note that derivatives from outputs z_in(t), z_fgt(t) and z_out(t) (the gates) with respect to parameters are zeroed since the derivatives for the gate weights passes by the z_k(t) paths and because these gate outputs are usually not connected to next (output) layer. (Even when they are, this is not considered a source of error signal.)
Memory cells internal states are available for read out through the
LSTMDataNames.LSTM_INTERNAL_STATES
keyword.
Cloning is done through serialization, transient states are therefore transient to cloning too (e.g. reseted in clones).
Avoid using this class directly or deriving it unless you really know what you do. Use Factories instead as much as possible.
FastLSTMNetwork
,
AbstractLSTMFactory
,
LSTMFactory
,
LSTMDataNames.LSTM_INTERNAL_STATES
,
Serialized FormNested Class Summary |
Nested classes inherited from class lnsc.pmvf.FunctionalUnit2 |
FunctionalUnit2.ProcessPatternResult2 |
Nested classes inherited from class lnsc.FunctionalUnit |
FunctionalUnit.ProcessPatternResult |
Field Summary |
Fields inherited from interface lnsc.FunctionalUnit |
EMPTY_PATTERN |
Method Summary | |
java.lang.Object |
clone()
|
double[] |
getParameters()
Gets a copy of the parameters as a vector. |
FunctionalUnit2.ProcessPatternResult2 |
processPattern(double[] inputPattern,
boolean computeDerivative,
boolean computeSecondDerivative,
boolean computeParameterDerivative,
boolean computeParameterSecondDerivative,
java.lang.String[] recordList)
Processes an input pattern and returns its output pattern and derivatives (if requested). |
void |
reset()
Reset internal transient state for non stateless functions. |
void |
setParameters(double[] parameters)
Sets the parameters values to those of a given vector. |
java.lang.String |
toString()
|
Methods inherited from class lnsc.pmvf.AbstractFunctionalUnit2 |
getInputCount, getOutputCount, getParameterCount, isDifferentiable, isParameterDifferentiable, isParameterTwiceDifferentiable, isStateless, isTwiceDifferentiable, processDataSet, processPattern |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Method Detail |
public java.lang.Object clone()
clone
in interface FunctionalUnit
clone
in class AbstractFunctionalUnit2
public double[] getParameters()
FunctionalUnit2
public FunctionalUnit2.ProcessPatternResult2 processPattern(double[] inputPattern, boolean computeDerivative, boolean computeSecondDerivative, boolean computeParameterDerivative, boolean computeParameterSecondDerivative, java.lang.String[] recordList)
FunctionalUnit2
processPattern
in interface FunctionalUnit2
processPattern
in class AbstractFunctionalUnit2
inputPattern
- The input pattern.computeDerivative
- Must be true
if the
derivative should be computed.computeSecondDerivative
- Must be true
if the
second derivative should be computed.computeParameterDerivative
- Must be true
if the
derivative with respect to the
parameters should be computed.computeParameterSecondDerivative
- Must be true
if
be the derivative with
respect to the parameters
should be computed.recordList
- Extra data to be recorded.
public void reset()
FunctionalUnit
reset
in interface FunctionalUnit
reset
in class AbstractFunctionalUnit2
public void setParameters(double[] parameters)
FunctionalUnit2
public java.lang.String toString()
toString
in class AbstractFunctionalUnit2
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |