databionics.esom.train
Class SOM

java.lang.Object
  extended by databionics.esom.train.SOM
Direct Known Subclasses:
Batch2SOM, BatchSOM, HybridBatchSOM, KBatchSOM, OnlineSOM, SlowBatchSOM

public abstract class SOM
extends java.lang.Object

Abstract base class for a self organizing map training algorithm. It is a container for

The basic training algorithm is implemented in train() with many callback functions to implement different variants of the algorithm in subclasses, some of which already have a default implementation and should be called via super unless you know what you are doing.


Field Summary
protected  cern.colt.list.IntArrayList bestmatches
          indices of best matching neurons for data points
protected  BmSearch bmSearch
          the search system
protected  BmSearchStat bmStat
          the bm search statistic
protected  boolean bmStatNeeded
           
protected  boolean center
          flag for centering the map
protected  ClsFile classes
           
protected  cern.colt.list.IntArrayList count
           
 cern.colt.matrix.DoubleMatrix2D data
          matrix with training data
protected  Descriptives descriptives
          statistics about data
protected  cern.colt.function.ThresholdVectorVectorFunction distanceFunction
          distance function in data space
protected  cern.colt.matrix.DoubleMatrix1D distances
          current distances between bestmatches and datapoints
protected  int epoch
          current epoch number 0..epochs-1, only used internally
protected  int epochs
          number of training epochs
protected  Grid grid
          grid of weight vectors
protected  java.lang.String initMethod
          kind of initialising
protected  cern.colt.list.IntArrayList keys
          list with unique keys
protected static org.apache.log4j.Logger log
          interface to log4j system
protected  LrnFile lrn
           
protected  Neighborhood neighborhood
          neighborhood kernel function
protected  cern.colt.matrix.DoubleMatrix1D neuron
          current neuron, only used internally
protected  java.text.NumberFormat nf
          number format for messages
protected  boolean offline
           
protected  cern.colt.list.IntArrayList oldBestmatches
          indices of best matching neurons for data points of last epoch
protected  boolean online
           
protected  cern.colt.matrix.DoubleMatrix1D pattern
          current data pattern, only used internally
protected  cern.colt.list.IntArrayList permutation
          current permutation of the data vectors, only used internally
protected  int permutations
          number of possible permutations of the data vectors, only used internally
protected  boolean permute
          flag whether to permute data vectors before each epoch
protected  double qerror
          current quantization error
protected  int radius
          current neighborhood radius, only used internally
protected  Cooling radiusCooling
          cooling strategy of neighborhood radius
protected  cern.jet.random.engine.RandomEngine random
          random number generator, only used internally
protected  double rate
          current learning rate, only used internally
protected  Cooling rateCooling
          cooling strategy of learning rate
protected  int saveEpoch
          number of epochs after which a bm and wts file will be saved
protected  boolean saveEpoches
          Boolean for saving during epochs
protected  java.lang.String savePrefix
          String with name of files during epochs
protected  boolean saveUMatrix
          Boolean for saving U-Matrix during epochs
protected  cern.colt.matrix.DoubleMatrix2D view
          possibly permuted view on training data, only used internally
 
Constructor Summary
SOM()
          Standard constructor.
 
Method Summary
 void afterEpoch()
          Things to be done after each epoch.
 void afterUpdate(int index, int row)
          Things to be done after updating a weight vector.
 void beforeEpoch()
          Things to be done before each epoch.
 void beforeSearch(int row)
          Things to be done before searching for a best match.
 void beforeUpdate(int index)
          Things to be done before updating a weight vector Empty by default.
 double calcQerror()
           
protected  void centerMap()
          center map at neuron with highest density
 void cool()
          Cool down the parameters
 cern.colt.list.IntArrayList getBestMatches()
          Get the indices of best matching neurons for data points Available after calling train().
 BmSearch getBmSearch()
           
 cern.colt.matrix.DoubleMatrix2D getData()
          Get the matrix with the training data.
 Descriptives getDescriptives()
          Get the statistics about the training data.
 double getDistance(cern.colt.matrix.DoubleMatrix1D data, int index)
          Get the distance between a dataVector and a neuron.
 double getDistance(int row, int index)
          Get the distance between a data row and a neuron.
 cern.colt.function.ThresholdVectorVectorFunction getDistanceFunction()
          Get the distance function in data space
 cern.colt.matrix.DoubleMatrix1D getDistances()
           
 int getEpochs()
          Get the number of training epochs.
 Grid getGrid()
          Get the grid of weight vectors.
 cern.colt.list.IntArrayList getKeys()
          Get the keys for the data patterns.
 Neighborhood getNeighborhood()
          Get the neighborhood kernel function.
 java.text.NumberFormat getNumberFormat()
          Get the number format
 cern.colt.list.IntArrayList getOldBestmatches()
           
 cern.colt.matrix.DoubleMatrix1D getPattern(int row)
          Access to a data pattern by row number
 cern.colt.list.IntArrayList getPermutation()
          Get the current permuation
 boolean getPermute()
          Get the flag whether to permute data vectors before each epoch.
 int getRadius()
           
 Cooling getRadiusCooling()
          Get the cooling strategy of the neighborhood radius.
 Cooling getRateCooling()
          Get the cooling strategy of learning rate.
 void init()
          Initialize the grid of weight vectors with the current initialization method.
 boolean isBmStatNeeded()
           
 boolean isCenter()
           
 boolean loadCls(java.lang.String filename)
           
 boolean loadData(java.lang.String filename)
          Load the training data and keys from a *.lrn file
 void saveBestMatches(java.lang.String filename)
          Save indices of best matching neurons to a *.bm file Available after calling train().
 void setBestMatches(BMFile bms)
          Set the indices of best matching neurons for data points
 void setBestMatches(cern.colt.list.IntArrayList bms)
          Set the indices of best matching neurons for data points
 void setBmSearch(BmSearch bmSearch)
           
 void setBmStatNeeded(boolean bmStatNeeded)
           
 void setCenter(boolean center)
           
 void setData(cern.colt.matrix.DoubleMatrix2D v)
          Set the matrix with the training data, patterns in rows.
 void setDistanceFunction(cern.colt.function.ThresholdVectorVectorFunction v)
          Set the distance function in data space
 void setDistances(cern.colt.matrix.DoubleMatrix1D distances)
           
 void setEpochs(int v)
          Set the number of training epochs.
 void setGrid(Grid v)
          Set the grid of weight vectors.
 void setInit(java.lang.String initMethod)
          Set the method of initialization for the weights
 void setKeys(cern.colt.list.IntArrayList v)
          Set the list of unique keys for the data patterns.
 void setNeighborhood(Neighborhood v)
          Set the neighborhood kernel function.
 void setNumberFormat(java.text.NumberFormat v)
          Set the number format
 void setOldBestmatches(cern.colt.list.IntArrayList oldBestmatches)
           
 void setOnline(boolean online)
           
 void setPermute(boolean v)
          Set the flag whether to permute data vectors before each epoch.
 void setRadiusCooling(Cooling v)
          Set the cooling strategy of the neighborhood radius.
 void setRateCooling(Cooling v)
          Set the cooling strategy of learning rate.
 void setSaveEpoch(int e)
          Set number of epochs after which a *.bm and a *.wts are saved
 void setSaveEpochBoolean()
          sets the boolean flag true, because we are saving during epochs
 void setSavePrefix(java.lang.String name)
          Set the prefix for the filename used to regularily save *.bm, *.wts and *.umx files
 void setSaveUMatrix(boolean flag)
          set the flag for saving a U-Matrix every nth epoch
 boolean stop()
          Stopping criterion
 void train()
          General training algorithm for the SOM.
abstract  void update(cern.colt.matrix.DoubleMatrix1D vector, int bm, int pos)
          Things to be done, after finding a bestmatch, e.g.
 void updateNeighborhood(cern.colt.matrix.DoubleMatrix1D vector, int bm)
          Update a neuron and its neighborhood.
 void updateNeuron(int index, cern.colt.matrix.DoubleMatrix1D vector, double weight)
          Update a single weight vector.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log
interface to log4j system


bestmatches

protected cern.colt.list.IntArrayList bestmatches
indices of best matching neurons for data points


oldBestmatches

protected cern.colt.list.IntArrayList oldBestmatches
indices of best matching neurons for data points of last epoch


data

public cern.colt.matrix.DoubleMatrix2D data
matrix with training data


descriptives

protected Descriptives descriptives
statistics about data


view

protected cern.colt.matrix.DoubleMatrix2D view
possibly permuted view on training data, only used internally


keys

protected cern.colt.list.IntArrayList keys
list with unique keys


distanceFunction

protected cern.colt.function.ThresholdVectorVectorFunction distanceFunction
distance function in data space


epoch

protected int epoch
current epoch number 0..epochs-1, only used internally


epochs

protected int epochs
number of training epochs


grid

protected Grid grid
grid of weight vectors


neighborhood

protected Neighborhood neighborhood
neighborhood kernel function


neuron

protected cern.colt.matrix.DoubleMatrix1D neuron
current neuron, only used internally


distances

protected cern.colt.matrix.DoubleMatrix1D distances
current distances between bestmatches and datapoints


nf

protected java.text.NumberFormat nf
number format for messages


pattern

protected cern.colt.matrix.DoubleMatrix1D pattern
current data pattern, only used internally


permutation

protected cern.colt.list.IntArrayList permutation
current permutation of the data vectors, only used internally


permutations

protected int permutations
number of possible permutations of the data vectors, only used internally


permute

protected boolean permute
flag whether to permute data vectors before each epoch


radius

protected int radius
current neighborhood radius, only used internally


radiusCooling

protected Cooling radiusCooling
cooling strategy of neighborhood radius


random

protected cern.jet.random.engine.RandomEngine random
random number generator, only used internally


rate

protected double rate
current learning rate, only used internally


qerror

protected double qerror
current quantization error


rateCooling

protected Cooling rateCooling
cooling strategy of learning rate


saveEpoch

protected int saveEpoch
number of epochs after which a bm and wts file will be saved


initMethod

protected java.lang.String initMethod
kind of initialising


savePrefix

protected java.lang.String savePrefix
String with name of files during epochs


saveEpoches

protected boolean saveEpoches
Boolean for saving during epochs


saveUMatrix

protected boolean saveUMatrix
Boolean for saving U-Matrix during epochs


center

protected boolean center
flag for centering the map


bmSearch

protected BmSearch bmSearch
the search system


bmStat

protected BmSearchStat bmStat
the bm search statistic


bmStatNeeded

protected boolean bmStatNeeded

online

protected boolean online

offline

protected boolean offline

classes

protected ClsFile classes

count

protected cern.colt.list.IntArrayList count

lrn

protected LrnFile lrn
Constructor Detail

SOM

public SOM()
Standard constructor. Sets default values and initializes the random number generator.

Method Detail

init

public void init()
Initialize the grid of weight vectors with the current initialization method.


train

public void train()
General training algorithm for the SOM.


beforeEpoch

public void beforeEpoch()
Things to be done before each epoch. Cools down the parameters, prints some info, and optionally permutes the data patterns.


beforeSearch

public void beforeSearch(int row)
Things to be done before searching for a best match. Empty by default.

Parameters:
row - Row of the data pattern

beforeUpdate

public void beforeUpdate(int index)
Things to be done before updating a weight vector Empty by default.

Parameters:
index - Index of neuron to be updated

update

public abstract void update(cern.colt.matrix.DoubleMatrix1D vector,
                            int bm,
                            int pos)
Things to be done, after finding a bestmatch, e.g. in online learning to update the neuron. To be implemented in subclasses.

Parameters:
vector - vector of data
bm - index of bestmatching neuron

updateNeighborhood

public void updateNeighborhood(cern.colt.matrix.DoubleMatrix1D vector,
                               int bm)
Update a neuron and its neighborhood.

Parameters:
vector - vector of data
bm - index of bestmatching neuron

updateNeuron

public void updateNeuron(int index,
                         cern.colt.matrix.DoubleMatrix1D vector,
                         double weight)
Update a single weight vector.

Parameters:
index - Index of neuron to be updated
dataVector - Vector of the data pattern
weight - Weight of neighborhood kernel assigned by distance on grid

afterUpdate

public void afterUpdate(int index,
                        int row)
Things to be done after updating a weight vector. Empty by default.

Parameters:
index - Index of neuron just updated

afterEpoch

public void afterEpoch()
Things to be done after each epoch. Saves *.bm and *.wts every n-th epoch


cool

public void cool()
Cool down the parameters


stop

public boolean stop()
Stopping criterion


centerMap

protected void centerMap()
center map at neuron with highest density


getBestMatches

public cern.colt.list.IntArrayList getBestMatches()
Get the indices of best matching neurons for data points Available after calling train().

Returns:
list of best matches.

setBestMatches

public void setBestMatches(cern.colt.list.IntArrayList bms)
Set the indices of best matching neurons for data points


setBestMatches

public void setBestMatches(BMFile bms)
Set the indices of best matching neurons for data points


saveBestMatches

public void saveBestMatches(java.lang.String filename)
Save indices of best matching neurons to a *.bm file Available after calling train().


getDescriptives

public Descriptives getDescriptives()
Get the statistics about the training data.

Returns:
value of descriptives.

getData

public cern.colt.matrix.DoubleMatrix2D getData()
Get the matrix with the training data.

Returns:
value of data.

setData

public void setData(cern.colt.matrix.DoubleMatrix2D v)
Set the matrix with the training data, patterns in rows. Also updates the number of permutations and initializes array for best matching indices.

Parameters:
v - Value to assign to data.

loadData

public boolean loadData(java.lang.String filename)
Load the training data and keys from a *.lrn file

Parameters:
filename - Name of the file

loadCls

public boolean loadCls(java.lang.String filename)

getDistance

public double getDistance(int row,
                          int index)
Get the distance between a data row and a neuron.

Parameters:
row - Row of data pattern
index - Index of neuron
Returns:
value of distance.

getDistance

public double getDistance(cern.colt.matrix.DoubleMatrix1D data,
                          int index)
Get the distance between a dataVector and a neuron.

Parameters:
data - DataVector
index - Index of neuron
Returns:
value of distance.

getDistanceFunction

public cern.colt.function.ThresholdVectorVectorFunction getDistanceFunction()
Get the distance function in data space

Returns:
value of distanceFunction.

setDistanceFunction

public void setDistanceFunction(cern.colt.function.ThresholdVectorVectorFunction v)
Set the distance function in data space

Parameters:
v - Value to assign to distanceFunction.

calcQerror

public double calcQerror()

getEpochs

public int getEpochs()
Get the number of training epochs.

Returns:
value of epochs.

setEpochs

public void setEpochs(int v)
Set the number of training epochs.

Parameters:
v - Value to assign to epochs.

getGrid

public Grid getGrid()
Get the grid of weight vectors.

Returns:
value of grid.

setGrid

public void setGrid(Grid v)
Set the grid of weight vectors.

Parameters:
v - Value to assign to grid.

getKeys

public cern.colt.list.IntArrayList getKeys()
Get the keys for the data patterns.

Returns:
value of keys.

setKeys

public void setKeys(cern.colt.list.IntArrayList v)
Set the list of unique keys for the data patterns.

Parameters:
v - Value to assign to keys.

getNeighborhood

public Neighborhood getNeighborhood()
Get the neighborhood kernel function.

Returns:
value of neighborhood.

setNeighborhood

public void setNeighborhood(Neighborhood v)
Set the neighborhood kernel function.

Parameters:
v - Value to assign to neighborhood.

getNumberFormat

public java.text.NumberFormat getNumberFormat()
Get the number format

Returns:
value of number format

setNumberFormat

public void setNumberFormat(java.text.NumberFormat v)
Set the number format

Parameters:
v - value of number format

getPattern

public cern.colt.matrix.DoubleMatrix1D getPattern(int row)
Access to a data pattern by row number

Parameters:
row - Row of data pattern

getPermute

public boolean getPermute()
Get the flag whether to permute data vectors before each epoch.

Returns:
value of permute.

setPermute

public void setPermute(boolean v)
Set the flag whether to permute data vectors before each epoch.

Parameters:
v - Value to assign to permute.

getPermutation

public cern.colt.list.IntArrayList getPermutation()
Get the current permuation

Returns:
value of permutation

getRadiusCooling

public Cooling getRadiusCooling()
Get the cooling strategy of the neighborhood radius.

Returns:
value of radiusCooling.

setRadiusCooling

public void setRadiusCooling(Cooling v)
Set the cooling strategy of the neighborhood radius.

Parameters:
v - Value to assign to radiusCooling.

getRateCooling

public Cooling getRateCooling()
Get the cooling strategy of learning rate.

Returns:
value of rateCooling.

setRateCooling

public void setRateCooling(Cooling v)
Set the cooling strategy of learning rate.

Parameters:
v - Value to assign to rateCooling.

setSaveEpoch

public void setSaveEpoch(int e)
Set number of epochs after which a *.bm and a *.wts are saved

Parameters:
e - number of epochs

setSaveEpochBoolean

public void setSaveEpochBoolean()
sets the boolean flag true, because we are saving during epochs


setInit

public void setInit(java.lang.String initMethod)
Set the method of initialization for the weights

Parameters:
initMethod - name of method allowed names are: norm_mean_2std, uni_mean_2std, uni_min_max, pca

setSavePrefix

public void setSavePrefix(java.lang.String name)
Set the prefix for the filename used to regularily save *.bm, *.wts and *.umx files

Parameters:
name - String with name of file

setSaveUMatrix

public void setSaveUMatrix(boolean flag)
set the flag for saving a U-Matrix every nth epoch

Parameters:
flag - true for save

isCenter

public boolean isCenter()
Returns:
Returns the center.

setCenter

public void setCenter(boolean center)
Parameters:
center - The center to set.

getDistances

public cern.colt.matrix.DoubleMatrix1D getDistances()
Returns:
Returns the distances.

setDistances

public void setDistances(cern.colt.matrix.DoubleMatrix1D distances)
Parameters:
distances - The distances to set.

getBmSearch

public BmSearch getBmSearch()
Returns:
Returns the bmSearch.

setBmSearch

public void setBmSearch(BmSearch bmSearch)
Parameters:
bmSearch - The bmSearch to set.

getOldBestmatches

public cern.colt.list.IntArrayList getOldBestmatches()
Returns:
Returns the oldBestmatches.

setOldBestmatches

public void setOldBestmatches(cern.colt.list.IntArrayList oldBestmatches)
Parameters:
oldBestmatches - The oldBestmatches to set.

getRadius

public int getRadius()
Returns:
Returns the radius.

isBmStatNeeded

public boolean isBmStatNeeded()
Returns:
Returns the bmStatNeeded.

setBmStatNeeded

public void setBmStatNeeded(boolean bmStatNeeded)
Parameters:
bmStatNeeded - The bmStatNeeded to set.

setOnline

public void setOnline(boolean online)
Parameters:
online - The online to set.


Copyright © 2005-2006 Databionics Research Group. All Rights Reserved.