Optimizer#

To successively optimize the weight and bias network parameters during the training iterations, Fortnet provides different algorithms. Depending on the problem and dataset, the choice of optimizer can have a major impact on convergence and overall behavior during training. Currently, the following choices are available:

Steepest Descent (SD) [11]
Conjugate Gradient (CG) [12]
FIRE (FIRE) [13]
Limited-Memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) [14]

General Optimizer Settings#

Some parameters of the Training block are universally valid across all optimizers listed above. The table below lists these entries:

Setting	Type	Default	Note
NIterations	Integer	Huge()	Max. number of training iterations
Threshold	Float	Tiny()	Gradient termination criterion
NPrintout	Integer	10	Standard output print interval
NSaveNet	Integer	100	Netstat output save interval
MinDisplacement	Float	1e-06	Min. displacement in parameters
MaxDisplacement	Float	1e+04	Max. displacement in parameters
Shuffle	Logical	No	Randomly shuffle order of gradient calculations

Optimizer Specific Settings#

In addition to the universal parameters, there are also optimizer-specific options. These are the subject of the following sections.

Steepest Descent#

Exemplary HSD Training block of the fortnet_in.hsd user input:

Training = SD {
  Threshold = 1e-08
  NIterations = 10000
  NPrintout = 10
  NSaveNet = 100
  MinDisplacement = 1e-10
  MaxDisplacement = 5e-02
  LearningRate = 0.01
  Shuffle = No
}

Optimizer specific settings:

Setting	Type	Default	Note
LearningRate	Float	0.01	uniform weight of gradient components

Conjugate Gradient#

Exemplary HSD Training block of the fortnet_in.hsd user input:

Training = CG {
  Threshold = 1e-08
  NIterations = 10000
  NPrintout = 10
  NSaveNet = 100
  MinDisplacement = 1e-10
  MaxDisplacement = 5e-02
  Shuffle = No
}

Currently, there are no specific parameters for the conjugate gradient method.

FIRE#

Exemplary HSD Training block of the fortnet_in.hsd user input:

Training = FIRE {
  Threshold = 1e-08
  NIterations = 10000
  NPrintout = 10
  NSaveNet = 100
  MinDisplacement = 1e-10
  MaxDisplacement = 5e-02
  Shuffle = No
}

Currently, there are no specific parameters for the conjugate gradient method.

L-BFGS#

Exemplary HSD Training block of the fortnet_in.hsd user input:

Training = LBFGS {
  Threshold = 1e-08
  NIterations = 10000
  NPrintout = 10
  NSaveNet = 100
  MinDisplacement = 1e-10
  MaxDisplacement = 5e-02
  MaxForQNDisplacement = No
  LineMin = Yes
  Memory = 1000
  Shuffle = No
}

Optimizer specific settings:

Setting	Type	Default	Note
MaxForQNDisplacement	Logical	False	Consider max. step for quasi-Newton direction
Linemin	Logical	True	Use a line search
Memory	Integer	1000	Nr. of past iterations to save

Optimizer

Contents

Optimizer#

General Optimizer Settings#

Optimizer Specific Settings#

Steepest Descent#

Conjugate Gradient#

FIRE#

L-BFGS#