Pipe line of Berkeley Caffe

yjxiong


Training

Over all process

The entry point of the trainer is in /tool/train_net.cpp. It does follows:

  1. Setup logging, parameters
  2. Initialize SGD solver or resume from solver snapshot, in
  3. Call method SGDSolver.solve() to train the net.

Initializing the Solver

Constructor Solver::Solver() constructs the solver and the network object insided it. It parse the parameter file (a protobuf) according to the given file name. Then it calls the initializing method.

Method Solver::Init() initialize the solver using the parameters specified in a protobuf class SolverParamter. It does following steps:

  1. Intialize the random number generator seed.
  2. Set the training parameters of the network object

The solvers are defined in /include/caffe/solver.hpp.

Inside the Solver

SGDSolver is a sub-class of Solver which defines the method Solve(). The Solver::Solve() method runs following steps

  1. Initialization and paramter setup
  2. Call Solver::PreSolve() to do some pre-processing
  3. Do a test pass to identify memory errors
  4. For interation times do:
    • Call Net::ForwardBackward() of the net Object, to get results and gradients
    • Call method Solver::ComputeUpdateValue() to get weight updates
    • Update the network
    • If neccessary, take a snapshot by calling Solver::Snapshot()

As an extension of the Solver class, the class SGDSolver, which is actaully used in the caffe project, overides 4 methods:

  1. PreSolve(): save the historical weights to the blobs
  2. ComputeUpdateValue(): add weight decay to the updates if neccessary
  3. SnapshotSolverState()
  4. RestoreSolverState(): recover the solver state from saved history

The Network Object

The network object is the core of the training and testing. The definition is in /include/caffe/net.hpp. It exposes the method Net::ForwardBackward() to solvers. Every training pass is conducted by calling this method. Net::Update() is called to update the network weights.

When the solver is constructed, the constructer of the Net class is also called and the constructed Net object is linked by a shared_ptr member in Solver class.

Initializing the Network

The constructor Net:Net() parses parameters from a input file and calls the method Net:Init().

Net:Init() will build all the layers and set up their connections.

Train the Network

Basically, the training is done by Net::ForwardBackward() and Net::Update(). During one pass, they calculate the gradients and updates. The weights of the network is updated accordingly.

Net::ForwardPrefilled() layer-wisely call the layer’s Forward() method to get the forward pass outputs. And save the cumulatied loss in the argument pointer loss.

Net::Backward() layer-wisely call the layers Backward() method to get the gradients and raw updates.

Net::Update() update the weights on GPU/CPU.

The Layer Object

The network consists of a set of layers. They may not be in stack order. The base calss Layer is defined in /include/layer.hpp. Every layer exposes two methods Layer::Forward() and Layer::Backward(). Internally, Forward() calls Forward_cpu or Forward_gpu based on whether gpu training is prefered, same as Backward().

Thus every type of layer has to implement Forward_cpu/gpu(). If the layer has backward propagation, it needs to implement Backward_cpu/gpu()

Now we have the Top-down pipeline of caffe.

Note: a set of layers used in computer vision tasks are defined in /include/caffe/vision_layers.hpp.

Written with StackEdit.