API - Cost¶
To make TensorLayer simple, we minimize the number of cost functions as much as
we can. So we encourage you to use TensorFlow’s function.
For example, you can implement L1, L2 and sum regularization by
tf.contrib.layers.l1_regularizer
, tf.contrib.layers.l2_regularizer
and
tf.contrib.layers.sum_regularizer
, see TensorFlow API.
Custom cost function¶
TensorLayer provides a simple way to create you own cost function. Take a MLP below for example.
network = tl.InputLayer(x, name='input_layer')
network = tl.DropoutLayer(network, keep=0.8, name='drop1')
network = tl.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu1')
network = tl.DropoutLayer(network, keep=0.5, name='drop2')
network = tl.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu2')
network = tl.DropoutLayer(network, keep=0.5, name='drop3')
network = tl.DenseLayer(network, n_units=10, act = tl.activation.identity, name='output_layer')
The network parameters will be [W1, b1, W2, b2, W_out, b_out]
,
then you can apply L2 regularization on the weights matrix of first two layer as follow.
cost = tl.cost.cross_entropy(y, y_)
cost = cost + tf.contrib.layers.l2_regularizer(0.001)(network.all_params[0]) + tf.contrib.layers.l2_regularizer(0.001)(network.all_params[2])
Regularization of Weights¶
After initializing the variables, the informations of network parameters can be
observed by using network.print_params()
.
sess.run(tf.initialize_all_variables())
network.print_params()
param 0: (784, 800) (mean: -0.000000, median: 0.000004 std: 0.035524)
param 1: (800,) (mean: 0.000000, median: 0.000000 std: 0.000000)
param 2: (800, 800) (mean: 0.000029, median: 0.000031 std: 0.035378)
param 3: (800,) (mean: 0.000000, median: 0.000000 std: 0.000000)
param 4: (800, 10) (mean: 0.000673, median: 0.000763 std: 0.049373)
param 5: (10,) (mean: 0.000000, median: 0.000000 std: 0.000000)
num of params: 1276810
The output of network is network.outputs
, then the cross entropy can be
defined as follow. Besides, to regularize the weights,
the network.all_params
contains all parameters of the network.
In this case, network.all_params = [W1, b1, W2, b2, Wout, bout]
according
to param 0, 1 … 5 shown by network.print_params()
.
Then max-norm regularization on W1 and W2 can be performed as follow.
y = network.outputs
# Alternatively, you can use tl.cost.cross_entropy(y, y_) instead.
cross_entropy = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_))
cost = cross_entropy
cost = cost + tl.cost.maxnorm_regularizer(1.0)(network.all_params[0]) +
tl.cost.maxnorm_regularizer(1.0)(network.all_params[2])
In addition, all TensorFlow’s regularizers like
tf.contrib.layers.l2_regularizer
can be used with TensorLayer.
Regularization of Activation outputs¶
Instance method network.print_layers()
prints all outputs of different
layers in order. To achieve regularization on activation output, you can use
network.all_layers
which contains all outputs of different layers.
If you want to apply L1 penalty on the activations of first hidden layer,
just simply add tf.contrib.layers.l2_regularizer(lambda_l1)(network.all_layers[1])
to the cost function.
network.print_layers()
layer 0: Tensor("dropout/mul_1:0", shape=(?, 784), dtype=float32)
layer 1: Tensor("Relu:0", shape=(?, 800), dtype=float32)
layer 2: Tensor("dropout_1/mul_1:0", shape=(?, 800), dtype=float32)
layer 3: Tensor("Relu_1:0", shape=(?, 800), dtype=float32)
layer 4: Tensor("dropout_2/mul_1:0", shape=(?, 800), dtype=float32)
layer 5: Tensor("add_2:0", shape=(?, 10), dtype=float32)
cross_entropy (output, target) |
Returns the TensorFlow expression of cross-entropy of two distributions, implement softmax internally. |
mean_squared_error (output, target) |
Return the TensorFlow expression of mean-squre-error of two distributions. |
cross_entropy_seq (output, target, …) |
Returns the expression of cross-entropy of two sequences, implement softmax internally. |
li_regularizer (scale) |
li regularization removes the neurons of previous layer, i represents inputs. |
lo_regularizer (scale) |
lo regularization removes the neurons of current layer, o represents outputs |
maxnorm_regularizer ([scale]) |
Max-norm regularization returns a function that can be used to apply max-norm regularization to weights. |
maxnorm_o_regularizer (scale) |
Max-norm output regularization removes the neurons of current layer. |
maxnorm_i_regularizer (scale) |
Max-norm input regularization removes the neurons of previous layer. |
Cost functions¶
-
tensorlayer.cost.
cross_entropy
(output, target)[source]¶ Returns the TensorFlow expression of cross-entropy of two distributions, implement softmax internally.
Parameters: - output : Tensorflow variable
A distribution with shape: [None, n_feature].
- target : Tensorflow variable
A distribution with shape: [None, n_feature].
Notes
About cross-entropy: wiki.
The code is borrowed from: here.
Examples
>>> ce = tf.cost.cross_entropy(y_logits, y_target_logits)
-
tensorlayer.cost.
mean_squared_error
(output, target)[source]¶ Return the TensorFlow expression of mean-squre-error of two distributions.
Parameters: - output : tensorflow variable
A distribution with shape: [None, n_feature].
- target : tensorflow variable
A distribution with shape: [None, n_feature].
-
tensorlayer.cost.
cross_entropy_seq
(output, target, batch_size, num_steps)[source]¶ Returns the expression of cross-entropy of two sequences, implement softmax internally.
Parameters: - output : Tensorflow variable
2D tensor [batch_size*num_steps, n_units of output layer]
- target : Tensorflow variable
target : 2D tensor [batch_size, num_steps], need to be reshaped.
- batch_size : int
RNN batch_size, number of concurrent processes.
- num_steps : int
sequence length
Examples
>>> see PTB tutorial for more details >>> input_data = tf.placeholder(tf.int32, [batch_size, num_steps]) >>> targets = tf.placeholder(tf.int32, [batch_size, num_steps]) >>> cost = tf.cost.cross_entropy_seq(network.outputs, targets, batch_size, num_steps)
Regularization functions¶
-
tensorlayer.cost.
li_regularizer
(scale)[source]¶ li regularization removes the neurons of previous layer, i represents inputs.
Returns a function that can be used to apply group li regularization to weights.
The implementation follows TensorFlow contrib.
Parameters: - scale : float
A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns: - A function with signature `li(weights, name=None)` that apply L1 regularization.
Raises: - ValueError : if scale is outside of the range [0.0, 1.0] or if scale is not a float.
-
tensorlayer.cost.
lo_regularizer
(scale)[source]¶ lo regularization removes the neurons of current layer, o represents outputs
Returns a function that can be used to apply group lo regularization to weights.
The implementation follows TensorFlow contrib.
Parameters: - scale : float
A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns: - A function with signature `lo(weights, name=None)` that apply Lo regularization.
Raises: - ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
-
tensorlayer.cost.
maxnorm_regularizer
(scale=1.0)[source]¶ Max-norm regularization returns a function that can be used to apply max-norm regularization to weights. About max-norm: wiki.
The implementation follows TensorFlow contrib.
Parameters: - scale : float
A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns: - A function with signature `mn(weights, name=None)` that apply Lo regularization.
Raises: - ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
-
tensorlayer.cost.
maxnorm_o_regularizer
(scale)[source]¶ Max-norm output regularization removes the neurons of current layer.
Returns a function that can be used to apply max-norm regularization to each column of weight matrix.
The implementation follows TensorFlow contrib.
Parameters: - scale : float
A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns: - A function with signature `mn_o(weights, name=None)` that apply Lo regularization.
Raises: - ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
-
tensorlayer.cost.
maxnorm_i_regularizer
(scale)[source]¶ Max-norm input regularization removes the neurons of previous layer.
Returns a function that can be used to apply max-norm regularization to each row of weight matrix.
The implementation follows TensorFlow contrib.
Parameters: - scale : float
A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns: - A function with signature `mn_i(weights, name=None)` that apply Lo regularization.
Raises: - ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.