Made loss layers output the gradients by assigning them to the output rather

than adding them.  This way, the gradient buffer can be used as scratch space
during the loss computation.
This commit is contained in:
Davis King 2015-11-21 10:42:39 -05:00
parent e2a67dec4c
commit 5f5c46f49e
2 changed files with 4 additions and 4 deletions

View File

@ -77,7 +77,7 @@ namespace dlib
if (temp > 0)
{
loss += scale*temp;
g[i] += -scale*y;
g[i] = -scale*y;
}
}
return loss;

View File

@ -110,9 +110,9 @@ namespace dlib
of sub matches the expected labels given by truth. Let's write the loss
function as L(input_tensor, truth, sub).
- Then compute_loss() computes the gradient of L() with respect to the
outputs in sub. Specifically, compute_loss() adds the gradients into sub
by performing the following tensor additions, for all valid i:
- layer<i>(sub).get_gradient_input() += the gradient of
outputs in sub. Specifically, compute_loss() assigns the gradients into
sub by performing the following tensor assignments, for all valid i:
- layer<i>(sub).get_gradient_input() = the gradient of
L(input_tensor,truth,sub) with respect to layer<i>(sub).get_output().
- returns L(input_tensor,truth,sub)
!*/