mirror of https://github.com/davisking/dlib.git
Made loss layers output the gradients by assigning them to the output rather
than adding them. This way, the gradient buffer can be used as scratch space during the loss computation.
This commit is contained in:
parent
e2a67dec4c
commit
5f5c46f49e
|
@ -77,7 +77,7 @@ namespace dlib
|
||||||
if (temp > 0)
|
if (temp > 0)
|
||||||
{
|
{
|
||||||
loss += scale*temp;
|
loss += scale*temp;
|
||||||
g[i] += -scale*y;
|
g[i] = -scale*y;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return loss;
|
return loss;
|
||||||
|
|
|
@ -110,9 +110,9 @@ namespace dlib
|
||||||
of sub matches the expected labels given by truth. Let's write the loss
|
of sub matches the expected labels given by truth. Let's write the loss
|
||||||
function as L(input_tensor, truth, sub).
|
function as L(input_tensor, truth, sub).
|
||||||
- Then compute_loss() computes the gradient of L() with respect to the
|
- Then compute_loss() computes the gradient of L() with respect to the
|
||||||
outputs in sub. Specifically, compute_loss() adds the gradients into sub
|
outputs in sub. Specifically, compute_loss() assigns the gradients into
|
||||||
by performing the following tensor additions, for all valid i:
|
sub by performing the following tensor assignments, for all valid i:
|
||||||
- layer<i>(sub).get_gradient_input() += the gradient of
|
- layer<i>(sub).get_gradient_input() = the gradient of
|
||||||
L(input_tensor,truth,sub) with respect to layer<i>(sub).get_output().
|
L(input_tensor,truth,sub) with respect to layer<i>(sub).get_output().
|
||||||
- returns L(input_tensor,truth,sub)
|
- returns L(input_tensor,truth,sub)
|
||||||
!*/
|
!*/
|
||||||
|
|
Loading…
Reference in New Issue