Adding Mish activation function (#1938)

* Adding Mish activation function

* Bug fixed

* Added test for Mish

* Removed unwanted comments

* Simplified calculation and removed comments

* Kernel added and gradient computation simplified

* Gradient simplified

* Corrected gradient calculations

* Compute output when input greater than 8

* Minor correction

* Remove unnecessary pgrad for Mish

* Removed CUDNN calls

* Add standalone CUDA implementation of the Mish activation function

* Fix in-place gradient in the CUDA version; refactor a little

* Swap delta and omega

* Need to have src (=x) (and not dest) available for Mish

* Add test case that makes sure that cuda::mish and cpu::mish return the same results

* Minor tweaking to keep the previous behaviour

Co-authored-by: Juha Reunanen <juha.reunanen@tomaattinen.com>
This commit is contained in:
Manjunath Bhat 2020-01-15 16:34:02 +05:30 committed by Davis E. King
parent a82bf1536e
commit d766f5e82e

Diff Content Not Available