Mercurial > hg > tvii
view docs/summary_of_gradient_descent.txt @ 53:673a295fd09c
[documentation] cache coursera notes
author | Jeff Hammel <k0scist@gmail.com> |
---|---|
date | Sun, 24 Sep 2017 14:42:56 -0700 |
parents | |
children |
line wrap: on
line source
# Summary of Gradient Descent For a two layer network. The `[]`s denote the layer number. `'` denotes prime. `T` denotes transpose. ## Scalar implementation ``` dz[2] = a[2] - y dW[2] = dz[2]a[1]T db[2] = dz[2] dz[1] = W[2]Tdz[2] * g[1]'(z[1]) dW[1] = dz[1]xT db[1] = dz[1] ``` ## Vectorized Implementation ``` dZ[2] = A[2] - Y dW[2] = (1/m)dZ[2]A[1]T db[2] = (1/m)*np.sum(dZ[2], axis=1, keepdims=True) dZ[1] = W[2]TdZ[2] * g[1]'(z[1]) db[1] = (1/m)*np.sum(dZ[1], axis=1, keepdims=True) ```