Mercurial > hg > tvii
comparison docs/summary_of_gradient_descent.txt @ 53:673a295fd09c
[documentation] cache coursera notes
author | Jeff Hammel <k0scist@gmail.com> |
---|---|
date | Sun, 24 Sep 2017 14:42:56 -0700 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
52:0b3daccfc36c | 53:673a295fd09c |
---|---|
1 # Summary of Gradient Descent | |
2 | |
3 For a two layer network. The `[]`s denote the layer number. | |
4 `'` denotes prime. `T` denotes transpose. | |
5 | |
6 ## Scalar implementation | |
7 | |
8 ``` | |
9 dz[2] = a[2] - y | |
10 dW[2] = dz[2]a[1]T | |
11 db[2] = dz[2] | |
12 dz[1] = W[2]Tdz[2] * g[1]'(z[1]) | |
13 dW[1] = dz[1]xT | |
14 db[1] = dz[1] | |
15 ``` | |
16 | |
17 | |
18 ## Vectorized Implementation | |
19 | |
20 ``` | |
21 dZ[2] = A[2] - Y | |
22 dW[2] = (1/m)dZ[2]A[1]T | |
23 db[2] = (1/m)*np.sum(dZ[2], axis=1, keepdims=True) | |
24 dZ[1] = W[2]TdZ[2] * g[1]'(z[1]) | |
25 db[1] = (1/m)*np.sum(dZ[1], axis=1, keepdims=True) | |
26 ``` |