Mercurial > hg > tvii
annotate docs/matrix.txt @ 55:0908b6cd3217
[regression] add better cost function for sigmoids
author | Jeff Hammel <k0scist@gmail.com> |
---|---|
date | Sun, 24 Sep 2017 15:30:15 -0700 |
parents | 857a606783e1 |
children |
rev | line source |
---|---|
35
37a9fb876f54
[documentation] add notes for matrices + vectorization
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
1 [| | | ] |
37 | 2 X = [x1 x2 ...xm] = A0 |
35
37a9fb876f54
[documentation] add notes for matrices + vectorization
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
3 [| | | ] |
36
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
4 |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
5 Z1 = w'X + b1 |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
6 |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
7 A1 = sigmoid(Z1) |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
8 |
37 | 9 Z2 = W2 A1 + b2 |
10 | |
36
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
11 [---] |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
12 W1 = [---] |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
13 [---] |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
14 |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
15 `W1x1` gives some column vector, where `x1` |
433c475f42db
[documentation] more matrix notes
Jeff Hammel <k0scist@gmail.com>
parents:
35
diff
changeset
|
16 is the first training example. |
44
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
17 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
18 Y = [ y1 y2 ... ym] |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
19 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
20 For a two-layer network: |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
21 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
22 dZ2 = A2 - Y |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
23 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
24 dW = (1/m) dZ2 A1' |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
25 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
26 db2 = (1./m)*np.sum(dZ2, axis=1, keepdims=True) |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
27 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
28 dZ1 = W2' dZ2 * g1 ( Z1 ) |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
29 : W2' dZ2 : an (n1, m) matrix |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
30 : * : element-wise product |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
31 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
32 dW1 = (1/m) dZ1 X' |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
33 |
857a606783e1
[documentation] notes + stubs on gradient descent
Jeff Hammel <k0scist@gmail.com>
parents:
37
diff
changeset
|
34 db1 = (1/m) np.sum(dZ1, axis=1, keepdims=True) |