Mercurial > hg > tvii
annotate tvii/logistic_regression.py @ 28:77f68c241b37
[logistic regression] propagate
author | Jeff Hammel <k0scist@gmail.com> |
---|---|
date | Mon, 04 Sep 2017 11:53:23 -0700 |
parents | c52d8173b056 |
children | ae0c345ea09d |
rev | line source |
---|---|
2
1214c127fe43
steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
1 """ |
1214c127fe43
steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
2 z = w'x + b |
1214c127fe43
steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
3 a = sigmoid(z) |
1214c127fe43
steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
4 L(a,y) = -(y*log(a) + (1-y)*log(1-a)) |
11
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
5 |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
6 [| | | ] |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
7 X = [x1 x2 x3] |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
8 [| | | ] |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
9 |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
10 [z1 z2 z3 .. zm] = w'*X + [b b b b ] = [w'*x1+b + w'*x2+b ...] |
2
1214c127fe43
steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff
changeset
|
11 """ |
11
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
12 |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
13 |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
14 import numpy as np |
16 | 15 from .sigmoid import sigmoid |
16 | |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
17 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
18 def propagate(w, b, X, Y): |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
19 """ |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
20 Implement the cost function and its gradient for the propagation: |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
21 Forward Propagation: |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
22 - You get X |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
23 - You compute $A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})$ |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
24 - You calculate the cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})$ |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
25 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
26 Here are the two formulas you will be using: |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
27 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
28 $$ \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7}$$ |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
29 $$ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8}$$ |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
30 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
31 Arguments: |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
32 w -- weights, a numpy array of size (num_px * num_px * 3, 1) |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
33 b -- bias, a scalar |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
34 X -- data of size (num_px * num_px * 3, number of examples) |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
35 Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples) |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
36 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
37 Return: |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
38 cost -- negative log-likelihood cost for logistic regression |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
39 dw -- gradient of the loss with respect to w, thus same shape as w |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
40 db -- gradient of the loss with respect to b, thus same shape as b |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
41 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
42 Tips: |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
43 - Write your code step by step for the propagation. np.log(), np.dot() |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
44 """ |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
45 |
28
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
46 |
25
c52d8173b056
[regression] cleanup + proper structure
Jeff Hammel <k0scist@gmail.com>
parents:
24
diff
changeset
|
47 |
28
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
48 # FORWARD PROPAGATION (FROM X TO COST) |
25
c52d8173b056
[regression] cleanup + proper structure
Jeff Hammel <k0scist@gmail.com>
parents:
24
diff
changeset
|
49 cost = cost_function(w, b, X, Y) # compute cost |
c52d8173b056
[regression] cleanup + proper structure
Jeff Hammel <k0scist@gmail.com>
parents:
24
diff
changeset
|
50 |
28
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
51 # BACKWARD PROPAGATION (TO FIND GRADIENT) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
52 m = X.shape[1] |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
53 A = sigmoid(np.dot(w.T, X) + b) # compute activation |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
54 dw = (1./m)*np.dot(X, (A - Y).T) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
55 db = (1./m)*np.sum(A - Y) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
56 |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
57 # sanity check |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
58 assert(A.shape[1] == m) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
59 assert(dw.shape == w.shape), "dw.shape is {}; w.shape is {}".format(dw.shape, w.shape) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
60 assert(db.dtype == float) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
61 cost = np.squeeze(cost) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
62 assert(cost.shape == ()) |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
63 |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
64 # return gradients |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
65 grads = {"dw": dw, |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
66 "db": db} |
77f68c241b37
[logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents:
25
diff
changeset
|
67 return grads, cost |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
68 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
69 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
70 def cost_function(w, b, X, Y): |
16 | 71 """ |
72 Cost function for binary classification | |
73 yhat = sigmoid(W.T*x + b) | |
74 interpret yhat thhe probably that y=1 | |
75 | |
76 Loss function: | |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
77 y log(yhat) + (1 - y) log(1 - yhat) |
16 | 78 """ |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
79 |
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
80 m = X.shape[1] |
23
f34110e28a0a
[logistic regression] we have a working cost function
Jeff Hammel <k0scist@gmail.com>
parents:
22
diff
changeset
|
81 A = sigmoid(np.dot(w.T, X) + b) |
f34110e28a0a
[logistic regression] we have a working cost function
Jeff Hammel <k0scist@gmail.com>
parents:
22
diff
changeset
|
82 cost = np.sum(Y*np.log(A) + (1 - Y)*np.log(1 - A)) |
f34110e28a0a
[logistic regression] we have a working cost function
Jeff Hammel <k0scist@gmail.com>
parents:
22
diff
changeset
|
83 return (-1./m)*cost |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
84 |
11
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
85 |
12 | 86 def logistic_regression(_): |
87 """the slow way""" | |
13 | 88 J = 0 |
89 dw1 =0 | |
90 dw2=0 | |
91 db=0 | |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
92 raise NotImplementedError('TODO') |
12 | 93 |
24
89f46435a9e2
[logistic regression] call cost function
Jeff Hammel <k0scist@gmail.com>
parents:
23
diff
changeset
|
94 |
11
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
95 def logistic_regression(nx): |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
96 dw = np.zeros(nx) |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
97 # TODO |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
98 # z = np.dot(wT, x) + b # "boradcasting |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
99 raise NotImplementedError('TODO') |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
100 |
22
3713c6733990
[logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents:
16
diff
changeset
|
101 # derivatives: |
11
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
102 # dz1 = a1 - y1 ; dz2 = a2 - y2 ; .... |
b6a146f0a61b
[logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents:
2
diff
changeset
|
103 # dZ = [ dz1 dz2 ... dzm ] |
12 | 104 # Z = w'X + b = np.dot(w', X) + b |
105 # A sigmoid(Z) | |
106 #dZ = A - Y | |
107 #dw = (1./m)*X*dZ' | |
108 #db = (1./m)*np.sum(dZ) | |
109 # w -= alpha*dw | |
110 # b -= alpha*db |