annotate tvii/logistic_regression.py @ 28:77f68c241b37

[logistic regression] propagate
author Jeff Hammel <k0scist@gmail.com>
date Mon, 04 Sep 2017 11:53:23 -0700
parents c52d8173b056
children ae0c345ea09d
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
1214c127fe43 steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff changeset
1 """
1214c127fe43 steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff changeset
2 z = w'x + b
1214c127fe43 steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff changeset
3 a = sigmoid(z)
1214c127fe43 steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff changeset
4 L(a,y) = -(y*log(a) + (1-y)*log(1-a))
11
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
5
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
6 [| | | ]
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
7 X = [x1 x2 x3]
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
8 [| | | ]
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
9
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
10 [z1 z2 z3 .. zm] = w'*X + [b b b b ] = [w'*x1+b + w'*x2+b ...]
2
1214c127fe43 steps towards logistic regression
Jeff Hammel <k0scist@gmail.com>
parents:
diff changeset
11 """
11
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
12
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
13
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
14 import numpy as np
16
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
15 from .sigmoid import sigmoid
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
16
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
17
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
18 def propagate(w, b, X, Y):
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
19 """
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
20 Implement the cost function and its gradient for the propagation:
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
21 Forward Propagation:
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
22 - You get X
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
23 - You compute $A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})$
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
24 - You calculate the cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})$
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
25
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
26 Here are the two formulas you will be using:
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
27
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
28 $$ \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7}$$
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
29 $$ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8}$$
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
30
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
31 Arguments:
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
32 w -- weights, a numpy array of size (num_px * num_px * 3, 1)
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
33 b -- bias, a scalar
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
34 X -- data of size (num_px * num_px * 3, number of examples)
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
35 Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
36
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
37 Return:
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
38 cost -- negative log-likelihood cost for logistic regression
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
39 dw -- gradient of the loss with respect to w, thus same shape as w
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
40 db -- gradient of the loss with respect to b, thus same shape as b
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
41
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
42 Tips:
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
43 - Write your code step by step for the propagation. np.log(), np.dot()
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
44 """
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
45
28
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
46
25
c52d8173b056 [regression] cleanup + proper structure
Jeff Hammel <k0scist@gmail.com>
parents: 24
diff changeset
47
28
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
48 # FORWARD PROPAGATION (FROM X TO COST)
25
c52d8173b056 [regression] cleanup + proper structure
Jeff Hammel <k0scist@gmail.com>
parents: 24
diff changeset
49 cost = cost_function(w, b, X, Y) # compute cost
c52d8173b056 [regression] cleanup + proper structure
Jeff Hammel <k0scist@gmail.com>
parents: 24
diff changeset
50
28
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
51 # BACKWARD PROPAGATION (TO FIND GRADIENT)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
52 m = X.shape[1]
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
53 A = sigmoid(np.dot(w.T, X) + b) # compute activation
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
54 dw = (1./m)*np.dot(X, (A - Y).T)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
55 db = (1./m)*np.sum(A - Y)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
56
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
57 # sanity check
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
58 assert(A.shape[1] == m)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
59 assert(dw.shape == w.shape), "dw.shape is {}; w.shape is {}".format(dw.shape, w.shape)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
60 assert(db.dtype == float)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
61 cost = np.squeeze(cost)
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
62 assert(cost.shape == ())
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
63
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
64 # return gradients
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
65 grads = {"dw": dw,
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
66 "db": db}
77f68c241b37 [logistic regression] propagate
Jeff Hammel <k0scist@gmail.com>
parents: 25
diff changeset
67 return grads, cost
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
68
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
69
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
70 def cost_function(w, b, X, Y):
16
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
71 """
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
72 Cost function for binary classification
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
73 yhat = sigmoid(W.T*x + b)
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
74 interpret yhat thhe probably that y=1
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
75
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
76 Loss function:
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
77 y log(yhat) + (1 - y) log(1 - yhat)
16
b95fe82ac9ce more notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 13
diff changeset
78 """
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
79
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
80 m = X.shape[1]
23
f34110e28a0a [logistic regression] we have a working cost function
Jeff Hammel <k0scist@gmail.com>
parents: 22
diff changeset
81 A = sigmoid(np.dot(w.T, X) + b)
f34110e28a0a [logistic regression] we have a working cost function
Jeff Hammel <k0scist@gmail.com>
parents: 22
diff changeset
82 cost = np.sum(Y*np.log(A) + (1 - Y)*np.log(1 - A))
f34110e28a0a [logistic regression] we have a working cost function
Jeff Hammel <k0scist@gmail.com>
parents: 22
diff changeset
83 return (-1./m)*cost
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
84
11
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
85
12
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
86 def logistic_regression(_):
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
87 """the slow way"""
13
8cb116d63a78 [notes]
Jeff Hammel <k0scist@gmail.com>
parents: 12
diff changeset
88 J = 0
8cb116d63a78 [notes]
Jeff Hammel <k0scist@gmail.com>
parents: 12
diff changeset
89 dw1 =0
8cb116d63a78 [notes]
Jeff Hammel <k0scist@gmail.com>
parents: 12
diff changeset
90 dw2=0
8cb116d63a78 [notes]
Jeff Hammel <k0scist@gmail.com>
parents: 12
diff changeset
91 db=0
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
92 raise NotImplementedError('TODO')
12
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
93
24
89f46435a9e2 [logistic regression] call cost function
Jeff Hammel <k0scist@gmail.com>
parents: 23
diff changeset
94
11
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
95 def logistic_regression(nx):
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
96 dw = np.zeros(nx)
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
97 # TODO
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
98 # z = np.dot(wT, x) + b # "boradcasting
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
99 raise NotImplementedError('TODO')
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
100
22
3713c6733990 [logistic regression] introduce illustrative test
Jeff Hammel <k0scist@gmail.com>
parents: 16
diff changeset
101 # derivatives:
11
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
102 # dz1 = a1 - y1 ; dz2 = a2 - y2 ; ....
b6a146f0a61b [logistic regression] stubbing
Jeff Hammel <k0scist@gmail.com>
parents: 2
diff changeset
103 # dZ = [ dz1 dz2 ... dzm ]
12
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
104 # Z = w'X + b = np.dot(w', X) + b
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
105 # A sigmoid(Z)
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
106 #dZ = A - Y
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
107 #dw = (1./m)*X*dZ'
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
108 #db = (1./m)*np.sum(dZ)
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
109 # w -= alpha*dw
8d25213513b4 [regression] notes to self
Jeff Hammel <k0scist@gmail.com>
parents: 11
diff changeset
110 # b -= alpha*db