HW1 TA hours
TAs
ntu.mlta@gmail.com
Outline
Simple linear regression using gradient descent (with adagrad)
如何抽取feature
24
18
18
18
2014/1/1
2014/1/2
2014/1/3
...
...
如何抽取feature
(Pseudo code)�1. 宣告一個18維vector (Data)�2. for i_th row in training data : �3. Data[i_th row%18].append(every element in i_th row)�4. (可以順便處理rainfall的NR->設成0)
�Data會變成一個 的vector�
...
2014/1/1
2014/1/2
2014/1/3
如何抽取feature
一月份data
二月份data
三月份data
480
480
480
18
…...
18
…...
每10小時為一筆資料
如何抽取feature
(Pseudo code)�1. 宣告train_x儲存前9小時data,以及train_y紀錄第十小時pm2.5值�2. for i =1月、2月......�3. 取樣每連續10個小時:�4. train_x.append(前9小時所有data)�5. train_y.append(第10小時pm2.5值)�6. 在train_x每筆data中加入bias
實做linear regression
(Pseudo code)�1. 宣告weight vector、初始learning rate、# of iteration�2. for i_th iteration :�3. y’ = train_x 和 weight vector 的 內積�4. L = y’ - train_y �5. gra = 2*np.dot( (train_x)’ , L )�6. weight vector -= learning rate * gra� �
2. for i_th iteration :�3. y’ = train_x 和 weight vector 的 內積�4. L = y’ - train_y �5. gra = 2*np.dot( (train_x)T , L )�6. weight vector -= learning rate * gra
L =
gra = 2 *
p-dim vector
3.
4.
5.
Adagrad
實做linear regression
(Pseudo code)�1. 宣告weight vector、初始learning rate、# of iteration� 宣告prev_gra儲存每個iteration的gradient�2. for i_th iteration :�3. y’ = train_x 和 weight vector 的 內積�4. L = y’ - train_y�5. gra = 2*np.dot( (train_x)’ , L )� prev_gra += gra**2� ada = np.sqrt(prev_gra)�6. weight vector -= learning rate * gra / ada� �
預測 PM 2.5
(Pseudo code)�1. read test_x.csv�2. every 18 rows : �3. test_x.append([1])�4. test_x.append(這9小時的data)�5. test_y = np.dot( weight vector, test_x)
Reference