Adaboost (Adaptive Boosting) Homework

About Adaboost (Adaptive Boosting)


Adaboost (Adaptive Boosting) Homework

Question 1. Use 3 weak learners (decision stumps) to design an AdaBoost classifier trained with the following data (with Play Baseball Today as the target attribute) showing calculations worked out by hand. Provide the final decision function for the classifier.

Number of Previous Days of RainPlay Baseball Today
0Yes
1Yes
2Yes
3No
4No
5No
6Yes
7Yes
8Yes
9No

Solution:

AdaBoost is a boosting algorithm that combines multiple weak learners to form a strong learner. The weak learner is a decision stump which is a one level decision tree.

  1. for the first weak learner: (1) Initialize the weights of all instances to 1/10. (2) To find the best decision stump (i.e., the best attribute and threshold) that minimizes the weighted error, I test each threshold from 0 to 9 by calculating the weighted error. The weighted error is calculated by summing the weights of all instances that are misclassified by the decision stump. The math formula is: ϵ=i=1nwiI(yih(xi))\epsilon = \sum_{i=1}^{n}w_iI(y_i \neq h(x_i)) where wiw_i is the weight of instance ii, yiy_i is the true label of instance ii, h(xi)h(x_i) is the predicted label of instance ii, and I(yih(xi))I(y_i \neq h(x_i)) is the indicator function that returns 1 if yih(xi)y_i \neq h(x_i) and 0 otherwise.
ThresholdWeighted Error
00.5
10.4
20.3
30.4
40.5
50.4
60.5
70.4
80.3
90.4

(3) The threshold with the minimum weighted error is 2 and 8. I choose 2 as the threshold which is the first threshold.

(4) The decision stump is: if Number of Previous Days of Rain <= 2, then Play Baseball Today = Yes, else Play Baseball Today = No.

(5) Calculate the Alpha value (performance of the stump) for the decision stump. The math formula is:

α=12ln(1ϵϵ)\alpha = \frac{1}{2}ln(\frac{1-\epsilon}{\epsilon})

where ϵ\epsilon is the weighted error of the decision stump.

α=12ln(10.30.3)=0.4237\alpha = \frac{1}{2}ln(\frac{1-0.3}{0.3}) = 0.4237

(6) Update the weights of all instances. The math formula is:

wi=wieαyih(xi)w_i = w_i * e^{-\alpha y_i h(x_i)}

where wiw_i is the weight of instance ii, yiy_i is the true label of instance ii, h(xi)h(x_i) is the predicted label of instance ii, and α\alpha is the performance of the decision stump. Correctly classified instances will have αyih(xi)=1-\alpha y_i h(x_i)=1, and misclassified instances will have αyih(xi)=1-\alpha y_i h(x_i)=-1.

Number of Previous Days of RainPlay Baseball TodayWeightUpdated Weight
0Yes0.10.06546
1Yes0.10.06546
2Yes0.10.06546
3No0.10.06546
4No0.10.06546
5No0.10.06546
6Yes0.10.1528
7Yes0.10.1528
8Yes0.10.1528
9No0.10.06546

(7) Normalize the weights of all instances. The math formula is:

wi=wii=1nwiw_i = \frac{w_i}{\sum_{i=1}^{n}w_i}

where wiw_i is the weight of instance ii.

Number of Previous Days of RainPlay Baseball TodayNormalized Weight
0Yes0.0714
1Yes0.0714
2Yes0.0714
3No0.0714
4No0.0714
5No0.0714
6Yes0.1667
7Yes0.1667
8Yes0.1667
9No0.0714

The incorrectly classified instances have higher weights than the correctly classified instances.

  1. for the second weak learner: (1) The weights of all instances are updated in the first weak learner. (2) Calculate the weighted error for each threshold from 0 to 9.
ThresholdWeighted Error
00.642857
10.571429
20.500000
30.571429
40.642857
50.714286
60.547619
70.380952
80.214286
90.285714

(3) The threshold with the minimum weighted error is 8. I choose 8 as the threshold which is the second threshold.

(4) The decision stump is: if Number of Previous Days of Rain <= 8, then Play Baseball Today = Yes, else Play Baseball Today = No.

(5) Calculate the Alpha value.

α=12ln(10.2142860.0.214286)=0.650\alpha = \frac{1}{2}ln(\frac{1-0.214286}{0.0.214286}) = 0.650

(6) Update the weights of all instances and normalize the weights of all instances.

Number of Previous Days of RainPlay Baseball TodayWeightUpdated Weight
0Yes0.07140.045455
1Yes0.07140.045455
2Yes0.07140.045455
3No0.07140.166667
4No0.07140.166667
5No0.07140.166667
6Yes0.16670.106061
7Yes0.16670.106061
8Yes0.16670.106061
9No0.07140.045455
  1. for the third weak learner: (1) The weights of all instances are updated in the second weak learner.

(2) Calculate the weighted error for each threshold from 0 to 9.

ThresholdWeighted Error
00.409091
10.363636
20.318182
30.484848
40.651515
50.818182
60.712121
70.606061
80.500000
90.545455

(3) The threshold with the minimum weighted error is 2. I choose 2 as the threshold which is the third threshold.

(4) The decision stump is: if Number of Previous Days of Rain <= 2, then Play Baseball Today = Yes, else Play Baseball Today = No.

(5) Calculate the Alpha value.

α=12ln(10.3181820.318182)=0.381\alpha = \frac{1}{2}ln(\frac{1-0.318182}{0.318182}) = 0.381

(6) Update the weights of all instances and normalize the weights of all instances.

Number of Previous Days of RainPlay Baseball TodayWeightUpdated Weight
0Yes0.0454550.033333
1Yes0.0454550.033333
2Yes0.0454550.033333
3No0.1666670.122222
4No0.1666670.122222
5No0.1666670.122222
6Yes0.1060610.166667
7Yes0.1060610.166667
8Yes0.1060610.166667
9No0.0454550.033333
  1. In summary (1) The Final Weight Table
Data PointsInitial WeightsWeight of 1st Weak LearnerWeight of 2nd Weak LearnerWeight of 3rd Weak Learner
00.10.0714290.0454550.033333
10.10.0714290.0454550.033333
20.10.0714290.0454550.033333
30.10.0714290.1666670.122222
40.10.0714290.1666670.122222
50.10.0714290.1666670.122222
60.10.1666670.1060610.166667
70.10.1666670.1060610.166667
80.10.1666670.1060610.166667
90.10.0714290.0454550.033333

(2) Alphas for Weak Learners: For the 1st decision stump (threshold=2): Alpha = 0.424 For the 2nd decision stump (threshold=8): Alpha = 0.650 For the 3rd decision stump (threshold=2): Alpha = 0.381

(3) Final Decision Function:

f(x)=sign(t=1Tαtht(x))f(x) = sign(\sum_{t=1}^{T}\alpha_th_t(x))

where TT is the number of weak learners, αt\alpha_t is the performance of the ttth weak learner, and ht(x)h_t(x) is the prediction of the ttth weak learner.

f(x)=sign(0.424h1(x)+0.650h2(x)+0.381h3(x))f(x) = sign(0.424h_1(x) + 0.650h_2(x) + 0.381h_3(x))

The sign of this sum will give us the final classification.