Consider the above table of 6 observations. Values of X1 and X2 are used to predict the outcome Y. When Y is 1 the outcome is considered to be positive.
If in the process of making a decision tree X2 is considered as splitting feature and 2.5 is considered as splitting value and values less than 2.5 go to the left node. What will the Gini impurity of the split data be?
Given options :
ANS: Option = 1/2
The Gini impurity for the change value whcih is from 2.5 to 1.5 where we will be having the summation of 2.5*2.5 and 1.5*1.5 and have a toal for8.5 and then the impuriyt will be calculated for the positive result where X1 and X2 have a total for 17 then this will give 8.5/17 = 0.5 above which we cannot have the gini impurity and 1-0.5 will give 0.5 itself i.e. 1/2 and hence would be the answer.