- 主講人:李詠玄博士候選人 (政治大學統計學系)
- 題 目:Classification Strategies for Time-Constrained Cost-Sensitive Decision Tree Induction with Missing Values
- 時 間:民國108年6月3日 (星期一) 下午2:00
- 地 點:國立政治大學逸仙樓050101教室
摘 要:
The induction of a cost-sensitive decision tree is one of extensively investigated issues in the study of classification. Among the studies, a newly developed algorithm of Chen, Wu, and Tang (2016) generates a time-constrained minimal-cost tree, which is the first to build a cost-sensitive tree within a time limit. Their experiments show that the algorithm possesses highly satisfactory performance. However, there often exist missing values when analyzing real data. Therefore, I extend the time-constrained cost-sensitive tree induction to handle the missing values simultaneously, in which two methods are employed to deal with incomplete data in this study. The first one is to apply the active feature acquisition (AFA) approach, and the other is the model-based imputation methods. Through AFA, we can acquire the true feature values for those missing data at a cost that we have to take into account in the tree-inducting process. While imputing the missing values based on available data is a more statistical strategy, which may require little cost and time, but it leads to the issue of misclassification. The proposed strategies incorporate AFA and imputations with the time-constrained cost-sensitive tree induction for different scenarios. A simulation study and real-world data analysis are conducted to examine the performance of the proposed algorithm.