模式识别与人工智能
模式識彆與人工智能
모식식별여인공지능
Moshi Shibie yu Rengong Zhineng
2014年
2期
134-140
,共7页
有序分类%有序决策树%非平衡割点
有序分類%有序決策樹%非平衡割點
유서분류%유서결책수%비평형할점
Ordinal Classification%Ordinal Decision Tree%Unstable Cut-Point
由于基于排序熵的有序决策树在扩展属性选取时,需计算每个条件属性的每个割点处的排序互信息,并通过对比这些排序互信息的大小来确定最大值(最大值对应的属性为扩展属性),计算复杂度较高。针对此问题,文中将割点分为平衡割点和非平衡割点两部分,建立一个数学模型,从理论上证明排序互信息最大值不会在平衡割点处达到,而只能在非平衡割点处达到。这说明在计算排序互信息时只需遍历非平衡割点,而无需再计算平衡割点处的值,从而使决策树构建的计算效率得到较大程度提高。数值实验验证此结果。
由于基于排序熵的有序決策樹在擴展屬性選取時,需計算每箇條件屬性的每箇割點處的排序互信息,併通過對比這些排序互信息的大小來確定最大值(最大值對應的屬性為擴展屬性),計算複雜度較高。針對此問題,文中將割點分為平衡割點和非平衡割點兩部分,建立一箇數學模型,從理論上證明排序互信息最大值不會在平衡割點處達到,而隻能在非平衡割點處達到。這說明在計算排序互信息時隻需遍歷非平衡割點,而無需再計算平衡割點處的值,從而使決策樹構建的計算效率得到較大程度提高。數值實驗驗證此結果。
유우기우배서적적유서결책수재확전속성선취시,수계산매개조건속성적매개할점처적배서호신식,병통과대비저사배서호신식적대소래학정최대치(최대치대응적속성위확전속성),계산복잡도교고。침대차문제,문중장할점분위평형할점화비평형할점량부분,건립일개수학모형,종이론상증명배서호신식최대치불회재평형할점처체도,이지능재비평형할점처체도。저설명재계산배서호신식시지수편력비평형할점,이무수재계산평형할점처적치,종이사결책수구건적계산효솔득도교대정도제고。수치실험험증차결과。
When the expanded attributes are selected for decision tree learning based on rank entropy, computing the rank mutual information of every single cut for each of the continuous-valued attributes is required to get the expanded attribute by comparing the values of rank mutual information. Therefore, the computational complexity is high. Aiming at this problem, cut-points are divided into stable and unstable cut-points and a mathematical model is established in this paper. The proposed model theoretically proves that the rank mutual information function achieves its maximum not at stable cut-points, but at unstable cut-points. The result means that in the algorithm only traversing the unstable cut-points is required instead of computing the values of the stable cut-points. Thus, the computational efficiency of building decision trees is greatly improved, which is confirmed by the numerical experimental results.