计算机应用
計算機應用
계산궤응용
COMPUTER APPLICATION
2015年
z1期
152-155
,共4页
余翔%白友良%李成%赵楠
餘翔%白友良%李成%趙楠
여상%백우량%리성%조남
多维有序聚类法%数据分类%地层划分%岩体质量分级%孢粉粒度分带
多維有序聚類法%數據分類%地層劃分%巖體質量分級%孢粉粒度分帶
다유유서취류법%수거분류%지층화분%암체질량분급%포분립도분대
multi-dimension sequential clustering method%data classification%division of stratigraphic sequence%classification of rock mass%graininess zonation of spore-pollen
针对经验法在地质多指标数据进行有序分类存在局限性的问题,提出了基于Fisher最优分割的多维有序聚类法,通过对均一化处理的多维指标之间定义类直径的损失函数来判别最优分割,然后在使得损失函数最小的情况下合并相邻分类,实现有序样本数据的聚类层次。采用该方法对钻孔编录地层划分、岩体质量分级和孢粉粒度分带三种数据进行分类,结果表明在不能确定分层数的情况下,可以利用损失函数的导数曲线极值点确定最优分层数;样品多个指标之间的相关性越高,分类结果与单指标分类结果的差异越不明显。该方法能够为多元地质数据的定量分析提供数学理论依据,也可以应用于其他行业领域。
針對經驗法在地質多指標數據進行有序分類存在跼限性的問題,提齣瞭基于Fisher最優分割的多維有序聚類法,通過對均一化處理的多維指標之間定義類直徑的損失函數來判彆最優分割,然後在使得損失函數最小的情況下閤併相鄰分類,實現有序樣本數據的聚類層次。採用該方法對鑽孔編錄地層劃分、巖體質量分級和孢粉粒度分帶三種數據進行分類,結果錶明在不能確定分層數的情況下,可以利用損失函數的導數麯線極值點確定最優分層數;樣品多箇指標之間的相關性越高,分類結果與單指標分類結果的差異越不明顯。該方法能夠為多元地質數據的定量分析提供數學理論依據,也可以應用于其他行業領域。
침대경험법재지질다지표수거진행유서분류존재국한성적문제,제출료기우Fisher최우분할적다유유서취류법,통과대균일화처리적다유지표지간정의류직경적손실함수래판별최우분할,연후재사득손실함수최소적정황하합병상린분류,실현유서양본수거적취류층차。채용해방법대찬공편록지층화분、암체질량분급화포분립도분대삼충수거진행분류,결과표명재불능학정분층수적정황하,가이이용손실함수적도수곡선겁치점학정최우분층수;양품다개지표지간적상관성월고,분류결과여단지표분류결과적차이월불명현。해방법능구위다원지질수거적정량분석제공수학이론의거,야가이응용우기타행업영역。
In order to solve the limitation problem of empirical approach in classification of sequential multi-index geological data, multi-dimensional sequential clustering method based on Fisher optimal segmentation was proposed to achieve hierarchical clustering. The multi-index of sample data was homogenized first, and then the category diameter of each possible segments was calculated to find optimal segmentation. The optimal segmentation had a minimum value of loss function. At last adjacent categories with smallest diameter were merged in proper order to keep orderliness. Through the applications of stratigraphic sequence division, rock mass classification and spore-pollen graininess zonation, the result shows that the number of segments can be specified by extreme points of derivative curve of loss function, and the classifications are consistent with other studies. The higher the correlation between multiple indexes is, the more similar the results of classification are with the results of single index. The method provides the mathematic theory basis in quantitative analysis of geologic data processing and it also can be used in other fields.