运筹与管理
運籌與管理
운주여관리
OPERATIONS RESEARCH AND MANAGEMENT SCIENCE
2015年
2期
201-207
,共7页
客户信用评估%类别不平衡%迁移学习%数据分组处理技术
客戶信用評估%類彆不平衡%遷移學習%數據分組處理技術
객호신용평고%유별불평형%천이학습%수거분조처리기술
credit scoring%class imbalance%transfer learning%group method of data handling
客户信用评估是银行等金融企业日常经营活动中的重要组成部分。一般违约样本在客户总体中只占少数,而能按时还款客户样本占多数,这就是客户信用评估中常见的类别不平衡问题。目前,用于客户信用评估的方法尚不能有效解决少数类样本稀缺带来的类别不平衡。本研究引入迁移学习技术整合系统内外部信息,以解决少数类样本稀缺带来的类别不平衡问题。为了提高对来自系统外部少数类样本信息的使用效率,构建了一种新的迁移学习模型:以基于集成技术的迁移装袋模型为基础,使用两阶段抽样和数据分组处理技术分别对其基模型生成和集成策略进行改进。运用重庆某商业银行信用卡客户数据进行的实证研究结果表明:与目前客户信用评估的常用方法相比,新模型能更好地处理绝对稀缺条件下类别不平衡对客户信用评估的影响,特别对占少数的违约客户有更好的预测精度。
客戶信用評估是銀行等金融企業日常經營活動中的重要組成部分。一般違約樣本在客戶總體中隻佔少數,而能按時還款客戶樣本佔多數,這就是客戶信用評估中常見的類彆不平衡問題。目前,用于客戶信用評估的方法尚不能有效解決少數類樣本稀缺帶來的類彆不平衡。本研究引入遷移學習技術整閤繫統內外部信息,以解決少數類樣本稀缺帶來的類彆不平衡問題。為瞭提高對來自繫統外部少數類樣本信息的使用效率,構建瞭一種新的遷移學習模型:以基于集成技術的遷移裝袋模型為基礎,使用兩階段抽樣和數據分組處理技術分彆對其基模型生成和集成策略進行改進。運用重慶某商業銀行信用卡客戶數據進行的實證研究結果錶明:與目前客戶信用評估的常用方法相比,新模型能更好地處理絕對稀缺條件下類彆不平衡對客戶信用評估的影響,特彆對佔少數的違約客戶有更好的預測精度。
객호신용평고시은행등금융기업일상경영활동중적중요조성부분。일반위약양본재객호총체중지점소수,이능안시환관객호양본점다수,저취시객호신용평고중상견적유별불평형문제。목전,용우객호신용평고적방법상불능유효해결소수류양본희결대래적유별불평형。본연구인입천이학습기술정합계통내외부신식,이해결소수류양본희결대래적유별불평형문제。위료제고대래자계통외부소수류양본신식적사용효솔,구건료일충신적천이학습모형:이기우집성기술적천이장대모형위기출,사용량계단추양화수거분조처리기술분별대기기모형생성화집성책략진행개진。운용중경모상업은행신용잡객호수거진행적실증연구결과표명:여목전객호신용평고적상용방법상비,신모형능경호지처리절대희결조건하유별불평형대객호신용평고적영향,특별대점소수적위약객호유경호적예측정도。
Customer credit scoring is an important part of daily business activities for financial companies such as banks.Default customers usually makae up the minority of the population while customers of timely repayment make up the majority , which is called a class imbalance problem in the study of customer credit scoring .Existing methods in credit scoring cannot effectively solve the issue of class imbalance caused by absolute scarcity of the minority class.In our study, we introduce the technique of transfer learning to integrate the external information and try to solve the issue of class imbalance caused by absolute scarcity of the minority class .In order to exploit the minority sample outside the system more effectively , a transfer learning model is proposed , which is based on the ensemble transfer learning technology transfer bagging .A two-stage sampling method and the technique of group method of data handling are used in the new model to improve the generation and integration strategy of base models .The empirical results on the credit card dataset from a commercial bank show that the new model can deal with the issue of class imbalance caused by absolute scarcity better in comparison with other commonly used methods in credit scoring and provide a better prediction of the credit status of default customers .