计算机工程
計算機工程
계산궤공정
COMPUTER ENGINEERING
2015年
2期
7-11,16
,共6页
崔竞松%路昊宇%郭迟%何松
崔競鬆%路昊宇%郭遲%何鬆
최경송%로호우%곽지%하송
OpenStack云平台%负载均衡%事件驱动机制%高可用性%虚拟化%云计算
OpenStack雲平檯%負載均衡%事件驅動機製%高可用性%虛擬化%雲計算
OpenStack운평태%부재균형%사건구동궤제%고가용성%허의화%운계산
OpenStack cloud platform%load balancing%event-driven mechanism%high availability%virtualization%cloud computing
为解决虚拟化条件下云平台故障排除不及时的问题,在开源云平台OpenStack上设计并实现一种虚拟化故障检测恢复系统。该系统由GUI层、调度层、逻辑层和功能层组成,以事件驱动机制为核心,将系统中传递的信息作为事件按时序进行处理。以感知模块、策略模块、执行模块为主体,调用OpenStack API和Libvirt API实现与虚拟机管理层的交互。建立以信息获取、分析处理、故障恢复为主要内容的故障检测恢复体系,通过对云平台运行环境的实时检测,获取状态参数,根据策略对参数进行分析判断并制定应对措施,实现对故障的自动恢复。实验结果证明,该系统可以在无代理情况下对云平台进行实时检测和故障自动恢复,增强云环境的安全性,提升云平台的高可用性。
為解決虛擬化條件下雲平檯故障排除不及時的問題,在開源雲平檯OpenStack上設計併實現一種虛擬化故障檢測恢複繫統。該繫統由GUI層、調度層、邏輯層和功能層組成,以事件驅動機製為覈心,將繫統中傳遞的信息作為事件按時序進行處理。以感知模塊、策略模塊、執行模塊為主體,調用OpenStack API和Libvirt API實現與虛擬機管理層的交互。建立以信息穫取、分析處理、故障恢複為主要內容的故障檢測恢複體繫,通過對雲平檯運行環境的實時檢測,穫取狀態參數,根據策略對參數進行分析判斷併製定應對措施,實現對故障的自動恢複。實驗結果證明,該繫統可以在無代理情況下對雲平檯進行實時檢測和故障自動恢複,增彊雲環境的安全性,提升雲平檯的高可用性。
위해결허의화조건하운평태고장배제불급시적문제,재개원운평태OpenStack상설계병실현일충허의화고장검측회복계통。해계통유GUI층、조도층、라집층화공능층조성,이사건구동궤제위핵심,장계통중전체적신식작위사건안시서진행처리。이감지모괴、책략모괴、집행모괴위주체,조용OpenStack API화Libvirt API실현여허의궤관리층적교호。건립이신식획취、분석처리、고장회복위주요내용적고장검측회복체계,통과대운평태운행배경적실시검측,획취상태삼수,근거책략대삼수진행분석판단병제정응대조시,실현대고장적자동회복。실험결과증명,해계통가이재무대리정황하대운평태진행실시검측화고장자동회복,증강운배경적안전성,제승운평태적고가용성。
In order to solve the problem that the fault troubleshooting of cloud platforms is not timely,and guarantee the continuity of cloud services,this paper designs and implements a virtualization fault detection and recovery system based on event-driven mechanism,which is on the open-source cloud platform———OpenStack. The system is composed of GUI layer,scheduling layer,logic layer and functional layer,and processes the information transmitted in the system by timing as an event on the basis of event-driven mechanism. It mainly uses perception module, policy module and execution module,which call OpenStack API and Libvirt API to interact with the management of virtual machines. The established fault detection recovery system mainly includes information acquisition, analysis and processing, fault recovery, and by real-time detection of the cloud platform’ s runtime environment,it can obtain state parameters,analyze the parameters and develop countermeasures according to established policy,and achieve automatic fault recovery. Experimental results show that the system can detect and recover cloud platforms’ fault with agentless method,enhance the security of cloud environments,and improve the high availability of cloud platforms.