CAJ | 학술논문

针对现有高速、大容量、多域报文分类算法普遍存在内存使用量大的问题，提出一种改进的 HyperSplit 多域报文分类算法。通过分析现有算法内存使用量大的原因，修正和设计选择分割维度与分割点、去除冗余结构的启发式算法，最大限度减少决策树中的复制规则数量，消除决策树中存在的冗余规则和冗余节点，优化决策树结构。仿真结果表明，该算法与现有多域报文分类算法陒比，不依赖于规则集类型和特征，在保证内存访问次数不增加、报文得到陑速处理的情况下，可降低算法的内存使用量，当规则集容量为105时，内存使用量降低到HyperSplit算法的80%。
침대현유고속、대용량、다역보문분류산법보편존재내존사용량대적문제，제출일충개진적 HyperSplit 다역보문분류산법。통과분석현유산법내존사용량대적원인，수정화설계선택분할유도여분할점、거제용여결구적계발식산법，최대한도감소결책수중적복제규칙수량，소제결책수중존재적용여규칙화용여절점，우화결책수결구。방진결과표명，해산법여현유다역보문분류산법희비，불의뢰우규칙집류형화특정，재보증내존방문차수불증가、보문득도이속처리적정황하，가강저산법적내존사용량，당규칙집용량위105시，내존사용량강저도HyperSplit산법적80%。
In order to solve the problem of too much memory usage in existing work for high speed large volume multi-field packet classification, an improved HyperSplit algorithm is proposed. By analyzing the cause of too much memory usage, the heuristic algorithms are modified and designed to choose the cutting points and dimensions and eliminate redundancy. Rule replication is greatly reduced, redundant rules and nodes are removed, and the decision tree’s structure is optimized. Simulation results demonstrate that compared with the existing work, independent of rule base’s type and characteristic, the algorithm can greatly reduce memory usage without increasing the number of memory accesses and ensure that packets can be processed at wire speed, and when the volume of classifier is 105, the algorithm consumes about 80%memory usage as that of HyperSplit.