CAJ | 학술논문

针对大数据排序算法的需求,提出了基于任务驱动的并行排序算法。该算法采用任务驱动、AIO (A-synchronous Input/Output,异步输入/输出)和双缓冲区机制等技术充分利用系统资源；通过构造等价排序键,优化快速排序算法；并在算法实现上,采用多线程处理任务,通过控制线程个数控制并行度。综合利用这些技术,该算法使得大数据的排序性能接近理论极限值,在 CPU (Central Processing Unit,中央处理器)资源充裕的情况下,利用异步压缩技术,还可以突破这一极限,最终实现的系统2000 s 就可以对超过500 Gbyte 的磁盘数据做一次完整的排序。在数据库设计中充分利用此思想,将会实现连接和线程的分离,数据库将可以支持更大的连接数,从而提高数据库支持的并发度。
침대대수거배서산법적수구,제출료기우임무구동적병행배서산법。해산법채용임무구동、AIO (A-synchronous Input/Output,이보수입/수출)화쌍완충구궤제등기술충분이용계통자원；통과구조등개배서건,우화쾌속배서산법；병재산법실현상,채용다선정처리임무,통과공제선정개수공제병행도。종합이용저사기술,해산법사득대수거적배서성능접근이론겁한치,재 CPU (Central Processing Unit,중앙처리기)자원충유적정황하,이용이보압축기술,환가이돌파저일겁한,최종실현적계통2000 s 취가이대초과500 Gbyte 적자반수거주일차완정적배서。재수거고설계중충분이용차사상,장회실현련접화선정적분리,수거고장가이지지경대적련접수,종이제고수거고지지적병발도。
A task-driving parallel ranking algorithm is proposed to meet demands for ranking algorithms for big data.Task-driving,AIO (Asynchronous Input and Output)and dual-buffer zone mechanisms are employed to make full use of system resources.The quick ranking algorithm is optimized by building equivalent keys.In algo-rithm implementation,parallel concurrences are controlled through the number of threads by using multi-threading in task handling.Through integrative use of such technologies,the ranking performance of the algorithm is ap-proached the theoretical limit.It is even possible to go beyond the limit,that is,completing ranking of more than 500 Gbyte disk data in 2000 s,by using asynchronous compression technology when there is adequate CPU (Central Processing Unit)resource.Utilizing this algorithm in database design will facilitate separation of connection and thread and the database will be able to support an even larger number of connections,thus increasing concurrences supported by the database.