华南师范大学学报(自然科学版)
華南師範大學學報(自然科學版)
화남사범대학학보(자연과학판)
JOURNAL OF SOUTH CHINA NORMAL UNIVERSITY(NATURAL SCIENCE EDITION)
2001年
2期
84-88
,共5页
编码%文本压缩%LZ算法%位%按位与运算
編碼%文本壓縮%LZ算法%位%按位與運算
편마%문본압축%LZ산법%위%안위여운산
该文把GB2312-80的汉字转换为从0至6767的短整型数,这些短整型数据有一个共同的存储特点:它们的2字节中的高3位(称为冗余位)皆为0. 删除冗余位而重组其余位即可形成压缩文本. 这种压缩方法显然是简单、快捷、容易实现和对GB2312-80汉字是普遍适用的.
該文把GB2312-80的漢字轉換為從0至6767的短整型數,這些短整型數據有一箇共同的存儲特點:它們的2字節中的高3位(稱為冗餘位)皆為0. 刪除冗餘位而重組其餘位即可形成壓縮文本. 這種壓縮方法顯然是簡單、快捷、容易實現和對GB2312-80漢字是普遍適用的.
해문파GB2312-80적한자전환위종0지6767적단정형수,저사단정형수거유일개공동적존저특점:타문적2자절중적고3위(칭위용여위)개위0. 산제용여위이중조기여위즉가형성압축문본. 저충압축방법현연시간단、쾌첩、용역실현화대GB2312-80한자시보편괄용적.
In this paper, the chinese characters of GB2312-80 are transformed into short integral numbers distributing from 0 to 6767. Every one of these short integral numbers is stored in a cell of two bytes, and the 3 higher bits, named redundance bits, in the cell are always zero. Omitting the redundance bits and reorganizing the others, the compression text of chinese characters is formed. The compression method is simple, quick, easy to implement and universal for all texts of chinese characters of GB2312-80.