计算机工程与应用
計算機工程與應用
계산궤공정여응용
COMPUTER ENGINEERING AND APPLICATIONS
2013年
8期
146-150
,共5页
藏文音节%现代藏文字、词典排序规则%ISO/IEC 10646(Tibetan)%藏文排序
藏文音節%現代藏文字、詞典排序規則%ISO/IEC 10646(Tibetan)%藏文排序
장문음절%현대장문자、사전배서규칙%ISO/IEC 10646(Tibetan)%장문배서
Tibetan syllable%Tibetan dictionary sort rules%ISO/IEC 10646(Tibetan)%Tibetan-sort
构成藏文音节的字母具有一定的顺序,ISO/IEC 10646(Tibetan)中每个藏文字符规定了排序码,但是藏文音节的构造复杂性使得藏文不能直接按构成藏文音节的字母顺序来排序,也不能直接应用这些排序码,提出了基于ISO/IEC 10646(Tibetan)的藏文排序算法,主要思想是:从文本中读入藏文音节,并把它转化为一维的字母串;识别基字及调整构成藏文音节的字母(构件)顺序,并且在缺构件位置上添加相应的空格符;用快速排序法对藏文音节串进行排序;构成藏文音节的字母(构件)顺序调回到原来的顺序,去除空格符,并输出.
構成藏文音節的字母具有一定的順序,ISO/IEC 10646(Tibetan)中每箇藏文字符規定瞭排序碼,但是藏文音節的構造複雜性使得藏文不能直接按構成藏文音節的字母順序來排序,也不能直接應用這些排序碼,提齣瞭基于ISO/IEC 10646(Tibetan)的藏文排序算法,主要思想是:從文本中讀入藏文音節,併把它轉化為一維的字母串;識彆基字及調整構成藏文音節的字母(構件)順序,併且在缺構件位置上添加相應的空格符;用快速排序法對藏文音節串進行排序;構成藏文音節的字母(構件)順序調迴到原來的順序,去除空格符,併輸齣.
구성장문음절적자모구유일정적순서,ISO/IEC 10646(Tibetan)중매개장문자부규정료배서마,단시장문음절적구조복잡성사득장문불능직접안구성장문음절적자모순서래배서,야불능직접응용저사배서마,제출료기우ISO/IEC 10646(Tibetan)적장문배서산법,주요사상시:종문본중독입장문음절,병파타전화위일유적자모천;식별기자급조정구성장문음절적자모(구건)순서,병차재결구건위치상첨가상응적공격부;용쾌속배서법대장문음절천진행배서;구성장문음절적자모(구건)순서조회도원래적순서,거제공격부,병수출.
The component letters of Tibetan syllables have certain ordering, each Tibetan character has stipulated the sorting code in the ISO/IEC 10646(Tibetan), but the structural complexity of Tibetan syllables cause that Tibetan cannot be sorted according to the order of letters which form Tibetan syllables and cannot use their sorting codes directly, this paper proposes the Tibetan-sort algorithm based on the ISO/IEC 10646(Tibetan), the main idea is: it reads in Tibetan syllables from the text, and transforms them into the one-dimensional letters string; it recognizes the base characters and adjusts the order of letters which form Tibetan syllables and add corresponding blank characters in the positions of lacking letters which form Tibetan syllables;it sorts Tibetan syllable string with the quick-sort method;it adjusts the ordering of component letters of Tibetan syllables back to the original ordering, removes the blank characters, and outputs as well.