化学学报
化學學報
화학학보
ACTA CHIMICA SINICA
2001年
2期
241-246
,共6页
王利莎%袁身刚%欧阳政%郑崇直
王利莎%袁身剛%歐暘政%鄭崇直
왕리사%원신강%구양정%정숭직
拓扑距离%宽度优先搜索%新Morgan算法%B-树算法
拓撲距離%寬度優先搜索%新Morgan算法%B-樹算法
탁복거리%관도우선수색%신Morgan산법%B-수산법
介绍了目标化合物析分系统中所用到的三个重要算法.它们是:最短拓扑距离的求解、析分过程结束的判别以及合成树的构建.分子结构中任意两个原子之间最短拓扑距离的求解是建立在采用队列数据结构的宽度优先搜索算法基础上的.析分过程结束的判别是由在新Morgan算法基础上产生的化合物的唯一编码和B-树两种算法构成.前者是为了将前体与原料库中每个化合物是否同构的复杂问题简化为码之间的比较问题;后者是一种高效文件组织方式,将代表原料库中化合物的唯一编码作为检索键来组织建库,从而实现对原料库的快速查询.合成树采用的是链表存储方式,每一个结点由六个域组成,且在建树和画树的过程中,均用前序遍历.这些算法是实现析分系统的基础,因此它们的正确设计与高效实现就显得尤为重要.
介紹瞭目標化閤物析分繫統中所用到的三箇重要算法.它們是:最短拓撲距離的求解、析分過程結束的判彆以及閤成樹的構建.分子結構中任意兩箇原子之間最短拓撲距離的求解是建立在採用隊列數據結構的寬度優先搜索算法基礎上的.析分過程結束的判彆是由在新Morgan算法基礎上產生的化閤物的唯一編碼和B-樹兩種算法構成.前者是為瞭將前體與原料庫中每箇化閤物是否同構的複雜問題簡化為碼之間的比較問題;後者是一種高效文件組織方式,將代錶原料庫中化閤物的唯一編碼作為檢索鍵來組織建庫,從而實現對原料庫的快速查詢.閤成樹採用的是鏈錶存儲方式,每一箇結點由六箇域組成,且在建樹和畫樹的過程中,均用前序遍歷.這些算法是實現析分繫統的基礎,因此它們的正確設計與高效實現就顯得尤為重要.
개소료목표화합물석분계통중소용도적삼개중요산법.타문시:최단탁복거리적구해、석분과정결속적판별이급합성수적구건.분자결구중임의량개원자지간최단탁복거리적구해시건립재채용대렬수거결구적관도우선수색산법기출상적.석분과정결속적판별시유재신Morgan산법기출상산생적화합물적유일편마화B-수량충산법구성.전자시위료장전체여원료고중매개화합물시부동구적복잡문제간화위마지간적비교문제;후자시일충고효문건조직방식,장대표원료고중화합물적유일편마작위검색건래조직건고,종이실현대원료고적쾌속사순.합성수채용적시련표존저방식,매일개결점유륙개역조성,차재건수화화수적과정중,균용전서편력.저사산법시실현석분계통적기출,인차타문적정학설계여고효실현취현득우위중요.
This article presents the three most important algorithms used in the Target Parsing System. They are the shortest topological distance between two atoms, the determination of the termination of the parsing process and the construction of the synthesis tree. The shortest topological distance between two atoms in a structure was designed according to the widely used breadth first search algorithm which uses a data structure called Queue. The determination of the termination of the parsing process consists of the canonicalisation algorithm based on the new Morgan algorithm and the B-tree algorithm. The former is used to convert the complicated isomorphic problem into the comparison between two unique codes. The latter is one of efficient file organizations, in which the unique code is used to represent the compound in the material data base. All these codes are organized in a B-tree. Thus, for a given compound it is very fast to know if it is contained in the data base by searching in the B-tree. The construction of the synthesis tree adopts data structure of list. Each node of the list consists of six rields. Both the generation and drawing of the synthesis tree use the preorder technique. Since these algorithms constitute the core of the Target Parsing System, their proper design and implementation is extremely important.