CAJ | 학술논문

随着数据量的不断增加，快速而准确的索引算法对信息检索而言变得十分重要。针对上述问题，提出了一种基于子空间学习的索引算法。首先，利用部分有标签的数据进行子空间学习，在学习过程中，为了保证语义相同的样本在索引后保持局部性，以样本近邻间的距离衡量类内聚合度；同时，为了保证不同语义的样本在索引后增强判别性，以不同语义样本中心之间的距离衡量类间离散度。通过放松限制，用类似线性判别分析的方法进行子空间学习，将子空间作为哈希函数的投影向量。利用学习到的投影向量进一步计算偏移量，得到哈希函数。分别在数据集MNIST和CIFAR-10上进行编码判别性实验和局部性保留实验，并与相关方法进行比较，得到了较好的效果。实验结果表明该方法是有效的。
수착수거량적불단증가，쾌속이준학적색인산법대신식검색이언변득십분중요。침대상술문제，제출료일충기우자공간학습적색인산법。수선，이용부분유표첨적수거진행자공간학습，재학습과정중，위료보증어의상동적양본재색인후보지국부성，이양본근린간적거리형량류내취합도；동시，위료보증불동어의적양본재색인후증강판별성，이불동어의양본중심지간적거리형량류간리산도。통과방송한제，용유사선성판별분석적방법진행자공간학습，장자공간작위합희함수적투영향량。이용학습도적투영향량진일보계산편이량，득도합희함수。분별재수거집MNIST화CIFAR-10상진행편마판별성실험화국부성보류실험，병여상관방법진행비교，득도료교호적효과。실험결과표명해방법시유효적。
With the increasing amount of data being collected, developing fast indexing methods with high accuracy becomes important for information retrieval tasks. To address this issue, this paper proposes an indexing method based on hashing mechanism with subspace learning. Firstly, the subspace is learned on a set of labeled data. To guarantee the locality preserving characteristics in the original space for the samples with similar semantic labels, the distances between the nearest neighbors are computed to measure the intra-class scatter. Besides, the distances between the centers of samples with dissimilar semantic labels are also computed to measure the inter-class scatter in order to enhance the discriminative power of the codes. The projections of the hash functions are then learned by relaxing the constraint of the formula. The biases are further learned based on the projections. Finally, the proposed method is evaluated on the datasets MNIST and CIFAR-10 to compare with the state-of-the-art methods. Experimental results show that the proposed method achieves significant performance and high effectiveness in searching semantically similar neighbors.