软件学报
軟件學報
연건학보
JOURNAL OF SOFTWARE
2013年
5期
1295-1304
,共10页
庄凌%庄越挺%吴江琴%叶振超%吴飞
莊凌%莊越挺%吳江琴%葉振超%吳飛
장릉%장월정%오강금%협진초%오비
图像检索%文本%语义%稀疏典型性相关分析%视觉单词
圖像檢索%文本%語義%稀疏典型性相關分析%視覺單詞
도상검색%문본%어의%희소전형성상관분석%시각단사
image retrieval%text%semantics%sparse canonical correlation analysis%visual word
图像语义检索的一个关键问题就是要找到图像底层特征与语义之间的关联,由于文本是表达语义的一种有效手段,因此提出通过研究文本与图像两种模态之间关系来构建反映两者间潜在语义关联的有效模型的思路。基于该模型,可使用自然语言形式(文本语句)来表达检索意图,最终检索到相关图像。该模型基于稀疏典型性相关分析(sparse canonical correlation analysis,简称sparse CCA),按照如下步骤训练得到:首先利用隐语义分析方法构造文本语义空间,然后以视觉词袋(bag of visual words)来表达文本所对应的图像,最后通过Sparse CCA算法找到一个语义相关空间,以实现文本语义与图像视觉单词间的映射。使用稀疏的相关性分析方法可以提高模型可解释性和保证检索结果稳定性。实验结果验证了Sparse CCA方法的有效性,同时也证实了所提出的图像语义检索方法的可行性。
圖像語義檢索的一箇關鍵問題就是要找到圖像底層特徵與語義之間的關聯,由于文本是錶達語義的一種有效手段,因此提齣通過研究文本與圖像兩種模態之間關繫來構建反映兩者間潛在語義關聯的有效模型的思路。基于該模型,可使用自然語言形式(文本語句)來錶達檢索意圖,最終檢索到相關圖像。該模型基于稀疏典型性相關分析(sparse canonical correlation analysis,簡稱sparse CCA),按照如下步驟訓練得到:首先利用隱語義分析方法構造文本語義空間,然後以視覺詞袋(bag of visual words)來錶達文本所對應的圖像,最後通過Sparse CCA算法找到一箇語義相關空間,以實現文本語義與圖像視覺單詞間的映射。使用稀疏的相關性分析方法可以提高模型可解釋性和保證檢索結果穩定性。實驗結果驗證瞭Sparse CCA方法的有效性,同時也證實瞭所提齣的圖像語義檢索方法的可行性。
도상어의검색적일개관건문제취시요조도도상저층특정여어의지간적관련,유우문본시표체어의적일충유효수단,인차제출통과연구문본여도상량충모태지간관계래구건반영량자간잠재어의관련적유효모형적사로。기우해모형,가사용자연어언형식(문본어구)래표체검색의도,최종검색도상관도상。해모형기우희소전형성상관분석(sparse canonical correlation analysis,간칭sparse CCA),안조여하보취훈련득도:수선이용은어의분석방법구조문본어의공간,연후이시각사대(bag of visual words)래표체문본소대응적도상,최후통과Sparse CCA산법조도일개어의상관공간,이실현문본어의여도상시각단사간적영사。사용희소적상관성분석방법가이제고모형가해석성화보증검색결과은정성。실험결과험증료Sparse CCA방법적유효성,동시야증실료소제출적도상어의검색방법적가행성。
A key issue of semantic-based image retrieval is how to bridge the semantic gap between the low-level feature of image and high-level semantics, which can be expressed by means of free text effectively. The cross-modal relationship between the text and image is studied by a modeling semantic correlation between text and image. Based on the model, an approach to image retrieval is proposed so that images are retrieved according to meaning of the query text rather than query keywords. First, an algorithm for solving sparse canonical correlation analysis (CCA) is designed in this paper. Then a semantic space is learned by way of latent semantic analysis from text corpus, and images are represented by bag of visual words. After that, a semantic correlation space, by which the map between visual words of image and the high-level semantics is made explicit, can be constructed. The proposed method solves CCA in a sparse framework in order to make the result more interpretable and stable. The experimental result demonstrates that Sparse CCA outperform CCA in the context, and also substantiates the feasibility of the proposed approach to image retrieval.