-
公开(公告)号:KR101266504B1
公开(公告)日:2013-05-24
申请号:KR1020120006633
申请日:2012-01-20
Applicant: 성균관대학교산학협력단
CPC classification number: G06F17/30705 , G06F17/15 , G06F17/277
Abstract: PURPOSE: A method for extracting topic words from a set of documents using richness is provided to weigh and rank topic words by comparing coverage of documents for candidate topic words. CONSTITUTION: A user terminal extracts candidate topic words from a set of documents using a term extraction algorithm(S101). The user terminal generates groups of documents by grouping the documents related to each of topic words(S102). The user terminal searches documents related to each of topic words and obtains the search results(S103). The user terminal generates clusters of the search results by a clustering algorithm(S104). The clusters are assumed as sub topics of each topic word(S105). The user terminal calculates similarity among the clusters which make up the clusters of the sub topics of each topic word. [Reference numerals] (AA) Start; (BB) End; (S101) Extract one or more candidate topic words from a set of documents; (S102) Group documents of the document set corresponding to each of the extracted topic words; (S103) Search related documents with a query about an online search target set using each of the extracted topic words; (S104) Generate clusters by clustering the extracted related documents; (S105) Assume the clusters generated based on each topic word as sub topics of each topic word; (S106) Calculate similarity between clusters which make up clusters corresponding to the sub topics of each topic word and documents which make up documents grouped based on each topic word; (S107) Match a cluster most similar to each document with respect to topic word; (S108) Calculate richness scores using the number of matched clusters and the quantitative value of the similarity to the clusters; (S109) Arrange topic words based on the richness scores and select a predetermined number(N) of topic words from the highest score as final topic words
Abstract translation: 目的:提供一种使用丰富性从一组文档中提取主题词的方法,通过比较候选主题词的文档覆盖率来对主题词进行权重和排名。 构成:用户终端使用术语提取算法从一组文档中提取候选主题词(S101)。 用户终端通过对与每个主题词相关的文档进行分组来生成文档组(S102)。 用户终端搜索与每个主题词相关的文档并获得搜索结果(S103)。 用户终端通过聚类算法生成搜索结果的集群(S104)。 集群被假设为每个主题词的子主题(S105)。 用户终端计算构成每个主题词的子主题的集群的集群之间的相似度。 (附图标记)(AA)开始; (BB)结束; (S101)从一组文档中提取一个或多个候选主题词; (S102)对应于每个提取的主题词的文档集合的文档; (S103)使用每个提取的主题词搜索关于在线搜索目标集的查询的相关文档; (S104)通过聚类提取的相关文档生成集群; (S105)假设基于每个主题词生成的群集作为每个主题词的子主题; (S106)计算构成与每个主题词的子主题相对应的群集的群组之间的相似性,以及基于每个主题词分组的文档的组合; (S107)匹配与主题词最相似的每个文档的集群; (S108)使用匹配簇的数量和与簇的相似性的定量值来计算丰度度分数; (S109)基于丰富度分数排列主题词,并从最高分选择预定数量(N)个主题词作为最终主题词
-
2.
公开(公告)号:KR1020120110799A
公开(公告)日:2012-10-10
申请号:KR1020110028911
申请日:2011-03-30
Applicant: 성균관대학교산학협력단
CPC classification number: G06F9/45533 , G06F9/445 , G06F9/45558 , G06F17/3007
Abstract: PURPOSE: A pattern diversification system and a pattern diversification method using an intelligent agent are provided to apply an emotion model for generation of various emotion in the generation of the emotion model. CONSTITUTION: An input unit(10) receives IVA(Intelligent Virtual Agent) behavior data according to the input of a user. An IVA(20) classifies the IVA behavior data received from the input unit. The IVA extracts an IVA feature value stored in a storing unit(30) from the IVA behavior data. The IVA generates a regulation based on a learning result. [Reference numerals] (10) Input unit; (20) Virtual IVA; (30) Storing unit; (40) Output unit; (AA) VA feature value; (BB) VA behavior data
Abstract translation: 目的:提供一种使用智能代理的图案多样化系统和图案多样化方法,以在情感模型的产生中应用情感模型来产生各种情感。 构成:输入单元(10)根据用户的输入接收IVA(智能虚拟代理)行为数据。 IVA(20)对从输入单元接收的IVA行为数据进行分类。 IVA从IVA行为数据中提取存储在存储单元(30)中的IVA特征值。 IVA根据学习结果产生规定。 (附图标记)(10)输入单元; (20)虚拟IVA; (30)存储单元; (40)输出单元; (AA)VA特征值; (BB)VA行为数据
-