Information retrieval system, method, and program
    1.
    发明专利
    Information retrieval system, method, and program 有权
    信息检索系统,方法和程序

    公开(公告)号:JP2009211263A

    公开(公告)日:2009-09-17

    申请号:JP2008051871

    申请日:2008-03-03

    CPC classification number: G06F17/30684 G06F17/30625

    Abstract: PROBLEM TO BE SOLVED: To provide a technology for retrieving documents matching a dependency pattern at a high speed from a large volume of text documents. SOLUTION: An index creation part creates an index for acquiring by sequential access the array of appearance information (document ID, position on tree) of a node even from each word appearing as the node of the tree of a syntax analytic result. A query input part receives a query from a user or an external application. The query is configured of a retrieval pattern, a pivot (node as the reference of retrieval pattern extension), the maximum depth difference in the case of retrieving an extended node form the pivot, the maximum number of extended nodes to be presented in the order of frequency and a flag designating whether to retrieve the high rank node. The index reading part obtains the appearance information array of the pivot at a place matched with the retrieval pattern. The retrieval is performed until it reaches any node connecting the route to the pivot. COPYRIGHT: (C)2009,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种用于从大量文本文档中高速检索匹配依赖模式的文档的技术。 解决方案:索引创建部分创建一个索引,用于通过顺序访问节点数组(文档ID,树上的位置),即使从出现作为语法分析结果树的节点的每个单词获取索引。 查询输入部分接收来自用户或外部应用程序的查询。 该查询由检索模式,枢轴(节点作为检索模式扩展的参考)配置,在从扩展节点形成枢轴的情况下获取最大深度差,按顺序呈现的扩展节点的最大数量 的频率和指定是否检索高等级节点的标志。 索引读取部分在与检索图案匹配的位置处获得枢轴的外观信息数组。 执行检索,直到到达连接到枢轴的路线的任何节点。 版权所有(C)2009,JPO&INPIT

    Computer system for creating term dictionary with named entities or terminologies included in text data, and method and computer program therefor
    2.
    发明专利
    Computer system for creating term dictionary with named entities or terminologies included in text data, and method and computer program therefor 有权
    用于创建包含文本数据的有名实体或术语的终止词典的计算机系统及其方法和计算机程序

    公开(公告)号:JP2010157178A

    公开(公告)日:2010-07-15

    申请号:JP2009000192

    申请日:2009-01-05

    CPC classification number: G06F17/2755 G06F17/2735 G06F17/278

    Abstract: PROBLEM TO BE SOLVED: To find a word to be registered from newly added text without omission, and to efficiently perform an operation when constructing a term dictionary of word categories. SOLUTION: A computer system includes a morphological analysis unit which acquires token sequence data by performing morphological analysis for text data, a category distinguishing unit which distinguishes respective tokens of the token sequence data by using a category dictionary to extract uncategorized words, an uncategorized-word comparing unit which compares each of the extracted uncategorized words with an uncategorized-word comparison rule to extract an uncategorized word matching the uncategorized-word comparison rule as a registration candidate word, and a token-sequence comparing unit which compares a token sequence of the token sequence data with a token-sequence comparison rule to extract a token sequence matching the token-sequence comparison rule as registration candidate words, and comprises a permission unit which permits a user to select whether to register the registration candidate words in the category dictionary. COPYRIGHT: (C)2010,JPO&INPIT

    Abstract translation: 要解决的问题:从新添加的文本中找到要注册的单词而不会遗漏,并且在构建词类的术语词典时有效地执行操作。 解决方案:计算机系统包括通过对文本数据进行形态分析来获取令牌序列数据的形态分析单元,通过使用类别字典来区分令牌序列数据的各个令牌以提取未分类的字的类别区分单元, 未分类词比较单元,其将提取的未分类词与未分类词比较规则进行比较,以提取与未分类词比较规则作为注册候选词匹配的未分类词,以及令牌序列比较单元,其将令牌序列 使用令牌序列比较规则提取令牌序列数据,以提取与令牌序列比较规则匹配的令牌序列作为注册候选词,并且包括许可单元,其允许用户选择是否在类别中注册注册候选词 字典。 版权所有(C)2010,JPO&INPIT

    Index preparation system, method and program for database
    3.
    发明专利
    Index preparation system, method and program for database 有权
    索引编制系统,数据库的方法和程序

    公开(公告)号:JP2009003541A

    公开(公告)日:2009-01-08

    申请号:JP2007161524

    申请日:2007-06-19

    CPC classification number: G06F17/30616

    Abstract: PROBLEM TO BE SOLVED: To quickly prepare the index of a large-scaled database regardless of the restriction of the capacity of a main storage.
    SOLUTION: A document set is analyzed into sub-sets which do not have any common sections. The set of keywords appearing in the divided sub-sets are grouped by a remainder calculated by dividing hash values of the keywords by a certain fixed integer value, and an index file corresponding to each group is created. The index files prepared for each of the sub-sets of the document having the same group numbers are merged. Thus, the integrated index files corresponding to individual group numbers are generated. The index files exist only by the number of pieces of the group numbers, and the indexes corresponding to the whole document set are not obtained. Then, the index files existing only by the number of pieces of such group numbers are further merged, and an index file corresponding to the whole document set is generated.
    COPYRIGHT: (C)2009,JPO&INPIT

    Abstract translation: 要解决的问题:无论主存储容量的限制如何,都可以快速准备大型数据库的索引。

    解决方案:将文档集分析为不具有任何公共部分的子集。 在分割子集中出现的关键字集合通过将关键字的哈希值除以某个固定的整数值而计算的余数,并且创建与每个组对应的索引文件。 为具有相同组号的文档的每个子集准备的索引文件被合并。 因此,生成与各组号对应的综合索引文件。 索引文件只存在组号的数量,不能获得与整个文档集对应的索引。 然后,进一步合并仅存在这样的组号的数量的索引文件,生成与整个文档集对应的索引文件。 版权所有(C)2009,JPO&INPIT

    Retrieval system, retrieval method, reporting system, reporting method, and program
    4.
    发明专利
    Retrieval system, retrieval method, reporting system, reporting method, and program 有权
    检索系统,检索方法,报告系统,报告方法和程序

    公开(公告)号:JP2006031194A

    公开(公告)日:2006-02-02

    申请号:JP2004206567

    申请日:2004-07-13

    Abstract: PROBLEM TO BE SOLVED: To retrieve document data while appropriately reflecting the contents of a retrieving sentence and appropriately detect the occurrence of problems from document data sequentially added. SOLUTION: A retrieval system retrieves document data including the contents of retrieving sentences from a plurality of document data. The retrieval system is provided with; a document database which stores a plurality of document data; a concept database which stores a plurality of concepts with hierarchical structure; a document data concept extraction part which extracts document concepts corresponding to document data based on keywords included in respective document data; a retrieving sentence concept extraction part which extracts retrieving sentence concepts based on keyword included in the retrieving sentences; a concept retrieval part which retrieves the document data, in which the retrieving sentence concepts become the concepts of upper hierarchy or lower hierarchy of document concepts, among a plurality of pieces of document data; and a retrieval result output part which outputs the document data retrieved by the concept retrieval part as the document data including contents designated by the retrieving sentences. COPYRIGHT: (C)2006,JPO&NCIPI

    Abstract translation: 要解决的问题:在适当地反映检索句子的内容的同时检索文档数据,并从顺序添加的文档数据中适当地检测问题的发生。 解决方案:检索系统从多个文档数据中检索包括检索句子的内容的文档数据。 提供检索系统; 存储多个文档数据的文档数据库; 存储具有分层结构的多个概念的概念数据库; 文档数据概念提取部,其基于包含在各文档数据中的关键词提取与文档数据对应的文档概念; 检索句子概念提取部,其基于检索句子中包含的关键词提取检索句子概念; 一种概念检索部分,其在多个文档数据中检索其中检索句子概念成为文档概念的上层或下层的概念的文档数据; 以及检索结果输出部,其输出由概念检索部检索到的文档数据作为包括由检索句子指定的内容的文档数据。 版权所有(C)2006,JPO&NCIPI

    Apparatus, method, and program for acquiring evaluation information of object of interest
    5.
    发明专利
    Apparatus, method, and program for acquiring evaluation information of object of interest 有权
    用于获取利益对象评估信息的装置,方法和程序

    公开(公告)号:JP2009223745A

    公开(公告)日:2009-10-01

    申请号:JP2008069112

    申请日:2008-03-18

    Abstract: PROBLEM TO BE SOLVED: To provide a technology for acquiring evaluation information of an object in a virtual-reality space. SOLUTION: An evaluation acquisition apparatus includes: a first conversation data acquiring section for acquiring neighboring conversation data composed of conversations which are made around an object of interest to be evaluated; a second conversation data acquiring section for acquiring wide-range conversation data composed of conversations made in an area in a virtual-reality space wider than a neighboring area of the object of interest; and an acquiring section for specifying expressions frequently appearing in the neighboring conversation data by use of the neighboring conversation data and the wide-range conversation data to acquire the expressions as the evaluation information of the object of interest. COPYRIGHT: (C)2010,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种用于获取虚拟现实空间中的对象的评估信息的技术。 评估获取装置包括:第一对话数据获取部分,用于获取由待评估的关注对象周围的对话组成的邻近对话数据; 第二会话数据获取部分,用于获取由比感兴趣对象的相邻区域更宽的虚拟现实空间中的区域中进行的对话组成的宽范围对话数据; 以及获取部分,用于通过使用相邻会话数据和广范围对话数据来指定频繁出现在相邻会话数据中的表达,以获取表达式作为感兴趣对象的评估信息。 版权所有(C)2010,JPO&INPIT

    Method and device for evaluating trend analysis system
    6.
    发明专利
    Method and device for evaluating trend analysis system 有权
    用于评估趋势分析系统的方法和装置

    公开(公告)号:JP2008146319A

    公开(公告)日:2008-06-26

    申请号:JP2006332192

    申请日:2006-12-08

    CPC classification number: G06F17/30731 G06Q10/0639

    Abstract: PROBLEM TO BE SOLVED: To provide a method and a system for evaluating a trend analysis system. SOLUTION: The device for evaluating a trend analysis system as a first embodiment to solve the problem comprises: an acceptable value input part for receiving acceptable values of false positives in which data which is irrelevant is judged as relevant, and acceptable values of false negatives in which data which is relevant is judged as irrelevant; and an accuracy rate calculation part for calculating the accuracy rate of the system, comprising a weight determination part for reading the accuracy data showing correctly whether there is a relation between the data of the existing data aggregate stored in the storage device from the storage device, and determining the weight to the number of the false positives of the system and the weight to the number of the false negatives from the acceptable values of the false positives and the acceptable values of the false negatives using the accuracy data, and a calculation part for calculating the accuracy rate of the system from the number and the weight of the false positives, the number and the weight of the false negatives, and the total number of the data. COPYRIGHT: (C)2008,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种评估趋势分析系统的方法和系统。 解决方案:用于评估作为解决问题的第一实施例的趋势分析系统的装置包括:用于接收可接受的误报值的可接受值输入部分,其中不相关的数据被判断为相关,以及可接受的值 将相关数据视为无关的假否定; 以及用于计算系统的精度率的精度率计算部分,包括:权重确定部分,用于读取正确显示存储装置中存储的存储装置中存在的现有数据集合的数据之间的关系的精度数据; 并使用准确度数据,从假阳性的可接受值和假阴性的可接受值确定系统的假阳性数量的权重和对假阴性数量的权重,以及计算部分 从假阳性的数量和权重,假阴性的数量和重量以及数据的总数来计算系统的准确率。 版权所有(C)2008,JPO&INPIT

    Device, method and program for visualizing boolean expression
    7.
    发明专利
    Device, method and program for visualizing boolean expression 有权
    用于可视化布尔表达的设备,方法和程序

    公开(公告)号:JP2008097467A

    公开(公告)日:2008-04-24

    申请号:JP2006280612

    申请日:2006-10-13

    CPC classification number: G06F17/30967

    Abstract: PROBLEM TO BE SOLVED: To provide a device, a method and a program for visualizing a Boolean expression so as to easily clarify what is added and what is excluded as conditions.
    SOLUTION: In a Boolean expression to be visualized is input in the form of a binary tree in which a leaf node represents an operand in the Boolean expression, and nodes other than the leaf node represent operators in the Boolean expression. The input binary tree is converted into a two-dimensional nested expression composed of a plurality of areas, and a drawing expression for visualization is created from the nested expression and displayed. When the Boolean expression is given as a character string expression, the character string expression is converted into a binary tree.
    COPYRIGHT: (C)2008,JPO&INPIT

    Abstract translation: 要解决的问题:提供用于可视化布尔表达式的装置,方法和程序,以便容易地澄清添加的内容和作为条件排除的内容。

    解决方案:在要可视化的布尔表达式中,以二进制树的形式输入叶节点表示布尔表达式中的操作数,除叶节点以外的节点表示布尔表达式中的运算符。 输入二进制树被转换为由多个区域组成的二维嵌套表达式,并且从嵌套表达式创建用于可视化的绘制表达式并显示。 当布尔表达式作为字符串表达式给出时,字符串表达式将转换为二叉树。 版权所有(C)2008,JPO&INPIT

    Verfahren zum Klassifizieren von Texteinheiten auf der Grundlage von Bewertungsgegensätzen, Computerprogrammprodukt und Computer dafür

    公开(公告)号:DE112013002187T5

    公开(公告)日:2015-01-08

    申请号:DE112013002187

    申请日:2013-03-29

    Applicant: IBM

    Abstract: Um eine Analyse von Rezensionstexteinheiten bei begrenzter Zeit/begrenzten Ressourcen rationell durchzuführen, wird eine Technik zum rationellen Entnehmen bestimmter Texteinheiten, auf die jemand (ein Analyst) verweisen soll, aus einer großen Anzahl von Rezensionstexteinheiten bereitgestellt. Verfahren zum Entnehmen bestimmter Texteinheiten aus einer Vielzahl von Texteinheiten durch einen Computer, das die folgenden Schritte beinhaltet: erstens Bewerten eines Umfangs positiver Ausdrücke und eines Umfangs negativer Ausdrücke in jeder der Texteinheiten; zweitens Bewerten jeder der Texteinheiten auf der Grundlage einer Vielzahl von Bewertungsfunktionen, wobei zumindest bestimmte Bewertungsfunktionen aus der Vielzahl von Bewertungsfunktionen den Umfang der positiven Ausdrücke und den Umfang der negativen Ausdrücke als Variablen verwenden; und Entnehmen einer Texteinheit, deren Bewertungsergebnis einen höheren Punktwert aufweist, bevorzugt gegenüber einer Texteinheit mit einem niedrigeren Punktwert, wobei die einzelnen Bewertungsergebnisse auf derselben Bewertungsfunktion aus der Vielzahl von Bewertungsfunktionen beruhen.

    Retrieval system, retrieval method, reporting system, reporting method, and program
    9.
    发明专利
    Retrieval system, retrieval method, reporting system, reporting method, and program 有权
    检索系统,检索方法,报告系统,报告方法和程序

    公开(公告)号:JP2010211821A

    公开(公告)日:2010-09-24

    申请号:JP2010111458

    申请日:2010-05-13

    Abstract: PROBLEM TO BE SOLVED: To retrieve a document data while appropriately reflecting the content of a retrieving sentence, and to appropriately detect the occurrence of a problem out of document data sequentially added. SOLUTION: This retrieval system retrieves the document data including the content of the retrieving sentence from the plurality of document data, and includes a document database for storing the plurality of document data, a concept database for storing a plurality of concepts by hierarchical structure, a document data concept extraction part for extracting document concept corresponding to document data, based on a keyword included in each document data, a retrieving sentence concept extraction part for extracting a retrieving sentence concept, based on the keyword included in the retrieving sentence, a concept retrieval part for retrieving the document data with the retrieving sentence concept serving as an upper hierarchy or lower hierarchy of the document concept, out of the plurality of document data, and a retrieval result output part for outputting the document data retrieved by the concept retrieval part, as the document data including the content assigned by the retrieving sentence. COPYRIGHT: (C)2010,JPO&INPIT

    Abstract translation: 要解决的问题:在适当地反映检索句子的内容的同时检索文档数据,并且从顺序添加的文档数据中适当地检测问题的发生。 解决方案:该检索系统从多个文档数据中检索包括检索句子的内容的文档数据,并且包括用于存储多个文档数据的文档数据库,用于通过分层存储多个概念的概念数据库 基于包括在每个文档数据中的关键词提取与文档数据相对应的文档概念的文档数据概念提取部分,用于提取检索句子概念的检索语句提取部分,基于检索句子中包含的关键字, 一种概念检索部分,用于利用在多个文档数据中作为文档概念的上层或下层的检索句子概念检索文档数据,以及检索结果输出部分,用于输出由概念检索的文档数据 检索部分,作为包括通过检索分配的内容的文档数据 句子。 版权所有(C)2010,JPO&INPIT

    Character string processing method and device, and program
    10.
    发明专利
    Character string processing method and device, and program 有权
    字符串处理方法和设备,程序

    公开(公告)号:JP2007172404A

    公开(公告)日:2007-07-05

    申请号:JP2005370970

    申请日:2005-12-22

    CPC classification number: G06F17/276

    Abstract: PROBLEM TO BE SOLVED: To provide a method for efficient document masking. SOLUTION: As a first mode, this method has steps of: decomposing a character string inside a document into partial character strings; calculating a score including appearance frequency in each the partial character string; presenting the score and the partial character string to a user; deciding the partial character string selected by the user; storing the selected partial character string as a safe character string list; and replacing the partial character string inside the document except the partial character string present in the safe character string list with a prescribed replacement character string. COPYRIGHT: (C)2007,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种高效文档掩蔽的方法。

    解决方案:作为第一种模式,该方法具有以下步骤:将文档内的字符串分解为部分字符串; 计算每个部分字符串中的出现频率的分数; 向用户呈现分数和部分字符串; 决定用户选择的部分字符串; 将所选择的部分字符串存储为安全字符串列表; 并且用规定的替换字符串替换除了安全字符串列表中存在的部分字符串之外的文档内部的部分字符串。 版权所有(C)2007,JPO&INPIT

Patent Agency Ranking