DATA PROCESSING METHOD, DATA PROCESSING SYSTEM, AND PROGRAM
    1.
    发明公开
    DATA PROCESSING METHOD, DATA PROCESSING SYSTEM, AND PROGRAM 审中-公开
    数据处理,数据处理系统和程序

    公开(公告)号:EP1429258A4

    公开(公告)日:2007-08-29

    申请号:EP02746128

    申请日:2002-07-19

    Applicant: IBM

    Abstract: A supporting system for efficiently creating a synonym candidate when a thesaurus used in text mining is compiled and a method for creating a synonym candidate are disclosed. A synonym candidate acquiring device (130) creates, for each author, an author synonym candidate set containing synonym candidates similar to an inputted word from data (110) on the author and creates a whole synonym candidate set containing synonym candidates similar to the inputted word from the whole data (120). A synonym candidate judging device (150) evaluates the synonym candidates of the whole data (120) on receiving the created synonym candidate set (140). During the evaluation, a status, “absolute”, is added to a word agreeing with the word rating as the first place in the synonym candidates for each author; and a status, “negative”, is added to a word agreeing with the word rating as the second or later place.

    METHOD AND SYSTEM FOR PROCESSING DOCUMENT AND MEDIUM

    公开(公告)号:JP2002032770A

    公开(公告)日:2002-01-31

    申请号:JP2000190335

    申请日:2000-06-23

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To extract a meaningful text block from a document that is optionally subjected to a layout, such as a table, itemization and a multicolumn composition. SOLUTION: A document subjected to a layout with blanks, etc., is inputted and a symbol associated with the spatial coordinates of the document is acquired. The continuation of the same type of characters is extracted from the symbol and tokens and spaces are generated. A stream is generated from spaces continuing in the column direction, and a text block is generated from the streams and the tokens. A link between text blocks is generated and defined as a document graph. The propriety of the connection (link) between the text blocks in the document graph is evaluated by using a language model, and when the connection is proper, the text blocks are merged.

    Computer system and method for outputting term in second language paired with term in first language to be translated, and computer program
    3.
    发明专利
    Computer system and method for outputting term in second language paired with term in first language to be translated, and computer program 有权
    计算机系统和方法,用于在第一语言中与第一语言进行翻译的第二语言进行翻译,并进行翻译和计算机程序

    公开(公告)号:JP2010055298A

    公开(公告)日:2010-03-11

    申请号:JP2008218444

    申请日:2008-08-27

    Abstract: PROBLEM TO BE SOLVED: To provide a means for performing text mining of document data written in a language other than mother language or familiar language, and for satisfying a request for retrieval. SOLUTION: This computer system outputting a term in a second language paired with a term in a first language to be translated includes: a first extraction part for extracting a co-occurrence term which co-occurs with the term of the first language from a corpus of the first language; an output part for outputting a word translated in the second language corresponding to at least one of the extracted co-occurrence terms; a second extraction part for extracting translation candidates which co-occur with at least one of the output words in the second language from a corpus of the second language corresponding to the corpus of the first language; a weighting part for weighting each of the extracted translation candidates; and a generation part for optimizing the weight, and for generating the list of the translation pairs for the term in the first language according to the optimized weight. COPYRIGHT: (C)2010,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种用于执行以母语或熟悉语言以外的语言书写的文档数据的文本挖掘并满足检索请求的手段。 解决方案:该计算机系统输出与第一语言的术语配对的第二语言的术语,以便翻译包括:第一提取部分,用于提取与第一语言的术语共同出现的同现条件 从第一语言的语料库; 输出部分,用于输出对应于所提取的共现项中的至少一个的以第二语言翻译的单词; 第二提取部分,用于从与第一语言的语料库对应的第二语言的语料库中提取与第二语言的输出单词中的至少一个共同出现的翻译候选; 加权部分,用于对所提取的翻译候选中的每一个进行加权; 以及用于优化权重的生成部分,以及根据优化的权重生成第一语言中的术语的翻译对的列表。 版权所有(C)2010,JPO&INPIT

    Expression detection system, expression detection method and program
    4.
    发明专利
    Expression detection system, expression detection method and program 有权
    表达检测系统,表达式检测方法和程序

    公开(公告)号:JP2006146567A

    公开(公告)日:2006-06-08

    申请号:JP2004335906

    申请日:2004-11-19

    Abstract: PROBLEM TO BE SOLVED: To suitably detect liking expression indicating people's liking for a commodity or the like. SOLUTION: An expression detection system for detecting liking expression indicating evaluator's liking to a specific evaluation target from texts in which the evaluation of the specific evaluation target is described stores a plurality of texts in which the evaluation of the specific evaluation target is described corresponding to the attributes of respective texts, extracts the evaluation expression indicting the evaluation of the specific evaluation target from respective texts, judges whether the extracted evaluation expression is positive polarity indicating positive evaluation to the specific evaluation target or negative polarity indicating negative evaluation to the specific evaluation target, inputs the attribute of a text specified as an object for detecting liking expression, detects evaluation expression detected from the text having the inputted attribute as liking expression, and then outputs the liking expression corresponding to the frequency of judgement that the liking expression in the text having the attribute is positive polarity or negative polarity. COPYRIGHT: (C)2006,JPO&NCIPI

    Abstract translation: 要解决的问题:适当地检测表示人们喜欢商品的喜好表达等。 解决方案:用于检测表达评估者对于特定评估目标的评估的喜好表达的表达检测系统,其中描述了特定评估对象的评估的文本存储了描述特定评价对象的评价的多个文本 对应于各文本的属性,从各文本中提取表示具体评价对象的评价的评价表达,判断所提取的评价表达是否为表示对特定评价对象的肯定评价的正极性,或表示对特定评价对象的否定评价的负极性 评估对象,输入被指定为检测喜好表达的对象的文本的属性,检测从具有输入属性的文本中检测到的评价表达作为喜好表达,然后输出与判断的频率相对应的喜好表达, 具有该属性的文本中的表达式为正极性或负极性。 版权所有(C)2006,JPO&NCIPI

    METHOD AND SYSTEM FOR PROCESSING NATURAL LANGUAGE

    公开(公告)号:JPH08147299A

    公开(公告)日:1996-06-07

    申请号:JP28308794

    申请日:1994-11-17

    Applicant: IBM JAPAN

    Inventor: NASUKAWA TETSUYA

    Abstract: PURPOSE: To enable syntax analysis with a certain degree of accuracy even for any sentence by reanalyzing a sentence, which can not be analyzed, by utilizing the word train of a sentence, for which the syntax analysis is enabled, in the same context in syntax analystic processing depending on grammatical knowledge. CONSTITUTION: A morpheme analysis block 104 divides a sentence applied from an input block 102 and analyzes the part of speech or inflection of each word while referring to a dictionary and a syntax analysis block 106 performs processing for providing information to consist of tree structure based on the output information from the mo pheme analysis block 104. Further, a context analysis block 108 holds the context information of plural entier inputted snetences and performs processing to provide the exact analyzed result as much as possible by applying context information to the sentence showing the unsuitable analyzed result outputted from the syntax analysis block 106, namely, to the sentence (grammatically unsuitable sentence) not to be analyzed by the conventional syntax analysis depending on the grammatical knowledge. Thus, the accuracy of the natural language processing system is improved and the syntax analysis is enabled with a certain degree of accuracy.

    Future technology tend prediction support device, method, program and method for providing future technology tend prediciton support service
    6.
    发明专利
    Future technology tend prediction support device, method, program and method for providing future technology tend prediciton support service 有权
    未来技术趋势预测支持设备,方法,程序和方法,用于提供未来技术TEND PREDICITON支持服务

    公开(公告)号:JP2008282222A

    公开(公告)日:2008-11-20

    申请号:JP2007126046

    申请日:2007-05-10

    CPC classification number: G06Q30/02 G06Q10/00 G06Q30/0202

    Abstract: PROBLEM TO BE SOLVED: To provide information which is helpful for predicting technology tend by analyzing the subject description part and effect description part of a technical document. SOLUTION: This technology tend prediction support device is provided with: a description part extraction part for extracting a subject description part and an effect description part from a technical document; a technical expression extraction part for extracting technical expression expressing matters to be achieved by the technology from the subject description part and the effect description part; an influence degree decision part for deciding the degree of influence to be given to business by the matters expressed by the extracted technical expressions; a naming part for naming the extracted technical expressions; and a technology map creation part for creating a technology map. The created technology map has axes related with time to be spent on the achievement of the technology and the degree of influence to be given to business, and the names of the extracted technical expressions are displayed on pertinent coordinates. COPYRIGHT: (C)2009,JPO&INPIT

    Abstract translation: 要解决的问题:通过分析技术文档的主题描述部分和效果描述部分来提供有助于预测技术倾向的信息。 解决方案:该技术趋势预测支持装置具有:从技术文档中提取主题描述部分和效果描述部分的描述部分提取部分; 技术表达提取部,用于从对象描述部分和效果描述部分提取表达由技术实现的事情的技术表达; 影响程度决定部分,用于通过提取的技术表达形式表达的事项来决定给予企业的影响程度; 用于命名提取的技术表达的命名部分; 以及用于创建技术图的技术图创建部分。 创建的技术图具有与时间有关的轴,用于实现技术和对企业的影响程度,提取的技术表达的名称显示在相关坐标上。 版权所有(C)2009,JPO&INPIT

    Expression extractor, expression extraction method, program, and recording medium
    7.
    发明专利
    Expression extractor, expression extraction method, program, and recording medium 有权
    表达式提取器,表达式提取方法,程序和记录介质

    公开(公告)号:JP2005235014A

    公开(公告)日:2005-09-02

    申请号:JP2004045342

    申请日:2004-02-20

    CPC classification number: G06F17/2785

    Abstract: PROBLEM TO BE SOLVED: To provide an expression extraction device which extracts evaluation expressions showing evaluations of an evaluation object, from a text to properly determine a polarity. SOLUTION: The expression extraction device which extracts evaluation expressions from the text having descriptions on evaluations of a specific evaluation object is provided with; a registered expression storage part for registering an evaluation expression having a polarity predetermined, as a registered expression; an expression extraction part for extracting a plurality of evaluation expressions and a conjunction expression from the text; a registered expression detection part for detecting an evaluation expression including the registered expression registered in the registered expression storage part, out of the plurality of evaluation expressions; and a polarity determination part for determining that an evaluation expression which is in conjunction with the evaluation expression including the registered expression by a conjunction expression in a form of copulative conjunction and a series of evaluation expressions which are not in conjunction with the evaluation expression by a conjunction expression in any form of adversative/copulative conjunction and are not in conjunction with each other by a conjunction expression in any form of the adversative/copulative conjunction are of the same polarity as the registered expression. COPYRIGHT: (C)2005,JPO&NCIPI

    Abstract translation: 要解决的问题:提供一种表达提取装置,其从文本中提取表示评估对象的评价的评价表达,以正确地确定极性。 提供了从具有对具体评价对象的评价的描述的文本中提取评价表达式的表达式提取装置; 注册表达式存储部分,用于将具有预定极性的评估表达式登记为注册表达; 用于从文本中提取多个评价表达式和连接表达式的表达提取部分; 一种用于检测包括登记在登记表达式存储部分中的登记表达式的评价表达式的登记表达式检测部分; 以及极性确定部分,用于通过连续表达式以连续表达形式的连续表达式与包括登记表达式的评价表达式的评估表达式以及与评估表达式不一致的一系列评估表达式, 以任何形式的敌对/共同连接的连接表达,并且不以任何形式的敌对/共同连接的连接表达形式彼此结合,具有与注册表达式相同的极性。 版权所有(C)2005,JPO&NCIPI

    METHOD AND DEVICE FOR EXTRACTING KNOWLEDGE FROM ENORMOUS DOCUMENT DATA AND MEDIUM

    公开(公告)号:JP2001084250A

    公开(公告)日:2001-03-30

    申请号:JP23967499

    申请日:1999-08-26

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To automatically extract a document satisfying a pattern from enormous amount of documents, to extract useful knowledge and to reduce time required for a response by generating a field-dependent dictionary from document data, generating a syntax tree considering modification, by means of a language analysis device and extracting/outputting a frequentlyappearing pattern by means of a pattern extraction device. SOLUTION: A language feature analysis device generates an analysis- dependent dictionary. A language analysis device needs to prepare a field- dependent dictionary for requiring an attribute adjusted to data to be analyzed. A word having the specified attribute is to be generated by each field. The language feature analysis device checks the word from actual data and registers it in the field-dependent dictionary. A pattern extraction device obtains a pattern, which frequently appears by using document data which is structure- analyzed by the device and takes out an original document having a syntax which is matched with the pattern. A frequently-appearing pattern device displays the document, having the detected frequently-appearing pattern and a syntax tree matched with it.

    TRANSLATION METHOD AND SYSTEM
    9.
    发明专利

    公开(公告)号:JPH11184855A

    公开(公告)日:1999-07-09

    申请号:JP35438697

    申请日:1997-12-24

    Applicant: IBM JAPAN

    Abstract: PROBLEM TO BE SOLVED: To improve the selection accuracy of translated words in a machine translation mode without lowering the processing efficiency by using plural types of dictionaries including a context dictionary when a word that is not defined in a compound word dictionary is translated in a sentence. SOLUTION: Every input sentence is taken out at its head (110), and the compound words corresponding to a word string composing a single input sentence are retrieved from a compound word dictionary. When each of wards which are not corresponding to the compound words is translated (120), its translation is decided by a context dictionary and the translated word is obtained based on the translation (130). The translated word undergoes the translation result registering processing. Then it's checked whether an object word is stored in a translation result recording buffer as a header. If the object word is stored in the buffer, it's checked whether or not its translated word is stored in the buffer and all words which are not corresponding to the compound words are processed (140, 160). Then all words are translated (170). When it's decided that a full sentence is translated, the sentence is registered (180 to 195) after undergoing the retranslation effect evaluation processing.

Patent Agency Ranking