-
公开(公告)号:KR101309839B1
公开(公告)日:2013-09-23
申请号:KR1020090118298
申请日:2009-12-02
Applicant: 한국전자통신연구원
Abstract: 본 발명은 통계 정보를 이용한 규칙 기반 구문 분석 장치 및 방법에 관한 것으로, 본 발명의 일실시 예에 따른 통계정보를 이용한 규칙 기반 구문분석 방법은, 입력 문장에 대해 구문 규칙을 적용함으로써 구문 분석을 수행하는 단계; 상기 입력 문장에 대해 적용되는 규칙에 주어진 규칙 확률과 어휘통계정보에 기반하여 계산된 어휘 의존 가중치를 이용하여 상기 규칙에 대한 규칙 가중치를 계산하는 단계; 각 구문트리에 사용된 규칙에 대해 계산된 상기 규칙 가중치들을 곱하여 각 구문트리의 가중치를 계산하고 가장 높은 가중치를 갖는 구문 트리를 선택하는 단계; 및 상기 선택된 구문 트리를 출력하는 단계를 포함한다.
상술한 바와 같은 본 발명은, 규칙기반 방식의 효율성과 통계기반 방식의 높은 모호성 처리 성능을 갖는 구문분석이 가능하다.
언어 처리, 구문 분석, 통계 정보, 규칙 기반-
公开(公告)号:KR1020130022473A
公开(公告)日:2013-03-07
申请号:KR1020110084729
申请日:2011-08-24
Applicant: 한국전자통신연구원
IPC: G06F17/28
CPC classification number: G06F17/2818 , G06F17/277 , G06F17/2795
Abstract: PURPOSE: A target word selecting method in a dialogue type automatic translation device is provided to improve the accuracy of target word selection in dialogue type automatic translation. CONSTITUTION: A reference sentence range is set for selecting a specific target word in provided dialogue type sentences(209). A clue for selecting a target word is defined within the range of the reference sentence(211). Target word selection probability for words corresponding to the clue is obtained from a co-occurring probability information database(213). The target word of the specific word is selected based on the target word selection probability(217). [Reference numerals] (201) Setting a dynamic context window for a plurality of input sentences according to predetermined references; (203) Changing the structure of the plurality of input sentences; (205) Ambiguous vocabulary exists?; (207) Changing into a vocabulary in a dictionary; (209) Setting a reference context range for setting a target word of the ambiguous vocabulary; (211) Selecting a clue in the reference context range; (213) Obtaining a target word and probability information corresponding to the selected clue from a co-occurring probability information DB; (215) Storing the selected clue and the probability information in the dynamic context window; (217) Selecting a language target word using the clue and the information; (AA) Start; (BB) No; (CC) Yes; (DD) End
Abstract translation: 目的:提供对话型自动翻译装置中的目标词选择方法,以提高对话型自动翻译中目标词选择的准确性。 构成:在提供的对话类型句子中设置参考语句范围用于选择特定目标词(209)。 在参考句(211)的范围内定义用于选择目标词的线索。 从共同概率信息数据库(213)获得与该线索相对应的字的目标字选择概率。 基于目标词选择概率来选择特定单词的目标词(217)。 (附图标记)(201)根据预定参考设置多个输入句子的动态上下文窗口; (203)改变多个输入句子的结构; (205)存在不明确的词汇? (207)改为词典中的词汇; (209)设置用于设定歧义词汇表的目标单词的参考上下文范围; (211)在参考上下文范围内选择一条线索; (213)从共同发生概率信息DB获取与所选择的线索对应的目标字和概率信息; (215)将所选择的线索和概率信息存储在动态上下文窗口中; (217)使用线索和信息选择语言目标词; (AA)开始; (BB)否 (CC)是; (DD)结束
-
公开(公告)号:KR1020120089502A
公开(公告)日:2012-08-13
申请号:KR1020100125870
申请日:2010-12-09
Applicant: 한국전자통신연구원
CPC classification number: G06F17/289 , G06F17/2809 , G06F17/27 , G06F17/271 , G06F17/2755 , G06F17/28 , G06F17/2854
Abstract: PURPOSE: A translation knowledge server generating method and an apparatus thereof are provided to obtain a translation knowledge and to apply the obtained knowledge to a translation engine. CONSTITUTION: A data analysis unit(103) performs morphological analysis and syntax analysis of initial translation knowledge data collected from a data collecting unit. The data analysis unit outputs the analyzed data. A translation knowledge learning unit(105) determines a target word by domain according to predetermined domain information. The translation knowledge learning unit determines a domain through automatic leaning clustering. The translation knowledge learning unit learns translation knowledge in real time.
Abstract translation: 目的:提供翻译知识服务器生成方法及其装置,以获得翻译知识并将获得的知识应用于翻译引擎。 构成:数据分析单元(103)执行从数据收集单元收集的初始翻译知识数据的形态分析和语法分析。 数据分析单元输出分析数据。 翻译知识学习单元(105)根据预定域信息确定目标单词。 翻译知识学习单元通过自动倾斜聚类来确定域。 翻译知识学习单元实时学习翻译知识。
-
公开(公告)号:KR1020120088032A
公开(公告)日:2012-08-08
申请号:KR1020100101419
申请日:2010-10-18
Applicant: 한국전자통신연구원
IPC: G06F17/28
CPC classification number: G06F17/28 , G06F17/2818 , G06F17/2854 , G06F17/2872 , G06F17/289
Abstract: PURPOSE: An automatic real time translation knowledge extracting/verifying method and an apparatus thereof are provided to automatically build a translation knowledge of an automatic translation system and to reduce installation costs for building existing automatic translation knowledge dictionary. CONSTITUTION: A parallel corpus collecting unit(100) collects a parallel corpus on a web. The parallel corpus collecting unit removes a tag such as HTML. The parallel corpus collecting unit generates the parallel corpus. A translation knowledge extracting block(110) extracts predetermined translation knowledge. A translation knowledge evaluating block(130) evaluates and converts the extracted translation knowledge into a first evaluation element and a second evaluation element.
Abstract translation: 目的:提供自动实时翻译知识提取/验证方法及其装置,以自动构建自动翻译系统的翻译知识,并降低构建现有自动翻译知识词典的安装成本。 构成:平行语料库收集单元(100)在网上收集平行语料库。 平行语料库收集单元删除诸如HTML的标签。 平行语料库收集单元生成平行语料库。 翻译知识提取块(110)提取预定的翻译知识。 翻译知识评估块(130)评估并将所提取的翻译知识转换为第一评估元素和第二评估元素。
-
公开(公告)号:KR1020110067276A
公开(公告)日:2011-06-22
申请号:KR1020090123802
申请日:2009-12-14
Applicant: 한국전자통신연구원
CPC classification number: G06F17/278 , G06F17/2755 , G06F17/2863
Abstract: PURPOSE: A Korean language analyzing apparatus is provided to improve the efficiency of analyzing a noun phrase structure by determining noun phrase analysis candidates on the basis of a noun vocabulary pattern and a verb phrase pattern. CONSTITUTION: The inflected word expression/vocabulary pattern building block [verb phrase/vocabulary pattern building block](102) uses the sentences of the analysis object domain. The inflected word expression pattern and vocabulary pattern are away driven. A verb phrase/vocabulary pattern building block builds verb phrase patterns and vocabulary patterns using the sentences in a domain to be analyzed. An analysis candidate generating block(104) generates noun phrase analysis candidates by searching for a noun that can be a declinable word from the noun phrase sentence to be analyzed. A priority determining block(108) produces a preference from the established vocabulary patterns and determines the priority of the noun phrase analysis candidates. An analysis candidate selecting block(110) determines on noun phrase analysis candidate in consideration of the preference and inter-noun restriction.
Abstract translation: 目的:提供韩语分析装置,以通过基于名词词汇模式和动词短语模式确定名词短语分析候选来提高分析名词短语结构的效率。 构成:词汇表达/词汇模式构建块[动词短语/词汇模式构建块](102)使用分析对象域的句子。 转折词表达模式和词汇模式被驱逐。 动词短语/词汇模式构建块使用要分析的域中的句子构建动词短语模式和词汇模式。 分析候选生成块(104)通过从要分析的名词短语句中搜索可以是可拒绝词的名词来生成名词短语分析候选。 优先级确定块(108)从所建立的词汇模式产生偏好,并确定名词短语分析候选者的优先级。 考虑到偏好和名词之间的限制,分析候选选择块(110)确定名词短语分析候选者。
-
公开(公告)号:KR1020110062867A
公开(公告)日:2011-06-10
申请号:KR1020090119720
申请日:2009-12-04
Applicant: 한국전자통신연구원
CPC classification number: G06F17/30539 , G06F17/2735 , G06F17/2755 , G06F17/2863
Abstract: PURPOSE: A source language-object language term list constructing apparatus and method thereof are provided to construct an original language-object language term list through mining of a search result through an intermediate language. CONSTITUTION: A document search unit(105) collects a search result including an original language. A data mining unit(107) extracts intermediate language parallel sentence or a word. A morpheme analysis unit(109) analyzes an extracted original language-middle language parallel sentence. A word alignment unit(111) aligns the language parallel sentence. A term generator(113) generates an original language term list.
Abstract translation: 目的:提供源语言对象语言词列表构造装置及其方法,以通过中间语言挖掘搜索结果来构建原始语言对象语言词列表。 构成:文件搜索单元(105)收集包括原始语言的搜索结果。 数据挖掘单元(107)提取中间语言并行语句或单词。 语素分析单元(109)分析提取的原始语言 - 中间语言并行语句。 字对齐单元(111)对齐语言并行句。 术语生成器(113)生成原始语言项列表。
-
公开(公告)号:KR1020110057632A
公开(公告)日:2011-06-01
申请号:KR1020090114108
申请日:2009-11-24
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2863 , G06F17/218 , G06F17/271 , G06F17/2785
Abstract: PURPOSE: A Chinese translation apparatus and method thereof which uses syntax analysis information are provided to determine a Chinese sentence pattern by using syntax analysis information of a tagged Chinese sentence. CONSTITUTION: An input unit(102) performs segmentation and tagging of Chinese sentence and analyzes a syntax. A sentence pattern determining unit(104) uses a sentence pattern decision rule about the sentence analysis result. A transform unit(112) converts a Chinese sentence according to the sentence type. A terminative ending generating unit(114) generates terminative ending production rule corresponding to a sentence type.
Abstract translation: 目的:提供一种使用语法分析信息的中文翻译装置及其方法,通过使用标记中文句子的语法分析信息来确定中文句型。 构成:输入单元(102)执行中文句子的分割和标记,并分析语法。 句子模式确定单元(104)使用关于句子分析结果的句子模式决定规则。 变换单元(112)根据句型转换中文句子。 终端生成单元(114)生成与句型对应的终止生成规则。
-
公开(公告)号:KR1020110057495A
公开(公告)日:2011-06-01
申请号:KR1020090113923
申请日:2009-11-24
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A method and device for segmenting a Chinese language are provided to phrase analysis accuracy and phrase analysis performance of Chinese based on phrase rule by performing a phrase segmentation based on Chinese contextual information. CONSTITUTION: A segmentation parameter setting unit(100) establishes a segment parameter about a sentence of Chinese in which morpheme is analyzed. A segment location estimating unit(102) presumes segment-available location and a maximum segment number about a Chinese sentence which the segment parameter is set.
Abstract translation: 目的:通过基于中文语境信息进行短语分割,提供了一种基于短语规则的汉语分词准确度和短语分析性能的方法和设备。 构成:细分参数设定单元(100)建立关于语素被分析的汉语句子的分段参数。 段位置估计单元(102)假设段可用位置和关于段参数被设置的中文句子的最大段号。
-
公开(公告)号:KR1020110028123A
公开(公告)日:2011-03-17
申请号:KR1020090086064
申请日:2009-09-11
Applicant: 한국전자통신연구원
CPC classification number: G06F17/289 , G06K9/2081 , G06K2209/01
Abstract: PURPOSE: An automatic translation apparatus and a method thereof using a user interaction of a mobile apparatus which modifies the error of a text are provided to increase the level of an automatic translation result and to modify the recognition error about the text through user feedback. CONSTITUTION: An text recognition controller(103) designates a target character string area through a user interaction in a digital image and performs the character recognition through an OCR(Optical Character Recognition). A text transmission controller(105) transmits the text character string. An automatic translation controller(107) generates a translation of the character string by using automatic translation knowledge database.
Abstract translation: 目的:提供一种使用修改文本错误的移动装置的用户交互的自动翻译装置及其方法,以增加自动翻译结果的级别,并通过用户反馈修改关于文本的识别错误。 构成:文本识别控制器(103)通过数字图像中的用户交互来指定目标字符串区域,并通过OCR(光学字符识别)执行字符识别。 文本传输控制器(105)发送文本字符串。 自动翻译控制器(107)通过使用自动翻译知识数据库生成字符串的翻译。
-
公开(公告)号:KR1020110027361A
公开(公告)日:2011-03-16
申请号:KR1020090085422
申请日:2009-09-10
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2836
Abstract: PURPOSE: An automatic translation system based on TM and a method thereof, capable of increasing coverage by TM are provided to increase the quality of translation by changing TM consisting of a character string into a configured TM. CONSTITUTION: A TM building module(106) converts a language pattern into a partial translation pattern and registers the partial translation pattern into TM database. A partial combination translation module(20) analyzes the structure of a language pattern with reference to the TM database and searches the partial translation pattern. The partial combination translation module combines the partial translation pattern and outputs a translation corresponding to an input statement.
Abstract translation: 目的:提供一种基于TM的自动翻译系统及其方法,其能够增加TM的覆盖率,以通过将由字符串组成的TM变更为配置的TM来提高翻译质量。 构成:TM构建模块(106)将语言模式转换为部分翻译模式,并将部分翻译模式注册到TM数据库中。 部分组合翻译模块(20)参考TM数据库分析语言模式的结构,并搜索部分翻译模式。 部分组合翻译模块组合部分翻译模式并输出与输入语句对应的翻译。
-
-
-
-
-
-
-
-
-