-
公开(公告)号:KR101467826B1
公开(公告)日:2014-12-03
申请号:KR1020100125864
申请日:2010-12-09
Applicant: 한국전자통신연구원
IPC: G06F17/28
Abstract: 자동번역시스템에서어휘공기정보를추출하여목적언어로사용함으로써구문관계를분석하는구문분석방법및 그장치가개시된다. 제 1 구문분석결과및 제 2 구문분석결과를수신하는수신부, 제 1 구문분석결과를이용해서장거리의존관계성향이강한어휘(LDLex-Long Distance Lexeme)에대응되는번역대상언어의대역어휘를추출하고, 제 2 구문분석결과를이용해서번역대상언어의대역어휘에대응되는번역대상언어의지배소어휘를추출하고, 제 1 구문분석결과를이용해서번역대상언어의지배소어휘에대응되는지배소어휘를번역대상입력문으로부터추출하는공기정보추출부및 번역대상언어의지배소어휘와번역대상입력문으로부터추출된지배소어휘를의존관계성향이강한어휘와대응하여저장하는저장부를포함하여구성될수 있다. 따라서, 어휘공기정보를목적언어를이용하여정확히추출할수 있으므로구문관계에대해서도정확한구문분석이가능하다는장점이있다.
-
公开(公告)号:KR1020140049149A
公开(公告)日:2014-04-25
申请号:KR1020120114534
申请日:2012-10-16
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2705 , G06F17/2809
Abstract: The present invention relates to a device and a method for creating translation for a complex sentence. The translation creating device proposed by the present invention comprises a corpus analysis unit which classifies the original included in a corpus and an initial translation as a sentence unit and analyzes the sentence unit; a short sentence information extraction unit which extracts information about a predicate in each short sentence of the complex sentence included in the original based on the analyzed result of the original and extracts information about grammar and word class in each sentence of the complex sentence included in the initial translation based on the analyzed result of the initial translation; and a final translation creating unit which generates a final translation from the initial translation by using the information about the predicate and the information about the grammar and the word class. The present invention improves translation performance of the complex sentence. [Reference numerals] (510) Morpheme analysis unit; (520) Construction analysis unit; (531) Short sentence qualification extraction unit; (532) Optimal pattern selecting unit; (533) Complex sentence translation generation unit; (540) Short sentence unit translation generation unit; (550) Morpheme generation unit; (AA) Primitive language input sentence; (BB) Object language translation sentence
Abstract translation: 本发明涉及一种用于创建复杂句子的翻译的装置和方法。 本发明提出的翻译创建装置包括:语料库分析单元,其将包含在语料库中的原件和初始翻译分类为句子单元,并分析句子单元; 一种短句子信息提取单元,其基于原始的分析结果提取包含在原文中的复合句子的每个短句中的谓词的信息,并提取关于包含在原文中的复合句的每个句子中的语法和单词类的信息 基于初始翻译分析结果的初始翻译; 以及最终翻译创建单元,其通过使用关于谓词的信息和关于语法和单词类的信息从初始翻译生成最终翻译。 本发明提高了复合句的翻译性能。 (附图标记)(510)语素分析单元; (520)施工分析单位; (531)短句资格提取单位; (532)最优模式选择单元; (533)复杂句子翻译生成单元; (540)短句单位翻译生成单元; (550)语素生成单位; (AA)原语言输入句; (BB)对象语言翻译句子
-
公开(公告)号:KR101309839B1
公开(公告)日:2013-09-23
申请号:KR1020090118298
申请日:2009-12-02
Applicant: 한국전자통신연구원
Abstract: 본 발명은 통계 정보를 이용한 규칙 기반 구문 분석 장치 및 방법에 관한 것으로, 본 발명의 일실시 예에 따른 통계정보를 이용한 규칙 기반 구문분석 방법은, 입력 문장에 대해 구문 규칙을 적용함으로써 구문 분석을 수행하는 단계; 상기 입력 문장에 대해 적용되는 규칙에 주어진 규칙 확률과 어휘통계정보에 기반하여 계산된 어휘 의존 가중치를 이용하여 상기 규칙에 대한 규칙 가중치를 계산하는 단계; 각 구문트리에 사용된 규칙에 대해 계산된 상기 규칙 가중치들을 곱하여 각 구문트리의 가중치를 계산하고 가장 높은 가중치를 갖는 구문 트리를 선택하는 단계; 및 상기 선택된 구문 트리를 출력하는 단계를 포함한다.
상술한 바와 같은 본 발명은, 규칙기반 방식의 효율성과 통계기반 방식의 높은 모호성 처리 성능을 갖는 구문분석이 가능하다.
언어 처리, 구문 분석, 통계 정보, 규칙 기반-
公开(公告)号:KR1020120089502A
公开(公告)日:2012-08-13
申请号:KR1020100125870
申请日:2010-12-09
Applicant: 한국전자통신연구원
CPC classification number: G06F17/289 , G06F17/2809 , G06F17/27 , G06F17/271 , G06F17/2755 , G06F17/28 , G06F17/2854
Abstract: PURPOSE: A translation knowledge server generating method and an apparatus thereof are provided to obtain a translation knowledge and to apply the obtained knowledge to a translation engine. CONSTITUTION: A data analysis unit(103) performs morphological analysis and syntax analysis of initial translation knowledge data collected from a data collecting unit. The data analysis unit outputs the analyzed data. A translation knowledge learning unit(105) determines a target word by domain according to predetermined domain information. The translation knowledge learning unit determines a domain through automatic leaning clustering. The translation knowledge learning unit learns translation knowledge in real time.
Abstract translation: 目的:提供翻译知识服务器生成方法及其装置,以获得翻译知识并将获得的知识应用于翻译引擎。 构成:数据分析单元(103)执行从数据收集单元收集的初始翻译知识数据的形态分析和语法分析。 数据分析单元输出分析数据。 翻译知识学习单元(105)根据预定域信息确定目标单词。 翻译知识学习单元通过自动倾斜聚类来确定域。 翻译知识学习单元实时学习翻译知识。
-
公开(公告)号:KR1020120088032A
公开(公告)日:2012-08-08
申请号:KR1020100101419
申请日:2010-10-18
Applicant: 한국전자통신연구원
IPC: G06F17/28
CPC classification number: G06F17/28 , G06F17/2818 , G06F17/2854 , G06F17/2872 , G06F17/289
Abstract: PURPOSE: An automatic real time translation knowledge extracting/verifying method and an apparatus thereof are provided to automatically build a translation knowledge of an automatic translation system and to reduce installation costs for building existing automatic translation knowledge dictionary. CONSTITUTION: A parallel corpus collecting unit(100) collects a parallel corpus on a web. The parallel corpus collecting unit removes a tag such as HTML. The parallel corpus collecting unit generates the parallel corpus. A translation knowledge extracting block(110) extracts predetermined translation knowledge. A translation knowledge evaluating block(130) evaluates and converts the extracted translation knowledge into a first evaluation element and a second evaluation element.
Abstract translation: 目的:提供自动实时翻译知识提取/验证方法及其装置,以自动构建自动翻译系统的翻译知识,并降低构建现有自动翻译知识词典的安装成本。 构成:平行语料库收集单元(100)在网上收集平行语料库。 平行语料库收集单元删除诸如HTML的标签。 平行语料库收集单元生成平行语料库。 翻译知识提取块(110)提取预定的翻译知识。 翻译知识评估块(130)评估并将所提取的翻译知识转换为第一评估元素和第二评估元素。
-
公开(公告)号:KR1020110067276A
公开(公告)日:2011-06-22
申请号:KR1020090123802
申请日:2009-12-14
Applicant: 한국전자통신연구원
CPC classification number: G06F17/278 , G06F17/2755 , G06F17/2863
Abstract: PURPOSE: A Korean language analyzing apparatus is provided to improve the efficiency of analyzing a noun phrase structure by determining noun phrase analysis candidates on the basis of a noun vocabulary pattern and a verb phrase pattern. CONSTITUTION: The inflected word expression/vocabulary pattern building block [verb phrase/vocabulary pattern building block](102) uses the sentences of the analysis object domain. The inflected word expression pattern and vocabulary pattern are away driven. A verb phrase/vocabulary pattern building block builds verb phrase patterns and vocabulary patterns using the sentences in a domain to be analyzed. An analysis candidate generating block(104) generates noun phrase analysis candidates by searching for a noun that can be a declinable word from the noun phrase sentence to be analyzed. A priority determining block(108) produces a preference from the established vocabulary patterns and determines the priority of the noun phrase analysis candidates. An analysis candidate selecting block(110) determines on noun phrase analysis candidate in consideration of the preference and inter-noun restriction.
Abstract translation: 目的:提供韩语分析装置,以通过基于名词词汇模式和动词短语模式确定名词短语分析候选来提高分析名词短语结构的效率。 构成:词汇表达/词汇模式构建块[动词短语/词汇模式构建块](102)使用分析对象域的句子。 转折词表达模式和词汇模式被驱逐。 动词短语/词汇模式构建块使用要分析的域中的句子构建动词短语模式和词汇模式。 分析候选生成块(104)通过从要分析的名词短语句中搜索可以是可拒绝词的名词来生成名词短语分析候选。 优先级确定块(108)从所建立的词汇模式产生偏好,并确定名词短语分析候选者的优先级。 考虑到偏好和名词之间的限制,分析候选选择块(110)确定名词短语分析候选者。
-
公开(公告)号:KR1020110062867A
公开(公告)日:2011-06-10
申请号:KR1020090119720
申请日:2009-12-04
Applicant: 한국전자통신연구원
CPC classification number: G06F17/30539 , G06F17/2735 , G06F17/2755 , G06F17/2863
Abstract: PURPOSE: A source language-object language term list constructing apparatus and method thereof are provided to construct an original language-object language term list through mining of a search result through an intermediate language. CONSTITUTION: A document search unit(105) collects a search result including an original language. A data mining unit(107) extracts intermediate language parallel sentence or a word. A morpheme analysis unit(109) analyzes an extracted original language-middle language parallel sentence. A word alignment unit(111) aligns the language parallel sentence. A term generator(113) generates an original language term list.
Abstract translation: 目的:提供源语言对象语言词列表构造装置及其方法,以通过中间语言挖掘搜索结果来构建原始语言对象语言词列表。 构成:文件搜索单元(105)收集包括原始语言的搜索结果。 数据挖掘单元(107)提取中间语言并行语句或单词。 语素分析单元(109)分析提取的原始语言 - 中间语言并行语句。 字对齐单元(111)对齐语言并行句。 术语生成器(113)生成原始语言项列表。
-
公开(公告)号:KR1020110057632A
公开(公告)日:2011-06-01
申请号:KR1020090114108
申请日:2009-11-24
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2863 , G06F17/218 , G06F17/271 , G06F17/2785
Abstract: PURPOSE: A Chinese translation apparatus and method thereof which uses syntax analysis information are provided to determine a Chinese sentence pattern by using syntax analysis information of a tagged Chinese sentence. CONSTITUTION: An input unit(102) performs segmentation and tagging of Chinese sentence and analyzes a syntax. A sentence pattern determining unit(104) uses a sentence pattern decision rule about the sentence analysis result. A transform unit(112) converts a Chinese sentence according to the sentence type. A terminative ending generating unit(114) generates terminative ending production rule corresponding to a sentence type.
Abstract translation: 目的:提供一种使用语法分析信息的中文翻译装置及其方法,通过使用标记中文句子的语法分析信息来确定中文句型。 构成:输入单元(102)执行中文句子的分割和标记,并分析语法。 句子模式确定单元(104)使用关于句子分析结果的句子模式决定规则。 变换单元(112)根据句型转换中文句子。 终端生成单元(114)生成与句型对应的终止生成规则。
-
公开(公告)号:KR1020110057495A
公开(公告)日:2011-06-01
申请号:KR1020090113923
申请日:2009-11-24
Applicant: 한국전자통신연구원
Abstract: PURPOSE: A method and device for segmenting a Chinese language are provided to phrase analysis accuracy and phrase analysis performance of Chinese based on phrase rule by performing a phrase segmentation based on Chinese contextual information. CONSTITUTION: A segmentation parameter setting unit(100) establishes a segment parameter about a sentence of Chinese in which morpheme is analyzed. A segment location estimating unit(102) presumes segment-available location and a maximum segment number about a Chinese sentence which the segment parameter is set.
Abstract translation: 目的:通过基于中文语境信息进行短语分割,提供了一种基于短语规则的汉语分词准确度和短语分析性能的方法和设备。 构成:细分参数设定单元(100)建立关于语素被分析的汉语句子的分段参数。 段位置估计单元(102)假设段可用位置和关于段参数被设置的中文句子的最大段号。
-
公开(公告)号:KR1020110028123A
公开(公告)日:2011-03-17
申请号:KR1020090086064
申请日:2009-09-11
Applicant: 한국전자통신연구원
CPC classification number: G06F17/289 , G06K9/2081 , G06K2209/01
Abstract: PURPOSE: An automatic translation apparatus and a method thereof using a user interaction of a mobile apparatus which modifies the error of a text are provided to increase the level of an automatic translation result and to modify the recognition error about the text through user feedback. CONSTITUTION: An text recognition controller(103) designates a target character string area through a user interaction in a digital image and performs the character recognition through an OCR(Optical Character Recognition). A text transmission controller(105) transmits the text character string. An automatic translation controller(107) generates a translation of the character string by using automatic translation knowledge database.
Abstract translation: 目的:提供一种使用修改文本错误的移动装置的用户交互的自动翻译装置及其方法,以增加自动翻译结果的级别,并通过用户反馈修改关于文本的识别错误。 构成:文本识别控制器(103)通过数字图像中的用户交互来指定目标字符串区域,并通过OCR(光学字符识别)执行字符识别。 文本传输控制器(105)发送文本字符串。 自动翻译控制器(107)通过使用自动翻译知识数据库生成字符串的翻译。
-
-
-
-
-
-
-
-
-