-
公开(公告)号:KR1020140059877A
公开(公告)日:2014-05-19
申请号:KR1020120125764
申请日:2012-11-08
Applicant: 한국전자통신연구원
CPC classification number: G06F17/20 , G06F17/2705 , G06F17/2725
Abstract: The present invention relates to an apparatus and method for improving Chinese word segmentation performance, and more particularly, an apparatus and method for improving word segmentation performance by processing word segmentation errors of Chinese by automatically recognizing an accurate boundary of a word from a sentence of another language, for example, English or Korean, of a parallel corpus of which a word boundary is clear in order to reduce unregistered word errors and ambiguity errors frequently appeared in a Chinese word segmenting device. According to the present invention, a limitation that errors are confirmed from the word segmenting device by consuming lots of manpower and time can be overcome by continuously extracting the unregistered word errors and ambiguity errors, which are difficult to process at the time of word segmentation of a Chinese sentence, through the parallel corpus and storing corrected word segmentation information.
Abstract translation: 本发明涉及一种用于提高中文分词性能的装置和方法,更具体地说,涉及一种通过自动识别来自另一语句的单词的准确边界来处理中文的分词错误来提高分词性能的装置和方法 中文字分割装置中经常出现的语言,例如英文或韩文,其中字边界是清楚的,以减少未注册的字错误和模糊性错误。 根据本发明,通过消耗大量人力和时间从字分割装置确认出错误的限制可以通过连续提取在单词分割时难以处理的未注册的字错误和歧义错误来克服 一个中文句子,通过并行语料库和存储更正的分词信息。
-
公开(公告)号:KR101263332B1
公开(公告)日:2013-05-20
申请号:KR1020090086064
申请日:2009-09-11
Applicant: 한국전자통신연구원
CPC classification number: G06F17/289 , G06K9/2081 , G06K2209/01
Abstract: 본발명은모바일기기에서사용자상호작용을이용한자동번역장치및 그방법에관한것으로, 카메라렌즈를포함한모바일기기상에서촬영한정지영상에대하여사용자인터페이스를통해자동번역하고자하는문자열영역이지정될경우, 지정된문자열영역에대하여문자인식후에전자화된텍스트로변환하고이 변환된텍스트에대한인식오류를사용자피드백을통해직접오류가수정완료된텍스트문자열을자동번역함으로써, 식당에서의메뉴, 도로에서의표지판, 다양한외국서적, 외국브랜드제품의메뉴얼등의번역대상에대하여휴대가능한모바일기기를통해국내뿐만아니라외국에서편리하게번역기능을수행할수 있으며, 또한사용자와의상호작용을통하여문자인식오류를최소화하여고품질의자동번역의성능을얻을수 있다.
-
公开(公告)号:KR1020130026799A
公开(公告)日:2013-03-14
申请号:KR1020110090194
申请日:2011-09-06
Applicant: 한국전자통신연구원
IPC: G06F17/28
CPC classification number: G06F17/2854 , G06F17/271 , G06F17/2863
Abstract: PURPOSE: A translation engine tuning device and a method thereof are provided to improve the translation performance of a translation engine by evaluating a translated sentence based on a parallel corpus by field and editing translation information based on the evaluation result. CONSTITUTION: An example search unit(111) searches a document translated by a translation engine(120) or a document on a network for sentences to be used for tuning the translation engine. The example search unit groups the searched sentences according to each related filed and match the searched sentences with correct sentences translated by a user in order to generate a parallel corpus. A user authentication unit(112) determines a tuning qualification of the user tuning the translation engine. A sentence selecting unit(113) selects the sentences as original sentences to be tuned from the parallel corpus. The sentence selecting unit transmits the selected original sentences to a translation unit(121) of the translation engine. [Reference numerals] (110) Translation engine tuning device; (111) Example search unit; (112) User authentication unit; (113) Sentence selecting unit; (114) Sentence evaluating unit; (115) Information editing unit; (120) Translation engine; (121) Translation unit; (122) Translation DataBase
Abstract translation: 目的:提供翻译引擎调谐装置及其方法,以通过基于场地的平行语料库评估翻译语句并且基于评估结果编辑翻译信息来提高翻译引擎的翻译性能。 构成:示例搜索单元(111)搜索由翻译引擎(120)翻译的文档或网络上的文档以用于调整翻译引擎的句子。 示例搜索单元根据每个相关档案对搜索到的句子进行分组,并且使用用户翻译的正确句子匹配搜索到的句子,以生成平行语料库。 用户认证单元(112)确定调整转换引擎的用户的调谐限定。 句子选择单元(113)将句子选择为从并行语料库调谐的原始句子。 句子选择单元将所选择的原始语句发送到翻译引擎的翻译单元(121)。 (附图标记)(110)翻译引擎调谐装置; (111)搜索单元示例; (112)用户认证单元; (113)句子选择单位; (114)句子评估单位; (115)信息编辑单元; (120)翻译引擎; (121)翻译单位; (122)翻译数据库
-
公开(公告)号:KR1020120072112A
公开(公告)日:2012-07-03
申请号:KR1020100133909
申请日:2010-12-23
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2854
Abstract: PURPOSE: An apparatus and a method for measuring translation reliability of a multi-language model are provided to measure translation reliability by calculating reliability of an automatic translation result based on probability of language models. CONSTITUTION: A vocabulary based determining unit(110) calculates the probability of a vocabulary information language mode about a translation from a machine translation result. A content word based determining unit(130) calculates the probability of content information language model about the translation from the machine translation result. A reliability calculating unit(140) calculates the translation reliability of the translation by using the probability of the language model.
Abstract translation: 目的:提供一种用于测量多语言模型的翻译可靠性的装置和方法,以通过基于语言模型的概率计算自动翻译结果的可靠性来测量翻译的可靠性。 构成:基于词汇的确定单元(110)从机器翻译结果计算关于翻译的词汇信息语言模式的概率。 基于内容字的确定单元(130)从机器翻译结果计算关于翻译的内容信息语言模型的概率。 可靠性计算单元(140)通过使用语言模型的概率来计算翻译的翻译可靠性。
-
公开(公告)号:KR1020120037842A
公开(公告)日:2012-04-20
申请号:KR1020100099547
申请日:2010-10-12
Applicant: 한국전자통신연구원
CPC classification number: G09B19/06 , G06F17/289
Abstract: PURPOSE: A method and an apparatus for learning translation knowledge by phrase are provided to reliably and accurately extend translation knowledge since a user corrects the translation knowledge by noun phrase. CONSTITUTION: Syntax analysis for source language among language corpus of two countries is performed(110). Two countries language corpus is arranged with a word as a unit(120). A target language noun phrase candidate is extracted from two countries language corpus arranged with a word as a unit(130). A noun phrase translation knowledge construction candidate is selected by filtering the target language noun phrase candidates(140). The noun phrase translation knowledge is collected based on the machine translation result of the source language(160). The noun phrase pattern candidate is selected by searching the generalized pattern from pattern database(180). The result of correction or selection by a user is saved in the corresponding database(200).
Abstract translation: 目的:提供用于通过短语学习翻译知识的方法和装置,以可靠和准确地扩展翻译知识,因为用户通过名词短语来校正翻译知识。 规定:执行两国语言语料库语言语法分析(110)。 两个国家的语言语料库以单词(120)的形式排列。 从两个国家提取目标语言名词短语候选人,语言语料库以单词排列(130)。 通过过滤目标语言名词短语候选者(140)来选择名词短语翻译知识建构候选者。 基于源语言的机器翻译结果收集名词短语翻译知识(160)。 通过从模式数据库搜索广义模式来选择名词短语模式候选者(180)。 用户的校正或选择的结果保存在相应的数据库(200)中。
-
公开(公告)号:KR101120038B1
公开(公告)日:2012-03-23
申请号:KR1020080131776
申请日:2008-12-22
Applicant: 한국전자통신연구원
Abstract: 본 발명은 주제를 대표할 수 있는 신조어인지 여부에 따라 우선순위를 결정하여 신조어를 선정한다는 것으로, 이를 위하여 본 발명은, 다양한 유형에 따른 신조어를 모두 추출하거나 전체 말뭉치를 이용하여 신조어를 선정하거나 선정된 신조어 후보를 사람이 직접 일일이 검토하여 신조어를 선정하는 종래 방법과는 달리, 신조어 후보에 대한 키워드를 추출하고, 추출된 키워드에 대한 주제를 탐지 및 추적한 후 그 주제의 대표어가 될 수 있는지 여부에 따라 우선 순위를 결정한 후에, 결정된 우선 순위에 따라 신조어를 선정함으로써, 신조어 선정을 효과적으로 수행할 수 있는 것이다.
자동 번역 시스템, 신조어(neologism)-
公开(公告)号:KR101092361B1
公开(公告)日:2011-12-09
申请号:KR1020080131775
申请日:2008-12-22
Applicant: 한국전자통신연구원
Abstract: 본발명은목표언어가중국어인자동번역시스템에서의대역어생성기법에관한것으로, 통계기반의의미모호성을해소하거나구문변환으로는정확한번역이어려운상황에서, 중국어부사생성을위하여다중부사변환패턴, 부사-용언변환패턴, 중국어부사부정형위치사전및 중국어복합동사사전과같은패턴과지식사전을활용하여부사의생성위치에따른대역어를결정함으로써, 목표언어가중국어인자동번역시스템에서중국어부사에대응하는대역어를효과적으로생성할수 있는것이다.
-
公开(公告)号:KR1020110066359A
公开(公告)日:2011-06-17
申请号:KR1020090122977
申请日:2009-12-11
Applicant: 한국전자통신연구원
CPC classification number: G06F17/277 , G06F17/218
Abstract: PURPOSE: An apparatus for extracting vocabulary pattern including a syntactic node and a method thereof are provided to effectively extract the vocabulary pattern suited for grammar units by extracting vocabulary patterns including the syntactic node from massive text based on statistical and linguistic means. CONSTITUTION: Sentences which exceeds a frequency threshold is removed from a text document(202). Sentences which does not exceed the frequency threshold is tagged(203). A lexical pattern is generated about the tagged sentences, and the frequency of the vocabulary patterns is calculated(204). The vocabulary patterns are filtered. An illustrative sentence about the vocabulary patterns is added(206). The vocabulary patterns are outputted according to the priority(207).
Abstract translation: 目的:提供一种用于提取包括句法节点的词汇模式及其方法的装置,以通过基于统计和语言手段从大量文本中提取包括语法节点的词汇模式,有效地提取适合于语法单元的词汇模式。 构成:从文本文档(202)中删除超过频率阈值的句子。 不超过频率阈值的句子被标记(203)。 产生关于标记的句子的词汇模式,并计算词汇模式的频率(204)。 词汇模式被过滤。 添加了关于词汇模式的说明性句子(206)。 根据优先级输出词汇模式(207)。
-
公开(公告)号:KR1020110057631A
公开(公告)日:2011-06-01
申请号:KR1020090114107
申请日:2009-11-24
Applicant: 한국전자통신연구원
CPC classification number: G06F17/277 , G06F17/2755 , G06F17/2785
Abstract: PURPOSE: A compound noun decision apparatus and method thereof are provided to determine the sphere range of a compound noun by determining a semantic relation between nouns according to a semantic relation within a sentence. CONSTITUTION: A noun range recognition unit(102) selects a phrase binding target noun according to morpheme analysis result of the inputted sentence. A semantic relation decision unit(106) uses a semantic constraint condition and analyzes semantic relation between noun and verb. A noun range decision unit(110) determines phrase range according to a decision result.
Abstract translation: 目的:提供一种复合名词决定装置及其方法,以通过根据句子内的语义关系确定名词之间的语义关系来确定复合名词的范围。 构成:名词范围识别单元(102)根据输入的句子的语素分析结果选择短语绑定目标名词。 语义关系决策单元(106)使用语义约束条件并分析名词和动词之间的语义关系。 名词范围决定单元(110)根据判定结果确定短语范围。
-
公开(公告)号:KR100975044B1
公开(公告)日:2010-08-11
申请号:KR1020080104184
申请日:2008-10-23
Applicant: 한국전자통신연구원
Abstract: 본 발명은 명사가 여러 게 나열되는 복합 명사 내부의 의미 구조 분석을 통해 의미의 왜곡이 없는 문장 형태로 변환하여, 명사가 나열된 형태에서 생략된 문장 성분을 자동으로 복원하여 의미 왜곡이 없는 문장으로 생성하는 문장 성분 복원 장치 및 그 방법에 관한 것이다. 본 발명에 의해, 분석된 복합 명사의 의미 구조를 바탕으로 의미를 왜곡시키지 않으면서, 다른 형태의 표현을 생성할 수 있게 된다.
-
-
-
-
-
-
-
-
-