-
公开(公告)号:KR1020110066466A
公开(公告)日:2011-06-17
申请号:KR1020090123135
申请日:2009-12-11
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2881 , G06F17/2836
Abstract: PURPOSE: A foreign language writing service and system are provided to improve foreign language ability while making a composition with a foreign language by using an automatic translation function and an error estimation/correction function. CONSTITUTION: A language input unit(100) supports a foreign/native language mixed sentence input of a learner. A native language translation unit(102) recognizes and translates a native language from a foreign/native language mixed sentence. A conjunction sentence completion unit(104) combines a native language translation result and a foreign language input part of the foreign/native language mixed sentence. A conjunction sentence output unit(106) outputs a conjunction sentence which the conjunction sentence completion unit combines. An error estimation unit(108) presumes an error of the conjunction sentence and outputs the error estimation result through the conjunction sentence output unit.
Abstract translation: 目的:提供外语写作服务和系统,通过使用自动翻译功能和错误估算/修正功能,提高外语能力,同时使外语组合。 构成:语言输入单元(100)支持学习者的外语/母语混合输入。 母语翻译单元(102)从外语/母语混合句子中识别和翻译母语。 连接句子完成单元(104)将母语翻译结果和外语/母语混合句的外语输入部分相结合。 连词输出单元(106)输出连字句完成单元组合的连词。 误差估计单元(108)假设连接语句的错误,并通过连接语句输出单元输出错误估计结果。
-
公开(公告)号:KR1020110057583A
公开(公告)日:2011-06-01
申请号:KR1020090114046
申请日:2009-11-24
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2863 , G06F17/271 , G06F17/2755 , G06F17/2785 , G06F17/2809 , G06F17/30616 , G06F17/30625
Abstract: PURPOSE: A method and apparatus for generating and analyzing predicate for Chinese-Korean machine translation are provided to generate accurate Korean sentence by dividing past and present tenses. CONSTITUTION: A Chinese morpheme analyzing unit(101) divides a Chinese word included in a Chinese sentence. A Chinese morpheme analyzing unit(102) analyzes Chinese sentence and generates a Chinese syntactic tree. A Chinese/Korean selecting unit(103) converts Chinese syntactic tree into Korean syntactic tree. A Korean generating unit(105) rearranges Korean words by using a Korean syntactic tree.
Abstract translation: 目的:提供一种用于生成和分析中韩机器翻译谓词的方法和装置,通过划分过去和现在的时态来产生准确的韩语句子。 规定:中国语素分析单位(101)划分中文句子中包含的中文单词。 汉语词素分析单元(102)分析汉语句子并生成中文句法树。 中文/韩文选择单元(103)将汉语句法树转换为韩语句法树。 韩国发电单元(105)使用韩国语法树重新排列韩语单词。
-
公开(公告)号:KR1020110050296A
公开(公告)日:2011-05-13
申请号:KR1020090107214
申请日:2009-11-06
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2827 , G06F17/30876
Abstract: PURPOSE: A parallel language corpus extracting system and method thereof are provided to automatically extract parallel corpus from a web document at predetermined interval and to reduce the database construction cost of parallel language corpus. CONSTITUTION: A text extractor(130) extracts a text from a translated document and an original document(20). A paragraph extractor(140) extracts a translated paragraph. The paragraph extractor extracts an original paragraph from the text of the original document. A sentence extractor(150) extracts a translated sentence and an original sentence. A corpus extractor(160) extracts a parallel corpus.
Abstract translation: 目的:提供一种并行语言语料库提取系统及其方法,以预定间隔自动从Web文档中提取并行语料库,并减少并行语言语料库的数据库构建成本。 规定:文本提取器(130)从翻译文档和原始文档(20)中提取文本。 段落提取器(140)提取翻译的段落。 段落提取器从原始文档的文本中提取原始段落。 句子提取器(150)提取翻译的句子和原始句子。 语料库提取器(160)提取平行语料库。
-
公开(公告)号:KR1020100073181A
公开(公告)日:2010-07-01
申请号:KR1020080131775
申请日:2008-12-22
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2863 , G06F17/2765 , G06F17/289
Abstract: PURPOSE: A substitute adverb generating device and a method thereof are provided to determine generation location of adverb within predicate phrase in a Chinese target auto translation and determine substitute language of the determined position, thereby solving the ambiguity on meaning. CONSTITUTION: A multi adverb pattern applying unit(102) recognizes multiple adverb within inputted predicate phrase. The multi adverb pattern applying unit selects substitute multi adverb. A negative type recognizing and generating unit(108) recognizes negative adverb. The negative type recognizing and generating unit selects substitute negative adverb. An adverb-predicate pattern applying unit(112) determines adverb generating location through adverb-predicate pattern. The adverb-predicate pattern applying unit selects substitute adverb.
Abstract translation: 目的:提供一种替代副词生成装置及其方法,用于确定中文目标自动翻译中谓词短语内副词的生成位置,并确定确定位置的替代语言,从而解决意义上的歧义。 构成:多副词模式应用单元(102)识别输入的谓词短语内的多个副词。 多副词模式应用单元选择替代多副词。 负型识别和生成单元(108)识别负副词。 负型识别和生成单元选择替代负副词。 副词谓词模式应用单元(112)通过副词谓词模式确定副词生成位置。 副词谓词模式应用单元选择替代副词。
-
公开(公告)号:KR1020100072388A
公开(公告)日:2010-07-01
申请号:KR1020080130781
申请日:2008-12-22
Applicant: 한국전자통신연구원
IPC: G06F17/40
CPC classification number: G06F17/3089 , G06F17/2705
Abstract: PURPOSE: A translation word extractor device is provided to collect a web news and extract a translation word about a neologism or foreign language within a bracket and quotation symbol form a collected web news, thereby constructing a translation dictionary using the extracted translation word. CONSTITUTION: A web news collection unit(102) collects RSS(Really Simple Syndication) news list in real time. The web news collection unit extracts a web news corresponding to the collected RSS news list. Based on bracket and quotation symbols of the extracted web news, a translation word extractor(104) separates the sentence. The translation word extractor extracts word boundary corresponding to the bracket through LCS(Longest Common Substring) algorithm. The translation word extractor extracts a translation pair according to the extracted word border.
Abstract translation: 目的:提供一种翻译单词提取装置,用于收集网络新闻,并从收集的网络新闻中提取一个括号内的新词或外语的翻译单词和引号,从而使用提取的翻译词来构建翻译词典。 规定:网络新闻采集单位(102)实时收集RSS(真正简单聚合)新闻列表。 网络新闻收集单元提取与收集的RSS新闻列表相对应的网络新闻。 基于提取的网络新闻的括号和引号,翻译单词提取器(104)分离句子。 翻译词提取器通过LCS(最长公共子串)算法提取与括号相对应的字边界。 翻译单词提取器根据提取的单词边界提取翻译对。
-
公开(公告)号:KR1020100072384A
公开(公告)日:2010-07-01
申请号:KR1020080130777
申请日:2008-12-22
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2705 , G06F17/18 , G06F17/2755 , G06F17/2863 , G06F17/289
Abstract: PURPOSE: A method for generating Korean connectives for Chinese-Korean machine translation and a device thereof are provide to generate clear vision of connectives in Korean point of view and generate Korean sentence, thereby generating high quality of Korean. CONSTITUTION: A Chinese morpheme analyzer(101) selects optimum morpheme part about Chinese word in a Chinese input sentence. A Chinese construction analyzer generates a Chinese construction tree. A Chinese-Korean converter(103) converts the Chinese construction tree into Korean construction tree. If an inter-short sentence logical connection mark is inexplicit, a connective ending determining unit(104) generates inter-short connection ending through connective knowledge DB(106). A Korean generator(105) generates natural Korean.
Abstract translation: 目的:韩国机器翻译生成韩文连词的方法及其设备,为朝鲜语提供连贯词的清晰视觉,产生韩语句子,从而产生韩国人的高品质。 构成:中文词素分析器(101)在中文输入句中选择中文词的最佳语素部分。 中国建筑分析仪生产中国建筑树。 中国 - 韩国转换器(103)将中国建筑树转换为韩国建筑树。 如果短语间逻辑连接标记是不明确的,则连接结束确定单元(104)生成通过连接知识DB(106)结束的短间连接。 韩国发电机(105)产生天然韩国人。
-
公开(公告)号:KR100956794B1
公开(公告)日:2010-05-11
申请号:KR1020080084626
申请日:2008-08-28
Applicant: 한국전자통신연구원
IPC: G06F17/28
CPC classification number: G06F17/2827 , G06F17/2775
Abstract: 본 발명은 다단계 용언구 패턴을 적용한 번역장치와 이를 위한 적용방법 및 추출방법에 관한 것으로, 다단계 용언구 패턴 매칭 기법을 적용하여 번역성능을 향상시키고, 번역장치를 위한 용언구 패턴을 다단계로 적용하며, 다단계 용언구 패턴을 자동으로 추출함으로써 고성능의 기계번역장치를 구축할 수 있다. 또한, 본 발명은 다단계 용언구 패턴을 적용한 번역장치와 이를 위한 적용방법 및 추출방법을 제공함으로써 원시언어에서 목적언어로 변환하는 데 사용되며 어휘적인 측면과 어순 등 언어 구조적인 측면에서 이종 언어간에 발생하는 중의성을 해소할 수 있다.
용언구 패턴, 다단계, 매칭, 적용-
公开(公告)号:KR100911621B1
公开(公告)日:2009-08-12
申请号:KR1020070133677
申请日:2007-12-18
Applicant: 한국전자통신연구원
IPC: G06F17/28
CPC classification number: G06F17/2872 , G06F17/2818 , G06F17/2863
Abstract: 본 발명은 패턴기반 자동번역(Pattern Based Machine Translation) 방식의 장점과 통계기반 자동번역(Statistical Machine Translation) 방식의 장점을 혼합한 하이브리드 자동번역 기술에 관한 것이다. 본 발명은, 형태소 분석기를 이용하여 한국어 문장에 대한 형태소 분석 결과를 생성하는 단계, 형태소 분석 결과를 입력으로 하고 구문분석기를 이용하여 구문분석 결과를 생성하는 단계, 원문부 번역 매니저를 이용하여 원문의 분석 결과를 보정하는 단계, 원문부 번역 매니저 내에서 문장 분절을 수행하는 단계, 원문부 번역 매니저 내에서 문형 매칭을 수행하는 단계, 원문부 번역 매니저 내에서 패러프레이징(Paraphrasing)을 수행하는 단계, PBMT 생성기에서 번역 결과를 생성하는 단계, PBMT 생성기에서 SMT 번역 결과를 호출하는 단계, SMT에서 보정된 원문 분석 결과를 이용해 번역 결과를 생성하는 단계, 대역문 번역 매니저에서 최종 번역 결과를 생성하는 단계, 대역문 합성기에서 PBMT 및 SMT 번역 결과를 이용하여 최종 대역문 후보를 생성하는 단계, 대역문 합성기에서 생성한 대역문 후보들에 대해 가장 적절한 대역문 결과를 평가하여 선정하는 단계를 포함한다. 본 발명에 의하면, 첫째, 한국어 문장을 정확하게 분절할 수 있으며, 둘째, 분절을 통해 번역 속도를 향상할 수 있으며, 셋째, 분절을 통해 번역 성능을 향상시킬 수 있으며, 넷째, 입력문에 대한 패러프레이징을 수행함으로써 분석 및 번역 성능을 개선시킬 수 있고, 다섯째, 대역문 선택기를 개발함으로써 보다 우수한 번역 결과를 최종적으로 생성할 수 있다.
통계기반 자동번역, 패턴기반 자동번역, 패러프레이징, 문장 분절, 대역문 선택-
公开(公告)号:KR100886687B1
公开(公告)日:2009-03-04
申请号:KR1020070129360
申请日:2007-12-12
Applicant: 한국전자통신연구원
CPC classification number: G06F17/2863 , G06F17/218 , G06F17/2247 , G06F17/2715 , G06F17/2755 , G06F17/277
Abstract: A method and an apparatus for auto-detecting an unregistered word in Chinese are provided to extract unregistered words from a web-document which is a translation target document by using HTML tag information, statistic information, monosyllable token information, etc. A removing unit(102) removes an HTML tag of an inputted web-document when receiving a web-document which includes Chinese sentences, and a tag classification unit(104) classifies each sentence in the document based on a meta tag and general tag processing manner. An extracting unit(106) using a general tag includes: a monosyllable based extracting module(116) extracts unregistered words on the basis of monosyllable token; and a verb based extracting module(118) extracts unregistered verb words which consist of 4 syllables. An extracting unit(108) using a meta tag extracts an unregistered word by using a word included in meta tag information, and a morpheme analyzing unit(110) analyzes morphemes and outputs the analyzed results. A radix based extracting module(114) extracts an unregistered word based on radixes by using the analyzed results.
Abstract translation: 提供了一种用于自动检测中文的未注册单词的方法和装置,用于通过使用HTML标签信息,统计信息,单音节令牌信息等从作为翻译目标文档的web文档中提取未注册的单词。 102)在接收到包括中文句子的网络文档时,移除输入的web文档的HTML标签,标签分类单元(104)基于元标签和通用标签处理方式对文档中的每个句子进行分类。 使用一般标签的提取单元(106)包括:基于单音节提取模块(116)基于单音节令牌提取未注册的单词; 并且基于动词的提取模块(118)提取由4个音节组成的未注册的动词。 使用元标签的提取单元(108)通过使用元标签信息中包含的单词提取未注册的单词,并且语素分析单元(110)分析语素并输出分析结果。 基于基数的提取模块(114)通过使用分析结果基于基数提取未注册的单词。
-
公开(公告)号:KR1020080052318A
公开(公告)日:2008-06-11
申请号:KR1020070093689
申请日:2007-09-14
Applicant: 한국전자통신연구원
CPC classification number: G06F17/289 , G06F17/2785
Abstract: A method and an apparatus for selecting a translated word in a machine translation are provided to obtain more natural translation quality by applying a translated word selecting scheme to the machine translation, and to get wanted information from a document written in foreign languages more easily and cheaply. A method for selecting a translated word in a machine translation comprises the following several steps. A co-occurring word class information applying unit checks whether a target word with respect to a word of a source sentence can be determined as a translated word by using a co-occurring word class information database(S300). In case that the target word is not determined from the co-occurring word class information database(S301), a meaning determining unit solves meaning ambiguity of the target word(S302). If the meaning determining unit solves the meaning ambiguity or the number of translated words of a target noun word is more than 2 but the translated words share the same meaning code, the translated word determining unit determines an optimal translated word by using a target language local context statistics information database which is constructed from a target language corpus in order to select more precise translated word(S303,S304).
Abstract translation: 提供了一种用于在机器翻译中选择翻译单词的方法和装置,以通过将翻译的单词选择方案应用于机器翻译来获得更自然的翻译质量,并且更容易且更便宜地从外国语言写的文档中获得所需信息 。 一种用于在机器翻译中选择翻译单词的方法包括以下几个步骤。 共同出现的词类信息应用单元通过使用共同出现的词类信息数据库来检查关于源语句的单词的目标词是否可以被确定为翻译的单词(S300)。 在没有从共同词类信息数据库确定目标字的情况下(S301),意义确定单元解决目标字的含义歧义(S302)。 如果意义确定单元解决了目标名词词语的含义歧义或翻译单词的数目大于2,但翻译的单词共享相同的含义代码,则翻译单词确定单元通过使用目标语言本地来确定最佳翻译单词 从目标语言语料库构建的上下文统计信息数据库,以便选择更精确的翻译单词(S303,S304)。
-
-
-
-
-
-
-
-
-