Abstract:
PURPOSE: A translation engine device for translating a source language into an object language and a translation method therefor are provided to use a dialogue sentence in several domain environment and accurately translate the source language inputted by a user into the object language. CONSTITUTION: A mapping table(408) stores a cluster of an object language mapped with a cluster of a source language. A DTST(Direct Translation Sentence Table) direct translation unit(401) directly translates a sentence capable of direct translation in an inputted source language sentence. A pre-processing module(402) maintains a kernel language of the source language sentence through the morpheme analysis of the source language sentence, hides the other portions, and simplifies the structure of the sentence. A clustering unit(404) divides the source language sentence into clusters. A mapping unit(405) decides the cluster of the object language mapped to the cluster of the source language using the mapping table(408). A post-processing and generating unit(406) reallocates the order of the clusters of the object language, and recovers the object language as a completed sentence form.
Abstract:
A break predicting method to which a static feature and a dynamic feature are reflected, a text-to-speech system based on the same, and a method therefor are provided to combine a CART(Classification And Regression Tree) model of the static feature with an HMM(Hidden Markov Model) model of the dynamic feature, generate a new break prediction model, and predict the most corresponding break strength to the meaning of the corresponding sentence through the generated break prediction model. Text data are extracted from a text corpus(S210). Morphological analysis for the extracted text data is performed(S230). A feature parameter is extracted from the morphological analysis result(S240). The voice recording of the extracted text data is performed, and training data are configured(S250). CART modeling is performed on the basis of the training data, and observation probability is calculated(S260). HMM modeling is performed on the basis of the training data, and transition probability is calculated(S270). A break prediction model is generated on the basis of the observation probability and the transition probability(S280). If a sentence is inputted, a break strength for the inputted sentence is predicted through the break prediction model.
Abstract:
본 발명은 문말 억양 예측 방법 및 이를 기반으로 하는 음성합성 방법 및 시스템에 관한 것으로, 문말 억양과 가장 밀접한 상관 관계를 가지는 문장의 종결어미를 이용하여 문말 억양 예측 모델을 생성하고 생성된 문말 억양 예측 모델을 통해 입력된 대화체 문장의 의미에 가장 부합하는 문말 억양을 생성함으로써, 보다 자연스러운 대화체 합성음을 생성할 수 있는 것을 특징으로 한다. 음성합성시스템(Text-to-Speech system), 문말 억양(sentence-final intonation), 양태(modality), 운율(prosody), 화행(speech act), 문형(sentence type)
Abstract:
A method for predicting a sentence-final intonation and a Text-to-Speech method and system based on the same are provided to generate the sentence-final intonation by using the final ending of a sentence with a strong correlation between the sentence-final intonation and the final ending of the sentence, thereby generating the sentence-final intonation most suitable for the meaning of a conversational sentence and generating a more natural conversational complex sound. A method for predicting a sentence-final intonation comprises the following steps of: extracting text data in consideration of the distribution of a final ending type from a conversational text corpus; performing sentence-final intonation tagging for the extracted text data based on a sentence-final intonation tag set which is set by a sentence-final intonation type(S520); performing aspect tagging for the extracted text data based on an aspect tagging table that the final ending of a sentence is divided by meanings(S530); performing speech-act tagging and sentence pattern tagging for the extracted text data(S540); constructing training data based on the text data in which the sentence-final intonation tagging, aspect tagging, speech-act tagging, and sentence pattern tagging are completed; generating a sentence-final intonation prediction model for sentence-final intonation prediction by a statistical method based on the training data; and performing the sentence-final intonation tagging by predicting the sentence-final intonation for a conversational sentence through the sentence-final intonation prediction model when the conversational sentence is inputted(S550,S560).
Abstract:
본 발명은 화행 정보를 이용한 대화체 음성합성 시스템 및 방법에 관한 것으로서, 대화 텍스트(dialog text)에서 대화의 맥락(context)에 따라 다른 억양이 구현될 필요가 있는 표현에 대해 두 대화자의 발화 문장으로부터 추출되는 화행(speech act) 정보를 이용하여 억양을 구분하는 태깅을 수행해 주고, 음성 합성시에는 그 태그에 맞는 억양을 갖는 음성 신호를 음성데이타베이스에서 추출하여 합성에 사용함으로써 대화의 흐름에 맞는 자연스럽고 다양한 억양을 구현함으로써, 대화의 상호작용(interaction)적인 측면을 좀더 실감나게 표현할 수 있어 대화음성의 자연성의 증진 효과를 기대할 수 있다. 대화체 음성합성시스템(Dialog-style Text-to-Speech system), 대화체 텍스트(dialog text), 음성 합성(speech synthesis), 맥락(context), 화행(speech act), 억양(intonation)
Abstract:
A device and a method for synthesizing voices are provided to realize various styles of voices with a voice database of a singular radio performer, thereby vividly expressing conversation voices. Levels of intimacy are defined(S10). Voices recording text constructed corresponding to each intimacy level are stored(S20). At least one of a sentence final intonation contour pattern, an intonation pattern of a primary intonation phrase in a sentence, and a pitch mean value of a sentence of each voice data is statistically modeled to extract a metrical characteristic according to each intimacy(S30). Rhythm models by intimacy levels are generated based on the extracted metrical characteristic(S40).
Abstract:
PURPOSE: A translation engine device for translating a source language into an object language and a translation method therefor are provided to use a dialogue sentence in several domain environment and accurately translate the source language inputted by a user into the object language. CONSTITUTION: A mapping table(408) stores a cluster of an object language mapped with a cluster of a source language. A DTST(Direct Translation Sentence Table) direct translation unit(401) directly translates a sentence capable of direct translation in an inputted source language sentence. A pre-processing module(402) maintains a kernel language of the source language sentence through the morpheme analysis of the source language sentence, hides the other portions, and simplifies the structure of the sentence. A clustering unit(404) divides the source language sentence into clusters. A mapping unit(405) decides the cluster of the object language mapped to the cluster of the source language using the mapping table(408). A post-processing and generating unit(406) reallocates the order of the clusters of the object language, and recovers the object language as a completed sentence form.