Abstract:
본 발명은 온톨로지의 도메인 정보를 이용한 문서 요약 장치 및 방법에 관한 것으로, 온톨로지의 도메인 정보를 이용하여 문서를 효과적으로 요약하기 위한 장치 및 방법을 제공하는 데 있다. 이를 위하여 본 발명의 일실시 예에 따른 문서 요약 방법은, 문서 요약 장치가 온톨로지 구축을 위하여 입력되는 문서를 기반으로 온톨로지를 자동으로 구축하는 단계; 상기 문서 요약 장치가 요약을 위하여 입력되는 문서를 형태소 분석하여 명사를 추출하는 단계; 상기 문서 요약 장치가 상기 온톨로지로부터 상기 명사의 도메인을 추출하는 단계; 및 상기 문서 요약 장치가 상기 도메인으로 구성되는 요약본을 생성하는 단계를 포함한다. 그럼으로써, 본 발명은, 온톨로지의 도메인 정보를 이용하여 문서의 요약본을 효과적으로 생성할 수 있는 이점이 있다. 온톨로지, 도메인, 문서 요약
Abstract:
본 발명은 HTML(Hyper Text Mark-up Language) 웹 문서의 특징인 구조화된 문서로부터 정보를 추출하는 방법에 관한 것으로, 영역 적응성을 높이기 위한 구조정보 반자동 추출 기술에 관한 것이다. 본 발명은 사용자 간섭을 최소화하기 위해 기계학습을 기반으로 하고 있으며, 영역 내 사이트간의 적응성을 높이기 위해 학습모델을 영역과 사이트 별로 나누어 학습하는 2단계의 자질학습 방법을 포함한다. 본 발명에 의하면, 웹 사이트의 일부 소량의 데이터만 수작업으로 태깅(tagging)하여 학습하더라도 해당 웹 사이트의 속성들을 자동으로 대량 추출이 가능하며, 2단계의 학습모델을 사용함으로 인해, 한 사이트에서 학습한 정보를 같은 영역의 타 사이트에도 적용이 될 수 있기 때문에 사이트가 바뀔 때마다 매번 새로운 추출패턴 등의 리소스(resource)를 구축해야하는 부담을 덜어줌으로써 동 영역내 사이트 간의 적응성을 높이는 장점이 있다. 구조정보, 랩퍼(Wrapper), XHTML(eXtensible Hyper Text Mark-up Language), DOM(Document Object Model)
Abstract:
A method and an apparatus for retrieving multimedia contents are provided to analyze the meaning of an inquiry of a user correctly in a retrieving operation, thereby correctly retrieving multimedia contents corresponding to the inquiry. An inquiry of a user is represented by using a pointer which points a specific region of an MPEG-7 document and a reference which refers to the pointer(10). The meaning of the inquiry represented by using the pointer and the reference is analyzed(20). Multimedia contents corresponding to the inquiry are retrieved according to the analysis result(30).
Abstract:
PURPOSE: A method for selectively embodying a metre with respect to a specific form in a Korean dialogue text-to-speech system is provided to variously embody the metre suitable for dialogue connection or a sentence type by selectively extracting a corresponding speech segment from a synthesis unit DB. CONSTITUTION: If a pre-processed Korean dialogue sentence is inputted, a speech act tagging work of the input sentence is performed(S20). It is discriminated whether a specific element to selectively embody a metre is included in the input sentence in which the speech act tagging work is completed(S30). If the specific element is included, a tagging work of the specific element is performed using a work tagging table to correspond to speech act information of a preceding sentence and a following sentence including the specific element(S40). If a specific element with the same form to selectively embody the metre is not included in the input sentence, it is discriminated whether a question-type ending to selectively embody the metre is included in the input sentence(S50). It is discriminated whether the question-type ending to selectively embody the metre is included in the input sentence. If the question-type ending is included, a tagging work of the question-type ending is performed using a question-type ending tagging table to correspond to the question type(S60). If the question-type ending is not included in the input sentence, a text tagged to the specific element is output(S70). If a tagging text for the specific element is output, a corresponding speech segment is extracted from a synthesis unit DB to be suitable for a tag with a tagged form(S80). The extracted speech segments are added to the other speech segments to generate a dialogue synthesis sound(S90).
Abstract:
PURPOSE: A translation engine device for translating a source language into an object language and a translation method therefor are provided to use a dialogue sentence in several domain environment and accurately translate the source language inputted by a user into the object language. CONSTITUTION: A mapping table(408) stores a cluster of an object language mapped with a cluster of a source language. A DTST(Direct Translation Sentence Table) direct translation unit(401) directly translates a sentence capable of direct translation in an inputted source language sentence. A pre-processing module(402) maintains a kernel language of the source language sentence through the morpheme analysis of the source language sentence, hides the other portions, and simplifies the structure of the sentence. A clustering unit(404) divides the source language sentence into clusters. A mapping unit(405) decides the cluster of the object language mapped to the cluster of the source language using the mapping table(408). A post-processing and generating unit(406) reallocates the order of the clusters of the object language, and recovers the object language as a completed sentence form.
Abstract:
PURPOSE: A method and a device for generating a translated sentence using a statistical method of a word level are provided to generate the translated sentence of a high quality in a high speed by forming/using an order information database extracted from a large target language corpus through the statistical method. CONSTITUTION: A training module(110) statistically stores the order information from the target language corpus. A morpheme analyzer(121) analyzes a morpheme by receiving a original language sentence. A parameterizer(123) parameterizes the words corresponding to the first speech part forming the original language sentence divided into each morpheme and forms the sentence tagged by the morpheme after hiding the words corresponding to the second speech part. A word arranger(125) replaces each morpheme with the target language word from a translation dictionary database(130) by receiving the tagged sentence. A recovery part(127) recovers/inserts the original word of the parameterized speech part into the replaced target language and recovers the hidden words. A post-processor(129) outputs the original language sentence and the translated sentence based on the generation information after removing a tag.