-
公开(公告)号:KR100404320B1
公开(公告)日:2003-11-01
申请号:KR1020000080992
申请日:2000-12-23
Applicant: 한국전자통신연구원
IPC: G06F17/30
Abstract: PURPOSE: A method for automatically indexing sentences is provided to increase the efficiency of extracting index terms by automatically indexing Korean and English sentences standardized using a PDA(Push Down Automata). CONSTITUTION: Sentences in a document are recognized(201). Disused words are registered in a disused word database(203). The disused words registered in the disused word database are extracted. After that, arbitrary characters are substituted for the words(202). The sentences substituted are read in the unit. In case that an error is generated, an error message is transmitted. Index terms are extracted from the sentences using a PDA indexing engine(205). In case that the processing of a sentence is completed, a next sentence in the document is recognized(204). The structure of the stack of the PDA is divided into a state, an index and a symbol stack.
Abstract translation: 目的:提供了一种自动索引句子的方法,通过自动索引使用PDA(按下自动机)标准化的韩语和英语句子来提高索引术语的提取效率。 组成:文件中的句子被识别(201)。 废弃的单词被登记在废弃的单词数据库(203)中。 提取废弃词汇数据库中登记的废弃词汇。 之后,用任意字符代替字(202)。 被替换的句子在单元中被读取。 如果发生错误,则会发送错误消息。 使用PDA索引引擎从句子中提取索引术语(205)。 在完成句子的处理的情况下,识别文档中的下一个句子(204)。 PDA的堆栈结构分为状态,索引和符号堆栈。
-
公开(公告)号:KR1020020051596A
公开(公告)日:2002-06-29
申请号:KR1020000080992
申请日:2000-12-23
Applicant: 한국전자통신연구원
IPC: G06F17/30
Abstract: PURPOSE: A method for automatically indexing sentences is provided to increase the efficiency of extracting index terms by automatically indexing Korean and English sentences standardized using a PDA(Push Down Automata). CONSTITUTION: Sentences in a document are recognized(201). Disused words are registered in a disused word database(203). The disused words registered in the disused word database are extracted. After that, arbitrary characters are substituted for the words(202). The sentences substituted are read in the unit. In case that an error is generated, an error message is transmitted. Index terms are extracted from the sentences using a PDA indexing engine(205). In case that the processing of a sentence is completed, a next sentence in the document is recognized(204). The structure of the stack of the PDA is divided into a state, an index and a symbol stack.
Abstract translation: 目的:提供自动索引句子的方法,通过自动索引使用PDA(Push Down Automata)标准化的韩语和英语句子来提高索引项的提取效率。 宪法:文件中的句子被确认(201)。 废弃的单词被注册在废弃的单词数据库(203)中。 提取在废弃字数据库中注册的废弃字。 之后,任意字符代替单词(202)。 取代的句子在单位中读取。 在产生错误的情况下,发送错误消息。 使用PDA索引引擎(205)从句子中提取索引项。 在句子的处理完成的情况下,识别文档中的下一句(204)。 PDA的堆叠结构分为状态,索引和符号堆栈。
-