Invention Grant
US08463598B2 Word detection 有权
词检测

Word detection
Abstract:
Methods, systems, and apparatus, including computer program products, in which data from web documents are partitioned into a training corpus and a development corpus are provided. First word probabilities for words are determined for the training corpus, and second word probabilities for the words are determined for the development corpus. Uncertainty values based on the word probabilities for the training corpus and the development corpus are compared, and new words are identified based on the comparison.
Public/Granted literature
Information query
Patent Agency Ranking
0/0