Invention Grant
US07966174B1 Automatic clustering of tokens from a corpus for grammar acquisition
有权
用于语法获取的语料库的令牌的自动聚类
- Patent Title: Automatic clustering of tokens from a corpus for grammar acquisition
- Patent Title (中): 用于语法获取的语料库的令牌的自动聚类
-
Application No.: US12030935Application Date: 2008-02-14
-
Publication No.: US07966174B1Publication Date: 2011-06-21
- Inventor: Srinivas Bangalore , Giuseppe Riccardi
- Applicant: Srinivas Bangalore , Giuseppe Riccardi
- Applicant Address: US GA Atlanta
- Assignee: AT&T Intellectual Property II, L.P.
- Current Assignee: AT&T Intellectual Property II, L.P.
- Current Assignee Address: US GA Atlanta
- Main IPC: G06F17/27
- IPC: G06F17/27

Abstract:
A system for recognizing patterns is disclosed. Grammar learning from a corpus includes, for the other non-context words, generating frequency vectors for each non-context token in a corpus based upon counted occurrences of a predetermined relationship of the non-context tokens to identified context tokens. Clusters are grown from the frequency vectors according to a lexical correlation or a cluster tree among the non-context tokens. The cluster tree is used for pattern recognition.
Information query