Invention Grant
- Patent Title: Tokenization platform
- Patent Title (中): 令牌平台
-
Application No.: US13572825Application Date: 2012-08-13
-
Publication No.: US09195738B2Publication Date: 2015-11-24
- Inventor: Jignashu Parikh
- Applicant: Jignashu Parikh
- Applicant Address: US CA Sunnyvale
- Assignee: YAHOO! INC.
- Current Assignee: YAHOO! INC.
- Current Assignee Address: US CA Sunnyvale
- Agency: Pillsbury Winthrop Shaw Pittman LLP
- Main IPC: G06F17/27
- IPC: G06F17/27 ; G06F17/30

Abstract:
A tokenization platform and method is described for accurately tokenizing character strings, including but not limited to non-delimited character strings of the type commonly used in Internet domain names and computer filenames, to accurately identify words and phrases occurring therein. In one embodiment, a phased tokenization approach is used in which the final phase is a lexical analysis-based tokenization using a dictionary. The dictionary may be advantageously created and updated based upon one or more query logs associated with respective information retrieval systems, thereby ensuring that the dictionary accurately reflects currently-used terminology and captures alternative spellings and presentations of words and phrases submitted by users.
Public/Granted literature
- US20120310630A1 TOKENIZATION PLATFORM Public/Granted day:2012-12-06
Information query