Invention Grant
- Patent Title: System and method for identifying website verticals
- Patent Title (中): 用于识别网站纵向的系统和方法
-
Application No.: US14180273Application Date: 2014-02-13
-
Publication No.: US09330168B1Publication Date: 2016-05-03
- Inventor: Robert Brown , Tapan Kamdar , Ryan Kirkish , Wei-Cheng Lai , Jeff McLellan
- Applicant: Go Daddy Operating Company, LLC
- Applicant Address: US AZ Scottsdale
- Assignee: Go Daddy Operating Company, LLC
- Current Assignee: Go Daddy Operating Company, LLC
- Current Assignee Address: US AZ Scottsdale
- Agency: Quarles & Brady LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Systems and methods for the categorization of websites are presented. A website is categorized using one or a combination of its domain name and its web page content. The domain name is tokenized, and the tokens compared to categories in a category structure to determine probabilities that the token belongs to each category. Combinations of tokens are similarly compared to the categories. A category may be determined with reference to a vector space in which a training set of websites having known categories is converted according to a methodology into reference vectors containing keyword frequencies. A target website is converted to a target vector using the same methodology, and a distance score of the target vector to each reference vector is calculated. The website represented by the target vector is assigned the category of the reference vector having the lowest distance score.
Information query