Invention Grant
- Patent Title: Content pattern based automatic document classification
-
Application No.: US15713445Application Date: 2017-09-22
-
Publication No.: US10713306B2Publication Date: 2020-07-14
- Inventor: Daran Cai , Nakul Garg , Michael Dobrzynski , Wei-Qiang Guo , Amit Khanna , Ning Xu
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Liang IP, PLLC
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06F16/93 ; G06F16/35 ; G06F16/31

Abstract:
Computer systems, devices, and associated methods of content pattern based automatic document classification are disclosed herein. In one embodiment, a method includes receiving a document and a sequence of words corresponding to a document class having a class label from a network storage. The method also includes determining a longest common subsequence of words between the words in the document and the sequence of words and calculating a similarity percentage between the document and the sequence of words based on the determined longest common subsequence. When the calculated similarity percentage is above a threshold, the class label corresponding to the document class is automatically applied to the received document in the network storage.
Public/Granted literature
- US20190095439A1 CONTENT PATTERN BASED AUTOMATIC DOCUMENT CLASSIFICATION Public/Granted day:2019-03-28
Information query