Heuristic identification of shared substrings between text documents

Invention Grant

US12061637B2 Heuristic identification of shared substrings between text documents 有权

Please log in to see more content

Patent Title: Heuristic identification of shared substrings between text documents
Application No.: US17942174

Application Date: 2022-09-11
Publication No.: US12061637B2

Publication Date: 2024-08-13
Inventor: Mary Elizabeth Wahl , Amanda Leah Mercier , George Taylor Corbett
Applicant: Microsoft Technology Licensing, LLC
Applicant Address: US WA Redmond
Assignee: Microsoft Technology Licensing, LLC
Current Assignee: Microsoft Technology Licensing, LLC
Current Assignee Address: US WA Redmond
Agency: Calfee, Halter & Griswold LLP
Main IPC: G06F16/33
IPC: G06F16/33 ; G06F16/31 ; G06F16/338

Heuristic identification of shared substrings between text documents

Abstract:

Technologies for document evaluation and identification of shared textual substrings between documents are described herein. Documents are evaluated and organized according to textual elements within the documents. A suffix index is generated from a reference document. The suffix index is used to identify common substrings of text within query documents using variable evaluation windows within the query documents. Indications of overlapping textual information between the reference document and query documents is generated as an output.

Public/Granted literature

US20240086442A1 HEURISTIC IDENTIFICATION OF SHARED SUBSTRINGS BETWEEN TEXT DOCUMENTS Public/Granted day:2024-03-14

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/30	.•非结构文本数据（文档管理系统入G06F 16/93）
G06F16/33	..••查询