Matching strings in a large relational database

Invention Grant

US11238104B2 Matching strings in a large relational database 有权

Please log in to see more content

Patent Title: Matching strings in a large relational database
Application No.: US16659506

Application Date: 2019-10-21
Publication No.: US11238104B2

Publication Date: 2022-02-01
Inventor: Mohammadreza Barouni Ebrahimi , Samaneh Bayat , Obidul Islam
Applicant: International Business Machines Corporation
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Law Office of Jim Boice
Main IPC: G06F16/00
IPC: G06F16/00 ; G06F16/903 ; G06F16/33 ; G06F16/901

Matching strings in a large relational database

Abstract:

A computer-implemented method identifies strings of data from a database. One or more processors receive data as an input string. The processor(s) generate a first binary code using a binary locality sensitive hashing of k-grams in the input string, where the binary locality sensitive hashing on the k-grams in the input string is derived from a first set of bi-grams in the input string, a second set of bi-grams in the input string, and a quantity of intersecting bi-grams from the first set of bi-grams and the second set of bi-grams. In response to receiving a search request for a particular string, the processor(s) generate a second binary code using a binary locality sensitive hashing on the particular string, and search a database in a query process. The processor(s) then rank and return a set of similar strings found in the database.

Public/Granted literature

US20200050639A1 MATCHING STRINGS IN A LARGE RELATIONAL DATABASE Public/Granted day:2020-02-13

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构