Invention Grant
- Patent Title: Matching strings in a large relational database
-
Application No.: US16659506Application Date: 2019-10-21
-
Publication No.: US11238104B2Publication Date: 2022-02-01
- Inventor: Mohammadreza Barouni Ebrahimi , Samaneh Bayat , Obidul Islam
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Law Office of Jim Boice
- Main IPC: G06F16/00
- IPC: G06F16/00 ; G06F16/903 ; G06F16/33 ; G06F16/901

Abstract:
A computer-implemented method identifies strings of data from a database. One or more processors receive data as an input string. The processor(s) generate a first binary code using a binary locality sensitive hashing of k-grams in the input string, where the binary locality sensitive hashing on the k-grams in the input string is derived from a first set of bi-grams in the input string, a second set of bi-grams in the input string, and a quantity of intersecting bi-grams from the first set of bi-grams and the second set of bi-grams. In response to receiving a search request for a particular string, the processor(s) generate a second binary code using a binary locality sensitive hashing on the particular string, and search a database in a query process. The processor(s) then rank and return a set of similar strings found in the database.
Public/Granted literature
- US20200050639A1 MATCHING STRINGS IN A LARGE RELATIONAL DATABASE Public/Granted day:2020-02-13
Information query