Compressing, storing and searching sequence data

Invention Grant

US10777304B2 Compressing, storing and searching sequence data 有权

Please log in to see more content

Patent Title: Compressing, storing and searching sequence data
Application No.: US15657359

Application Date: 2017-07-24
Publication No.: US10777304B2

Publication Date: 2020-09-15
Inventor: Michael H. Baym , Bonnie Berger Leighton , Po-Ru Loh
Applicant: Michael H. Baym , Bonnie Berger Leighton , Po-Ru Loh
Agent David H. Judson
Main IPC: G16B50/00
IPC: G16B50/00 ; H03M7/30

Compressing, storing and searching sequence data

Abstract:

The redundancy in genomic sequence data is exploited by compressing sequence data in such a way as to allow direct computation on the compressed data using methods that are referred to herein as “compressive” algorithms. This approach reduces the task of computing on many similar genomes to only slightly more than that of operating on just one. In this approach, the redundancy among genomes is translated into computational acceleration by storing genomes in a compressed format that respects the structure of similarities and differences important to analysis. Specifically, these differences are the nucleotide substitutions, insertions, deletions, and rearrangements introduced by evolution. Once such a compressed library has been created, analysis is performed on it in time proportional to its compressed size, rather than having to reconstruct the full data set every time one wishes to query it.

Public/Granted literature

US20170323052A1 Compressing, storing and searching sequence data Public/Granted day:2017-11-09

Information query

Espacenet

IPC分类:

G	物理
G16	特别适用于特定应用领域的信息通信技术
G16B	生物信息学，例如特别适用于计算分子生物学中的遗传或蛋白质相关数据处理的信息与通信技术
G16B50/00	特别适用于生物信息学的ICT程序设计工具或数据库系统