Invention Grant
- Patent Title: Gene sequencing data compression preprocessing, compression and decompression method, system, and computer-readable medium
-
Application No.: US16618404Application Date: 2018-09-18
-
Publication No.: US11551785B2Publication Date: 2023-01-10
- Inventor: Zhuo Song , Gen Li , Pengxia Liu , Zhenguo Wang , Bolun Feng
- Applicant: GENETALKS BIO-TECH (CHANGSHA) CO., LTD.
- Applicant Address: CN Hunan
- Assignee: GENETALKS BIO-TECH (CHANGSHA) CO., LTD.
- Current Assignee: GENETALKS BIO-TECH (CHANGSHA) CO., LTD.
- Current Assignee Address: CN Hunan
- Agency: JCIP Global Inc.
- Priority: CN201710982649.1 20171020,CN201710982666.5 20171020,CN201710982696.6 20171020
- International Application: PCT/CN2018/106192 WO 20180918
- International Announcement: WO2019/076177 WO 20190425
- Main IPC: G06F16/22
- IPC: G06F16/22 ; G16B30/00 ; G06F16/2455 ; G06F16/174 ; G16B20/00

Abstract:
The present invention discloses a gene sequencing data compression preprocessing, compression and decompression method, a system, and a computer-readable medium. The preprocessing method implementation steps include: obtaining reference genome data; obtaining a mapping relationship between a short string K-mer and a prediction character c to obtain a prediction data model P1 containing any short string K-mer in the positive strand and negative strand of a reference genome and the prediction character c in a corresponding adjacent bit. The compression and decompression methods relate to performing compression/decompression on the basis of the prediction data model P1. The system is a computer system including a program for executing the previous method. The computer-readable medium includes a computer program for executing the previous method. The present invention can be oriented towards lossless gene sequencing data compression, provides fully effective information for a high-performance lossless compression and decompression algorithm for gene sequencing data.
Public/Granted literature
Information query