COMPUTER IMPLEMENTED METHOD FOR INDEXING REFERENCE GENOME
    1.
    发明申请
    COMPUTER IMPLEMENTED METHOD FOR INDEXING REFERENCE GENOME 审中-公开
    用于索引基因组索引的计算机实现方法

    公开(公告)号:WO2010104608A3

    公开(公告)日:2010-12-16

    申请号:PCT/US2010000772

    申请日:2010-03-15

    Inventor: ROTH CHANTAL

    CPC classification number: G06F19/22

    Abstract: A method for indexing a reference genome is provided. The method includes selecting a reference genome to index, calculating a first minimum index region size, assigning a first position number to a first index region of the reference genome, assigning a second position number to a second index region of the reference genome, and storing the association of the first and second position numbers to index regions in a hash table. The size of the first index region can be greater than or equal to the first minimum index region size. The second index region can overlap with at least one base included in the first index region. The first minimum index region size can be calculated based on the reference genome size. In yet other embodiments of the present teachings, a method for mapping a sequence read to a reference genome is provided wherein a sequence read is compared to the index regions stored in the indexing hash table, and the sequence read is mapped to and aligned against a location on the reference genome. Systems configured to carry out the methods are also provided.

    Abstract translation: 提供了用于索引参考基因组的方法。 该方法包括选择参考基因组来索引,计算第一最小索引区大小,将第一位置号分配给参考基因组的第一索引区,将第二位置号分配给参考基因组的第二索引区,以及存储 将第一和第二位置编号关联到散列表中的索引区域。 第一索引区域的大小可以大于或等于第一最小索引区域大小。 第二索引区域可以与第一索引区域中包括的至少一个基底重叠。 可以基于参考基因组大小来计算第一最小索引区域大小。 在本教导的其他实施例中,提供了一种用于将读取的序列映射到参考基因组的方法,其中将序列读取与存储在索引化哈希表中的索引区域进行比较,并且将序列读取映射到并与 在参考基因组上的位置。 还提供了配置为执行这些方法的系统。

    COMPUTER IMPLEMENTED METHOD FOR INDEXING REFERENCE GENOME
    2.
    发明申请
    COMPUTER IMPLEMENTED METHOD FOR INDEXING REFERENCE GENOME 审中-公开
    用于引用参考基因​​的计算机实现方法

    公开(公告)号:WO2010104608A2

    公开(公告)日:2010-09-16

    申请号:PCT/US2010/000772

    申请日:2010-03-15

    Inventor: ROTH, Chantal

    CPC classification number: G06F19/22

    Abstract: A method for indexing a reference genome is provided. The method includes selecting a reference genome to index, calculating a first minimum index region size, assigning a first position number to a first index region of the reference genome, assigning a second position number to a second index region of the reference genome, and storing the association of the first and second position numbers to index regions in a hash table. The size of the first index region can be greater than or equal to the first minimum index region size. The second index region can overlap with at least one base included in the first index region. The first minimum index region size can be calculated based on the reference genome size. In yet other embodiments of the present teachings, a method for mapping a sequence read to a reference genome is provided wherein a sequence read is compared to the index regions stored in the indexing hash table, and the sequence read is mapped to and aligned against a location on the reference genome. Systems configured to carry out the methods are also provided.

    Abstract translation: 提供了一种用于索引参考基因组的方法。 该方法包括选择参考基因组索引,计算第一最小索引区域大小,将第一位置编号分配给参考基因组的第一索引区域,将第二位置编号分配给参考基因组的第二索引区域,以及存储 第一和第二位置号码与散列表中的索引区域的关联。 第一索引区域的大小可以大于或等于第一最小索引区域大小。 第二索引区域可以与包括在第一索引区域中的至少一个基座重叠。 可以基于参考基因组大小计算第一最小指数区域大小。 在本教导的其他实施例中,提供了将读取的序列映射到参考基因组的方法,其中将读取的序列与存储在索引散列表中的索引区域进行比较,并将读取的序列映射到对齐于 参考基因组上的位置。 还提供了配置为执行方法的系统。

Patent Agency Ranking