T-CELL RECEPTOR REPERTOIRE SELECTION PREDICTION WITH PHYSICAL MODEL AUGMENTED PSEUDO-LABELING

    公开(公告)号:WO2023069667A1

    公开(公告)日:2023-04-27

    申请号:PCT/US2022/047346

    申请日:2022-10-21

    Abstract: Systems and methods for predicting T-Cell receptor (TCR)-peptide interaction, including training a deep learning model for the prediction of TCR-peptide interaction by determining a multiple sequence alignment (MSA) for TCR-peptide pair sequences from a dataset of TCR-peptide pair sequences using a sequence analyzer, building TCR structures and peptide structures using the MSA and corresponding structures from a Protein Data Bank (PDB) using a MODELLER, and generating an extended TCR-peptide training dataset based on docking energy scores determined by docking peptides to TCRs using physical modeling based on the TCR structures and peptide structures built using the MODELLER. TCR-peptide pairs are classified and labeled as positive or negative pairs using pseudo-labels based on the docking energy scores, and the deep learning model is iteratively retrained based on the extended TCR-peptide training dataset and the pseudo- labels until convergence.

    LEARNING WEIGHTED-AVERAGE NEIGHBOR EMBEDDINGS

    公开(公告)号:WO2021062052A1

    公开(公告)日:2021-04-01

    申请号:PCT/US2020/052577

    申请日:2020-09-24

    Abstract: Aspects of the present disclosure describe improving neural network robustness through neighborhood preserving layers and learning weighted-average neighbor embeddings. A method of training a neural network comprises modifying gradient backpropagation of weighted-average neighbor layer into input domain entries. The present disclosure may adapt certain manifold representation techniques to an online setting that advantageously affords practical real world benefits including uses in machine learning application for training neural networks in applications desiring dimension reduction, interpretability, smoothness, and acting as a form of regularization providing benefit against adversarial attack.

    METHODS AND SYSTEMS FOR QUICK AND EFFICIENT DATA MANAGEMENT AND/OR PROCESSING

    公开(公告)号:WO2008070484A3

    公开(公告)日:2008-06-12

    申请号:PCT/US2007/085686

    申请日:2007-11-28

    Abstract: System(s) and method(s) are provided for data management and data processing. For example, various embodiments may include systems and methods relating to relatively larger groups of data being selected with comparable or better performing selection results (e.g. high data redundancy elimination and/or average chunk size). In various embodiments, the system(s) and method(s) may include, for example a data group, block, or chunk combining technique and/or a data group, block, or chunk splitting technique. Various embodiments may include a first standard or typical data grouping, blocking, or chunking technique and/or data group, block or chunk combining technique and/or a data group, block, or chunk splitting technique. Exemplary system(s) and method(s) may relate to data hashing and/or data elimination. Embodiments may include a look-ahead buffer and determine whether to emit small chunks or large chunks based on characteristics of underlying data and/or particular application of the invention (e.g. for backup).

    CONTENT-LEVEL ANOMALY DETECTOR FOR SYSTEMS WITH LIMITED MEMORY

    公开(公告)号:WO2018231424A1

    公开(公告)日:2018-12-20

    申请号:PCT/US2018/033335

    申请日:2018-05-18

    Abstract: Systems and methods for implementing content-level anomaly detection for devices having limited memory are provided. At least one log content model is generated (130) based on training log content of training logs obtained from one or more sources associated with the computer system. The at least one log content model is transformed (140) into at least one modified log content model to limit memory usage. Anomaly detection is performed (170) for testing log content of testing logs obtained from one or more sources associated with the computer system based on the at least one modified log content model. In response to the anomaly detection identifying one or more anomalies associated with the testing log content, the one or more anomalies are output (170).

    SYSTEM AND METHOD FOR NETWORK BANDWIDTH AWARE DISTRIBUTED LEARNING
    5.
    发明申请
    SYSTEM AND METHOD FOR NETWORK BANDWIDTH AWARE DISTRIBUTED LEARNING 审中-公开
    网络带宽分布式学习的系统与方法

    公开(公告)号:WO2017058348A1

    公开(公告)日:2017-04-06

    申请号:PCT/US2016/044352

    申请日:2016-07-28

    CPC classification number: G06N99/005 G06F9/46 H04L67/10

    Abstract: A machine learning method includes connecting machines in a data-center using a network aware model consistency for stochastic applications; ensuring a communication graph of all machines in the data-center is connected; propagating all updates uniformly across the cluster without update; and preferring connections to a machine with first network throughput over machines with second network throughput smaller than the first network throughput.

    Abstract translation: 机器学习方法包括使用用于随机应用的网络感知模型一致性来连接数据中心中的机器; 确保数据中心中所有机器的通讯图连接; 在不更新的情况下,在整个集群中统一传播所有更新; 并且优先连接到具有第二网络吞吐量小于第一网络吞吐量的机器的具有第一网络吞吐量的机器。

    METHODS AND SYSTEMS FOR DATA MANAGEMENT USING MULTIPLE SELECTION CRITERIA
    6.
    发明申请
    METHODS AND SYSTEMS FOR DATA MANAGEMENT USING MULTIPLE SELECTION CRITERIA 审中-公开
    使用多种选择标准进行数据管理的方法和系统

    公开(公告)号:WO2008067226A1

    公开(公告)日:2008-06-05

    申请号:PCT/US2007/085357

    申请日:2007-11-21

    CPC classification number: G06F17/30159

    Abstract: Systems and methods for data management and data processing are provided. Embodiments may include systems and methods relating to fast data selection with reasonably high quality results, and may include a faster data selection function and a slower data selection function. Various embodiments may include systems and methods relating to data hashing and/or data redundancy identification and elimination for a data set or a string of data. Embodiments may include a first selection function is used to pre-select boundary points or data blocks/windows from a data set or data stream and a second selection function is used to refine the boundary points or data blocks/windows. The second selection function may be better at determining the best places for boundary points or data blocks/windows in the data set or data stream. In various embodiments, data may be processed by a first faster hash function and slower more discriminating second hash function.

    Abstract translation: 提供了数据管理和数据处理的系统和方法。 实施例可以包括与具有相当高质量结果的快速数据选择相关的系统和方法,并且可以包括更快的数据选择功能和较慢的数据选择功能。 各种实施例可以包括与用于数据集或一串数据的数据散列和/或数据冗余识别和消除有关的系统和方法。 实施例可以包括第一选择功能用于从数据集或数据流中预先选择边界点或数据块/窗口,并且使用第二选择功能来细化边界点或数据块/窗口。 在确定数据集或数据流中的边界点或数据块/窗口的最佳位置时,第二选择功能可能更好。 在各种实施例中,可以通过第一更快的散列函数和较慢的更多辨别的第二散列函数来处理数据。

    METHODS AND SYSTEMS FOR QUICK AND EFFICIENT DATA MANAGEMENT AND/OR PROCESSING
    7.
    发明公开
    METHODS AND SYSTEMS FOR QUICK AND EFFICIENT DATA MANAGEMENT AND/OR PROCESSING 有权
    方法和系统,以便快速和高效的数据管理和/或基本金属

    公开(公告)号:EP2089800A2

    公开(公告)日:2009-08-19

    申请号:EP07868875.1

    申请日:2007-11-28

    Abstract: System(s) and method(s) are provided for data management and data processing. For example, various embodiments may include systems and methods relating to relatively larger groups of data being selected with comparable or better performing selection results (e.g. high data redundancy elimination and/or average chunk size). In various embodiments, the system(s) and method(s) may include, for example a data group, block, or chunk combining technique and/or a data group, block, or chunk splitting technique. Various embodiments may include a first standard or typical data grouping, blocking, or chunking technique and/or data group, block or chunk combining technique and/or a data group, block, or chunk splitting technique. Exemplary system(s) and method(s) may relate to data hashing and/or data elimination. Embodiments may include a look-ahead buffer and determine whether to emit small chunks or large chunks based on characteristics of underlying data and/or particular application of the invention (e.g. for backup).

    METHODS AND SYSTEMS FOR DATA MANAGEMENT USING MULTIPLE SELECTION CRITERIA
    8.
    发明公开
    METHODS AND SYSTEMS FOR DATA MANAGEMENT USING MULTIPLE SELECTION CRITERIA 审中-公开
    数据管理的方法和系统使用多重选择准则

    公开(公告)号:EP2087418A1

    公开(公告)日:2009-08-12

    申请号:EP07854739.5

    申请日:2007-11-21

    CPC classification number: G06F17/30159

    Abstract: Systems and methods for data management and data processing are provided. Embodiments may include systems and methods relating to fast data selection with reasonably high quality results, and may include a faster data selection function and a slower data selection function. Various embodiments may include systems and methods relating to data hashing and/or data redundancy identification and elimination for a data set or a string of data. Embodiments may include a first selection function is used to pre-select boundary points or data blocks/windows from a data set or data stream and a second selection function is used to refine the boundary points or data blocks/windows. The second selection function may be better at determining the best places for boundary points or data blocks/windows in the data set or data stream. In various embodiments, data may be processed by a first faster hash function and slower more discriminating second hash function.

Patent Agency Ranking