SELF-SUPERVISED SEQUENTIAL VARIATIONAL AUTOENCODER FOR DISENTANGLED DATA GENERATION

    公开(公告)号:WO2021096739A1

    公开(公告)日:2021-05-20

    申请号:PCT/US2020/058857

    申请日:2020-11-04

    Abstract: A computer-implemented method is provided for disentangled data generation. The method includes accessing (410), by a variational autoencoder, a plurality of supervision signals. The method further includes accessing (420), by the variational autoencoder, a plurality of auxiliary tasks that utilize the supervision signals as reward signals to learn a disentangled representation. The method also includes training (430) the variational autoencoder to disentangle a sequential data input into a time-invariant factor and a time- varying factor using a self-supervised training approach which is based on outputs of the auxiliary tasks obtained by using the supervision signals to accomplish the plurality of auxiliary tasks.

    T-CELL RECEPTOR REPERTOIRE SELECTION PREDICTION WITH PHYSICAL MODEL AUGMENTED PSEUDO-LABELING

    公开(公告)号:WO2023069667A1

    公开(公告)日:2023-04-27

    申请号:PCT/US2022/047346

    申请日:2022-10-21

    Abstract: Systems and methods for predicting T-Cell receptor (TCR)-peptide interaction, including training a deep learning model for the prediction of TCR-peptide interaction by determining a multiple sequence alignment (MSA) for TCR-peptide pair sequences from a dataset of TCR-peptide pair sequences using a sequence analyzer, building TCR structures and peptide structures using the MSA and corresponding structures from a Protein Data Bank (PDB) using a MODELLER, and generating an extended TCR-peptide training dataset based on docking energy scores determined by docking peptides to TCRs using physical modeling based on the TCR structures and peptide structures built using the MODELLER. TCR-peptide pairs are classified and labeled as positive or negative pairs using pseudo-labels based on the docking energy scores, and the deep learning model is iteratively retrained based on the extended TCR-peptide training dataset and the pseudo- labels until convergence.

    PEPTIDE BASED VACCINE GENERATION SYSTEM WITH DUAL PROJECTION GENERATIVE ADVERSARIAL NETWORKS

    公开(公告)号:WO2022216584A1

    公开(公告)日:2022-10-13

    申请号:PCT/US2022/023264

    申请日:2022-04-04

    Abstract: A method is provided for generating new binding peptides to MHC proteins includes training (430), by a processor device, a Generative Adversarial Network GAN having a generator and a discriminator only on a set of binding peptide sequences given training data including the set of binding peptide sequences and a set of non-binding peptide sequences. A GAN training objective includes the discriminator being iteratively updated to distinguish generated peptide sequences from sampled binding peptide sequences as fake or real and the generator being iteratively updated to fool the discriminator. The training includes optimizing (440) the GAN training objective while learning two projection vectors for a binding class with two cross-entropy losses. A first loss discriminating binding peptide sequences in the training data from non-binding peptide sequences in the training data. A second loss discriminating generated binding peptide sequences from non-binding peptide sequences in the training data.

    ACCELERATING DEEP NEURAL NETWORK TRAINING WITH INCONSISTENT STOCHASTIC GRADIENT DESCENT
    5.
    发明申请
    ACCELERATING DEEP NEURAL NETWORK TRAINING WITH INCONSISTENT STOCHASTIC GRADIENT DESCENT 审中-公开
    不完全随机梯度下降加速深度神经网络训练

    公开(公告)号:WO2017136802A1

    公开(公告)日:2017-08-10

    申请号:PCT/US2017/016637

    申请日:2017-02-06

    CPC classification number: G06N3/08 G06N3/04 G06N3/0454 G06N3/084

    Abstract: Aspects of the present disclosure describe techniques for training a convolutional neural network using an inconsistent stochastic gradient descent (ISGD) algorithm. Training effort for training batches used by the ISGD algorithm are dynamically adjusted according to a determined loss for a given training batch which are classified into two sub states - well-trained or under-trained. The ISGD algorithm provides more iterations for under-trained batches while reducing iterations for well-trained ones.

    Abstract translation: 本公开的各方面描述了使用不一致随机梯度下降(ISGD)算法来训练卷积神经网络的技术。 ISGD算法使用的培训批次的培训工作根据给定培训批次的确定损失进行动态调整,这些培训批次分为两个子状态 - 良好培训或未受过培训。 ISGD算法为未经训练的批次提供更多迭代,同时减少训练良好的批次的迭代次数。

    HIERARCHICAL WORD EMBEDDING SYSTEM
    6.
    发明申请

    公开(公告)号:WO2022216935A1

    公开(公告)日:2022-10-13

    申请号:PCT/US2022/023840

    申请日:2022-04-07

    Abstract: Systems and methods for matching job descriptions with job applicants is provided. The method includes allocating each of one or more job applicants curriculum vitae (CV) into sections 320; applying max pooled word embedding 330 to each section of the job applicants CVs; using concatenated max-pooling and average-pooling 340 to compose the section embeddings into an applicants CV representation; allocating each of one or more job position descriptions into specified sections 220; applying max pooled word embedding 230 to each section of the job position descriptions; using concatenated max-pooling and average-pooling 240 to compose the section embeddings into a job representation; calculating a cosine similarity 250, 350 between each of the job representations and each of the CV representations to perform job-to-applicant matching; and presenting an ordered list of the one or more job applicants 360 or an ordered list of the one or more job position descriptions 260 to a user.

    CONTROLLED TEXT GENERATION WITH SUPERVISED REPRESENTATION DISENTANGLEMENT AND MUTUAL INFORMATION MINIMIZATION

    公开(公告)号:WO2021119074A1

    公开(公告)日:2021-06-17

    申请号:PCT/US2020/063926

    申请日:2020-12-09

    Abstract: A computer-implemented method is provided for disentangled data generation. The method includes accessing (310), by a bidirectional Long Short-Term Memory (LSTM) with a multi-head attention mechanism, a dataset including a plurality of pairs each formed from a given one of a plurality of input text structures and given one of a plurality of style labels for the plurality of input text structures. The method further includes training (320) the bidirectional LSTM as an encoder to disentangle a sequential text input into disentangled representations comprising a content embedding and a style embedding based on a subset of the dataset. The method also includes training (350) a unidirectional LSTM as a decoder to generate a next text structure prediction for the sequential text input based on previously generated text structure information and a current word, from a disentangled representation with the content embedding and the style embedding.

    A MOBILE PHONE WITH SYSTEM FAILURE PREDICTION USING LONG SHORT-TERM MEMORY NEURAL NETWORKS
    8.
    发明申请
    A MOBILE PHONE WITH SYSTEM FAILURE PREDICTION USING LONG SHORT-TERM MEMORY NEURAL NETWORKS 审中-公开
    使用长短期记忆神经网络的系统故障预测移动电话

    公开(公告)号:WO2017177018A1

    公开(公告)日:2017-10-12

    申请号:PCT/US2017/026377

    申请日:2017-04-06

    Abstract: Mobile phones and methods for mobile phone failure prediction include receiving respective log files from one or more mobile phone components, including at least one user application. The log files have heterogeneous formats. A likelihood of failure of one or more mobile phone components is determined based on the received log files by clustering the plurality of log files according to structural log patterns and determining feature representations of the log files based on the log clusters. A user is alerted to a potential failure if the likelihood of component failure exceeds a first threshold. An automatic system control action is performed if the likelihood of component failure exceeds a second threshold.

    Abstract translation: 用于手机故障预测的移动电话和方法包括从一个或多个移动电话组件接收各个日志文件,所述移动电话组件包括至少一个用户应用程序。 日志文件具有不同的格式。 基于接收到的日志文件,通过根据结构化日志模式对多个日志文件进行群集并且基于日志群集来确定日志文件的特征表示来确定一个或多个移动电话部件的故障的可能性。 如果组件故障的可能性超过第一阈值,则用户被警告潜在的故障。 如果组件故障的可能性超过第二阈值,则执行自动系统控制动作。

    A PEPTIDE SEARCH SYSTEM FOR IMMUNOTHERAPY
    9.
    发明申请

    公开(公告)号:WO2023038834A1

    公开(公告)日:2023-03-16

    申请号:PCT/US2022/042172

    申请日:2022-08-31

    Abstract: A system for binding peptide search for immunotherapy is presented. The system includes employing (101) a deep neural network to predict a peptide presentation given Major Histocompatibility Complex allele sequences and peptide sequences, training (103) a Variational Autoencoder (VAE) to reconstruct peptides by converting the peptide sequences into continuous embedding vectors, running (105) a Monte Carlo Tree Search to generate a first set of positive peptide vaccine candidates, running (105) a Bayesian Optimization search with the trained VAE and a Backpropagation search with the trained VAE to generate a second set of positive peptide vaccine candidates, using (107) a sampling from a Position Weight Matrix (sPWM) to generate a third set of positive peptide vaccine candidates, screening and merging (109) the first, second, and third sets of positive peptide vaccine candidates, and outputting (111) qualified peptides for immunotherapy from the screened and merged sets of positive peptide vaccine candidates.

    LEARNING ORTHOGONAL FACTORIZATION IN GAN LATENT SPACE

    公开(公告)号:WO2022169681A1

    公开(公告)日:2022-08-11

    申请号:PCT/US2022/014211

    申请日:2022-01-28

    Abstract: A method for learning disentangled representations of videos is presented. The method includes feeding (1001) each frame of video data into an encoder to produce a sequence of visual features, passing (1003) the sequence of visual features through a deep convolutional network to obtain a posterior of a dynamic latent variable and a posterior of a static latent variable, sampling (1005) static and dynamic representations from the posterior of the static latent variable and the posterior of the dynamic latent variable, respectively, concatenating (1007) the static and dynamic representations to be fed into a decoder to generate reconstructed sequences, and applying (1009) three regularizes to the dynamic and static latent variables to trigger representation disentanglement. To facilitate the disentangled sequential representation learning, orthogonal factorization in generative adversarial network (GAN) latent space is leveraged to pre-train a generator as a decoder in the method.

Patent Agency Ranking