Abstract:
Systems and methods for predicting T-Cell receptor (TCR)-peptide interaction, including training a deep learning model for the prediction of TCR-peptide interaction by determining a multiple sequence alignment (MSA) for TCR-peptide pair sequences from a dataset of TCR-peptide pair sequences using a sequence analyzer, building TCR structures and peptide structures using the MSA and corresponding structures from a Protein Data Bank (PDB) using a MODELLER, and generating an extended TCR-peptide training dataset based on docking energy scores determined by docking peptides to TCRs using physical modeling based on the TCR structures and peptide structures built using the MODELLER. TCR-peptide pairs are classified and labeled as positive or negative pairs using pseudo-labels based on the docking energy scores, and the deep learning model is iteratively retrained based on the extended TCR-peptide training dataset and the pseudo- labels until convergence.
Abstract:
Aspects of the present disclosure describe improving neural network robustness through neighborhood preserving layers and learning weighted-average neighbor embeddings. A method of training a neural network comprises modifying gradient backpropagation of weighted-average neighbor layer into input domain entries. The present disclosure may adapt certain manifold representation techniques to an online setting that advantageously affords practical real world benefits including uses in machine learning application for training neural networks in applications desiring dimension reduction, interpretability, smoothness, and acting as a form of regularization providing benefit against adversarial attack.
Abstract:
System(s) and method(s) are provided for data management and data processing. For example, various embodiments may include systems and methods relating to relatively larger groups of data being selected with comparable or better performing selection results (e.g. high data redundancy elimination and/or average chunk size). In various embodiments, the system(s) and method(s) may include, for example a data group, block, or chunk combining technique and/or a data group, block, or chunk splitting technique. Various embodiments may include a first standard or typical data grouping, blocking, or chunking technique and/or data group, block or chunk combining technique and/or a data group, block, or chunk splitting technique. Exemplary system(s) and method(s) may relate to data hashing and/or data elimination. Embodiments may include a look-ahead buffer and determine whether to emit small chunks or large chunks based on characteristics of underlying data and/or particular application of the invention (e.g. for backup).
Abstract:
Systems and methods for implementing content-level anomaly detection for devices having limited memory are provided. At least one log content model is generated (130) based on training log content of training logs obtained from one or more sources associated with the computer system. The at least one log content model is transformed (140) into at least one modified log content model to limit memory usage. Anomaly detection is performed (170) for testing log content of testing logs obtained from one or more sources associated with the computer system based on the at least one modified log content model. In response to the anomaly detection identifying one or more anomalies associated with the testing log content, the one or more anomalies are output (170).
Abstract:
A machine learning method includes connecting machines in a data-center using a network aware model consistency for stochastic applications; ensuring a communication graph of all machines in the data-center is connected; propagating all updates uniformly across the cluster without update; and preferring connections to a machine with first network throughput over machines with second network throughput smaller than the first network throughput.
Abstract:
Systems and methods for data management and data processing are provided. Embodiments may include systems and methods relating to fast data selection with reasonably high quality results, and may include a faster data selection function and a slower data selection function. Various embodiments may include systems and methods relating to data hashing and/or data redundancy identification and elimination for a data set or a string of data. Embodiments may include a first selection function is used to pre-select boundary points or data blocks/windows from a data set or data stream and a second selection function is used to refine the boundary points or data blocks/windows. The second selection function may be better at determining the best places for boundary points or data blocks/windows in the data set or data stream. In various embodiments, data may be processed by a first faster hash function and slower more discriminating second hash function.
Abstract:
System(s) and method(s) are provided for data management and data processing. For example, various embodiments may include systems and methods relating to relatively larger groups of data being selected with comparable or better performing selection results (e.g. high data redundancy elimination and/or average chunk size). In various embodiments, the system(s) and method(s) may include, for example a data group, block, or chunk combining technique and/or a data group, block, or chunk splitting technique. Various embodiments may include a first standard or typical data grouping, blocking, or chunking technique and/or data group, block or chunk combining technique and/or a data group, block, or chunk splitting technique. Exemplary system(s) and method(s) may relate to data hashing and/or data elimination. Embodiments may include a look-ahead buffer and determine whether to emit small chunks or large chunks based on characteristics of underlying data and/or particular application of the invention (e.g. for backup).
Abstract:
Systems and methods for data management and data processing are provided. Embodiments may include systems and methods relating to fast data selection with reasonably high quality results, and may include a faster data selection function and a slower data selection function. Various embodiments may include systems and methods relating to data hashing and/or data redundancy identification and elimination for a data set or a string of data. Embodiments may include a first selection function is used to pre-select boundary points or data blocks/windows from a data set or data stream and a second selection function is used to refine the boundary points or data blocks/windows. The second selection function may be better at determining the best places for boundary points or data blocks/windows in the data set or data stream. In various embodiments, data may be processed by a first faster hash function and slower more discriminating second hash function.