Abstract:
Systems and methods for detecting data exfiltration using domain name system (DNS) queries include, in various embodiments, performing operations that include parsing a DNS query to determine whether that DNS query is likely to contain hidden data that is being exfiltrated from a system or network. Statistical methods can be used to analyze the DNS query to determine a likelihood whether each of a plurality of segments of the DNS query are indicative of data exfiltration methods. If one or multiple DNS queries are deemed suspicious based on the analysis, a security action on the DNS query can be performed, including sending an alert and/or blocking the DNS query from being forwarded.
Abstract:
Systems and methods for detecting data exfiltration using domain name system (DNS) queries include, in various embodiments, performing operations that include parsing a DNS query to determine whether that DNS query is likely to contain hidden data that is being exfiltrated from a system or network. Statistical methods can be used to analyze the DNS query to determine a likelihood whether each of a plurality of segments of the DNS query are indicative of data exfiltration methods. If one or multiple DNS queries are deemed suspicious based on the analysis, a security action on the DNS query can be performed, including sending an alert and/or blocking the DNS query from being forwarded.
Abstract:
A system for discovering programming variants. The system analyzes system calls from executing a program to generate programming code or executable for a particular OS and/or CPU that would perform the same or similar actions as the program. The code that is generated is then mutated, augmented, and/or changed to create variations of the program which still functions and/or obtains the same objectives as the original code.
Abstract:
Anomalies in a data set may be difficult to detect when individual items are not gross outliers from a population average. Disclosed is an anomaly detector that includes neural networks such as an auto-encoder and a discriminator. The auto-encoder and the discriminator may be trained on a training set that does not include anomalies. During training, an auto-encoder generates an internal representation from the training set, and reconstructs the training set from the internal representation. The training continues until data loss in the reconstructed training set is below a configurable threshold. The discriminator may be trained until the internal representation is constrained to a multivariable unit normal. Once trained, the auto-encoder and discriminator identify anomalies in the evaluation set. The identified anomalies in an evaluation set may be linked to transaction, security breach or population trends, but broadly, disclosed techniques can be used to identify anomalies in any suitable population.
Abstract:
Aspects of the present disclosure involve a system and method for malware detection. The system and method introduce a probabilistic model that can observe user transaction data over a predetermined window of time. Then, using posterior probability, the system can determine whether multiple users where present during the window observed.
Abstract:
Aspects of the present disclosure involve systems, methods, devices, and the like for generating compact tree representations applicable to machine learning. In one embodiment, a system is introduced that can retrieve a decision tree structure to generate a compact tree representation model. The compact tree representation model may come in the form of a matrix design to maintain the relationships expressed by the decision tree structure.
Abstract:
Computer system drift can occur when a computer system or a cluster of computer systems deviates from ideal and/or desired behavior. In a server farm, for example, many different machines may be identically configured to work in conjunction with each other to provide an electronic service (serving web pages, processing electronic payment transactions, etc.). Over time, however, one or more of these systems may drift from previous behavior. Early drift detection can be important, especially in large enterprises, to avoiding costly downtime. Changes in a computer's configuration files, network connections, and/or executable processes can indicate ongoing drift, but collecting this information at scale can be difficult. By using certain hashing and min-Hash techniques, however, drift detection can be streamlined and accomplished for large scale operations. Velocity of drift may also be tracked using a decay function.
Abstract:
Methods and systems for creating and analyzing low-dimensional representation of webpage sequences are described. Network traffic history data associated with a particular website is retrieved and a word embedding algorithm is applied to the network traffic history data to produce a low dimensional embedding. A prediction model is created based on the low-dimensional embedding. Browsing activity on the particular website is monitored. A set of sessions in the current browsing activity is flagged based on a result of applying the prediction model to the monitored browsing activity.
Abstract:
Aspects of the present disclosure involve systems, methods, devices, and the like for generating compact tree representations applicable to machine learning. In one embodiment, a system is introduced that can retrieve a decision tree structure to generate a compact tree representation model. The compact tree representation model may come in the form of a matrix design to maintain the relationships expressed by the decision tree structure.
Abstract:
Methods and systems for creating and analyzing low-dimensional representation of webpage sequences are described. Network traffic history data associated with a particular website is retrieved and a word embedding algorithm is applied to the network traffic history data to produce a low dimensional embedding. A prediction model is created based on the low-dimensional embedding. Browsing activity on the particular website is monitored. A set of sessions in the current browsing activity is flagged based on a result of applying the prediction model to the monitored browsing activity.