Technologies for software basic block similarity analysis
Abstract:
Technologies for analyzing software similarity include a computing device having access to a collection of sample software. The computing device identifies a number of code segments, such as basic blocks, within the software. The computing device normalizes each code segment by extracting the first data element of each computer instruction within the code segment. The first data element may be the first byte. The computing device calculates a probabilistic feature hash signature for each normalized code segment. The computing device may filter out known-good code segments by comparing signatures with a probabilistic hash filter generated from a collection of known-good software. The computing device calculates a similarity value between each pair of unfiltered, normalized code segments. The computing device generates a graph including the normalized code segments and the similarity values. The computing device may cluster the graph using a force-based clustering algorithm.
Public/Granted literature
Information query
Patent Agency Ranking
0/0