BITY
Published in Github, 2018
BITY is a technique that recover the type information from Binary via machine learning. First, we present our approach to recovering type information in binary code. The idea is motivated by “duck typing”, where the type of a variable is determined by its features and properties. Our approach uses a combination of machine learning and program analysis. In detail, we first extract critical information form instruction-flow and data-flow, namely behaviors and features of variables in binary code. According to these behaviors and features, we learn a classifier with basic types as levels, using various machine learning methods, and then use this classifier to predict types for new, unseen binaries. For composite types, such as pointer and struct, we perform a point-to analysis to recover the target variables and use the classifier to recover the base type for these target variables, base on which, composite types are recovered.