Memory access patterns for malware detection
Malware brings signicant threats to modern digitized society. Malware developers put in signicant eorts to evade detection and remain unnoticed on victims' computers despite a number of malware detection techniques. To eliminate known and noticeable traces in memory, network or disk activities, they use encryption and obfuscation. Because of this, there remains a strong need for new malware detection methods, especially ones based on Machine Learning models, because processing of large amounts of data is not a suitable task for a human. This paper presents a novel method that could potentially detect zero-day attacks and contribute to proactive malware detection. Our method is based on analysis of sequences of memory access operations produced by binary le during execution. In order to perform experiments, we utilized an automated virtualized environment with binary instrumentation tools to trace the memory access sequences. Unlike the other relevant papers, we focus only on analysis of basic (Read and Write) memory access operations and their n-grams rather than on the fact of a presence or an overall number of operations. Additionally, we performed a study of n-grams of memory accesses and tested it against real-world malware samples collected from open sources. Collected data and proposed feature construction methods resulted in accuracy of up to 98.92% using such Machine Learning methods as k-NN and ANN. Thus, we believe that our proposed method will serve as a stepping stone for better proactive malware detection techniques in the future.
M. Schiffman, "A brief history of malware obfuscation," 2010.
D. Distler and C. Hornat, "Malware analysis: An introduction," Sans Reading Room, 2007.
K. Kendall and C. McMillan, "Practical malware analysis," in Black Hat Conference, USA, 2007.
D. Uppal, R. Sinha, V. Mehra, and V. Jain, "Malware detection and classication based on extraction of api sequences," in Advances in Computing, Communications and Informatics (ICACCI), 2014 International Conference on. IEEE, 2014, pp. 2337-2342.
L. S. Grini, A. Shalaginov, and K. Franke, "Study of soft computing methods for large-scale multinomial malware types and families detection," in Proceedings of the The 6th World Conference on Soft Computing, 2016.
M. Egele, T. Scholte, E. Kirda, and C. Kruegel, "A survey on automated dynamic malware-analysis techniques and tools," ACM Computing Surveys (CSUR), vol. 44, no. 2, p. 6, 2012.
C. Guarnieri, A. Tanasi, J. Bremer, and M. Schloesser, "The cuckoo sandbox," 2012.
J. Yang, J. Deng, B. Cui, and H. Jin, "Research on the performance of mining packets of educational network for malware detection between pm and vm," in Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), 2015 9th International Conference on, July 2015, pp. 296-300.
A. Prakash, E. Venkataramani, H. Yin, and Z. Lin, "On the trustworthiness of memory analysis; an empirical study from the perspective of binary execution," IEEE Transactions on Dependable and Secure Computing, vol. 12, no. 5, pp. 557-570, Sept 2015.
Y. Kawakoya, M. Iwamura, E. Shioji, and T. Hariu, Research in Attacks, Intrusions, and Defenses: 16th International Symposium, RAID 2013, Rodney Bay, St. Lucia, October 23-25, 2013. Proceedings. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, ch. API Chaser: Anti-analysis Resistant Malware Analyzer, pp. 123-143.
K. N. Khasawneh, M. Ozsoy, C. Donovick, N. Abu-Ghazaleh, and D. Ponomarev, "Ensemble learning for low-level hardware-supported malware detection," in Research in Attacks, Intrusions, and Defenses. Springer, 2015, pp. 3-25.
"Pin a dynamic binary instrumentation tool intel developer zone 2012,"
https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool, 2012, accessed:2016-4-14.
M. Ligh, S. Adair, B. Hartstein, and M. Richard, Malware analyst's cookbook and DVD: tools and techniques for fighting malicious code. Wiley Publishing, 2010.
V. FOUNDATION, "Volatility," http://www.volatilityfoundation.org/, 2015, accessed:2016-4-15.
B. Dolan-Gavitt, A. Srivastava, P. Traynor, and J. Giffin, "Robust signatures for kernel data structures," in Proceedings of the 16th ACM conference on Computer and communications security. ACM, 2009, pp. 566-577.
Z. Lin, J. Rhee, X. Zhang, D. Xu, and X. Jiang, "Siggraph: Brute force scanning of kernel data structure instances using graph-based signatures." in NDSS, 2011.
B. Lu, F. Liu, X. Ge, B. Liu, and X. Luo, "A software birthmark based on dynamic opcode n-gram," in Semantic Computing, 2007. ICSC 2007. International Conference on. IEEE, 2007, pp. 37-44.
A. Fog, "Technical university of denmark." Instruction tables Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs, 2016.
I. Santos, F. Brezo, X. Ugarte-Pedrero, and P. G. Bringas, "Opcode sequences
as representation of executables for data-mining-based unknown malware detection," Information Sciences, vol. 231, pp. 64-82, 2013.
P. Vinod, V. Laxmi, and M. S. Gaur, "Reform: Relevant features for malware analysis," in Advanced Information Networking and Applications Workshops (WAINA), 2012 26th International Conference on. IEEE, 2012, pp. 738-744.
I. Kononenko and M. Kukar, Machine learning and data mining: introduction to principles and algorithms. Horwood Publishing, 2007.
Netmarketshare, "Desktop operating system market share,"
https://www.netmarketshare.com/operating-system-market-share.aspx, 2016, accessed: 2016-5-10.
G. Amato, "Peframe," https://github.com/guelfoweb/peframe, accessed: 20.10.2015.