Obfuscated JavaScript Code Detection using Machine Learning with AST-based Syntactic and Lexical Analysis


Kiliç E., Sandıkkaya M. T.

8th International Conference on Smart and Sustainable Technologies, SpliTech 2023, Hybrid, Split/Bol, Croatia, 20 - 23 June 2023 identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.23919/splitech58164.2023.10193211
  • City: Hybrid, Split/Bol
  • Country: Croatia
  • Keywords: abstract syntax tree, binary classification, JavaScript, machine learning, natural language processing, obfuscation, obfuscation detection
  • Istanbul Technical University Affiliated: Yes

Abstract

Obfuscation has become a popular technique used by attackers to hide malicious code in JavaScript applications. The detection of obfuscated code in JavaScript is a challenging task. A survey of existing techniques for obfuscation detection in JavaScript is presented. The existing detection techniques, including static and dynamic analysis, are also reviewed. Furthermore, we propose a novel approach that combines both static and dynamic analysis to improve the accuracy of obfuscation detection in JavaScript. Our approach is based on the idea of detecting suspicious code patterns that are commonly used in obfuscated code using syntactic and lexical analysis. Finally, we evaluate the effectiveness of our proposed approach using a data set of real-world JavaScript applications. The results show that our approach can achieve a high level of accuracy in detecting obfuscated code, outperforming existing techniques in terms of both detection rate and false positive rate.