MOCCA: Multilayer One-Class Classification for Anomaly Detection

Massoli F. V., Falchi F., Kantarcı A., Aktı Ş., Ekenel H. K., Amato G.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, vol.33, no.6, pp.2313-2323, 2022 (SCI-Expanded) identifier identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 33 Issue: 6
  • Publication Date: 2022
  • Doi Number: 10.1109/tnnls.2021.3130074
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, Civil Engineering Abstracts
  • Page Numbers: pp.2313-2323
  • Keywords: Anomaly detection (AD), deep learning (DL), one-class (OC) classification, INTRUSION DETECTION
  • Istanbul Technical University Affiliated: Yes


Anomalies are ubiquitous in all scientific fields and can express an unexpected event due to incomplete knowledge about the data distribution or an unknown process that suddenly comes into play and distorts the observations. Usually, due to such events' rarity, to train deep learning (DL) models on the anomaly detection (AD) task, scientists only rely on "normal" data, i.e., nonanomalous samples. Thus, letting the neural network infer the distribution beneath the input data. In such a context, we propose a novel framework, named multilayer one-class classification (MOCCA), to train and test DL models on the AD task. Specifically, we applied our approach to autoencoders. A key novelty in our work stems from the explicit optimization of the intermediate representations for the task at hand. Indeed, differently from commonly used approaches that consider a neural network as a single computational block, i.e., using the output of the last layer only, MOCCA explicitly leverages the multilayer structure of deep architectures. Each layer's feature space is optimized for AD during training, while in the test phase, the deep representations extracted from the trained layers are combined to detect anomalies. With MOCCA, we split the training process into two steps. First, the autoencoder is trained on the reconstruction task only. Then, we only retain the encoder tasked with minimizing the L-2 distance between the output representation and a reference point, the anomaly-free training data centroid, at each considered layer. Subsequently, we combine the deep features extracted at the various trained layers of the encoder model to detect anomalies at inference time. To assess the performance of the models trained with MOCCA, we conduct extensive experiments on publicly available datasets, namely CIFAR10, MVTec AD, and ShanghaiTech. We show that our proposed method reaches comparable or superior performance to state-of-the-art approaches available in the literature. Finally, we provide a model analysis to give insights regarding the benefits of our training procedure.