Deep RL-Based Spectrum Occupancy Prediction Exploiting Time and Frequency Correlations

Aygul M. A., Nazzal M., ARSLAN H.

IEEE Wireless Communications and Networking Conference (IEEE WCNC), Texas, United States Of America, 10 - 13 April 2022, pp.2399-2404 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/wcnc51071.2022.9771702
  • City: Texas
  • Country: United States Of America
  • Page Numbers: pp.2399-2404
  • Keywords: Cognitive radio, deep reinforcement learning, real world spectrum measurements, spectrum occupancy prediction, time and frequency correlations, COGNITIVE RADIO
  • Istanbul Technical University Affiliated: Yes


In cognitive radio systems, predicting spectrum occupancies is a convenient alternative way to continuous spectrum sensing. It can provide information on spectrum usage and so empty spectrum bands can be used by secondary users. The usage of the spectrum bands is highly correlated over both time and frequency. Recently, machine learning algorithms are used to predict spectrum occupancy by exploiting such correlations. However, this approach primarily assumes a supervised learning setting. Despite its outstanding performance, this setting requires the availability of sufficiently large datasets (of labeled data) and is not adaptive to environment changes. In this paper, different from the existing literature, a deep reinforcement learning (RL) algorithm is used to alleviate those shortcomings. In this algorithm, we define the reward functions of the deep RL setting and its state and action spaces such that it is applicable to work dynamically, in an online fashion, in real world settings. Extensive experiments validate the capability of the proposed algorithm in predicting spectrum occupancies as examined over real world spectrum measurements. These are carried out in the 832-862 megahertz frequency bands, which are used by the leading Turkish telecom providers as private uplink bands. This is a significant step towards realizing a standalone spectrum occupancy prediction operation without any control from the operator and minimizing memory requirements while alleviating the need for the labeled dataset.