In this study, a new state space representation of the protein folding problem for the use of reinforcement learning methods is proposed. In the existing studies, the way of defining the state-action space prevents the agent to learn the state space for any amino-acid sequence, but rather, the defined state-action space is valid for only a particular amino-acid sequence. Moreover, in the existing methods, the size of the state space is strictly depends on the amino-acid sequence length. The newly proposed state-action space reduces this dependency and allows the agent to find the optimal fold of any sequence of a certain length. Additionally, by utilizing an ant based reinforcement learning algorithm, the Ant-Q algorithm, optimum fold of a protein is found rapidly when compared to the standard Q-learning algorithm. Experiments showed that, the new state-action space with the ant based reinforcement learning method is much more suited for the protein folding problem in two dimensional lattice model. (C) 2014 Elsevier B.V. All rights reserved.