The automatic emotion recognition domain brings new methods and technologies that might be used to enhance therapy of children with autism. The paper aims at the exploration of methods and tools used to recognize emotions in children. It presents a literature review study that was performed using a systematic approach and PRISMA methodology for reporting quantitative and qualitative results. Diverse observation channels and modalities are used in the analyzed studies, including facial expressions, prosody of speech, and physiological signals. Regarding representation models, the basic emotions are the most frequently recognized, especially happiness, fear, and sadness. Both single-channel and multichannel approaches are applied, with a preference for the first one. For multimodal recognition, early fusion was the most frequently applied. SVM and neural networks were the most popular for building classifiers. Qualitative analysis revealed important clues on participant group construction and the most common combinations of modalities and methods. All channels are reported to be prone to some disturbance, and as a result, information on a specific symptoms of emotions might be temporarily or permanently unavailable. The challenges of proper stimuli, labelling methods, and the creation of open datasets were also identified.