In today's technology, the data collection processes have to deal with simultaneous data sources. In order, for the data to be examined or used by artificial learning methods, the data incoming from different sources must be aligned with each other. In this research, participants have worn eye tracker devices and all watched segments in the prepared experimental environment have been recorded as a video with an eye tracking device. At the same time video of participants wearing eye-tracking devices has been recorded via a webcam. In order to be able to analyze the behavior of the participant while watching the video, two simultaneously recorded videos have to be evaluated together. However, the start times of videos recorded by a web camera and an eye tracking device may be different. In addition, there may be differences in seconds between system time of these devices. Aligning the eye-tracking device and webcam video recordings and comparing the results will enable us to know what the user feels when they look at the video or at an emotional image on the screen on the next stage of the research. The video files have been converted into audio files in order to align the webcam and eye tracker device videos. For the alignment process, the cross correlation method has been used. Also, in order to compare the accuracies of results, manual alignment has been performed. In addition to these, features of audio files were extracted there was performed an alignment was performed with Marsyas open source system by applying Dynamic Time Warping (DZB) algorithm to these features.