Self similar behavior of the aggregate traffic is a well-known issue in the networking area. In this paper we first study the self similarity of the empirically aggregated VoIP traffic in a heterogeneous wireless network testbed environment and then model it using Fractional Gaussian Noise (fGn). The heterogeneity of the environment is provided by exploiting different wireless technologies in backbone and access networks. The backbone of the testbed is IEEE 802.16d WiMAX whereas the access network is IEEE 802.11b WiFi mesh architecture. We evaluate the self similarity in terms of throughput and packet inter-arrival time using empirically captured VoIP calls generated by soft-phones in the laboratory. We prove the collected data's self similar characteristics with stochastic analysis using auto-correlation functions. We implement three well-known time-domain estimators to obtain Hurst values for both metrics. We also suggest the Fractional Gaussian Noise (fGn) Model for the empirically aggregated VoIP data. The self similarity analysis and modeling performed in this work will motivate new design issues on the quality of service frame-works and resources allocation mechanisms such as buffers in wireless heterogeneous networks.