Purpose Heterogeneous teams consisting of unmanned ground vehicles and unmanned aerial vehicles are being used for different types of missions such as surveillance, tracking and exploration. Exploration missions with heterogeneous robot teams (HeRTs) should acquire a common map for understanding the surroundings better. The purpose of this paper is to provide a unique approach with cooperative use of agents that provides a well-detailed observation over the environment where challenging details and complex structures are involved. Also, this method is suitable for real-time applications and autonomous path planning for exploration. Design/methodology/approach Lidar odometry and mapping and various similarity metrics such as Shannon entropy, Kullback-Leibler divergence, Jeffrey divergence, K divergence, Topsoe divergence, Jensen-Shannon divergence and Jensen divergence are used to construct a common height map of the environment. Furthermore, the authors presented the layering method that provides more accuracy and a better understanding of the common map. Findings In summary, with the experiments, the authors observed features located beneath the trees or the roofed top areas and above them without any need for global positioning system signal. Additionally, a more effective common map that enables planning trajectories for both vehicles is obtained with the determined similarity metric and the layering method. Originality/value In this study, the authors present a unique solution that implements various entropy-based similarity metrics with the aim of constructing common maps of the environment with HeRTs. To create common maps, Shannon entropy-based similarity metrics can be used, as it is the only one that holds the chain rule of conditional probability precisely. Seven distinct similarity metrics are compared, and the most effective one is chosen for getting a more comprehensive and valid common map. Moreover, different from all the studies in literature, the layering method is used to compute the similarities of each local map obtained by a HeRT. This method also provides the accuracy of the merged common map, as robots' sight of view prevents the same observations of the environment in features such as a roofed area or trees. This novel approach can also be used in global positioning system-denied and closed environments. The results are verified with experiments.