Background The coronavirus disease 2019 (COVID-19) spreads rapidly across the globe, seriously threatening the health of people all over the world. To reduce the diagnostic pressure of front-line doctors, an accurate and automatic lesion segmentation method is highly desirable in clinic practice. Purpose Many proposed two-dimensional (2D) methods for sliced-based lesion segmentation cannot take full advantage of spatial information in the three-dimensional (3D) volume data, resulting in limited segmentation performance. Three-dimensional methods can utilize the spatial information but suffer from long training time and slow convergence speed. To solve these problems, we propose an end-to-end hybrid-feature cross fusion network (HFCF-Net) to fuse the 2D and 3D features at three scales for the accurate segmentation of COVID-19 lesions. Methods The proposed HFCF-Net incorporates 2D and 3D subnets to extract features within and between slices effectively. Then the cross fusion module is designed to bridge 2D and 3D decoders at the same scale to fuse both types of features. The module consists of three cross fusion blocks, each of which contains a prior fusion path and a context fusion path to jointly learn better lesion representations. The former aims to explicitly provide the 3D subnet with lesion-related prior knowledge, and the latter utilizes the 3D context information as the attention guidance of the 2D subnet, which promotes the precise segmentation of the lesion regions. Furthermore, we explore an imbalance-robust adaptive learning loss function that includes image-level loss and pixel-level loss to tackle the problems caused by the apparent imbalance between the proportions of the lesion and non-lesion voxels, providing a learning strategy to dynamically adjust the learning focus between 2D and 3D branches during the training process for effective supervision. Result Extensive experiments conducted on a publicly available dataset demonstrate that the proposed segmentation network significantly outperforms some state-of-the-art methods for the COVID-19 lesion segmentation, yielding a Dice similarity coefficient of 74.85%. The visual comparison of segmentation performance also proves the superiority of the proposed network in segmenting different-sized lesions. Conclusions In this paper, we propose a novel HFCF-Net for rapid and accurate COVID-19 lesion segmentation from chest computed tomography volume data. It innovatively fuses hybrid features in a cross manner for lesion segmentation, aiming to utilize the advantages of 2D and 3D subnets to complement each other for enhancing the segmentation performance. Benefitting from the cross fusion mechanism, the proposed HFCF-Net can segment the lesions more accurately with the knowledge acquired from both subnets.