Keywords: Deep learning, multi-modal artificial intelligence, breast cancer prediction, medical imaging, artificial intelligence, literature review.
Breast cancer remains one of the most prevalent and life-threatening diseases worldwide, representing a major challenge for healthcare systems and a leading cause of cancer-related mortality among women. Early and accurate prediction is essential for improving patient outcomes, enabling timely treatment, and reducing mortality rates. However, conventional diagnostic approaches rely heavily on manual interpretation of medical data, particularly imaging examinations, which may be subjective and limited in capturing the complex, heterogeneous, and multi-factorial nature of breast cancer. These limitations have motivated the increasing adoption of artificial intelligence–based solutions in medical diagnostics.
In recent years, deep learning has emerged as a powerful and effective approach for breast cancer prediction, offering automated, data-driven analysis of large and complex datasets. This paper presents a state-of-the-art literature review of multi-modal deep learning approaches for breast cancer prediction, focusing on methods that jointly exploit information from multiple data sources to enhance predictive performance and robustness. The reviewed studies integrate diverse data modalities, including medical imaging techniques such as mammography, ultrasound, histopathology, and magnetic resonance imaging, as well as complementary clinical and genetic information. By combining these heterogeneous data sources, multi-modal deep learning frameworks aim to provide a more comprehensive representation of the disease compared to single-modality approaches.
The review analyzes recent methodological trends in multi-modal learning, including data fusion strategies, feature-level and decision-level integration, and commonly adopted training and evaluation practices. Reported performance improvements are discussed using standard evaluation metrics such as accuracy, precision, recall, and F1-score, highlighting the potential advantages of multi-modal approaches in improving diagnostic reliability. In addition, the review identifies key challenges reported in the literature, including data heterogeneity across modalities, limited availability of well-annotated multi-modal datasets, issues related to model interpretability, and barriers to clinical deployment.
Overall, this literature review provides an up-to-date overview of advances in multi-modal deep learning for breast cancer prediction and synthesizes current knowledge in this rapidly evolving field. The findings highlight promising research directions toward the development of more accurate, robust, and clinically meaningful AI-based diagnostic systems. Future research may focus on improving model transparency, validating approaches on large-scale real-world datasets, and extending multi-modal deep learning techniques to other cancer types and related medical applications.