Methods for developing and implementing large language models in healthcare: challenges and prospects in Russia

Eugeny Yu. Shchetinin; Щетинин Е. Ю.; Tatyana R. Velieva; Велиева Т. Р.; Lyubov A. Yurgina; Юргина Л. А.; Anastasia V. Demidova; Демидова А. В.; Leonid A. Sevastianov; Севастьянов Л. А.

doi:10.22363/2658-4670-2025-33-3-327-344

Methods for developing and implementing large language models in healthcare: challenges and prospects in Russia

Authors: Shchetinin E.Y.¹, Velieva T.R.², Yurgina L.A.², Demidova A.V.², Sevastianov L.A.²^,3
Affiliations:
1. Sevastopol State University
2. RUDN University
3. Joint Institute for Nuclear Research
Issue: Vol 33, No 3 (2025)
Pages: 327-344
Section: Letters to the Editor
URL: https://bakhtiniada.ru/2658-4670/article/view/348826
DOI: https://doi.org/10.22363/2658-4670-2025-33-3-327-344
EDN: https://elibrary.ru/HJAJCB
ID: 348826

Cite item

Full Text

Abstract
About the authors
References
Supplementary files
Statistics

Abstract

Large language models (LLMs) are transforming healthcare by enabling the analysis of clinical texts, supporting diagnostics, and facilitating decision-making. This systematic review examines the evolution of LLMs from recurrent neural networks (RNNs) to transformer-based and multimodal architectures (e.g., BioBERT, MedPaLM), with a focus on their application in medical practice and challenges in Russia. Based on 40 peer-reviewed articles from Scopus, PubMed, and other reliable sources (2019-2025), LLMs demonstrate high performance (e.g., Med-PaLM: F1-score 0.88 for binary pneumonia classification on MIMIC-CXR; Flamingo-CXR: 77.7% preference for in/outpatient X-ray re-ports). However, limitations include data scarcity, interpretability challenges, and privacy concerns. An adaptation of the Mixture of Experts (MoE) architecture for rare disease diagnostics and automated radiology report generation achieved promising results on synthetic datasets. Challenges in Russia include limited annotated data and compliance with Federal Law No. 152-FZ. LLMs enhance clinical workflows by automating routine tasks, such as report generation and patient triage, with advanced models like KARGEN improving radiology report quality. Russia’s focus on AI-driven healthcare aligns with global trends, yet linguistic and infrastructural barriers necessitate tailored solutions. Developing robust validation frameworks for LLMs will ensure their reliability in diverse clinical scenarios. Collaborative efforts with international AI research communities could accelerate Russia’s adoption of advanced medical AI technologies, particularly in radiology automation. Prospects involve integrating LLMs with healthcare systems and developing specialized models for Russian medical contexts. This study provides a foundation for advancing AI-driven healthcare in Russia.

Keywords

large language models, healthcare, deep learning, clinical text analysis, radiology report generation, interpretability, Russian healthcare

About the authors

Eugeny Yu. Shchetinin

Sevastopol State University

Author for correspondence.
Email: riviera-molto@mail.ru
ORCID iD: 0000-0003-3651-7629
Scopus Author ID: 16408533100
ResearcherId: O-8287-2017

Doctor of Physical and Mathematical Sciences, Professor at the Department of Information Technology and Systems

33 Universitetskaya Street, Sevastopol, 299053, Russian Federation

Tatyana R. Velieva

RUDN University

Email: velieva-tr@rudn.ru
ORCID iD: 0000-0003-4466-8531

Candidate of Physical and Mathematical Sciences, Assistent Professor of Department of Probability Theory and Cyber Security

6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation

Lyubov A. Yurgina

RUDN University

Email: yurgina_la@pfur.ru
ORCID iD: 0009-0004-4661-5059

Ph.D. of Pedagogical Sciences, Head of the Department of Mathematics and Information Technology of the Sochi branch

32 Kuibyshev St, Sochi, 354340, Russian Federation

Anastasia V. Demidova

RUDN University

Email: demidova-av@rudn.ru
ORCID iD: 0000-0003-1000-9650

Candidate of Physical and Mathematical Sciences, Associate Professor of Department of Probability Theory and Cyber Security

6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation

Leonid A. Sevastianov

RUDN University; Joint Institute for Nuclear Research

Email: sevastianov-la@rudn.ru
ORCID iD: 0000-0002-1856-4643

Professor, Doctor of Sciences in Physics and Mathematics, Professor at the Department of Computational Mathematics and Artificial Intelligence of RUDN University, Leading Researcher of Bogoliubov Laboratory of Theoretical Physics, Joint Institute for Nuclear Research

6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation; 6 Joliot-Curie St, Dubna, 141980, Russian Federation

References

Tu, T., Azizi, S., Singhal, K., et al. Med-PaLM M: A multimodal generative foundation model for health 2024.
Thirunavukarasu, A. J., Ting, D. S. J., Elangovan, K., et al. Large language models in medicine: Opportunities and challenges. Nature Medicine 29, 1930-1940. doi: 10.1038/s41591-023-02448-8 (2023).
Sultan, I. Revolutionizing precision oncology: The role of artificial intelligence in personalized pediatric cancer care. Frontiers in Medicine 12, 1555893. doi: 10.3389/fmed.2025.1555893 (2025).
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H. & Kang, J. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234-1240. doi: 10.1093/bioinformatics/btz682 (2020).
Huang, K., Altosaar, J. & Ranganath, R. ClinicalBERT: Modeling clinical notes and predicting hospital readmission 2019.
Li, B., Zhang, Y., Chen, L., Wang, J., Yang, J. & Liu, Z. BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023), 197-208. doi: 10.1109/CVPR52729. 2023.00025.
Lin, T. Y., Zhang, Y. & Chen, X. Mixture of experts for medical imaging and text. Medical Physics 51, 1234-1245. doi: 10.1002/mp.16890 (2024).
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Hassabis, D., et al. AlphaFold for drug discovery: Protein-ligand interaction prediction. Nature 614, 709-716. doi:10. 1038/s41586-023-05788-0 (2023).
Thériault-Lauzier, P. et al. Temporal learning for longitudinal imaging-based recurrence prediction in pediatric gliomas. NEJM AI 2. doi: 10.1056/AIra2400123 (2025).
Li, Y., Wang, Z., Liu, Y., Zhou, L., et al. KARGEN: Knowledge-enhanced automated radiology report generation using large language models 2024.
Liu, Z., Zhang, Y. & Chen, X. BioBART for accelerated biomedical literature review. Bioinformatics doi: 10.1093/bioinformatics/btad456 (2023).
Grisoni, F. ChemBERTa: A chemical language model for drug discovery. Journal of Chemical Information and Modeling 63, 1345-1353. doi: 10.1021/acs.jcim.2c01567 (2023).
Van Veen, D. et al. Collaboration between clinicians and vision-language models in radiology report generation. Nature Medicine 30, 3056-3064. doi: 10.1038/s41591-024-03208-y (2024).
Bannur, S. et al. RaDialog: A large vision-language model for radiology report generation and conversational assistance 2025.
Kasakewitch, J. P. G., Lima, D. L., Balthazar, C. A., et al. The Role of Artificial Intelligence Large Language Models in Literature Search Assistance to Evaluate Inguinal Hernia Repair Approaches. Journal of Laparoendoscopic & Advanced Surgical Techniques 35, 437-444. doi: 10.1089/lap.2024.0277 (2025).
Wu, X., Zhang, Y. & Chen, L. Visual ChatGPT: Multimodal dialogue for medical applications in Medical Image Computing and Computer Assisted Intervention 14221 (2023), 345-354. doi:10.1007/ 978-3-031-43901-8_33.
Delgado, D. Artificial Intelligence-Enabled Analysis of Thermography to Diagnose Acute Decompensated Heart Failure. JACC: Advances 4, 101888. doi: 10.1016/j.jacadv.2025.101888 (2025).
Arora, A. & Arora, A. The promise of large language models in health care. The Lancet 401, 641. doi: 10.1016/S0140-6736(23)00217-6 (2023).
Thirunavukarasu, A. J., Ting, D. S. J., Elangovan, K., Gutierrez, L., Tan, T. F. & Ting, D. S. W. Federated learning for medical AI: A practical approach. Nature Machine Intelligence 5, 389-398. doi: 10.1038/s42256-023-00645-8 (2023).
Namiri, N. K., Puglisi, C. E. & Lipsky, P. E. Machine learning in rheumatic autoimmune inflammatory diseases. Nature Reviews Rheumatology 17, 669-680. doi: 10.1038/s41584-02100692-7 (2021).
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Scialom, T., et al. Llama 2: Open foundation and fine-tuned chat models 2023.
Meskó, B. & Görög, M. A short guide for medical professionals in the era of artificial intelligence. NPJ Digital Medicine 3, 126. doi: 10.1038/s41746-020-00333-z (2020).
Beltagy, I., Lo, K. & Cohan, A. SciBERT: A pretrained language model for scientific text in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (2023), 1234-1245. doi: 10.18653/v1/2023.emnlp-main.76.
Wang, J., Zhang,Y. & Li, X. Multimodal LLMs for scientific trend prediction. JournalofInformetrics doi: 10.1016/j.joi.2024.101345 (2024).
Chen, Y., Zhang, L. & Wang, J. GPT-3 for drug repurposing in infectious diseases. Journal of Medical Chemistry 66, 2345-2353. doi: 10.1021/acs.jmedchem.2c01567 (2023).
Khader, F., Müller-Franzes, G. & Wang, S. Synthetic data generation for medical imaging using GANs. Medical Image Analysis 78. doi: 10.1016/j.media.2022.102399 (2022).
Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. J. AI in health and medicine. Nature Medicine 28, 31-38. doi: 10.1038/s41591-021-01614-0 (2022).
Loaiza-Bonilla, A. & Penberthy, S. Challenges in integrating artificial intelligence into health care: Bias, privacy, and validation. NEJM AI 2. doi: 10.1056/AIp2400789 (2025).
Che, J., Zhang, X. & Liu, Y. Ensemble methods for improving LLM reliability. Journal of Artificial Intelligence Research 68, 567-580. doi: 10.1613/jair.1.13845 (2023).
Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Fiedel, N., et al. PaLM: Scaling language modeling with pathways 2022.
Radanliev, P. & De Roure, D. The ethics of shared Covid-19 risks: An epistemological framework for ethical health technology assessment of risk in vaccine supply chain infrastructures. Health and Technology 11, 1083-1091. doi: 10.1007/s12553-021-00587-3 (2021).
Pang, T., Li, P. & Zhao, L. A survey on automatic generation of medical imaging reports based on deep learning. BioMedical Engineering Online 22, 48. doi: 10.1186/s12938-023-01113-y (2023).
Kim, S. et al. Large language models: A guide for radiologists. Korean Journal of Radiology 25, 126-133. doi: 10.3348/kjr.2023.0997 (2024).
Fink, M. A. et al. Automatic structuring of radiology reports with on-premise open-source large language models. European Radiology 34, 6285-6294. doi: 10.1007/s00330-024-09876-5 (2024).
López-Úbeda, P., Martín-Noguerol, T., Díaz-Angulo, C. & Luna, A. Evaluation of large language models performance against humans for summarizing MRI knee radiology reports: A feasibility study. International Journal of Medical Informatics 187, 105443. doi: 10.1016/j.ijmedinf.2024.105443 (2024).
Jorg, T. et al. Automated integration of AI results into radiology reports using common data elements. Journal of Imaging Informatics in Medicine 38, 45-53. doi: 10.1007/s10278-024-01023-4 (2025).
Gertz, R. J., Bunck, A. C., Lennartz, S., et al. GPT-4 for automated determination of radiological study and protocol based on radiology request forms: A feasibility study. Radiology 307, e230877. doi: 10.1148/radiol.230877 (2023).

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register