Published on in Vol 14 (2025)

This is a member publication of Imperial College London (Jisc)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/77494, first published .
Machine Learning in Health Economic Evaluations: Protocol for a Scoping Review

Machine Learning in Health Economic Evaluations: Protocol for a Scoping Review

Machine Learning in Health Economic Evaluations: Protocol for a Scoping Review

Protocol

1Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom

2Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, United Kingdom

3Department of Primary Care and Public Health, School of Public Health, Imperial College London, London, United Kingdom

Corresponding Author:

Edward Meinert

Department of Primary Care and Public Health

School of Public Health

Imperial College London

Exhibition Rd, South Kensington

London, SW7 2AZ

United Kingdom

Phone: 44 20 7589 5111

Email: e.meinert14@imperial.ac.uk


Background: In recent years, the development of machine learning (ML) applications has increased substantially, indicating the potential role of ML in transforming health care. However, the integration of ML approaches into health economic evaluations is underexplored and has several challenges.

Objective: This scoping review aims to explore the applications of ML in health economic evaluations. This review will also seek to identify some potential challenges to the use of ML in health economic evaluations.

Methods: This review will use PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) methods. The search will be conducted on MEDLINE (Ovid), Embase (Ovid), IEEE Xplore, and Cochrane Library databases. The eligibility criteria of the selection process will be based on the study types, data sources, methods, and outcomes (SDMO) framework approach.

Results: The database search yielded 4141 records after removal of retractions and duplicates. Title and abstract screening of 3718 records has been completed, resulting in 30 reports retrieved for eligibility assessment. Data extraction and charting are currently in progress. The results will be published in peer-reviewed journals by the end of 2025.

Conclusions: This review will help to build up the current understanding of how ML applications are integrated in health economics evaluations. This will also explore the potential barriers to and challenges of using ML in health economics evaluations.

International Registered Report Identifier (IRRID): DERR1-10.2196/77494

JMIR Res Protoc 2025;14:e77494

doi:10.2196/77494

Keywords



Background

Machine learning (ML) is a growing area in economic evaluation of health care interventions, improving predictive accuracy and resource allocation [1,2]. ML is a subfield of artificial intelligence that comprises statistical techniques that allow algorithms to learn from data and improve performance without being explicitly programmed [3,4]. In contrast, health economic evaluation provides a structured approach for comparing the costs and consequences of health care interventions to provide information in identifying the most efficient use of resources [5,6]. ML models use extensive real-world data, such as patient demographics, clinical backgrounds, treatment responses, and health care resource usage, to evaluate numerous parameters. The rapid nature of ML evaluation also allows for efficient and accurate determination of the important parameters that are considered when comparing 2 health care interventions. These may include costs of medicines or technologies, prevalence of illnesses, or the effectiveness of different treatments [7-10]. ML has significant potential to enhance health economic evaluations [9-19]; however, further exploration is required to fully understand its methodological integration and practical implications.

Despite the growing interest in ML across health research, its application within the domain of the health economic evaluation process remains limited and underexplored. Existing literature tends to focus broadly on ML applications in health outcomes research, often emphasizing prediction, diagnosis, or economic impact rather than incorporating it in the health economic evaluation process [1,2,20]. For example, a scoping review reported that 42% of studies using ML in economic evaluations focused on clinical event prediction, 22% on treatment outcomes, 16% on health care resource utilization, and 3% on cost prediction [1].

Understanding how ML is currently used in health economic evaluation is essential to ensure methodology transparency and understanding how ML could be integrated into the economic evaluation. An example of this is the research that analyzed the cost-effectiveness of high-flow nasal cannula therapy versus continuous positive airway pressure for acutely ill children using ML in data analysis [10]. The analysis indicated that high-flow nasal cannula therapy is more cost-effective for male infant patients and patients without severe respiratory distress, as it has an incremental net monetary benefit of £5310 (US $6590)overall [10]. Another example is in the case of breast cancer screening in which an ML-based risk-stratified model might save £60.4-85.3 million (US $74.9-105.9 million) per year while improving health outcomes compared to traditional screening [19]. Furthermore, applying risk assessment models can save costs while increasing value by identifying patients who can be safely discharged from the emergency room [18]. Additionally, ML techniques have shown promise in addressing nonlinear relationships and high-dimensional datasets that challenge traditional regression-based approaches [9,14,15]. For example, combining linear regression with feature selection methods such as lasso, random forest, and extreme gradient boosting has improved predictive performance in modelling complex health care costs [9,14,15]. There is a growing recognition that ML could strengthen several dimensions of economic evaluation, including the ability to capture patient heterogeneity, perform risk-stratified analysis, optimize model inputs, and handle complex data [9-19].

To the best of our knowledge, there are no previous reviews that directly explore the use of ML in health economic evaluation. However, ML has the ability to improve the health economic evaluation process through analysis of patient heterogeneity, model parameter optimization, and risk-stratified analysis [10,12,16,18]. To address this gap, this review aims to examine how ML has been applied in economic evaluations in health care and to explore the potential barriers and opportunities of using ML in the field of health economic evaluation. Guided by this aim, the review is structured around the following research questions:

  1. How have ML techniques been applied within the process of health economic evaluations?
  2. Which technical components of full economic evaluations (eg, model calibration, parameter estimation, heterogeneity analysis, uncertainty quantification, and metamodeling) have incorporated ML methods?
  3. What challenges and barriers have been reported in applying ML within health economic evaluations?

Objectives

The aim of this scoping review is to identify ML applications in the health economic evaluation process. Specifically, it maps how ML techniques are integrated into the technical components of full economic evaluations, such as model calibration, parameter estimation, heterogeneity analysis, uncertainty quantification, and metamodeling, regardless of the clinical or health sector context. The output will focus on the methods adopted and the challenges and barriers faced.


Overview

This scoping review follows Levac’s [21] recommended update to the Arksey and O’Malley scoping review framework to ensure a systematic and rigorous approach to evidence synthesis [22]. The PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines inform the reporting of the review [23]. The final report will include a PRISMA-ScR checklist (Multimedia Appendix 1).

Search Strategy and Study Selection

A comprehensive literature search will be conducted on MEDLINE (Ovid), Embase (Ovid), IEEE Xplore, and the Cochrane Library. We will consult with information specialists to improve the search strategy. The search will use Medical Subject Headings (MeSH) terms with Boolean operators (AND, OR). The preliminary search approach is shown in Multimedia Appendix 1.

Eligibility Criteria

This review will adopt a study types, data sources, methods, and outcomes (SDMO) framework approach [24]. This framework is used to make sure that the studies included in the selection process are relevant. The eligibility criteria are presented in Table 1. We included peer-reviewed, full-text articles and relevant conference abstracts that applied ML techniques within full economic evaluations. All health care settings were eligible, with no restrictions on publication year. Only studies published in English were considered.

Table 1. Eligibility criteria.
CategoryInclusion criteriaExclusion criteria
Types of studiesQuantitative studies and economic evaluations.Opinion pieces, editorials, and studies without economic evaluation.
Types of dataAll types of data, including EHRsa, claims databases, clinical trial data, administrative health care datasets, and simulated or hypothetical data.None.
Types of methodsStudies that use supervised, unsupervised, reinforcement learning, or deep learning models for economic evaluations.Studies using MLb for disease prediction or clinical decision-making without economic evaluations.
OutcomesStudies that evaluate health economic results such as ICERsc, QALYsd, utility scores, cost predictions, and health resources.Studies without economic evaluation outcomes.

aEHR: electronic health record.

bML: machine learning.

cICER: incremental cost-effectiveness ratio.

dQALY: quality-adjusted life year.

Screening and Article Selection

All references identified from database searching were first imported into EndNote [25] for reference management, with removal of duplicate references, and then uploaded to Rayyan [26]. Two independent reviewers (AK and HA) will screen titles and abstracts based on the eligibility criteria, working blinded and independently in different locations on separate copies of the database. Discrepancies will be resolved through discussion or consultation with a third reviewer.

Although a formal calibration exercise was not conducted, both reviewers (AK and HA) discussed the initial records to ensure a shared understanding of the inclusion and exclusion criteria. These criteria served as the guiding reference throughout the screening process. Consistency was maintained throughout the screening process via ongoing communication, with discrepancies resolved through discussion and, when necessary, adjudication by a third reviewer. Interrater agreement statistics were not calculated, but reviewer alignment was tracked throughout.

Data Extraction and Analysis

A structured data extraction form will be developed a priori and piloted to ensure consistency and reproducibility in capturing relevant study characteristics and to understand the gap in applying ML in health economic evaluation. Data will be managed using Microsoft Excel, allowing for systematic organization, coding, and retrieval of key information.

Two independent reviewers will perform data extraction, ensuring interreviewer reliability. Any discrepancies in extracted data will be resolved through discussion or, if necessary, by consulting a third reviewer. The extracted information will include the key domains described in Table 2.

Table 2. Key domains for extracted information.
DomainDescription
Study characteristicsAuthors, year of publication, country, study design, study objective, study setting, and data sources.
Economic evaluationType of economic evaluation, such as cost-effectiveness analysis and cost-utility analysis, in addition to how the economic evaluation was carried out (alongside a clinical trial or economic decision models).
MLa methodologyType of ML approach used, such as supervised learning, deep learning, or reinforcement learning algorithms.
OutcomesAll economic evaluation outcomes, including ICERsb, QALYsc, utility scores, cost predictions, and health care resource allocation.
Challenges and limitationsIdentified ML implementation barriers such as computational complexity, data security, and data availability.

aML: machine learning.

bICER: incremental cost-effectiveness ratio.

cQALY: quality-adjusted life year.

Data Synthesis

Data will be synthesized narratively using descriptive analysis. Studies will be categorized based on ML methodologies, economic evaluation type, and key outcomes. Narrative analysis will identify the themes that serve as opportunities or barriers to the application of ML in economic evaluations. In addition to narrative synthesis, we will also consider presenting findings using appropriate visual summaries, such as tables, flow charts, and descriptive figures.

The extracted data will be synthesized narratively using both descriptive and thematic approaches. Studies will first be categorized based on key characteristics, including study design, clinical area, data sources, ML methodology, type of economic evaluation, computational complexity, data security and availability concerns, and the context and stage of ML application within the economic evaluation process. Subsequently, a thematic analysis will be conducted to identify recurring methodological patterns, innovations, and implementation challenges. Themes will be developed inductively and refined through team discussion.


The preliminary search conducted in April 2024 yielded 4141 records from the selected databases, namely, MEDLINE (Ovid), Embase (Ovid), IEEE Xplore, and the Cochrane Library. Results will be disseminated through publication in a peer-reviewed journal by the end of 2025. The database searches are ongoing, and the PRISMA-ScR flowchart (Figure 1) illustrates the study selection process. The flowchart will be updated upon completion of full-text screening and data extraction. Based on an initial scan of titles and abstracts, the retrieved studies appeared to include a range of designs, including simulation-based modelling studies, retrospective cohorts, and hybrid designs that integrate trial data with economic models. Thematically, many studies applied ML for tasks such as model calibration, parameter estimation, heterogeneity analysis, uncertainty quantification, and metamodeling.

Figure 1. PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram of the study selection process.

Anticipated Findings

This scoping review is expected to identify how ML methods have been applied within health economic evaluations, particularly in areas such as model calibration, parameter estimation, heterogeneity analysis, uncertainty quantification, and metamodeling. By synthesizing this evidence, the review will map current practices and highlight methodological gaps that could inform future development of economic evaluation frameworks.

Most previous reviews examine how ML algorithms improve prediction, diagnosis, and economic impact; none have specifically explored ML in the context of the health economic evaluation process [1,2,20]. To the best of our knowledge, there is a lack of reviews on applying ML in the health economic evaluation process. This scoping review will seek to provide more comprehensive evidence on how ML is incorporated in the health economic evaluations process, what opportunities and barriers ML could face in the health economic evaluations process.

In this review, we aim to explore the existing literature in an unbiased manner; however, there are several limitations that should be acknowledged. For instance, not including grey literature may lead to missing relevant evidence from conference proceedings and other nonindexed sources. This decision was made to enhance the rigor and reliability of findings. Moreover, restricting the search to English-language publications may introduce language bias and miss relevant studies published in other languages [27]. Furthermore, this review did not assess the quality or risk of bias of the included studies, which allowed for a broad inclusion of relevant literature.

The findings of this scoping review may guide the design of future systematic reviews and comparative analyses that directly assess the effectiveness of ML-based versus traditional approaches in health economic evaluation. It may also highlight methodological areas where further research is required, such as handling of heterogeneity, validation of ML-based models, and transparency in reporting.

Conclusions

This scoping review will explore how ML applications could be incorporated into the health economic evaluation process. This review will identify opportunities and barriers for this research. This can help us to understand the possibilities of applying ML in the health economic evaluation. By mapping current practices, the review will contribute to a clearer understanding of the role of ML in areas such as model calibration, parameter estimation, heterogeneity analysis, and uncertainty quantification. The findings will help inform the design of future systematic reviews and comparative studies, and guide methodological research aimed at strengthening the integration of ML into economic evaluation frameworks. Results will be disseminated through a peer-reviewed journal article, academic conference presentations, and professional networks in health economics and data science, and will be communicated to relevant stakeholders, including policymakers and researchers interested in ML applications in health economics.

Acknowledgments

HD is supported by the King Salman Scholarship Program for Research and Development. CC, AA, VR, and EM are supported by the National Institute for Health and Care Research (NIHR) Newcastle Biomedical Research Centre, based on the Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle University, and the Cumbria, Northumberland, and Tyne and Wear (CNTW) NHS Foundation Trust. The views expressed in this publication are those of the authors and not necessarily those of the NIHR or any of the authors’ affiliated universities. The open-access publication fee was paid from the Imperial College London Open Access Fund. The funding body was not involved in the study design, data collection or analysis, or the writing and decision to submit the article for publication.

Authors' Contributions

Conceptualization: HD, GSS

Methodology: HD

Supervision: AK, EM, GSS

Writing – original draft: HD

Writing – review & editing: HD, AK, RB-S, RP, AA, CC, VR, EM, GSS

Conflicts of Interest

EM is an Editor in Chief of JMIRx Med. All other authors have no conflicts of interest.

Multimedia Appendix 1

PRISMA-ScR checklist and detailed search strategy.

DOCX File , 23 KB

  1. Lee W, Schwartz N, Bansal A, Khor S, Hammarlund N, Basu A, et al. A scoping review of the use of machine learning in health economics and outcomes research: part 2-data from nonwearables. Value Health. Dec 2022;25(12):2053-2061. [FREE Full text] [CrossRef] [Medline]
  2. Padula WV, Kreif N, Vanness DJ, Adamson B, Rueda J, Felizzi F, et al. Machine learning methods in health economics and outcomes research-the PALISADE checklist: a good practices report of an ISPOR task force. Value Health. Jul 2022;25(7):1063-1080. [FREE Full text] [CrossRef] [Medline]
  3. Habehh H, Gohel S. Machine learning in healthcare. Curr Genomics. Dec 16, 2021;22(4):291-300. [FREE Full text] [CrossRef] [Medline]
  4. Yadav DK, Gulati A, editors. Artificial Intelligence and Machine Learning in Healthcare. Singapore. Springer; 2024:187.
  5. Morris S, Devlin N, Parkin D, Spencer A, editors. Economic Analysis in Healthcare. Hoboken, New Jersey. Wiley & Sons; 2012.
  6. Drummond MF, Sculpher MJ, Claxton K, Stoddart GL, Torrance GW, editors. Methods for the Economic Evaluation of Health Care Programmes, 4th Edition. Oxford, England. Oxford University Press; 2015.
  7. Gong K, Xue Y, Kong L, Xie X. Cost prediction for ischemic heart disease hospitalization: interpretable feature extraction using network analysis. J Biomed Inform. Jun 2024;154:104652. [FREE Full text] [CrossRef] [Medline]
  8. Hautala AJ, Shavazipour B, Afsar B, Tulppo MP, Miettinen K. Machine learning models for assessing risk factors affecting health care costs: 12-month exercise-based cardiac rehabilitation. Front Public Health. 2024;12:1378349. [FREE Full text] [CrossRef] [Medline]
  9. Chen K, Huang Y, Liu C, Li S, Chen M. Machine learning-driven prediction of medical expenses in triple-vessel PCI patients using feature selection. BMC Health Serv Res. Jan 20, 2025;25(1):105. [FREE Full text] [CrossRef] [Medline]
  10. Hattab Z, Moler-Zapata S, Doherty E, Sadique Z, Ramnarayan P, O'Neill S. Exploring heterogeneity in the cost-effectiveness of high-flow nasal cannula therapy in acutely ill children-insights from the step-up first-line support for assistance in breathing in children trial using a machine learning method. Value Health. Jan 2025;28(1):60-69. [FREE Full text] [CrossRef] [Medline]
  11. Chalkou K, Hamza T, Benkert P, Kuhle J, Zecca C, Simoneau G, et al. Combining randomized and non-randomized data to predict heterogeneous effects of competing treatments. Res Synth Methods. Jul 2024;15(4):641-656. [FREE Full text] [CrossRef] [Medline]
  12. Sadique Z, Grieve R, Diaz-Ordaz K, Mouncey P, Lamontagne F, O'Neill S. A machine-learning approach for estimating subgroup- and individual-level treatment effects: an illustration using the 65 trial. Med Decis Making. Oct 2022;42(7):923-936. [FREE Full text] [CrossRef] [Medline]
  13. Knaus M, Lechner M, Strittmatter A. Machine learning estimation of heterogeneous causal effectsmpirical Monte Carlo evidence. Econom J. 2021;24(1):161. [CrossRef]
  14. Huang Y, Ho C, Chou W, Chen M. A framework to predict second primary lung cancer patients by using ensemble models. Ann Oper Res. Dec 14, 2023;348(1):373-397. [CrossRef]
  15. Gonzalez-Rodriguez J, Franco C, Pinzón-Espitia O, Caballer V, Alfonso-Lizarazo E, Augusto V. Prediction of pharmaceutical and non-pharmaceutical expenditures associated with diabetes mellitus type II based on clinical risk. PLoS One. 2024;19(6):e0301860. [FREE Full text] [CrossRef] [Medline]
  16. Fouladi A, Asadi A, Sherer EA, Madadi M. Cost-effectiveness analysis of colorectal cancer screening strategies using active learning and monte carlo simulation. Med Decis Making. Jun 22, 2024;44(5):554-571. [CrossRef]
  17. Padula WV, Pronovost PJ, Makic MBF, Wald HL, Moran D, Mishra MK, et al. Value of hospital resources for effective pressure injury prevention: a cost-effectiveness analysis. BMJ Qual Saf. Feb 2019;28(2):132-141. [FREE Full text] [CrossRef] [Medline]
  18. Shung DL, Lin JK, Laine L. Achieving value by risk stratification with machine learning model or clinical risk score in acute upper gastrointestinal bleeding: a cost minimization analysis. Am J Gastroenterol. Feb 01, 2024;119(2):371-373. [CrossRef] [Medline]
  19. Hill H, Roadevin C, Duffy S, Mandrik O, Brentnall A. Cost-effectiveness of AI for risk-stratified breast cancer screening. JAMA Netw Open. Sep 03, 2024;7(9):e2431715. [FREE Full text] [CrossRef] [Medline]
  20. Jiao W, Zhang X, D’Souza F. The economic value and clinical impact of artificial intelligence in healthcare: a scoping literature review. IEEE Access. 2023;11:123445-123457. [CrossRef]
  21. Levac D, Colquhoun H, O'Brien KK. Scoping studies: advancing the methodology. Implement Sci. Sep 20, 2010;5:69. [FREE Full text] [CrossRef] [Medline]
  22. Arksey H, O'Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. Feb 2005;8(1):19-32. [CrossRef]
  23. Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. Oct 02, 2018;169(7):467-473. [FREE Full text] [CrossRef] [Medline]
  24. Freitas de Mello N, Nascimento Silva S, Gomes DF, da Motta Girardi J, Barreto JOM. Models and frameworks for assessing the implementation of clinical practice guidelines: a systematic review. Implement Sci. Aug 07, 2024;19(1):59. [FREE Full text] [CrossRef] [Medline]
  25. Hupe M. EndNote X9. J Electron Resour Med Libr. Nov 26, 2019;16(3-4):117-119. [CrossRef]
  26. Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A. Rayyan-a web and mobile app for systematic reviews. Syst Rev. Dec 05, 2016;5(1):210. [FREE Full text] [CrossRef] [Medline]
  27. Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, et al. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Educ. Sep 22, 2023;23(1):689. [FREE Full text] [CrossRef] [Medline]


MESH: Medical Subject Headings
ML: machine learning
PRISMA-ScR: Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews
SDMO: study types, data sources, methods, and outcomes


Edited by J Sarvestan; submitted 14.May.2025; peer-reviewed by A Bartlett, S Heydari; comments to author 24.Jun.2025; revised version received 03.Aug.2025; accepted 15.Sep.2025; published 24.Sep.2025.

Copyright

©Hanan Daghash, Ashleigh Kernohan, Rosiered Brownson-Smith, Rohan Pandey, Ananya Ananthakrishnan, Cen Cong, Victoria Riccalton, Edward Meinert, Gurdeep S Sagoo. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 24.Sep.2025.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Research Protocols, is properly cited. The complete bibliographic information, a link to the original publication on https://www.researchprotocols.org, as well as this copyright and license information must be included.