Modelling in-hospital length of stay: A comparison of linear and ensemble models for competing risk analysis
| dc.contributor.author | Espinosa Moreno, Juan Carlos | |
| dc.contributor.author | García García, Fernando | |
| dc.contributor.author | Mas Bilbao, Naia | |
| dc.contributor.author | García Gutiérrez, Susana | |
| dc.contributor.author | Legarreta Olabarrieta, María José | |
| dc.contributor.author | Lee, Dae Jin | |
| dc.contributor.funder | Agencia Estatal de Investigación | |
| dc.contributor.funder | Fondo Europeo de Desarrollo Regional | |
| dc.contributor.funder | European Union | |
| dc.contributor.funder | Gobierno Vasco | |
| dc.contributor.funder | Departamento de Salud del Gobierno Vasco | |
| dc.contributor.funder | Ministerio de Ciencia, Innovación y Universidades | |
| dc.contributor.ror | https://ror.org/02jjdwm75 | |
| dc.date.accessioned | 2026-03-11T10:12:48Z | |
| dc.date.issued | 2025-08-26 | |
| dc.description.abstract | Length of Stay (LoS) for in-hospital patients is a relevant indicator of efficiency in healthcare. Moreover, it is often related to the occurrence of hospital-acquired complications. In this work, we aim to explore time-to-event analysis for modelling LoS. We employed competing risk models (CR), as we considered two mutually exclusive outcomes: favorable discharge and deterioration. The explanatory variables included the patient’s sex, age, and longitudinal vital signs collected from a dataset comprising admissions. To address sparse measurements, we transformed longitudinal vital signs into cross-sectional statistics. Our approach involves data pre-processing, imputation of missing data, and variable selection. We proposed four types of CR models: Cause-specific Cox, Sub-distribution hazard, and two variants of Random Survival Forests, with both generalised Log-Rank test (cause-specific hazard estimates) and Gray’s test (cumulative incidences estimations) as node splitting rules. Performance in LoS CR models was evaluated over a time frame from 2 to 15 days. Additionally, we considered baselines with two well-established clinical early warning scores the National Early Warning Score (NEWS) and the Modified Early Warning Score (MEWS). The best model was Random Survival Forest using Gray’s test split, with Integrated Brier Score[×100] of 0.386, C-Index above 99%, and Brier Score below 0.006, along the entire time frame. Employing cross-sectional statistics derived from vital signs, along with rigorous data pre-processing, outperformed the degree of correctness of modelling LoS, compared to NEWS and MEWS. | |
| dc.description.peerreviewed | Yes | |
| dc.description.sponsorship | This research is supported by the Spanish State Research AgencyAEI/10.13039/501100011033 and FEDER, UE, under the projects S3M1P4R: PID2020-115882RB-I00 and SPHERES: PID2023-153222OB-I00. It is also funded by the Basque Government (Eusko Jaurlaritza, EJ-GV) under the strategy ‘Mathematical Modelling Applied to Health’, the BERC 2022–2025 programme, and the Health Department of the Basque Government (Osasun Saila, Eusko Jaurlaritzako) under grant number 2018111094. Additionally, this research has received support from the Spanish Ministry of Science, Innovation, and Universities (Ministerio de Ciencia, Innovación y Universidades, MCIU) under the BCAM Severo Ochoa accreditation CEX2021-001142-S. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. | |
| dc.description.status | Published | |
| dc.format | application/pdf | |
| dc.identifier.citation | Espinosa-Moreno, J. C., García-García, F., Mas-Bilbao, N., García-Gutiérrez, S., Legarreta-Olabarrieta, M. J., & Lee, D. J. (2025). Modelling in-hospital length of stay: A comparison of linear and ensemble models for competing risk analysis. Plos one, 20(8), e0322101. https://doi.org/10.1371/journal.pone.0322101 | |
| dc.identifier.doi | https://doi.org/10.1371/journal.pone.0322101 | |
| dc.identifier.issn | 1932-6203 | |
| dc.identifier.officialurl | https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0322101 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14417/4272 | |
| dc.issue.number | 8 | |
| dc.journal.title | PLoS ONE | |
| dc.language.iso | eng | |
| dc.page.total | 25 | |
| dc.publisher | Public Library of Science | |
| dc.relation.department | Sci Tech (Data Science) | |
| dc.relation.entity | IE University | |
| dc.relation.projectid | PID2020-115882RB-I00 | |
| dc.relation.projectid | 2018111094 | |
| dc.relation.projectid | CEX2021-001142-S | |
| dc.relation.projectid | PID2023-153222OB-I00 | |
| dc.relation.school | IE School of Science & Technology | |
| dc.rights | Attribution 4.0 International | |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
| dc.subject.ods | ODS 3 - Salud y bienestar | |
| dc.subject.unesco | 32 Ciencias Médicas | |
| dc.title | Modelling in-hospital length of stay: A comparison of linear and ensemble models for competing risk analysis | |
| dc.type | info:eu-repo/semantics/article | |
| dc.version.type | info:eu-repo/semantics/publishedVersion | |
| dc.volume.number | 20 | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | c8601ce9-af35-48fa-bdb6-9875f25e6c1f | |
| relation.isAuthorOfPublication.latestForDiscovery | c8601ce9-af35-48fa-bdb6-9875f25e6c1f |
