Bayesian clustering of multiple zero-inflated outcomes

dc.contributor.authorFranzolini, Beatrice
dc.contributor.authorCremaschi, Andrea
dc.contributor.authorBoom, Willem van den
dc.contributor.authorIorio, Maria De
dc.contributor.funderNUS Centre for Trusted Internet and Community
dc.contributor.rorhttps://ror.org/02jjdwm75
dc.date.accessioned2026-05-27T09:08:50Z
dc.date.issued2023-03-27
dc.description.abstractSeveral applications involving counts present a large proportion of zeros (excess-of-zeros data). A popular model for such data is the hurdle model, which explicitly models the probability of a zero count, while assuming a sampling distribution on the positive integers. We consider data from multiple count processes. In this context, it is of interest to study the patterns of counts and cluster the subjects accordingly. We introduce a novel Bayesian approach to cluster multiple, possibly related, zero-inflated processes. We propose a joint model for zero-inflated counts, specifying a hurdle model for each process with a shifted Negative Binomial sampling distribution. Conditionally on the model parameters, the different processes are assumed independent, leading to a substantial reduction in the number of parameters as compared with traditional multivariate approaches. The subject-specific probabilities of zero-inflation and the parameters of the sampling distribution are flexibly modelled via an enriched finite mixture with random number of components. This induces a two-level clustering of the subjects based on the zero/non-zero patterns (outer clustering) and on the sampling distribution (inner clustering). Posterior inference is performed through tailored Markov chain Monte Carlo schemes. We demonstrate the proposed approach on an application involving the use of the messaging service WhatsApp. This article is part of the theme issue ‘Bayesian inference: challenges, perspectives, and prospects’.
dc.description.peerreviewedYes
dc.description.sponsorshipThis work was partially supported by the NUS Centre for Trusted Internet and Community (grant no. CTIC-RP-20-09).
dc.description.statusPublished
dc.formatapplication/pdf
dc.identifier.citationFranzolini, B., Cremaschi, A., Van Den Boom, W., & De Iorio, M. (2023). Bayesian clustering of multiple zero-inflated outcomes. Philosophical Transactions of the Royal Society A, 381(2247), https://doi.org/10.1098/rsta.2022.0145
dc.identifier.doihttps://doi.org/10.1098/rsta.2022.0145
dc.identifier.issn1471-2962
dc.identifier.officialurlhttps://royalsocietypublishing.org/rsta/article/381/2247/20220145/112453/Bayesian-clustering-of-multiple-zero-inflated
dc.identifier.urihttps://hdl.handle.net/20.500.14417/4369
dc.issue.number2247
dc.journal.titlePhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
dc.language.isoeng
dc.page.total16
dc.publisherThe Royal Society
dc.relation.entityIE University
dc.relation.projectidCTIC-RP-20-09
dc.relation.schoolIE School of Science & Technology
dc.rightsAttribution 4.0 International
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.keywordsconditional algorithm
dc.subject.keywordsexcess-of-zeros data
dc.subject.keywordsenriched priors
dc.subject.keywordshurdle model
dc.subject.keywordsfinite mixtures
dc.subject.keywordsnested clustering
dc.subject.odsODS 3 - Salud y bienestar
dc.subject.unesco12 Matemáticas
dc.titleBayesian clustering of multiple zero-inflated outcomes
dc.typeinfo:eu-repo/semantics/article
dc.version.typeinfo:eu-repo/semantics/publishedVersion
dc.volume.number381
dspace.entity.typePublication
relation.isAuthorOfPublication976c8dd3-a3ba-4b1a-9273-72c7ee16c39e
relation.isAuthorOfPublication.latestForDiscovery976c8dd3-a3ba-4b1a-9273-72c7ee16c39e

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
rsta.2022.0145.pdf
Tamaño:
844.09 KB
Formato:
Adobe Portable Document Format

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
1.71 KB
Formato:
Item-specific license agreed to upon submission
Descripción: