Zero-inflated boosted ensembles for rare event counts

Alexander Borisov, George Runger, Eugene Tuv, Nuttha Lurponglukana-Strand

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations


Two linked ensembles are used for a supervised learning problem with rare-event counts. With many target instances of zero, more traditional loss functions (such as squared error and class error) are often not relevant and a statistical model leads to a likelihood with two related parameters from a zero-inflated Poisson (ZIP) distribution. In a new approach, a linked pair of gradient boosted tree ensembles are developed to handle the multiple parameters in a manner that can be generalized to other problems. The result is a unique learner that extends machine learning methods to data with nontraditional structures. We empirically compare to two real data sets and two artificial data sets versus a single-tree approach (ZIP-tree) and a statistical generalized linear model.

Original languageEnglish (US)
Title of host publicationAdvances in Intelligent Data Analysis VIII - 8th International Symposium on Intelligent Data Analysis, IDA 2009, Proceedings
Number of pages12
StatePublished - 2009
Event8th International Symposium on Intelligent Data Analysis, IDA 2009 - Lyon, France
Duration: Aug 31 2009Sep 2 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5772 LCNS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other8th International Symposium on Intelligent Data Analysis, IDA 2009

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Zero-inflated boosted ensembles for rare event counts'. Together they form a unique fingerprint.

Cite this