The subsemble package is an R implementation of the Subsemble algorithm. Subsemble is a general subset ensemble prediction method, which can be used for small, moderate, or large datasets.

Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a unique form of k-fold cross-validation to output a prediction function that combines the subset-specific fits. An oracle result provides a theoretical performance guarantee for Subsemble.

The subsemble package also provides an implementation of the Super Learner ensemble algorithm.

Author(s): Erin LeDell, Stephanie Sapp, Mark van der Laan

GitHub | CRAN