EnsembleSVM


A Library for Ensemble Learning Using Support Vector Machines




Introduction

EnsembleSVM is a free software machine learning project. The EnsembleSVM library offers functionality to perform ensemble learning using Support Vector Machine (SVM) base models. In particular, we offer routines for binary ensemble models using SVM base classifiers.

The library enables users to efficiently train models for large data sets. Through a divide-and-conquer strategy, base models are trained on subsets of the data which makes training feasible even when using nonlinear kernels. Base models are combined into ensembles with high predictive performance through a bagging strategy. Experimental results have shown the predictive performance to be comparable with standard SVM models but with drastically reduced training time (cfr. our use cases).

EnsembleSVM is licensed under the GNU Lesser General Public License (LGPL) version 3, which can be found here. Latest version of our software: 2.0.

EnsembleSVM is accepted for publication in the Journal of Machine Learning Research (Open Source Software section)

Contact

We are excited to hear about your experience in using EnsembleSVM! For ideas, applications, comments or support please send an email to Marc Claesen at marc.claesen{at}esat.kuleuven.be. If you want to be informed about the latest news regarding the package, please let us know so we can add you to the news mailing list.

How to cite

If you use EnsembleSVM, please cite it as (bibtex):
Marc Claesen, Frank De Smet, Johan A.K. Suykens, Bart De Moor. EnsembleSVM: A Library for Ensemble Learning Using Support Vector Machines. Journal of Machine Learning Research, 15:141-145, 2014.

News and Updates

  • December 2, 2013: added preliminary support for precomputed kernels.
    The implementation follows LIBSVM philosophy. To use this feature, please download the latest source files from github (disclaimer: not fully tested yet).

  • October 3, 2013: release of version 2.0
    Support for multithreading in training and prediction with ensemble models. Since both of these are embarassingly parallel, this has induced a significant speedup (3-fold on quad-core).
    Extensive programming framework for aggregation of base model predictions which allows highly efficient prototyping of new aggregation approaches. Additionally we provide several predefined strategies, including (weighted) majority voting, logistic regression and nonlinear SVMs of your choice -- be sure to check out the esvm-edit tool! The provided framework also allows you to efficiently program your own, novel aggregation schemes.
    Full code transition to C++11, the latest C++ standard, which enabled various performance improvements. The new release requires moderately recent compilers, such as gcc 4.7.2+ or clang 3.2+.
    Generic implementations of convenient facilities have been added, such as thread pools, deserialization factories and more.
    The API and ABI has undergone significant changes, many of which are due to the transition to C++11.

  • March 30, 2013: release of version 1.2.
    Changes: Fixed bug in IndexedFile, which caused esvm-train to fail when used without bootstrap mask.
    Library API remains unchanged, library revision increased.

  • March 25, 2013: release of version 1.1.
    Changes: removed deprecated command line argument related to cross-validation from split-data tool.
    Library API and ABI remain unchanged.

  • March 22, 2013: initial release.
Useful Links
EnsembleSVM @ mloss.org
EnsembleSVM @ GitHub
LIBSVM website
Developers' Corner
GitHub wiki GitHub mascot
Doxygen documentation
Using LGPL v3