<img height="1" width="1" src="https://www.facebook.com/tr?id=2072464173003314&amp;ev=PageView &amp;noscript=1">

AnswerIQ Technology

Erin LeDell

Find me on:

Recent Posts

Benchmarking Random Forest Classification

Last month, wise.io co-founder Joey wrote a blog post on the principles of doing proper benchmarks of machine learning frameworks. Here, I start to put those principles into practice, presenting the first in a series of blog posts on ML benchmarking. For those short of time, you can jump to conclusions.

For this benchmark, I focus on comparing accuracy and speed of four random forest®1 classifier implementations, including the high-performance WiseRF™. In follow-up posts we will cover random forest regression and benchmark against other machine learning algorithms. We will also benchmark memory usage across different implementations — another very important, but often overlooked aspect of benchmarking, especially when it comes to “big data.”

Tools

We created a standardized benchmarking platform to compare the accuracy and speed of the following random forest implementations:

All of the tools, with the exception of the randomForest R package, are multi-threaded and were parallelized across available cores. H2O has the option of running a distributed (multi-machine) implementation, but I considered herein the more common single-node workstation for this benchmarking exercise.

Read More

Topics: Machine Learning, Data Science