mlampros Organizing and Sharing thoughts, Receiving constructive feedback

Extreme Learning Machine

As of 2018-06-17 the elmNN package was archived and due to the fact that it was one of the machine learning functions that I used when I started learning R (it returns the output results pretty fast too) plus that I had to utilize the package last week for a personal task I decided to reimplement the R code in Rcpp. It didn’t take long because the R package was written, initially by the author, in a clear way. In the next lines I’ll explain the differences and the functionality just for reference.

Continue reading...

New functionality for the textTinyR package

This blog post discuss the new functionality, which is added in the textTinyR package (version 1.1.0). I’ll explain some of the functions by using the data and pre-processing steps of this blog-post.

Continue reading...

Non Metric Space (Approximate) Library in R

The nmslibR package is a wrapper of NMSLIB, which according to the authors “… is a similarity search library and a toolkit for evaluation of similarity search methods. The goal of the project is to create an effective and comprehensive toolkit for searching in generic non-metric spaces. Being comprehensive is important, because no single method is likely to be sufficient in all cases. Also note that exact solutions are hardly efficient in high dimensions and/or non-metric spaces. Hence, the main focus is on approximate methods”.

I’ve searched for some time (before wrapping NMSLIB) for a nearest neighbor library which can work with high dimensional data and can scale with big datasets. I’ve already written a package for k-nearest-neighbor search (KernelKnn), however, it’s based on brute force and unfortunately, it requires a certain computation time if the data consists of many rows. The nmslibR package, besides the main functionality of the NMSLIB python library, also includes an Approximate Kernel k-nearest function, which as I will show in the next lines is both fast and accurate. A comparison of NMSLIB with other popular approximate k-nearest-neighbor methods can be found here.

Continue reading...

Regularized Greedy Forest in R

This blog post is about my newly released RGF package (the blog post consists mainly of the package Vignette). The RGF package is a wrapper of the Regularized Greedy Forest python package, which also includes a Multi-core implementation (FastRGF). Portability from Python to R was made possible using the reticulate package and the installation requires basic knowledge of Python. Except for the Linux Operating System, the installation on Macintosh and Windows might be somehow cumbersome (on windows the package currently can be used only from within the command prompt). Detailed installation instructions for all three Operating Systems can be found in the README.md file and in the rgf_python Github repository.

Continue reading...

Statoil / C-CORE Iceberg Classifier Competition

For the last two months, I had participated in a machine learning competition organized by Kaggle (platform for predictive modeling and analytics), where I ended up in the top 1 % on the private leaderboard or 24th out of 3343 participants. I thought it would be worth writing a blog post in order to both share my experience / insights and keep a reference of key features for satellite imagery ( Sentinel-1 satellite data and specifically HH - transmit/receive horizontally - and HV - transmit horizontally and receive vertically ) in case it might be useful in the future.

Continue reading...