# Machine Learning - Data Mining (Software, Library and Framework)

### Table of Contents

## 1 - About

This sections contains software library or framework that contains the implementation of machine learning algorithm. See data_mining

Data science can't be point and click

List of tools, software for data miner, machine learner.

See also: Natural Language - Processing (NLP)

## 2 - Articles Related

## 3 - Analytics

## 4 - Framework

- H2o - Open Source Fast Scalable Machine Learning Platform For Smarter Applications (Deep Learning, Gradient Boosting, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), …) - written in Java binding for R , Python

## 5 - Languages

- Matlab was built for matrix calculations (linear algebra).
- The R language is meant for statistics.
- Python are good general purpose languages

But they don’t run as quickly as languages like C and Java

### 5.1 - Python

Python is an incredible open source ecosystem. Package:

- Machine learning - Python scikit-learn (Machine learning)
- Numpy (for maths and arrays),
- SciPy (Scientific Tools) SciPy provides a lot of scientific routines that work on top of NumPy
- Pandas (Python Data Analysis Library) - handy to manipulate financial data
- matplotlib enable to plot graphics
- Keras: The Python Deep Learning library

### 5.2 - Java

- Stanford Classifier is a MaxEnt classifier. The Stanford Classifier shines is in working with mainly textual data.
- Apache Mahout machine learning library
- Apache Gora: Big Data Persistence Framework (Column Store)

### 5.3 - Javascript

### 5.4 - R

R Nice interactive data analysis tool through things like RStudio.

### 5.5 - Go

https://github.com/chewxy/gorgonia/

Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily. If this sounds like Theano or TensorFlow, it's because the idea is quite similar. Specifically, the library is pretty low-level, like Theano, but has higher goals like Tensorflow.

### 5.6 - Oracle

### 5.7 - Microsoft

### 5.8 - Others

- MATLAB/Octave
- Julia: New language
- KNIME: KNIME [naim] is an opensource workbench for the entire analysis process
- Rapid Miner

### 5.9 - Tools

## 6 - Framework

- Uber Michelangelo consists of a mix of open source systems and components built in-house. The primary open sourced components used are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow.