Machine Learning - Data Mining (Software, Library and Framework)
About
This sections contains software library or framework that contains the implementation of machine learning algorithm. See Data Mining
Data science can't be point and click
List of tools, software for data miner, machine learner.
Articles Related
Analytics
Framework
- H2o - Open Source Fast Scalable Machine Learning Platform For Smarter Applications (Deep Learning, Gradient Boosting, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), …) - written in Java binding for R , Python
Languages
- Matlab was built for matrix calculations (linear algebra).
- The R language is meant for statistics.
- Python are good general purpose languages
But they don’t run as quickly as languages like C and Java
Python
Python is an incredible open source ecosystem. Package:
- Machine learning - Python scikit-learn (Machine learning)
- Numpy (for maths and arrays),
- SciPy (Scientific Tools) SciPy provides a lot of scientific routines that work on top of NumPy
- Pandas (Python Data Analysis Library) - handy to manipulate financial data
- matplotlib enable to plot graphics
- Keras: The Python Deep Learning library
Java
- Stanford Classifier is a MaxEnt classifier. The Stanford Classifier shines is in working with mainly textual data.
- Apache Mahout machine learning library
- Apache Gora: Big Data Persistence Framework (Column Store)
Javascript
R
R Nice interactive data analysis tool through things like RStudio.
Go
https://github.com/chewxy/gorgonia/
Gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily. If this sounds like Theano or TensorFlow, it's because the idea is quite similar. Specifically, the library is pretty low-level, like Theano, but has higher goals like Tensorflow.
Oracle
Microsoft
Others
- MATLAB/Octave
- Julia: New language
- KNIME: KNIME [naim] is an opensource workbench for the entire analysis process
- Rapid Miner
Tools
Framework
- Uber Michelangelo consists of a mix of open source systems and components built in-house. The primary open sourced components used are HDFS, Spark, Samza, Cassandra, MLLib, XGBoost, and TensorFlow.