| Library | Language | Open Source | Note |
|---|---|---|---|
| NLTK | Python | Yes | |
| Gensim | Python | Yes | |
| spacy.io | Python | Yes | |
| ElasticSearch (Index and Search) | Java | Apache 2 | (based on Lucene) Guide, Crat (query / SQL layer on top of elasticsearch) |
| Solr (Index and Search) | Java | Apache 2 | (based on Lucene) Solr |
| Apache OpenNLP | Java | Yes | |
| Deepleaerning | Java, Scala | Yes | |
| Weka | Java | GPL | See https://github.com/fracpete/nlp-weka-package |
| Standford NLP | Java | GPL | Demo (Part of Speech, Named Entity Recognition, Coreference, Basic dependencies, Collapsed dependencies, Collapsed CC-processed dependencies) Github: http://stanfordnlp.github.io/CoreNLP/ Online Run: http://corenlp.run/ |
| LingPipe | Java | No | Topic Classification, Named Entity Recognition (NER), Sentiment Analysis, … |
| tm | R | Yes | |
| rWeka | R | Yes | rJava via JNI |
| openNLP | R | Yes | rJava via JNI |
| OCR Tesseract | |||
| TweetNLP | Java | Yes | tokenizer, a part-of-speech tagger, hierarchical word clusters, and a dependency parser for tweets |
| Smile | Java | LGPL | Statistical Machine Intelligence and Learning Engine |