NLP - (Software| API )
NLP - (Software|API)
Articles Related
Plugin combo - Component related: Nothing was found.
List
Apache Nutch: open source web crawler (Nutch can crawl and post to Apache Solr for search/index.)
Apache Tika: detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF)
-
Library | Language | Open Source | Note |
NLTK | Python | Yes | |
Gensim | Python | Yes | |
spacy.io | Python | Yes | |
ElasticSearch (Index and Search) | Java | Apache 2 | (based on Lucene) Guide, Crat (query / SQL layer on top of elasticsearch) |
Solr (Index and Search) | Java | Apache 2 | (based on Lucene) Solr |
Apache OpenNLP | Java | Yes | |
Deepleaerning | Java, Scala | Yes | |
Weka | Java | GPL | See https://github.com/fracpete/nlp-weka-package |
Standford NLP | Java | GPL | Demo (Part of Speech, Named Entity Recognition, Coreference, Basic dependencies, Collapsed dependencies, Collapsed CC-processed dependencies) Github: http://stanfordnlp.github.io/CoreNLP/ Online Run: http://corenlp.run/ |
LingPipe | Java | No | Topic Classification, Named Entity Recognition (NER), Sentiment Analysis, … |
tm | R | Yes | |
rWeka | R | Yes | rJava via JNI |
openNLP | R | Yes | rJava via JNI |
OCR Tesseract | | | |
TweetNLP | Java | Yes | tokenizer, a part-of-speech tagger, hierarchical word clusters, and a dependency parser for tweets |
Smile | Java | LGPL | Statistical Machine Intelligence and Learning Engine |
Oracle