Apache - Solr (Index and Search)

Text Mining


Apache Solr is a search platform built on Apache Lucene.

Build on Java, License onder Apache 2

Data may be stored in HDFS in Hadoop

elasticsearch is built on top of it.


SolrJ is an API that makes it easy for applications written in Java

Documentation / Reference

Discover More

Lucene is a text search engine library. The following application are Lucene application (ie build on it): * Solr * Elastic Search * New Relic Logs * ... The text data model of Lucene is...
NLP - (Software| API )

Apache Nutch: open source web crawler (Nutch can crawl and post to Apache Solr for search/index.) Apache Tika: detects and extracts metadata and text from over a thousand different file types (such as...
What is a Full Text Search Engine ?

Search Engine (Full Text Search) Full-text search is a battle between: * precision—returning as few irrelevant documents as possible * and recall—returning as many relevant documents as possible....

