Apache - Solr (Index and Search)

Text Mining

About

Apache Solr is a search platform built on Apache Lucene.

Build on Java, License onder Apache 2

Data may be stored in HDFS in Hadoop

elasticsearch is built on top of it.

API

SolrJ is an API that makes it easy for applications written in Java

Documentation / Reference





Discover More
Lucene

is a text search engine library. The following application are application (ie build on it): Solr Elastic Search New Relic Logs ... The text data model of is based on the following concept:...
Text Mining
NLP - (Software| API )

Apache Nutch: open source web crawler (Nutch can crawl and post to Apache Solr for search/index.) Apache Tika: detects and extracts metadata and text from over a thousand different file types (such as...
What is a Full Text Search Engine ?

Full-text search is a battle between: precision—returning as few irrelevant documents as possible and recall—returning as many relevant documents as possible. While matching only the exact words...



Share this page:
Follow us:
Task Runner