Start networking and exchanging professional insights

Register now or log in to join your professional community.

Follow

Which is the best open source search engine ?

- suitable for indexing large amount of data - suitable for fast retreival - support layer for NLP - supports incremental updates - out of the box: facets, auto-suggest, spell check, .. etc

user-image
Question added by Hamzeh Abu Zakham , Director of Software Development , bayt.com
Date Posted: 2012/12/06
Zaid Rabab'a
by Zaid Rabab'a , Technical Team Lead , ESKADENIA Software

In my opinion - Apache Solr SolrTM Features Solr is a standalone enterprise search server with a REST-like API.
You put documents in it (called "indexing") via XML, JSON, CSV or binary over HTTP.
You query it via HTTP GET and receive XML, JSON, CSV or binary results.
Advanced Full-Text Search Capabilities Optimized for High Volume Web Traffic Standards Based Open Interfaces - XML, JSON and HTTP Comprehensive HTML Administration Interfaces Server statistics exposed over JMX for monitoring Linearly scalable, auto index replication, auto failover and recovery Near Real-time indexing Flexible and Adaptable with XML configuration Extensible Plugin Architecture for more information about this topic here is a book title : A Comparison of Open Source Search Engines Christian Middleton, Ricardo Baeza-Yates

THIKRALLAH SHREAH
by THIKRALLAH SHREAH , Technical Team Leader , bayt.com

Well, there are many open source engines available nowadays, each one of them beats the others in one or many things.
Solr/lucene and sphinx have proved to support for large indexes.(Check the websites, you will see big names there).
Based on my personal experimentation, sphinx is a little bit faster than Appche Solr, but really it depends on the application you apply.
Solr in clustering could beat sphinx in some detailed cases, and sphinx could beat solr on one query.
Again the requirements decide which one is suitable for you.
As for NLP, Xapain rocks in this.
other search engine may provide some NLP, but nothing betas xapian.
Real time indexes is usually supported by most of the search engines.
Sphinx 2.0X has a excellent support, specially that sphinx provide "Mysql" interface, so it's really like working with database.
Out of the box features, is the Solr thing .
Solr.lucene is the most open source search engine with rich APIs.
In most standard usage, sphinx and solr should be sufficient for most requirements, But you need to fully understand how the run, and how the drivers API handle the operation.

More Questions Like This