Discovery & Visualization
In today’s digital world, organizations are amassing petabytes of data, but struggling to index and search it using traditional database approaches – RDBMSs simply cannot scale to this level. Apache Blur, addresses this problem. Blur is a search engine capable of querying massive amounts of structured data at incredible speeds. Blur combines the speed of document-oriented databases with the ability to build rich data models to deliver lightening-fast answers to complex queries.
Web and online social graphs have been rapidly growing in size and scale during the past decade. In 2008, Google estimated that the number of web pages reached over a trillion. Online social networking and email sites, including Yahoo!, Google, Microsoft, Facebook, LinkedIn, and Twitter, have hundreds of millions of users and are expected to grow much more in the future. Processing these graphs plays a big role in relevant and personalized information for users, such as results from a search engine or news in an online social networking site.
The Apache LuceneTM project develops open-source search software, including:
- Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
- SolrTM is a high performance search server built using Lucene Core, with XML/HTTP and JSON/Python/Ruby APIs, hit highlighting, faceted search, caching, replication, and a web admin interface.
- Open Relevance Project is a subproject with the aim of collecting and distributing free materials for relevance testing and performance.
- PyLucene is a Python port of the Core project.