Welcome to the collection of resources to make Apache Solr search engine more comprehensible to beginner and intermediate users. While Solr is very easy to start with, tuning it is - like for any search engine - fairly complex. This website will try to make this simpler by compiling information and creating tools to accelerate learning Solr. The currently available resources are linked in the menubar above. More resources will be coming shortly.
Resources
There are three types of resources on the website currently. All are created semi-automatically from Solr source and distribution. So, they are more complete then manually compiled lists.
Analyzers, Tokenizers, and Filters
Any processing of content in Solr is done through fields and their associated field types. Field type definitions consist of analyser chains. Those chains can contain
standalone analyzers. More commonly, however, they contain an optional sequence
Character Filters followed by a compulsory
Tokenizer, optionally followed by a sequence of
Token Filters. There could be separate definition for index and query chains. Together with
copyField instructions, this gives Solr ultimate flexibility on how text is processed and searched.
Update Request Processors
Analyzer chains process the fields as they are getting indexed and searched. However, sometimes there is a need to process the document submitted before it hits the indexing process. This allows to do things like automatically adding ID fields, implementing
schemaless mode or counting number of values in a multi-valued field to speed up searches. This is done using custom Update Request Processor chains that are configured in solrconfig.xml. One URP chain can contain many
individual URP Factories, allowing Solr to pre-process documents uniformly even if they originate from different clients outside Solr.
Searchable Lucene and Solr Javadocs
Lucene and Solr Javadocs are available online. However, they are split into many packages, which often makes looking up a class somewhat difficult. And, sometimes, valuable configuration information is only avialable on the component's Javadoc page. To make this easier, this website provides
combined Lucene and Solr Javadocs, that is searchable using Solr-backed autocomplete. It also uses alternative page layout (iframes, instead of usual frames) to allow you bookmarking individual classes more easily.
The Analyzer and URP pages cross-link to the relevant Javadoc entries for all their components. In its turn, Javadoc cross-links to the source files for the listed version on the official Lucene/Solr Github repository. This way, it is possible to go from looking at a list of components, to checking individual component's detailed documentations, to reviewing its source just in a couple of clicks.
Recent changes
- November 2018
- Removed pop-up survey and twitter embeds (annoying and slow)
- Updated to more recent presentation on the home page
- March 2017
- Moved to new static site generator
- Deleted archive version of information, as nobody looked at them
- Deleted least useful pages for now
- Updated home page to provide a bit more info on the website content
- February 2017
- Updated all lists and Javadoc to Solr 6.4.0