Solr 4.10.1 UpdateRequestProcessor factories

Overview

UpdateRequestProcessor is a mechinism in Solr to change the documents that are being submitted for indexing to Solr. They provide advanced functions such as language identification, duplicate detection, intelligent defaults, external text processing pipelines integration, and - most recently - dynamic schema definition.

UpdateRequestProcessor factories (a.k.a. Update Request Processors or URPs) can be chained and multiple chains can be defined for one Solr collection. A chain is assigned to a request handler with update.chain parameter that can be defined in the configuration file or passed as a part of the URL. For full syntax, see example solrconfig.xml or consult Solr WIKI.

Here, you will find the full set of UpdateRequestProcessor factories presented in their inheritance hierarchy. Abstract classes that you cannot use directly are marked with underlined italic. Clicking on the class name will open corresponding JavaDoc page.

Most of the UpdateRequestProcessor factories are located in solr-core-4.10.1.jar ( example/solr-webapp/webapp/WEB-INF/lib/ ), so any entry without a location indicated can be found in that jar.

Factories

UpdateRequestProcessorFactory
A factory to generate an UpdateRequestProcessor for each request.

AbstractDefaultValueUpdateProcessorFactory
Base class that can be extended by any UpdateRequestProcessorFactory designed to add a default value to the document in an AddUpdateCommand when that field is not already specified.

DefaultValueUpdateProcessorFactory
An update processor that adds a constant default value to any document being added that does not already have a value in the specified field.

TimestampUpdateProcessorFactory
An update processor that adds a newly generated Date value of "NOW" to any document being added that does not already have a value in the specified field.

AddSchemaFieldsUpdateProcessorFactory
This processor will dynamically add fields to the schema if an input document contains one or more fields that don't match any field or dynamic field in the schema.

CloneFieldUpdateProcessorFactory
Clones the values found in any matching source field into the configured dest field.

DistributedUpdateProcessorFactory
Factory for DistributedUpdateProcessor.

DocBasedVersionConstraintsProcessorFactory
This Factory generates an UpdateProcessor that helps to enforce Version constraints on documents based on per-document version numbers using a configured name of a versionField.

DocExpirationUpdateProcessorFactory
Update Processor Factory for managing automatic "expiration" of documents.

FieldMutatingUpdateProcessorFactory
Base class for implementing Factories for FieldMutatingUpdateProcessors and FieldValueMutatingUpdateProcessors.

ConcatFieldUpdateProcessorFactory
Concatenates multiple values for fields matching the specified conditions using a configurable delimiter which defaults to ", ".

CountFieldValuesUpdateProcessorFactory
Replaces any list of values for a field matching the specified conditions with the the count of the number of values for that field.

FieldLengthUpdateProcessorFactory
Replaces any CharSequence values found in fields matching the specified conditions with the lengths of those CharSequences (as an Integer).

FieldValueSubsetUpdateProcessorFactory
Base class for processors that want to mutate selected fields to only keep a subset of the original values.

FirstFieldValueUpdateProcessorFactory
Keeps only the first value of fields matching the specified conditions.

LastFieldValueUpdateProcessorFactory
Keeps only the last value of fields matching the specified conditions.

MaxFieldValueUpdateProcessorFactory
An update processor that keeps only the the maximum value from any selected fields where multiple values are found.

MinFieldValueUpdateProcessorFactory
An update processor that keeps only the the minimum value from any selected fields where multiple values are found.

UniqFieldsUpdateProcessorFactory
Removes duplicate values found in fields matching the specified conditions.

HTMLStripFieldUpdateProcessorFactory
Strips all HTML Markup in any CharSequence values found in fields matching the specified conditions.

IgnoreFieldUpdateProcessorFactory
Ignores & removes fields matching the specified conditions from any document being added to the index.

ParseBooleanFieldUpdateProcessorFactory
Attempts to mutate selected fields that have only CharSequence-typed values into Boolean values.

ParseDateFieldUpdateProcessorFactory
Attempts to mutate selected fields that have only CharSequence-typed values into Date values.

ParseNumericFieldUpdateProcessorFactory
Abstract base class for numeric parsing update processor factories.

ParseDoubleFieldUpdateProcessorFactory
Attempts to mutate selected fields that have only CharSequence-typed values into Double values.

ParseFloatFieldUpdateProcessorFactory
Attempts to mutate selected fields that have only CharSequence-typed values into Float values.

ParseIntFieldUpdateProcessorFactory
Attempts to mutate selected fields that have only CharSequence-typed values into Integer values.

ParseLongFieldUpdateProcessorFactory
Attempts to mutate selected fields that have only CharSequence-typed values into Long values.

PreAnalyzedUpdateProcessorFactory
An update processor that parses configured fields of any document being added using PreAnalyzedField with the configured format parser.

RegexReplaceProcessorFactory
An updated processor that applies a configured regex to any CharSequence values found in the selected fields, and replaces any matches with the configured replacement string.

RemoveBlankFieldUpdateProcessorFactory
Removes any values found which are CharSequence with a length of 0.

TrimFieldUpdateProcessorFactory
Trims leading and trailing whitespace from any CharSequence values found in fields matching the specified conditions and returns the resulting String.

TruncateFieldUpdateProcessorFactory
Truncates any CharSequence values found in fields matching the specified conditions to a maximum character length.

LangDetectLanguageIdentifierUpdateProcessorFactory in solr-langid-4.10.1.jar ( dist/ )
Identifies the language of a set of input fields using http://code.google.com/p/language-detection The UpdateProcessorChain config entry can take a number of parameters which may also be passed as HTTP parameters on the update request and override the defaults.

LogUpdateProcessorFactory
A logging processor.

NoOpDistributingUpdateProcessorFactory
A No-Op implementation of DistributingUpdateProcessorFactory that allways returns null.

RegexpBoostProcessorFactory
Factory which creates RegexBoostProcessors

RunUpdateProcessorFactory
Executes the update commands using the underlying UpdateHandler.

SignatureUpdateProcessorFactory

StatelessScriptUpdateProcessorFactory
An update request processor factory that enables the use of update processors implemented as scripts which can be loaded by the SolrResourceLoader (usually via the conf dir for the SolrCore).

TikaLanguageIdentifierUpdateProcessorFactory in solr-langid-4.10.1.jar ( dist/ )
Identifies the language of a set of input fields using Tika's LanguageIdentifier.

UIMAUpdateRequestProcessorFactory in solr-uima-4.10.1.jar ( dist/ )
Factory for UIMAUpdateRequestProcessor

URLClassifyProcessorFactory
Creates URLClassifyProcessor

UUIDUpdateProcessorFactory
An update processor that adds a newly generated UUID value to any document being added that does not already have a value in the specified field.

Short Names

Notice that most of UpdateRequestProcessor factories can be referenced by shortname such as:
<processor class="solr.CustomUpdateRequestProcessorFactory">
Only non-core URPs require full class name, including package name.


Previous versions of this document

You can also find archive versions of this document for version 4.9.0, version 4.8.0, and version 4.7.0

Subscribe to Solr Start news and updates:

* indicates required