org.apache.solr.update.processor

Class RegexReplaceProcessorFactory

  • All Implemented Interfaces:
    NamedListInitializedPlugin, SolrCoreAware


    public final class RegexReplaceProcessorFactory
    extends FieldMutatingUpdateProcessorFactory
    An updated processor that applies a configured regex to any CharSequence values found in the selected fields, and replaces any matches with the configured replacement string.

    By default this processor applies itself to no fields.

    By default, literalReplacement is set to true, in which case, the replacement string will be treated literally by quoting via Matcher.quoteReplacement(String). And hence, '\' and '$' signs will not be processed. When literalReplacement is set to false, one can perform backreference operations and capture group substitutions.

    For example, with the configuration listed below, any sequence of multiple whitespace characters found in values for field named title or content will be replaced by a single space character.

     <processor class="solr.RegexReplaceProcessorFactory">
       <str name="fieldName">content</str>
       <str name="fieldName">title</str>
       <str name="pattern">\s+</str>
       <str name="replacement"> </str>
       <bool name="literalReplacement">true</bool>
     </processor>
    See Also:
    Pattern