org.apache.solr.update.processor

Class DocExpirationUpdateProcessorFactory

  • All Implemented Interfaces:
    NamedListInitializedPlugin, SolrCoreAware


    public final class DocExpirationUpdateProcessorFactory
    extends UpdateRequestProcessorFactory
    implements SolrCoreAware

    Update Processor Factory for managing automatic "expiration" of documents.

    The DocExpirationUpdateProcessorFactory provides two features related to the "expiration" of documents which can be used individually, or in combination:

    1. Computing expiration field values for documents from a "time to live" (TTL)
    2. Periodically delete documents from the index based on an expiration field

    Documents with expiration field values computed from a TTL can be be excluded from searchers using simple date based filters relative to NOW, or completely removed from the index using the periodic delete function of this factory. Alternatively, the periodic delete function of this factory can be used to remove any document with an expiration value - even if that expiration was explicitly set with-out leveraging the TTL feature of this factory.

    The following configuration options are supported:

    • expirationFieldName - The name of the expiration field to use in any operations (mandatory).
    • ttlFieldName - Name of a field this process should look for in each document processed, defaulting to _ttl_. If the specified field name exists in a document, the document field value will be parsed as a Date Math Expression relative to NOW and the result will be added to the document using the expirationFieldName. Use <null name="ttlFieldName"/> to disable this feature.
    • ttlParamName - Name of an update request param this process should look for in each request when processing document additions, defaulting to _ttl_. If the the specified param name exists in an update request, the param value will be parsed as a Date Math Expression relative to NOW and the result will be used as a default for any document included in that request that does not already have a value in the field specified by ttlFieldName. Use <null name="ttlParamName"/> to disable this feature.
    • autoDeletePeriodSeconds - Optional numeric value indicating how often this factory should trigger a delete to remove documents. If this option is used, and specifies a non-negative numeric value, a background thread will be created that will execute recurring deleteByQuery commands using the specified period. The delete query will remove all documents with an expirationFieldName up to NOW.
    • autoDeleteChainName - Optional name of an updateRequestProcessorChain to use when executing automatic deletes. If not specified, or <null/>, the default updateRequestProcessorChain for this collection is used. This option is ignored unless autoDeletePeriodSeconds is configured and is non-negative.

    For example: The configuration below will cause any document with a field named _ttl_ to have a Date field named _expire_at_ computed for it when added -- but no automatic deletion will happen.

     <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
       <str name="expirationFieldName">_expire_at_</str>
     </processor>

    Alternatively, in this configuration deletes will occur automatically against the _expire_at_ field every 5 minutes - but this processor will not automatically populate the _expire_at_ using any sort of TTL expression. Only documents that were added with an explicit _expire_at_ field value will ever be deleted.

     <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
       <null name="ttlFieldName"/>
       <null name="ttlParamName"/>
       <int name="autoDeletePeriodSeconds">300</int>
       <str name="expirationFieldName">_expire_at_</str>
     </processor>

    This last example shows the combination of both features using a custom ttlFieldName: Documents with a my_ttl field will have an _expire_at_ field computed, and deletes will be triggered every 5 minutes to remove documents whose _expire_at_ field value is in the past.

     <processor class="solr.processor.DocExpirationUpdateProcessorFactory">
       <int name="autoDeletePeriodSeconds">300</int>
       <str name="ttlFieldName">my_ttl</str>
       <null name="ttlParamName"/>
       <str name="expirationFieldName">_expire_at_</str>
     </processor>