Class DictionaryColumn

java.lang.Object
org.apache.lucene.document.column.Column
org.apache.lucene.document.column.DictionaryColumn

public abstract class DictionaryColumn extends Column
A Column that provides string or binary values via a pre-defined term dictionary plus per-doc ordinals into that dictionary. Used for SORTED and SORTED_SET doc values, for stored binary or string fields, and for term inversion (tokenized or untokenized).

Iteration is performed via cursors. tuples() is always available and yields (docID, ordinal) pairs. values() is a bulk cursor over consecutive doc-ids; it must be overridden when Column.density() is DENSE and is only consulted in that case.

The caller supplies a fixed List<BytesRef> dictionary at construction. Per-doc ordinals returned by cursors index into this dictionary.

Duplicate dictionary entries are permitted; two slots with the same bytes will both resolve to the same Lucene-level ordinal. The dictionary may be in any order.

The dictionary list and the backing byte arrays of its entries must not be mutated after the column is constructed.

WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Constructor Details

    • DictionaryColumn

      protected DictionaryColumn(String name, IndexableFieldType fieldType, Column.Density density, List<BytesRef> dictionary)
      Creates a DictionaryColumn.
      Parameters:
      name - the field name
      fieldType - describes how this field should be indexed
      density - whether every batch-local doc-id has a value
      dictionary - the term universe; entries must be non-null and no longer than ByteBlockPool.BYTE_BLOCK_SIZE - 2. Must contain at least one entry. Duplicate entries are allowed but incur a minor per-batch cost.
  • Method Details