Lucene nightly benchmarks

Each night, an automated Python tool checks out the Lucene/Solr trunk source code and runs multiple benchmarks: indexing the entire Wikipedia English export three times (with different settings / document sizes); running a near-real-time latency test; running a set of "hardish" auto-generated queries and tasks. The tests take around 2.5 hours to run, and the results are verified against the previous run and then added to the graphs linked below.

The goal is to spot any long-term regressions (or, gains!) in Lucene's performance that might otherwise accidentally slip past the committers, hopefully avoiding the fate of the boiling frog.

See more details in this blog post.

See pretty flame charts from Java Flight Recorder profiling at blunders.io.



Indexing:
    Indexing throughput
    Analyzers throughput
    Near-real-time refresh latency
    GB/hour for medium docs index
    GB/hour for medium docs index with vectors
    GB/hour for medium docs index with int8 quantized vectors
    GB/hour for big docs index
    Indexing JIT/GC times
    Index disk usage

BooleanQuery:
    +high-freq +high-freq
    +high-freq +medium-freq
    high-freq high-freq
    high-freq medium-freq
    +high-freq +(medium-freq medium-freq)
    +medium-freq +(high-freq high-freq)
    Disjunction of 2 regular terms and 2 stop words
    Conjunction of 2 regular terms and 2 stop words
    Disjunction of 2 or more stop words
    Conjunction of 2 or more stop words
    Disjunction of 3 terms
    Conjunction of 3 terms
    Disjunction of a very frequent term and a very rare term
    Disjunction of many terms

CombinedFieldQuery:
    Combined high-freq
    Combined OR high-freq medium-freq
    Combined OR high-freq high-freq
    Combined AND high-freq medium-freq
    Combined AND high-freq high-freq

DisjunctionMaxQuery to take the maximum score across the title and body fields:
    Term query
    Disjunctive query on a high-frequency term and a medium-frequency term
    Disjunctive query on two high-frequency terms

Proximity queries:
    Exact phrase
    Sloppy (~4) phrase
    Span near (~10)
    Ordered intervals (MAXWIDTH/10)

FuzzyQuery:
    Edit distance 1
    Edit distance 2

Count:
    Count(Term)
    Count(Phrase)
    Count(+high-freq +high-freq)
    Count(+high-freq +med-freq)
    Count(high-freq high-freq)
    Count(high-freq med-freq)
    Count(Filtered(Phrase))
    Count(Filtered(high-freq high-freq))
    Count(Filtered(high-freq med-freq))

Vector Search:
    VectorSearch (approximate KNN float 768-dimension vector search from word embeddings)
    Likewise, with a pre-filter
    Same filter, but applied as a post-filter rather than a pre-filter

Other queries:
    TermQuery
    Respell (DirectSpellChecker)
    Primary key lookup
    WildcardQuery
    PrefixQuery (3 leading characters)
    Numeric range filtering on last-modified-datetime

Filtered queries where the filter matches 5% of the index:
    Filtered term query
    Filtered conjunctive query on two high-frequency terms
    Filtered conjunctive query on a high-frequency term and a medium-frequency term
    Filtered disjunctive query on two high-frequency terms
    Filtered disjunctive query on a high-frequency term and a medium-frequency term
    Filtered phrase query
    Filtered disjunction of 2 regular terms and 2 stop words
    Filtered conjunction of 2 regular terms and 2 stop words
    Filtered disjunction of 2 or more stop words
    Filtered conjunction of 2 or more stop words
    Filtered disjunction of 3 terms
    Filtered conjunction of 3 terms
    Filtered disjunction of many terms
    Filtered prefix query on a prefix of 3 chars
    Filtered numeric range query

Faceting:
    Term query + date hierarchy
    All dates hierarchy
    All dates hierarchy (doc values)
    All months
    All months (doc values)
    All dayOfYear
    All dayOfYear (doc values)
    medium-freq-term +dayOfYear taxo facets
    high-freq medium-freq +dayOfYear taxo facets
    +high-freq +high-freq +dayOfYear taxo facets
    +high-freq +medium-freq +dayOfYear taxo facets
    Random labels chosen from each doc
    Random labels chosen from each doc (doc values)
    NAD high cardinality faceting

Sorting (on TermQuery):
    Date/time (long, high cardinality)
    Title (string, high cardinality)
    Month (string, low cardinality)
    Day of year (int, medium cardinality)

Grouping (on TermQuery):
    100 groups
    10K groups
    1M groups
    1M groups (two pass block grouping)
    1M groups (single pass block grouping)

Others:
    Stored fields geonames benchmarks
    GC/JIT metrics during search benchmarks
    Geo spatial benchmarks
    Sparse vs dense doc values performance on NYC taxi ride corpus
    "gradle -p lucene test" and "gradle precommit" time in lucene
    CheckIndex time
    Lucene GitHub pull-request counts
    CombinedFieldsQuery (OR, high freq, high freq term)
    CombinedFieldsQuery (OR, high freq, medium freq term)


[last updated: 2024-12-07 07:04:13.979072; send questions to Mike McCandless]