Elasticlunr Query DSL
January 8th, 2022
Like every other search engine, you can make more advanced search queries depending on your requirements, and I'm pleased to tell you that Elasticlunr has not left out such capabilities. So, in the remaining part of this post, I will be highlighting the available query types provided by the library and how you can use them.
I will like to mention that Elasticlunr tries to replicate popular Query DSL (Domain Specific Language) with the same behavior as Elasticsearch, which means the learning curve reduces if you have experience using the search engine. For Elasticlunr, there are the bool
, match
, match_all
, not
, and terms
query types you can use to retrieve insights about an index.
So, let's proceed to the explanation of these query types and their usages using the blog posts index example in the previous blog post.
Bool
The bool
query is used with a combination of queries to retrieve documents matching the boolean combinations of clauses. Consider these clauses to be everything that comes after the SELECT
statement in relational databases. Note that the bool
query is used under the hood when you pass a string to Index.search/1
.
The bool
query is built using one or more clauses to achieve desired results, and each clause has its type, see below:
Clause | Description |
---|---|
must |
The clause must appear in the matching documents, and this affects the document's score. |
must_not |
The clause must not appear in the matching document. Scoring is ignored because the clause is executed in the filter context. |
filter |
Like must , the clause must appear in the matching documents but scoring is ignored for the query. |
should |
The clause should appear in the matching document. |
It's important to note that only scores from the must
and should
clauses contribute to the final score of the matching document.
# example bool query
Index.search(index, %{
"query" => %{
"bool" => %{
"must" => %{
"terms" => %{"content" => "use"}
},
"should" => %{
"terms" => %{"category" => "elixir"}
},
"filter" => %{
"match" => %{
"id" => 3
}
},
"must_not" => %{
"match" => %{
"author" => "mika"
}
},
"minimum_should_match" => 1
}
}
})
You can use the minimum_should_match
parameter to specify the number of should
clauses returned documents must match.
If the bool
query includes at least one should
clause and no must
or filter
clauses, the default value is 1. Otherwise, the default value is 0.
Match
The match
query is the standard query used for full-text search, including support for fuzzy matching. The provided text is analyzed before matching it against documents.
# example match query
Index.search(index, %{
"query" => %{
"match" => %{
"content" => %{
"query" => "liveview browser"
}
}
}
})
A match
query accepts one or more top-level fields you wish to search, in the example above, it's the content
field. Note that when you have more than one top-level fields, the match
query is rewritten to a bool
query internally by the library. Now, let's see what parameters are accepted by the match
query below:
Parameter | Description |
---|---|
query |
String you wish to find in the provided field. |
expand |
Increase token recall, see token expansion. |
fuzziness |
Maximum edit distance allowed for matching. |
boost |
Floating point number used to decrease or increase the relevance scores of a query. Defaults to 1.0. |
operator |
The boolean operator used to interpret the query value. Available values for the operator option are or and and . Defaults to or . |
minimum_should_match |
Minimum number of clauses that a document must match for it to be returned. |
Terms
The query return documents that contain the exact terms in a given field. The terms
query should be used to find documents based on a precise value such as a price, a product ID, or a username.
# example terms query
Index.search(index, %{
"query" => %{
"terms" => %{
"content" => %{
"value" => "think"
}
}
}
})
Just like the match
query, the terms
query also accepts one or more top-level fields. See below, to find what parameters are accepted by the terms
query:
Parameter | Description |
---|---|
value |
A term you wish to find in the provided field. The term must match exactly the field value to return a document. |
boost |
Floating point number used to decrease or increase the relevance scores of a query. Defaults to 1.0. |
Match All
The most simple query, which matches all documents, gives them a score of 1.0 each.
# example match all query
Index.search(index, %{
"query" => %{
"match_all" => %{}
}
})
Parameter | Description |
---|---|
boost |
Floating point number used to decrease or increase the relevance scores of a query. Defaults to 1.0. |
Not
The not
query inverts the result of the nested query giving the matched documents a score of 1.0 each.
# example match all query
Index.search(index, %{
"query" => %{
"not" => %{
"match" => %{
"content" => "ecto"
}
}
}
})
Wrap Up
Phew, we made it to the end. The above are the available query types you can use to build more advanced queries for your use case. In the proceeding posts, I will be writing about how you can serialize your index and write to any storage service of your choice.
And don't forget to have a look at the livebook document so that you can fiddle with each query and see how you can tweak them to achieve your wants.