-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOCS] Rewrite term
query docs for new format
#41498
Changes from all commits
5738ba3
d10254a
ed44215
da9123a
058ff63
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,168 +1,220 @@ | ||
[[query-dsl-term-query]] | ||
=== Term Query | ||
|
||
The `term` query finds documents that contain the *exact* term specified | ||
in the inverted index. For instance: | ||
Returns documents that contain an *exact* term in a provided field. | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
POST _search | ||
{ | ||
"query": { | ||
"term" : { "user" : "Kimchy" } <1> | ||
} | ||
} | ||
-------------------------------------------------- | ||
// CONSOLE | ||
<1> Finds documents which contain the exact term `Kimchy` in the inverted index | ||
of the `user` field. | ||
You can use the `term` query to find documents based on a precise value such as | ||
a price, a product ID, or a username. | ||
|
||
[WARNING] | ||
==== | ||
Avoid using the `term` query for <<text, `text`>> fields. | ||
|
||
By default, {es} changes the values of `text` fields as part of <<analysis, | ||
analysis>>. This can make finding exact matches for `text` field values | ||
difficult. | ||
|
||
A `boost` parameter can be specified to give this `term` query a higher | ||
relevance score than another query, for instance: | ||
To search `text` field values, use the <<query-dsl-match-query,`match`>> query | ||
instead. | ||
==== | ||
|
||
[[term-query-ex-request]] | ||
==== Example request | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
GET _search | ||
---- | ||
GET /_search | ||
{ | ||
"query": { | ||
"bool": { | ||
"should": [ | ||
{ | ||
"term": { | ||
"status": { | ||
"value": "urgent", | ||
"boost": 2.0 <1> | ||
"query": { | ||
"term": { | ||
"user": { | ||
"value": "Kimchy", | ||
"boost": 1.0 | ||
} | ||
} | ||
}, | ||
{ | ||
"term": { | ||
"status": "normal" <2> | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
---- | ||
// CONSOLE | ||
|
||
<1> The `urgent` query clause has a boost of `2.0`, meaning it is twice as important | ||
as the query clause for `normal`. | ||
<2> The `normal` clause has the default neutral boost of `1.0`. | ||
|
||
A `term` query can also match against <<range, range data types>>. | ||
|
||
.Why doesn't the `term` query match my document? | ||
************************************************** | ||
|
||
String fields can be of type `text` (treated as full text, like the body of an | ||
email), or `keyword` (treated as exact values, like an email address or a | ||
zip code). Exact values (like numbers, dates, and keywords) have | ||
the exact value specified in the field added to the inverted index in order | ||
to make them searchable. | ||
|
||
However, `text` fields are `analyzed`. This means that their | ||
values are first passed through an <<analysis,analyzer>> to produce a list of | ||
terms, which are then added to the inverted index. | ||
|
||
There are many ways to analyze text: the default | ||
<<analysis-standard-analyzer,`standard` analyzer>> drops most punctuation, | ||
breaks up text into individual words, and lower cases them. For instance, | ||
the `standard` analyzer would turn the string ``Quick Brown Fox!'' into the | ||
terms [`quick`, `brown`, `fox`]. | ||
|
||
This analysis process makes it possible to search for individual words | ||
within a big block of full text. | ||
|
||
The `term` query looks for the *exact* term in the field's inverted index -- | ||
it doesn't know anything about the field's analyzer. This makes it useful for | ||
looking up values in keyword fields, or in numeric or date | ||
fields. When querying full text fields, use the | ||
<<query-dsl-match-query,`match` query>> instead, which understands how the field | ||
has been analyzed. | ||
|
||
|
||
To demonstrate, try out the example below. First, create an index, specifying the field mappings, and index a document: | ||
[[term-top-level-params]] | ||
==== Top-level parameters for `term` | ||
`<field>`:: | ||
Field you wish to search. | ||
|
||
[[term-field-params]] | ||
==== Parameters for `<field>` | ||
`value`:: | ||
Term you wish to find in the provided `<field>`. To return a document, the term | ||
must exactly match the field value, including whitespace and capitalization. | ||
|
||
`boost`:: | ||
Floating point number used to decrease or increase the | ||
<<query-filter-context, relevance scores>> of a query. Default is `1.0`. | ||
Optional. | ||
+ | ||
You can use the `boost` parameter to adjust relevance scores for searches | ||
containing two or more queries. | ||
+ | ||
Boost values are relative to the default value of `1.0`. A boost value between | ||
`0` and `1.0` decreases the relevance score. A value greater than `1.0` | ||
increases the relevance score. | ||
|
||
[[term-query-notes]] | ||
==== Notes | ||
|
||
[[avoid-term-query-text-fields]] | ||
===== Avoid using the `term` query for `text` fields | ||
By default, {es} changes the values of `text` fields during analysis. For | ||
example, the default <<analysis-standard-analyzer, standard analyzer>> changes | ||
`text` field values as follows: | ||
|
||
* Removes most punctuation | ||
* Divides the remaining content into individual words, called | ||
<<analysis-tokenizers, tokens>> | ||
* Lowercases the tokens | ||
|
||
To better search `text` fields, the `match` query also analyzes your provided | ||
search term before performing a search. This means the `match` query can search | ||
`text` fields for analyzed tokens rather than an exact term. | ||
|
||
The `term` query does *not* analyze the search term. The `term` query only | ||
searches for the *exact* term you provide. This means the `term` query may | ||
return poor or no results when searching `text` fields. | ||
|
||
To see the difference in search results, try the following example. | ||
|
||
. Create an index with a `text` field called `full_text`. | ||
+ | ||
-- | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
---- | ||
PUT my_index | ||
{ | ||
"mappings": { | ||
"properties": { | ||
"full_text": { | ||
"type": "text" <1> | ||
}, | ||
"exact_value": { | ||
"type": "keyword" <2> | ||
} | ||
"mappings" : { | ||
"properties" : { | ||
"full_text" : { "type" : "text" } | ||
} | ||
} | ||
} | ||
} | ||
---- | ||
// CONSOLE | ||
|
||
-- | ||
|
||
. Index a document with a value of `Quick Brown Foxes!` in the `full_text` | ||
field. | ||
+ | ||
-- | ||
|
||
[source,js] | ||
---- | ||
PUT my_index/_doc/1 | ||
{ | ||
"full_text": "Quick Foxes!", <3> | ||
"exact_value": "Quick Foxes!" <4> | ||
"full_text": "Quick Brown Foxes!" | ||
} | ||
-------------------------------------------------- | ||
---- | ||
// CONSOLE | ||
// TEST[continued] | ||
|
||
Because `full_text` is a `text` field, {es} changes `Quick Brown Foxes!` to | ||
`[quick, brown, fox]` during analysis. | ||
|
||
<1> The `full_text` field is of type `text` and will be analyzed. | ||
<2> The `exact_value` field is of type `keyword` and will NOT be analyzed. | ||
<3> The `full_text` inverted index will contain the terms: [`quick`, `foxes`]. | ||
<4> The `exact_value` inverted index will contain the exact term: [`Quick Foxes!`]. | ||
-- | ||
|
||
Now, compare the results for the `term` query and the `match` query: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just out of curiosity: are there follow up plans to move these more detailed examples somewhere else? While I like the idea of having succinct, standardized docs for each query, I find these kind of examples quite useful and better to understand than merely the minimal snippet that remains. Not saying this shouldn't go away, I'm just curious what the plan is for examples like this in this rewriting effort. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the feedback @cbuescher. For this particular set of examples, I think a concept-focused page like "Search structured data" or "Search full-text" would be a better fit. Those pages don't exist yet, but I'll work on creating them and add it to this PR. For other queries, I think we can include additional sections for detailed examples below parameter documentation. A good example of this would be the Let me know if you feel differently. I'm still new so there could be context I'm missing. |
||
. Use the `term` query to search for `Quick Brown Foxes!` in the `full_text` | ||
field. Include the `pretty` parameter so the response is more readable. | ||
+ | ||
-- | ||
|
||
[source,js] | ||
-------------------------------------------------- | ||
GET my_index/_search | ||
---- | ||
GET my_index/_search?pretty | ||
{ | ||
"query": { | ||
"term": { | ||
"exact_value": "Quick Foxes!" <1> | ||
"full_text": "Quick Brown Foxes!" | ||
} | ||
} | ||
} | ||
---- | ||
// CONSOLE | ||
// TEST[continued] | ||
|
||
GET my_index/_search | ||
{ | ||
"query": { | ||
"term": { | ||
"full_text": "Quick Foxes!" <2> | ||
} | ||
} | ||
} | ||
Because the `full_text` field no longer contains the *exact* term `Quick Brown | ||
Foxes!`, the `term` query search returns no results. | ||
|
||
GET my_index/_search | ||
{ | ||
"query": { | ||
"term": { | ||
"full_text": "foxes" <3> | ||
} | ||
} | ||
} | ||
-- | ||
|
||
. Use the `match` query to search for `Quick Brown Foxes!` in the `full_text` | ||
field. | ||
+ | ||
-- | ||
|
||
//// | ||
|
||
GET my_index/_search | ||
[source,js] | ||
---- | ||
POST my_index/_refresh | ||
---- | ||
// CONSOLE | ||
// TEST[continued] | ||
|
||
//// | ||
|
||
[source,js] | ||
---- | ||
GET my_index/_search?pretty | ||
{ | ||
"query": { | ||
"match": { | ||
"full_text": "Quick Foxes!" <4> | ||
"full_text": "Quick Brown Foxes!" | ||
} | ||
} | ||
} | ||
-------------------------------------------------- | ||
---- | ||
// CONSOLE | ||
// TEST[continued] | ||
|
||
<1> This query matches because the `exact_value` field contains the exact | ||
term `Quick Foxes!`. | ||
<2> This query does not match, because the `full_text` field only contains | ||
the terms `quick` and `foxes`. It does not contain the exact term | ||
`Quick Foxes!`. | ||
<3> A `term` query for the term `foxes` matches the `full_text` field. | ||
<4> This `match` query on the `full_text` field first analyzes the query string, | ||
then looks for documents containing `quick` or `foxes` or both. | ||
************************************************** | ||
Unlike the `term` query, the `match` query analyzes your provided search term, | ||
`Quick Brown Foxes!`, before performing a search. The `match` query then returns | ||
any documents containing the `quick`, `brown`, or `fox` tokens in the | ||
`full_text` field. | ||
|
||
Here's the response for the `match` query search containing the indexed document | ||
in the results. | ||
|
||
[source,js] | ||
---- | ||
{ | ||
"took" : 1, | ||
"timed_out" : false, | ||
"_shards" : { | ||
"total" : 1, | ||
"successful" : 1, | ||
"skipped" : 0, | ||
"failed" : 0 | ||
}, | ||
"hits" : { | ||
"total" : { | ||
"value" : 1, | ||
"relation" : "eq" | ||
}, | ||
"max_score" : 0.8630463, | ||
"hits" : [ | ||
{ | ||
"_index" : "my_index", | ||
"_type" : "_doc", | ||
"_id" : "1", | ||
"_score" : 0.8630463, | ||
"_source" : { | ||
"full_text" : "Quick Brown Foxes!" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
---- | ||
// TESTRESPONSE[s/"took" : 1/"took" : $body.took/] | ||
-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whilst I'm not wedded to having this explanation in this page of the documentation I think it would be useful to make sure we explain the differece between searching with analyzed or not analyzed queries somewhere since it gives some understanding into how search works and how to avoid some pitfalls. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback.
I think the warning provides enough information for users looking to get started quickly, but it doesn't explain why you should avoid using term-level queries for analyzed fields. The example in the aside is great for that.
I think that content would fit better in a concept-focused page like "Search structured data" or "Search full-text." Those don't exist yet, but I'll work on creating them and add it to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great, thanks