-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow all properties of an object to be queried and aggregated as top-level attributes #103567
Labels
>feature
:Search Foundations/Mapping
Index mappings, including merging and defining field types
:StorageEngine/TSDB
You know, for Metrics
Team:Search Foundations
Meta label for the Search Foundations team in Elasticsearch
Team:StorageEngine
Comments
Pinging @elastic/es-search (Team:Search) |
kkrik-es
added a commit
to kkrik-es/elasticsearch
that referenced
this issue
Dec 21, 2023
`PassthoughObjectMapper` extends `ObjectMapper` to create a container for fields that also need to be referenced as if they were at the root level. This is done by creating aliases for all its subfields. It also supports an option of annotating all its subfields as dimensions. This will be leveraged in TSDB, where dimension fields can be dynamically defined as nested under a passthrough object - and still referenced directly (i.e. without prefixes) in aggregation queries. Related to elastic#103567
kkrik-es
added a commit
that referenced
this issue
Feb 1, 2024
* Introduce passthrough field type `PassthoughObjectMapper` extends `ObjectMapper` to create a container for fields that also need to be referenced as if they were at the root level. This is done by creating aliases for all its subfields. It also supports an option of annotating all its subfields as dimensions. This will be leveraged in TSDB, where dimension fields can be dynamically defined as nested under a passthrough object - and still referenced directly (i.e. without prefixes) in aggregation queries. Related to #103567 * Update docs/changelog/103648.yaml * no subobjects * create dimensions dynamically * remove unused method * restore ignoreAbove incompatibility with dimension * fix test * refactor, skip aliases on conflict * fix branch * fix branch * add tests * update test * remove unused variable * add yaml test for subobject * minor refactoring * add unittest for PassThroughObjectMapper * suggested fixes * suggested fixes * update yaml with warning for duplicate alias * updates from review * add withoutMappers()
This was referenced Feb 2, 2024
martijnvg
added a commit
to martijnvg/elasticsearch
that referenced
this issue
Feb 9, 2024
That also asserts routing aspects of indexing, searching and getting by id. Relates to elastic#103567
elasticsearchmachine
pushed a commit
that referenced
this issue
Feb 9, 2024
That also asserts routing aspects of indexing, searching and getting by id. Relates to #103567
This was referenced Mar 5, 2024
elasticsearchmachine
pushed a commit
that referenced
this issue
Mar 13, 2024
#106080) Supporting non-keyword fields requires updating non-keyword fields in the routing path to be included in routing calculations. Routing is performed in coordinating nodes that lack mappings (or mappings haven't been created yet, for dynamically-defined dimensions), so the routing hash they calculate are passed to data nodes and stored in a new fields, namely _ts_routind_hash. This is included in the _id field, in turn, so that it can consistently reach the right shard for get-by-id and delete-by-id operations. A few interesting points: - The hash is passed from the coordinating to data nodes using the `routing` field in `IndexRequest`; adding another field to the latter requires updating dozens of classes. - We explicitly skip (double-) storing the hash to the routing field, as the latter is not optimized for storage using the TSDB codec. - The routing hash may not be available in Translog operations, it can then be retrieved from the `id` prefix. Related to #103567
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>feature
:Search Foundations/Mapping
Index mappings, including merging and defining field types
:StorageEngine/TSDB
You know, for Metrics
Team:Search Foundations
Meta label for the Search Foundations team in Elasticsearch
Team:StorageEngine
In order to properly map events that come in via the OpenTelemetry Protocol (OTLP), we'd like to have the ability to store attributes in an
attributes
andresource_attributes
object.For example:
We still want to be able do queries and aggregations directly on
host.name
. While we could just store attributes at the top level, we'd be losing information on whether something is a resource attribute vs attribute and it would be difficult to convert that document back into OTLP without loss of information.Maybe we can build on the foundations of the alias field type. However, it would need to be a generic kind of alias that makes
attributes.*
andresource_attributes.*
available at the top level. Ideally, this should not have an impact on the field limit of an index.We could, for example, create a special object field type (maybe
root_object
) or have a mapping parameter forobject
field types that optionally makes the fields within a document available at the top-level. Any defined subfield is also available as an alias at the same level that the object field is defined. These fields behave in queries and aggregations as if they had been defined at the root level. When returningfields
from a search, they should probably only be returned with the prefix, for example["attributes.host.name": ["my-host"]]
, rather than duplicated, such as["attributes.host.name": ["my-host"], "host.name": ["my-host"]]
When there are multiple
root_object
definitions, there needs to be a declared order of precedence in which attributes are resolved. For example, if bothattributes
andresource_attributes
define aservice.name
field, the one inresource_attributes
should win.This can also help with #98384 as we can add dynamic templates that define
attributes.*
andresource_attributes.*
astime_series_dimension
. Alternatively, this new field type can be marked as atime_series_dimension
which implies that all its sub-fields are also dimensions.Example mapping:
cc @elastic/opentelemetry-leads @elastic/es-analytics-geo
The text was updated successfully, but these errors were encountered: