You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, we're counting all mappers, including mappers for subfields that
aren't explicitly added to the mapping towards the field limit.
This means that some field types, such as `search_as_you_type` or
`percolator` count as more than one field even though that's not
apparent to users as they're just defining them as a single field in the
mapping.
This change makes it so that each field mapper only counts as one. We're
still counting multi-fields.
This makes it easier to understand for users why the field limit is hit.
~In addition to that, it also simplifies
#96235 as it makes the
implementation of `Mapper.Builder#getTotalFieldsCount` much easier and
easier to align with `Mapper#getTotalFieldsCount`. This reduces the risk
of over- or under-estimating the field count of a `Mapper.Builder` in
`DocumentParserContext#addDynamicMapper`, which in turn reduces the risk
of data loss due to the issue described here:
#96235 (comment)
*Edit: due to #103865, we
don't need an implementation of `getTotalFieldsCount` or `mapperSize` in
`Mapper.Builder`. Still, this PR more closely aligns
`Mapper#getTotalFieldsCount` with `MappingLookup#getTotalFieldsCount`,
which `DocumentParserContext#addDynamicMapper` uses to determine
whether the field limit is hit*
A potential risk of this is that we're now effectively allowing more
fields in the mapping. It may be surprising to users that more fields
can be added to a mapping. Although, I'd not expect negative
consequences from that. Generally, I'd expect users to be happy about
any change that reduces the risk of data loss.
We could also think about whether to apply the new counting logic only
to new indices (depending on the `IndexVersion`). However, that would
add more complexity and I'm not convinced about the value. We'd then
need to maintain two different ways of counting fields and also require
passing in the `IndexVersion` to `MappingLookup` which previously didn't
require the `IndexVersion`.
This PR is meant as a conversation starter. It would also simplify
#96235 but I don't think
this blocks that PR in any way.
I'm curious about the opinion of @javanna and @jpountz on this.
0 commit comments