-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move FieldMapper#valueFetcher to MappedFieldType #62974
Conversation
Pinging @elastic/es-search (:Search/Mapping) |
Some more refactoring from me :). We are getting some fairly unwieldy telescoping MappedFieldType constructors now, and there are also quite a few places where we end up sharing object parsing logic between the FieldMapper and the MappedFieldType which I think can be cleaned up, but I wanted to keep this PR as simple as possible and so this is just doing the absolute basics. Further improvements will follow. |
@@ -122,7 +122,7 @@ Builder nullValue(double nullValue) { | |||
@Override | |||
public ScaledFloatFieldMapper build(BuilderContext context) { | |||
ScaledFloatFieldType type = new ScaledFloatFieldType(buildFullName(context), indexed.getValue(), stored.getValue(), | |||
hasDocValues.getValue(), meta.getValue(), scalingFactor.getValue()); | |||
hasDocValues.getValue(), meta.getValue(), scalingFactor.getValue(), nullValue.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These telescoping constructors are getting too complicated now - one idea I have for a follow-up is to add a FieldCharacteristics
wrapper object that takes the (indexed, docvales, stored, meta) tuple, analogous to TextSearchInfo, and give both of these 'holder' objects constructors that take Parameter
sets directly, so that you can go FieldCharacteristics.build(index, hasDocValues, stored, meta)
directly in build
and remove some of the ceremony.
|
||
public CollationFieldType(String name, boolean isSearchable, boolean isStored, boolean hasDocValues, | ||
Collator collator, Map<String, String> meta) { | ||
Collator collator, String nullValue, int ignoreAbove, Map<String, String> meta) { | ||
super(name, isSearchable, isStored, hasDocValues, TextSearchInfo.SIMPLE_MATCH_ONLY, meta); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can wrap this up more nicely as well in a followup, passing a Function<Object, String>
parsing object here that is shared with the FieldMapper
impl so that we don't need to a) pass nullValue and ignoreAbove and b) duplicate the relevant logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And indeed, we can probably add a factory function to SourceValueFetcher
that takes a parsing function as well and further reduce the boilerplate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this out when initially designing the change, and it proved difficult to share the parsing logic. I wrote up the details in this draft PR: #56473. But I would love to discuss further if you find a way forward!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some questions, thanks for taking care of this change!
...s/mapper-extras/src/main/java/org/elasticsearch/index/mapper/SearchAsYouTypeFieldMapper.java
Show resolved
Hide resolved
.../mapper-murmur3/src/main/java/org/elasticsearch/index/mapper/murmur3/Murmur3FieldMapper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/FieldNamesFieldMapper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/IndexFieldMapper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/SeqNoFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TextFieldMapper.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/TypeFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/VersionFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/fetch/subphase/FieldFetcher.java
Show resolved
Hide resolved
|
||
@Override | ||
public ValueFetcher valueFetcher(MapperService mapperService, SearchLookup searchLookup, String format) { | ||
throw new UnsupportedOperationException(); // TODO can we implement this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering the same, or even if we should do so. when would this field type be looked up compared to the other one that implements valueFetcher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good question, I'll propose an idea in a follow-up PR.
Have updated @javanna and ready for another go-round. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.
For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.
@@ -129,20 +130,6 @@ protected void parseCreateField(ParseContext context) throws IOException { | |||
); | |||
} | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this change, I think value parsing will always fail -- since it uses NumberFieldType
, it will attempt to parse a piece of text as a number.
continue; | ||
} | ||
|
||
if (mapper instanceof FieldAliasMapper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very happy we removed this 🎉
Thanks @romseygeek for this refactor! Overall it turned out nicely -- it feels better conceptually for field types to own value fetching, since they're in charge of all other search-time operations. |
if (format != null) { | ||
throw new IllegalArgumentException("Field [" + name() + "] of type [" + typeName() + "] doesn't support formats."); | ||
} | ||
return new SourceValueFetcher(name(), mapperService, false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other change I noticed -- previously we always passed the result of FieldMapper#parsesArrayValue
instead of hard-coding the booleans. Now there is nothing really tying this to parsesArrayValue
and it could be easily overlooked. I wonder how we could make this more robust.
An update: I opened #63354 with a small improvement.
When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to elastic#62974.
When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to #62974.
When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to elastic#62974.
When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to #62974.
For runtime fields, we will want to do all search-time interaction with
a field definition via a MappedFieldType, rather than a FieldMapper, to
avoid interfering with the logic of document parsing. Currently, fetching
values for runtime scripts and for building top hits responses need to
call a method on FieldMapper. This commit moves this method to
MappedFieldType, incidentally simplifying the current call sites and freeing
us up to implement runtime fields as pure MappedFieldType objects.