Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch meta fields in FetchFieldsPhase using ValueFetcher #106325

Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
7f93d0a
refactor: extract meta fields fetching in separate fetch sub-phase
salvatore-campagna Mar 13, 2024
1006059
fix: inline return value of storedFieldSpec
salvatore-campagna Mar 13, 2024
535cada
fix: check stored fields not fetch fields
salvatore-campagna Mar 14, 2024
c840da4
refactor: just use FetchFieldsPhase to fetch both fields and stored_f…
salvatore-campagna Mar 14, 2024
6c31e57
fix: update skip version
salvatore-campagna Mar 14, 2024
503439f
fix: simplify code
salvatore-campagna Mar 14, 2024
99f129d
fix: prevent a NPE when retrieving the field type for metadata fields
salvatore-campagna Mar 18, 2024
ba0d9e7
Revert "fix: prevent a NPE when retrieving the field type for metadat…
salvatore-campagna Mar 18, 2024
9bb8dbc
fix: fetch _source using ValueFetcher
salvatore-campagna Mar 18, 2024
2369703
fix: allow _ignored, _routing and _type fields
salvatore-campagna Mar 18, 2024
9fa44ca
fix: return stored fields with their correct type not as strings
salvatore-campagna Mar 18, 2024
64dbaed
fix: process just metadata fields
salvatore-campagna Mar 18, 2024
cff6343
fix: restore original exception message
salvatore-campagna Mar 18, 2024
d335005
fix: hadling of alias fields
salvatore-campagna Mar 19, 2024
45c5bb9
nit: improve readability
salvatore-campagna Mar 19, 2024
d336fd5
fix: mock MappedFieldType#name
salvatore-campagna Mar 19, 2024
8889b95
fix: add a missing FetchFieldsPhase
salvatore-campagna Mar 19, 2024
180fce1
fix: use a visitor per each field instead of per each doc
salvatore-campagna Mar 20, 2024
c0ef98c
test: include a test to surface a bug
salvatore-campagna Mar 21, 2024
a1f9ad9
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Mar 21, 2024
89ddd1b
fix: merge conflict mistake
salvatore-campagna Mar 21, 2024
fed48ad
fix: remove empty line
salvatore-campagna Mar 21, 2024
fe91fc0
fix: remove empty line
salvatore-campagna Mar 21, 2024
1c3c141
fix: include a boolean to enable fetching of stored fields in FieldFe…
salvatore-campagna Mar 21, 2024
a1e1166
fix: handle _size metadata field
salvatore-campagna Mar 21, 2024
6bf58bb
fix: move the fetch fase earlier
salvatore-campagna Mar 21, 2024
495f91f
fix: call setNextReader before setting field loaders
salvatore-campagna Mar 21, 2024
66687ed
fix: skip unmapped fields
salvatore-campagna Mar 21, 2024
1fd6a9f
fix: load directly stored fields if they are not pre-loaded
salvatore-campagna Mar 21, 2024
556a0f6
fix: add missing return
salvatore-campagna Mar 21, 2024
e700c95
fix: revert FetchPhase ordering and setNextReader
salvatore-campagna Mar 22, 2024
4e27d44
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Mar 22, 2024
231f876
fix: revert changes to get stack traces
salvatore-campagna Mar 22, 2024
2c2429c
fix: clarify comment
salvatore-campagna Mar 22, 2024
6dc595e
fix: clarify null check for unmapped fields
salvatore-campagna Mar 22, 2024
7dcf8ce
fix: use the actual field name for sorting
salvatore-campagna Mar 23, 2024
b197018
Revert "fix: use the actual field name for sorting"
salvatore-campagna Mar 23, 2024
1ec9ad4
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Mar 23, 2024
77d6625
fix: possible NPE with storedFields
salvatore-campagna Mar 23, 2024
02e74de
fix: clarify why we skip null MappedFieldType
salvatore-campagna Mar 23, 2024
cf72f41
checkstyle: remove implmnote
salvatore-campagna Mar 24, 2024
7092e1c
fix: do not try fetching stored fields if there is none
salvatore-campagna Mar 25, 2024
60c4d9e
fix: revert storedFields == null check
salvatore-campagna Mar 25, 2024
2a71b9a
Revert "fix: revert storedFields == null check"
salvatore-campagna Mar 25, 2024
aa7982c
Revert "fix: do not try fetching stored fields if there is none"
salvatore-campagna Mar 25, 2024
a7c9f95
fix: require non-null loadedFields
salvatore-campagna Mar 26, 2024
d3d6711
nit: remove comment
salvatore-campagna Mar 26, 2024
ec29233
fix: remove storedFields null check and disable parallel execution fo…
salvatore-campagna Mar 28, 2024
3f5582e
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Mar 28, 2024
98d5c6a
fix: disallow top hits parallel execution
salvatore-campagna Mar 28, 2024
ab96ec1
fix: null check not required after #106862
salvatore-campagna Apr 2, 2024
d9b91b0
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Apr 2, 2024
74b20bb
fix: remove isMetadataField from StoredField
salvatore-campagna Apr 3, 2024
d6f59d7
fix: remove redundant check on field name
salvatore-campagna Apr 3, 2024
bdb7567
fix: use setDocumentField
salvatore-campagna Apr 3, 2024
5370b3e
fix: remoe logic behind fetching _size
salvatore-campagna Apr 3, 2024
d8d1d77
fix: reuse FieldFetcher as it was originally
salvatore-campagna Apr 3, 2024
eadfc90
fix: only deal with non-metadata fields that are stored
salvatore-campagna Apr 3, 2024
45c3af3
fix: remove unused method
salvatore-campagna Apr 3, 2024
acf09ee
fix: flip merge call
salvatore-campagna Apr 3, 2024
484e5d5
fix: just skip fields not stored
salvatore-campagna Apr 3, 2024
22cf0be
fix: hanlde none stored_fields
salvatore-campagna Apr 4, 2024
6e778a1
fix: do not include metadata stored fields specs
salvatore-campagna Apr 4, 2024
def5d35
fix: fetch fields phase not executed if no fields or stored fields
salvatore-campagna Apr 4, 2024
b4a1ec6
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Apr 4, 2024
5941d68
revert: parallel execution of top hits and child agg
salvatore-campagna Apr 4, 2024
a1ca138
revert: Objects.requireNotNull
salvatore-campagna Apr 4, 2024
674b354
test: yaml test _ignored stored_fields in get api
salvatore-campagna Apr 4, 2024
e0be2b7
fix: yaml test header
salvatore-campagna Apr 4, 2024
38cd4c2
fix: remove unnecessary skip
salvatore-campagna Apr 4, 2024
283cb78
fix: avoid creating a field fetcher if possible
salvatore-campagna Apr 4, 2024
dfc680a
fix: remove unecessary method mock
salvatore-campagna Apr 4, 2024
4290eac
fix: fetch fields and stored fields
salvatore-campagna Apr 5, 2024
6687daa
fix: always return default metadata fields
salvatore-campagna Apr 5, 2024
48f35fb
fix: remove unnecessary null check
salvatore-campagna Apr 5, 2024
ce16a81
fix: remove skip version
salvatore-campagna Apr 5, 2024
ac592e9
fix: remove reason as it is sligthly different for ccs
salvatore-campagna Apr 5, 2024
7c33c62
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Apr 5, 2024
06697a5
fix: _ignored fetched through stored_fields
salvatore-campagna Apr 5, 2024
f87dbf4
fix: do not fetch metadata fields in stored fields phase
salvatore-campagna Apr 5, 2024
c6d84fc
fix: remove check on stored_fields for get api
salvatore-campagna Apr 12, 2024
cf81e82
fix: exclude source and id before getting the field type
salvatore-campagna Apr 12, 2024
26b2463
test: add missing test requesting none stored_fields and a few fields
salvatore-campagna Apr 12, 2024
4ace7a0
fix: include a method to know if metadata fields are fetched via wild…
salvatore-campagna Apr 12, 2024
65bd1ec
Merge branch 'main' into refactor/meta-fields-sub-fetch
salvatore-campagna Apr 12, 2024
f8ed186
fix: skip test in v7 since it was http status 500 instead of 400
salvatore-campagna Apr 12, 2024
db1a8cb
Revert "fix: include a method to know if metadata fields are fetched …
salvatore-campagna Apr 12, 2024
56ea97a
fix: usae HashSet instead of TreeSet
salvatore-campagna Apr 12, 2024
866ce95
note: clarify why we need to check field is stored
salvatore-campagna Apr 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/reference/search/profile.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,17 @@ The API returns the following result:
"stored_fields": ["_id", "_routing", "_source"]
},
"children": [
{
"type" : "FetchFieldsPhase",
"description" : "",
"time_in_nanos" : 238762,
"breakdown" : {
"process_count" : 5,
"process" : 227914,
"next_reader" : 10848,
"next_reader_count" : 1
}
},
{
"type": "FetchSourcePhase",
"description": "",
Expand Down Expand Up @@ -1043,6 +1054,17 @@ And here is the fetch profile:
"stored_fields": ["_id", "_routing", "_source"]
},
"children": [
{
"type" : "FetchFieldsPhase",
"description" : "",
"time_in_nanos" : 238762,
"breakdown" : {
"process_count" : 5,
"process" : 227914,
"next_reader" : 10848,
"next_reader_count" : 1
}
},
{
"type": "FetchSourcePhase",
"description": "",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ setup:
---
fetch fields:
- skip:
version: ' - 8.5.99'
reason: stored fields phase added in 8.6
version: ' - 8.13.99'
reason: fetch fields and stored_fields using ValueFetcher

- do:
search:
Expand All @@ -44,17 +44,21 @@ fetch fields:
- match: { profile.shards.0.fetch.debug.stored_fields: [_id, _routing, _source] }
- length: { profile.shards.0.fetch.children: 2 }
- match: { profile.shards.0.fetch.children.0.type: FetchFieldsPhase }
- match: { profile.shards.0.fetch.children.1.type: StoredFieldsPhase }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.1.type: StoredFieldsPhase }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }

---
fetch source:
- skip:
version: ' - 8.5.99'
reason: stored fields phase added in 8.6
version: ' - 8.13.99'
reason: fetch fields and stored_fields using ValueFetcher

- do:
search:
Expand All @@ -71,20 +75,21 @@ fetch source:
- gt: { profile.shards.0.fetch.breakdown.load_stored_fields_count: 0 }
- gt: { profile.shards.0.fetch.breakdown.load_stored_fields: 0 }
- match: { profile.shards.0.fetch.debug.stored_fields: [_id, _routing, _source] }
- length: { profile.shards.0.fetch.children: 2 }
- match: { profile.shards.0.fetch.children.0.type: FetchSourcePhase }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.0.debug.fast_path: 1 }
- match: { profile.shards.0.fetch.children.1.type: StoredFieldsPhase }
- length: { profile.shards.0.fetch.children: 3 }
- match: { profile.shards.0.fetch.children.0.type: FetchFieldsPhase }
- match: { profile.shards.0.fetch.children.1.type: FetchSourcePhase }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.1.debug.fast_path: 1 }
- match: { profile.shards.0.fetch.children.2.type: StoredFieldsPhase }

---
fetch nested source:
- skip:
version: ' - 8.5.99'
reason: stored fields phase added in 8.6
version: ' - 8.13.99'
reason: fetch fields and stored_fields using ValueFetcher

- do:
indices.create:
Expand Down Expand Up @@ -135,24 +140,25 @@ fetch nested source:
- gt: { profile.shards.0.fetch.breakdown.load_stored_fields_count: 0 }
- gt: { profile.shards.0.fetch.breakdown.load_stored_fields: 0 }
- match: { profile.shards.0.fetch.debug.stored_fields: [_id, _routing, _source] }
- length: { profile.shards.0.fetch.children: 3 }
- match: { profile.shards.0.fetch.children.0.type: FetchSourcePhase }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.0.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.1.type: InnerHitsPhase }
- length: { profile.shards.0.fetch.children: 4 }
- match: { profile.shards.0.fetch.children.0.type: FetchFieldsPhase }
- match: { profile.shards.0.fetch.children.1.type: FetchSourcePhase }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.1.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.2.type: StoredFieldsPhase }
- match: { profile.shards.0.fetch.children.2.type: InnerHitsPhase }
- gt: { profile.shards.0.fetch.children.2.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.2.breakdown.next_reader: 0 }
- gt: { profile.shards.0.fetch.children.2.breakdown.next_reader_count: 0 }
- gt: { profile.shards.0.fetch.children.2.breakdown.next_reader: 0 }
- match: { profile.shards.0.fetch.children.3.type: StoredFieldsPhase }

---
disabling stored fields removes fetch sub phases:
- skip:
version: ' - 7.15.99'
reason: fetch profiling implemented in 7.16.0
version: ' - 8.13.99'
reason: fetch fields and stored_fields using ValueFetcher
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer needed


- do:
search:
Expand All @@ -163,7 +169,8 @@ disabling stored fields removes fetch sub phases:

- match: { hits.hits.0._index: test }
- match: { profile.shards.0.fetch.debug.stored_fields: [] }
- is_false: profile.shards.0.fetch.children
- length: { profile.shards.0.fetch.children: 1 }
- match: { profile.shards.0.fetch.children.0.type: FetchFieldsPhase }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is an oversight and should be reverted. The fact that this test needs to be adapted is a signal that we have done something wrong.

When we ask for stored_fields: _none_, StoredFieldsContext gets fetchFields set to false. In that case, I think we should skip retrieving all metadata fields in FetchFieldsPhase too, regardless of whether they are effectively stored or not. Because here there is no fields parameter and only stored_fields: _none_, we should skip the FetchFieldsPhase entirely, something like:

// with _stored: _none_ we don't fetch metadata fields either (regardless of whether they are stored or not, for bwc reasons)
if ((storedFieldsContext == null || storedFieldsContext.fetchFields() == false) && fetchFieldsContext == null) {
    return null;
}

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the purpose of the fetchStoredFields flag...under the assumption that the FetchFieldsPhase is always executed unconditionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would break another test in 370_profile when checking

- match: { profile.shards.0.fetch.debug.stored_fields: [_id, _routing, _source] }


---
dfs knn vector profiling:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,13 @@ public List<Object> fetchValues(Source source, int doc, List<Object> includedVal
for (Object entry : nestedValues) {
// add this one entry only to the stub and use this as source lookup
stub.put(nestedFieldName, entry);
Map<String, DocumentField> fetchResult = nestedFieldFetcher.fetch(
FieldFetcher.DocAndMetaFields fetchResult = nestedFieldFetcher.fetch(
Source.fromMap(filteredSource, source.sourceContentType()),
doc
);

Map<String, Object> nestedEntry = new HashMap<>();
for (DocumentField field : fetchResult.values()) {
for (DocumentField field : fetchResult.documentFields().values()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am using just the document fields because there is no one of the metadata fields used in the response at the top level that need to be used in a nested context.

List<Object> fetchValues = field.getValues();
if (fetchValues.isEmpty() == false) {
String keyInNestedMap = field.getName().substring(nestedFieldPath.length() + 1);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,29 +10,69 @@

import org.apache.lucene.index.LeafReaderContext;
import org.elasticsearch.common.document.DocumentField;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.index.mapper.IgnoredFieldMapper;
import org.elasticsearch.index.mapper.LegacyTypeFieldMapper;
import org.elasticsearch.index.mapper.RoutingFieldMapper;
import org.elasticsearch.search.fetch.FetchContext;
import org.elasticsearch.search.fetch.FetchSubPhase;
import org.elasticsearch.search.fetch.FetchSubPhaseProcessor;
import org.elasticsearch.search.fetch.StoredFieldsContext;
import org.elasticsearch.search.fetch.StoredFieldsSpec;

import java.io.IOException;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.stream.Stream;

/**
* A fetch sub-phase for high-level field retrieval. Given a list of fields, it
* retrieves the field values through the relevant {@link org.elasticsearch.index.mapper.ValueFetcher}
* and returns them as document fields.
*/
public final class FetchFieldsPhase implements FetchSubPhase {

private static final List<String> ROOT_LEVEL_METADATA_FIELD_NAMES = List.of(
IgnoredFieldMapper.NAME,
RoutingFieldMapper.NAME,
LegacyTypeFieldMapper.NAME
);

private static final List<FieldAndFormat> ROOT_LEVEL_METADATA_FIELDS = ROOT_LEVEL_METADATA_FIELD_NAMES.stream()
.map(field -> new FieldAndFormat(field, null))
.toList();

public static boolean isMetadataField(final String field) {
return ROOT_LEVEL_METADATA_FIELD_NAMES.stream().anyMatch(f -> f.equals(field));
}

private static <T> List<T> emptyListIfNull(final List<T> theList) {
return theList == null ? Collections.emptyList() : theList;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is problematic, because you have a separate definition of metadata fields within fetch fields. Plugins (like SizeMapperPlugin) can provide their own extensions of custom metadata fields. When they do so, we should treat those metadata fields like any other. We certainly should not have the knowledge of which metadata fields plugins may provide. Ideally, we would simply rely on SearchExecutionContext#isMetadataField here, like StoredFieldsPhase used to do. Additionally , you need to skip the _source field exactly like StoredFieldsPhase used to do.


@Override
public FetchSubPhaseProcessor getProcessor(FetchContext fetchContext) {
FetchFieldsContext fetchFieldsContext = fetchContext.fetchFieldsContext();
if (fetchFieldsContext == null) {
return null;
}
final FetchFieldsContext fetchFieldsContext = fetchContext.fetchFieldsContext();
final StoredFieldsContext storedFieldsContext = fetchContext.storedFieldsContext();

boolean fetchStoredFields = storedFieldsContext != null && storedFieldsContext.fetchFields();
final List<FieldAndFormat> storedFields = storedFieldsContext == null
? Collections.emptyList()
: emptyListIfNull(storedFieldsContext.fieldNames()).stream()
.map(storedField -> new FieldAndFormat(storedField, null))
.distinct()
.toList();
final List<FieldAndFormat> storedFieldsIncludingDefaultMetadataFields = fetchStoredFields
? Stream.concat(ROOT_LEVEL_METADATA_FIELDS.stream(), storedFields.stream()).distinct().toList()
: Collections.emptyList();

FieldFetcher fieldFetcher = FieldFetcher.create(fetchContext.getSearchExecutionContext(), fetchFieldsContext.fields());
final List<FieldAndFormat> fetchFields = fetchFieldsContext == null
? Collections.emptyList()
: emptyListIfNull(fetchFieldsContext.fields());
final FieldFetcher fieldFetcher = FieldFetcher.create(
fetchContext.getSearchExecutionContext(),
Stream.concat(fetchFields.stream(), storedFieldsIncludingDefaultMetadataFields.stream()).toList()
);

return new FetchSubPhaseProcessor() {
@Override
Expand All @@ -47,11 +87,10 @@ public StoredFieldsSpec storedFieldsSpec() {

@Override
public void process(HitContext hitContext) throws IOException {
Map<String, DocumentField> documentFields = fieldFetcher.fetch(hitContext.source(), hitContext.docId());
SearchHit hit = hitContext.hit();
for (Map.Entry<String, DocumentField> entry : documentFields.entrySet()) {
hit.setDocumentField(entry.getKey(), entry.getValue());
}
final FieldFetcher.DocAndMetaFields fields = fieldFetcher.fetch(hitContext.source(), hitContext.docId());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to simplify things, I would propose that we make no changes to FieldFetcher, and instead create two instances of it, one for ordinary fields (like we already do) and a second one for metadata fields. The choice of where to put the fields that each fetcher outputs can be made in FetchFieldsPhase. This simplifies things in that it removes the need to distinguish between metadata and not in FieldFetcher and removes the need for the fetchStoredFields flag too.

final Map<String, DocumentField> documentFields = fields.documentFields();
final Map<String, DocumentField> metadataFields = fields.metadataFields();
hitContext.hit().addDocumentFields(documentFields, metadataFields);
}
};
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
*/
public class FieldFetcher {

private record ResolvedField(String field, String matchingPattern, MappedFieldType ft, String format) {}
private record ResolvedField(String field, String matchingPattern, MappedFieldType ft, String format, boolean isMetadataField) {}

/**
* Build a FieldFetcher for a given search context and collection of fields and formats
Expand All @@ -58,7 +58,9 @@ public static FieldFetcher create(SearchExecutionContext context, Collection<Fie
if (context.isMetadataField(field) && matchingPattern != null) {
continue;
}
resolvedFields.add(new ResolvedField(field, matchingPattern, ft, fieldAndFormat.format));
resolvedFields.add(
new ResolvedField(field, matchingPattern, ft, fieldAndFormat.format, FetchFieldsPhase.isMetadataField(field))
);
}
}

Expand Down Expand Up @@ -127,7 +129,7 @@ private static Map<String, FieldContext> buildFieldContexts(
if (nestedScope.equals(scope)) {
// These are fields in the current scope, so add them directly to the output map
for (ResolvedField ff : fieldsByNestedMapper.get(nestedScope)) {
output.put(ff.field, new FieldContext(ff.field, buildValueFetcher(context, ff)));
output.put(ff.field, new FieldContext(ff.field, buildValueFetcher(context, ff), ff.isMetadataField));
}
} else {
// don't create nested fetchers if no children have been requested as part of the fields
Expand All @@ -147,7 +149,7 @@ private static Map<String, FieldContext> buildFieldContexts(
unmappedFetchPatterns
);
NestedValueFetcher nvf = new NestedValueFetcher(scope, new FieldFetcher(scopedFields, unmappedFieldFetcher));
output.put(scope, new FieldContext(scope, nvf));
output.put(scope, new FieldContext(scope, nvf, FetchFieldsPhase.isMetadataField(scope)));
}
}
}
Expand All @@ -161,25 +163,39 @@ private static Map<String, FieldContext> buildFieldContexts(
private FieldFetcher(Map<String, FieldContext> fieldContexts, UnmappedFieldFetcher unmappedFieldFetcher) {
this.fieldContexts = fieldContexts;
this.unmappedFieldFetcher = unmappedFieldFetcher;
this.storedFieldsSpec = StoredFieldsSpec.build(fieldContexts.values(), fc -> fc.valueFetcher.storedFieldsSpec());
this.storedFieldsSpec = StoredFieldsSpec.build(
fieldContexts.values().stream().filter(f -> f.isMetadataField == false).toList(),
fc -> fc.valueFetcher.storedFieldsSpec()
);
}

public StoredFieldsSpec storedFieldsSpec() {
return storedFieldsSpec;
}

public Map<String, DocumentField> fetch(Source source, int doc) throws IOException {
/**
* Collects document and metadata fields as two separate maps
* mapping the field name to the actual {@link DocumentField}.
*/
public record DocAndMetaFields(Map<String, DocumentField> documentFields, Map<String, DocumentField> metadataFields) {}

public DocAndMetaFields fetch(Source source, int doc) throws IOException {
Map<String, DocumentField> documentFields = new HashMap<>();
for (FieldContext context : fieldContexts.values()) {
String field = context.fieldName;
ValueFetcher valueFetcher = context.valueFetcher;
final DocumentField docField = valueFetcher.fetchDocumentField(field, source, doc);
if (docField != null) {
documentFields.put(field, docField);
Map<String, DocumentField> metadataFields = new HashMap<>();
for (FieldContext fieldContext : fieldContexts.values()) {
String field = fieldContext.fieldName;
ValueFetcher valueFetcher = fieldContext.valueFetcher;
final DocumentField value = valueFetcher.fetchDocumentField(field, source, doc);
if (value != null) {
if (fieldContext.isMetadataField) {
metadataFields.put(field, value);
} else {
documentFields.put(field, value);
}
}
}
unmappedFieldFetcher.collectUnmapped(documentFields, source);
return documentFields;
return new DocAndMetaFields(documentFields, metadataFields);
}

public void setNextReader(LeafReaderContext readerContext) {
Expand All @@ -188,5 +204,5 @@ public void setNextReader(LeafReaderContext readerContext) {
}
}

private record FieldContext(String fieldName, ValueFetcher valueFetcher) {}
private record FieldContext(String fieldName, ValueFetcher valueFetcher, boolean isMetadataField) {}
}
Loading