[system tests] Validate transforms based on the mappings #2347

mrodm · 2025-01-16T10:43:26Z

Relates #2341

Add support to validate transforms created in packages (if any) based on their mappings.

As there are some packages that once the documents are processed are also deleted, it's required to be able to check whether or not its destination index has deleted docs.

Concerns:

Testing packages defining transforms are going to be slower, since it is needed to wait for docs (or deleted docs).
- This is required in order to have the destination index the actual mappings after being processed the docs.

Author's Checklist

Run tests in packages from integrations repository
Requires to update after merging Validate fields based on their mappings and the dynamic templates set #2285

multi_fields are now compared directly with dynamic templates or ECS fields. It was required to know whether or not that field was a multi_field or not to apply some specific validations.

…ing for other dynamic templates if parameters do not match

…ppings

mrodm · 2025-01-23T17:51:05Z

test integrations

elastic-vault-github-plugin-prod · 2025-01-23T17:57:05Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#12387

Do not validate transform fields based on mappings for those packages where the kibana version is lower than 8.14.0. There is an known issue related to how mappings are generated. Related issue: elastic/kibana#175331

mrodm · 2025-01-24T14:55:14Z

internal/testrunner/runners/system/tester.go

+	// Before stack version 8.14.0, there are some issues generating the corresponding
+	// mappings for the fields defined in the transforms
+	// Forced to not use mappings to validate transforms before 8.14.0
+	// Related issue: https://github.com/elastic/kibana/issues/175331


Found some issues with the generated mappings for the transforms defined in some packages. If two group fields were defined (with different fields under it), they were not merged. This happened for instance in github package.

It looks like it was solved here elastic/kibana#177608 and it is available/solved starting in stack version 8.14.0

Added for now a way to not run validated based on mappings (just in transforms).

mrodm · 2025-01-27T12:29:37Z

test integrations

elastic-vault-github-plugin-prod · 2025-01-27T12:35:41Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#12387

mrodm · 2025-01-27T17:02:00Z

/test

mrodm · 2025-01-27T17:21:19Z

internal/testrunner/runners/system/tester.go

@@ -750,7 +752,7 @@ func (r *tester) getDocs(ctx context.Context, dataStream string) (*hits, error)
 		return &hits{}, nil
 	}
 	if resp.IsError() {
-		return nil, fmt.Errorf("failed to search docs for data stream %s: %s", dataStream, resp.String())
+		return nil, fmt.Errorf("failed to search docs for index or data stream %s: %s", dataStream, resp.String())


Should we keep just data stream to avoid being too much verbose here?
index refers when the process waits for the docs to appear in transforms.

mrodm · 2025-01-27T17:42:39Z

/test

mrodm · 2025-01-27T18:19:08Z

test integrations

elastic-vault-github-plugin-prod · 2025-01-27T18:25:33Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#12387

mrodm · 2025-01-27T18:29:49Z

@jsoriano after some changes, this PR is ready for review again, thanks in advance!

jsoriano · 2025-01-28T19:01:29Z

internal/testrunner/runners/system/tester.go

+	// Related issue: https://github.com/elastic/kibana/issues/175331
+	validationMethod := r.fieldValidationMethod
+	if stackVersion.LessThan(semver_8_14_0) {
+		logger.Debugf("Forced to validate transforms based on fields, not available for stack versions < 8.14.0")


jsoriano · 2025-01-28T19:06:59Z

internal/testrunner/runners/system/tester.go

 			}
-		}
+			return processed >= len(transformDocs), nil


Looks like we are comparing different things here:

len(transformDocs) is the number of documents resulting from the preview.

processed is the number of processed documents from the source data stream/s.

So, if I understand correctly, processed can be bigger than len(transformDocs) before all the documents in the data stream are processed.

I would expect this to stop when processed >= number of docs in the source index at some point to be decided.

So, IIUC you mean to compare processed with the number of documents found in the source data stream just before validating the transform, is that right? @jsoriano

This number of documents in the data stream could be added as first step into validateTransformsWithMappings.

Yes, I mean to compare with the number of documents in the source data stream. What I am not so sure is when to check for this number.

This number of documents in the data stream could be added as first step into validateTransformsWithMappings.

If we are sure that we have enough documents at this point it would be perfect, yes.

This number of documents in the data stream could be added as first step into validateTransformsWithMappings.

If we are sure that we have enough documents at this point it would be perfect, yes.

Not totally 100% sure, but currently I cannot think of any other better location. This could vary for each package, since there could be packages that are ingesting documents continuously or they do not have ingested all the required documents in the data stream validation (related to #2378).

mrodm · 2025-01-29T15:36:31Z

test integrations

jsoriano · 2025-01-29T15:37:25Z

internal/testrunner/runners/system/tester.go

-func (r *tester) validateTransformsWithMappings(ctx context.Context, transformId, transformName, destIndexTransform, indexTemplateTransform string, transformDocs []common.MapStr, fieldsValidator *fields.Validator) error {
+func (r *tester) validateTransformsWithMappings(ctx context.Context, sourceDataStream, transformId, transformName, destIndexTransform, indexTemplateTransform string, transformDocs []common.MapStr, fieldsValidator *fields.Validator) error {
+	logger.Debugf("Searching the number of documents found in source data stream %q before validating transform %q", sourceDataStream, transformId)
+	sourceDataStreamHits, err := r.getDocs(ctx, sourceDataStream)


Umm, thinking about #2341 (comment), we should probably use the source index here, because in cases where there are differences we would be also comparing different things.

IIUC the source index in this context (where the transform is reading the documents to process) is the source data stream of the running test. If so, this would mean the data stream that the scenario.dataStream variable refers to.

Is that what you meant here?

sourceDataStream is the scenario.dataStream, right?

The source index in a transform is a pattern, that can match multiple indexes or data streams. As mentioned in #2341 (comment) there can be several data streams matching the same transform source index. In the case of system tests, I guess this means that the source index can match other data streams apart of sourceDataStream/scenario.dataStream.

So here we are getting the docs written by the test into its data stream, and using its count to decide when to stop the test. We are comparing the number of docs in the scenario data stream with the documents of all the matching data streams.

In the case where there are more than one data streams matching, we might be continuing even if none of the expected documents has been processed.

I think we should use here the transform source index to be sure that we check the transform after all the documents in all datastreams (including the scenario one) are processed. But maybe this is quite a corner case, not sure about the cases we have.

elasticmachine · 2025-01-29T15:38:00Z

💚 Build Succeeded

Buildkite Build
Commit: 1d539ee

History

💚 Build #4752 succeeded e2fe72f
💚 Build #4743 succeeded afac6f3
💔 Build #4741 failed afac6f3
💔 Build #4739 failed 1e5a7ae
💚 Build #4736 succeeded a65efc3
💚 Build #4730 succeeded df0732a

cc @mrodm

elastic-vault-github-plugin-prod · 2025-01-29T15:43:27Z

Created or updated PR in integrations repository to test this version. Check elastic/integrations#12387

mrodm added 30 commits December 16, 2024 19:15

Add dynamic templates as parameter

2c3f76e

Add validation for each dynamic template depending on the parameters

41d7b88

Fixes for dynamic templates

ff5298e

Compare fields with dynamic templates

af04d66

Remove multi_fields from flattened fields

f21c712

multi_fields are now compared directly with dynamic templates or ECS fields. It was required to know whether or not that field was a multi_field or not to apply some specific validations.

Test without filtering dynamic templates

aa28db6

Ensure properties subfields are validated accordingly

3767111

Restore filtering and add tests

9fb80b4

Disable unmatch_mapping_type and match_mapping_type and continue look…

2ad28ac

…ing for other dynamic templates if parameters do not match

Merge upstream/main into validate-dynamic-mappings

8bdad65

Refactors and remove log statements

151e59e

Fix function naming

d0ff6b9

Add test with match_pattern regex

b474b4b

Add comment in tests

066b5a0

Remove loading schema from create validator for mappings method

a7eb191

Separate parsing and validation of dynamic templates

7666ac2

Support match_pattern for the other settings

a718796

fix test

0a5e775

Update tests for multi-fields

131cc9c

Review multi-fields logic

38db8df

Revisit multi-fields logic in ECS

9ff3d0c

Report all errors related to multi-fields comparing with ECS

5165612

Ignored validation of multi-fields with ECS

ce557c9

Fix multi-field test

ebda859

Rephrase errors

019ffd4

Validate fully dynamic objects (preview) with dynamic templates

8f36c18

Rephrase debug message

d96ffbf

Add logging - to be removed

dbca4fe

Merge remote-tracking branch 'upstream/main' into validate-dynamic-ma…

8fce0ec

…ppings

Validate transforms with mappings getting docs

aa5674f

mrodm added 2 commits January 23, 2025 11:20

Add transform name to test case

1cbe763

Extract function to validate transforms based on mappings

930bbbe

mrodm requested a review from jsoriano January 23, 2025 14:54

mrodm added 3 commits January 23, 2025 16:14

Remove TODO

e2eee93

Use processed docs as stop condition in transforms

a178a68

Remove unused function

aff903b

mrodm added 2 commits January 24, 2025 15:37

Add exception for transforms in stacks < 8.14.0

c385cb2

Do not validate transform fields based on mappings for those packages where the kibana version is lower than 8.14.0. There is an known issue related to how mappings are generated. Related issue: elastic/kibana#175331

Merge upstream/main into validate-transforms-mappings

df0732a

mrodm commented Jan 24, 2025

View reviewed changes

Check for errors in docs when validation based on mappings is used

a65efc3

Rename function

1e5a7ae

Report any mapping error or any doc with an error

afac6f3

mrodm commented Jan 27, 2025

View reviewed changes

Merge upstream/main into validate-transforms-mappings

e2fe72f

jsoriano reviewed Jan 28, 2025

View reviewed changes

Use source data stream hits as a reference for transforms

1d539ee

jsoriano approved these changes Jan 29, 2025

View reviewed changes

jsoriano reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[system tests] Validate transforms based on the mappings #2347

[system tests] Validate transforms based on the mappings #2347

mrodm commented Jan 16, 2025 •

edited

Loading

mrodm commented Jan 23, 2025

elastic-vault-github-plugin-prod bot commented Jan 23, 2025

mrodm Jan 24, 2025

mrodm commented Jan 27, 2025

elastic-vault-github-plugin-prod bot commented Jan 27, 2025

mrodm commented Jan 27, 2025

mrodm Jan 27, 2025

mrodm commented Jan 27, 2025

mrodm commented Jan 27, 2025

elastic-vault-github-plugin-prod bot commented Jan 27, 2025

mrodm commented Jan 27, 2025

jsoriano Jan 28, 2025

jsoriano Jan 28, 2025

mrodm Jan 29, 2025

jsoriano Jan 29, 2025

mrodm Jan 29, 2025

mrodm commented Jan 29, 2025

jsoriano Jan 29, 2025

mrodm Jan 29, 2025

jsoriano Jan 30, 2025

elasticmachine commented Jan 29, 2025

elastic-vault-github-plugin-prod bot commented Jan 29, 2025

[system tests] Validate transforms based on the mappings #2347

Are you sure you want to change the base?

[system tests] Validate transforms based on the mappings #2347

Conversation

mrodm commented Jan 16, 2025 • edited Loading

Author's Checklist

mrodm commented Jan 23, 2025

elastic-vault-github-plugin-prod bot commented Jan 23, 2025

Choose a reason for hiding this comment

mrodm commented Jan 27, 2025

elastic-vault-github-plugin-prod bot commented Jan 27, 2025

mrodm commented Jan 27, 2025

Choose a reason for hiding this comment

mrodm commented Jan 27, 2025

mrodm commented Jan 27, 2025

elastic-vault-github-plugin-prod bot commented Jan 27, 2025

mrodm commented Jan 27, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrodm commented Jan 29, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticmachine commented Jan 29, 2025

💚 Build Succeeded

History

elastic-vault-github-plugin-prod bot commented Jan 29, 2025

mrodm commented Jan 16, 2025 •

edited

Loading