Dynamic pipelines - a new `foreach` block #1480

ptodev · 2024-08-15T15:41:36Z

PR Description

This is a proposal for starting up Alloy pipelines dynamically based on data in transit. For example, if a discovery component comes up with 10 targets, Alloy can start 10 sub-pipelines, each dealing with a different target.

The PR consists of a a design doc in docs/design/ and an experimental implementation. The user will have to have the "experimental" cmd flag turned on.

Which issue(s) this PR fixes

Fixes #1443

docs/design/1443-dynamic-pipelines.md

captncraig · 2024-08-20T18:40:54Z

I really like the idea of it being component based. I am comparing to terraform's for_each, where you make a "template" resource and have an argument in there which tells it how to expand it into multiple components:

resource "azurerm_resource_group" "rg" {
  for_each = tomap({
    a_group       = "eastus"
    another_group = "westus2"
  })
  name     = each.key
  location = each.value
}

for_each can be any expression that evaluates to something iterable (in our case limiting to an array would probably be fine, but maybe we could do maps in a similar way).
The each keyword is valid in other arguments to reference the "current" item for iteration.

I really like the notion that this is a single component that would be "expanded" based on the evaluation of some list. It could be a built-in component or a previously imported dynamic component.

I am not sure how naming of the expanded components would work, but we could figure something out. I'm also not sure if it is possible to reference a dynamically created subcomponent from elsewhere in the config.

docs/design/1443-dynamic-pipelines.md

github-actions · 2024-10-07T00:01:33Z

This PR has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If you do not have enough time to follow up on this PR or you think it's no longer relevant, consider closing it.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your PR will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

ptodev · 2024-10-29T13:42:26Z

It looks likely that we are going to go with the foreach option. I updated the PR to add some draft code that I've been working on. As a first step, it'd be good to have a unit tests that doesn't use the proposed var argument. The test could look something like what's proposed in internal/runtime/testdata/foreach/foreach_1.txtar.

As a second step, we will need to figure out how to refer to the values that are currently iterated on. Should we actually have a var argument? I think it might be unnecessary - having a predefined argument (e.g. val) which refers to the current thing being iterated on may be enough. I'm not yet sure how to add this reserved word into the Alloy syntax though.

By the way, apparently the Collector already has something like dynamic pipelines. There are observer extensions which you can hook up to a recover creator component. For example, you can have a k8s observer which crates a receiver for each discovered pod.

github-actions · 2025-01-23T11:24:20Z

💻 Deploy preview available: https://deploy-preview-alloy-1480-zb444pucvq-vp.a.run.app/docs/alloy/latest/

* summation1 only sends during run() * summation2 tracks the sum via metrics instead of an export

…itself. (#2404)

* add stability lvl to config blocks * fix import git test

* Add tests for types other than integers * Minor fixes to string_receiver * Add a foreach test for maps which contain maps

* Add docs for foreach * Apply suggestions from code review Co-authored-by: Clayton Cornell <[email protected]> * Add a shared experimental_feature snippet * Addressing PR feedback * Apply suggestions from code review Co-authored-by: William Dumont <[email protected]> --------- Co-authored-by: Clayton Cornell <[email protected]> Co-authored-by: William Dumont <[email protected]>

mattdurham

Fantastic work, gave it a first review and will go over it again. Added some comments on tests.

docs/sources/reference/config-blocks/foreach.md

internal/runtime/internal/controller/node_config_foreach_test.go

internal/runtime/internal/controller/node_config_foreach.go

internal/runtime/foreach_test.go

internal/runtime/internal/controller/node_config_foreach_test.go

docs/sources/reference/config-blocks/foreach.md

internal/runtime/internal/controller/node_config_foreach.go

mattdurham · 2025-02-06T21:29:52Z

internal/runtime/internal/controller/node_config_foreach.go

+}
+
+func (fi *forEachChild) Hash() uint64 {
+	fnvHash := fnv.New64a()


Why are we using two different hash functions? sha256 and fnv, though they dont collide in usage, it feels odd.

The runner pkg only needs a 64 bits hash but for the objects in the collection we use a 256 bits hash. I would be ok to use 64 bits for the items in the collection. A collision at this level could end up in missing metrics and duplicated metrics in unlucky scenarios but with 1000 items in the collection (which would be a lot for Alloy) the probability is 2.71×10-14. A collision in the runner pkg would be even worse so I'm not sure that the extra 256 hash security is needed but I also don't mind it so much because it should not matter much in terms of performance. @ptodev what do you think?

IIUC, for items in the collection the hash is 256 bits since there could be lots of things in each item in a collection, and it gives us extra protection against collisions. For the foreach ID we use a 64 bit hash since it's just a string. IMO it's ok to use different hashes, but we do need to document why each situation uses a different one.

we use the hash of the items in the collection for the foreachID and the foreachID as a key: https://github.com/grafana/alloy/pull/1480/files#diff-9cebda46e5a40368c4b76027ff2bda36114d2737124b5390aa35fa6435d79127R193
The 64 bits hash is used for the runner that wraps around the foreach to run it.
What's in an item in a collection should not matter. Whether the object has 3 or 30 fields, it should not have any influence on the collision I believe. The only difference would be the number of items in the collection and we would need billions of items for a collision to be likely.
I'm ok to keep it as it is and add a comment that we keep the 256 bits hash for extra security and that if one day it becomes a performance bottleneck it would be ok to use a 64 bits hash instead

internal/runtime/internal/controller/node_config_foreach.go

internal/runtime/alloy.go

mattdurham

Small comments but overall fantastic!

Co-authored-by: Clayton Cornell <[email protected]>

ptodev commented Aug 15, 2024

View reviewed changes

docs/design/1443-dynamic-pipelines.md Show resolved Hide resolved

docs/design/1443-dynamic-pipelines.md Show resolved Hide resolved

ptodev mentioned this pull request Aug 19, 2024

Integrate prometheus.exporter and discovery components #1443

Open

wildum reviewed Aug 20, 2024

View reviewed changes

docs/design/1443-dynamic-pipelines.md Show resolved Hide resolved

wildum reviewed Aug 20, 2024

View reviewed changes

docs/design/1443-dynamic-pipelines.md Show resolved Hide resolved

wildum reviewed Aug 20, 2024

View reviewed changes

docs/design/1443-dynamic-pipelines.md Show resolved Hide resolved

ptodev mentioned this pull request Aug 23, 2024

Document a new "join" block in discovery.relabel #1541

Closed

4 tasks

thampiotr reviewed Sep 2, 2024

View reviewed changes

github-actions bot added the needs-attention label Oct 7, 2024

ptodev force-pushed the ptodev/dynamic-pipelines branch from dd602e6 to 378cb04 Compare October 31, 2024 19:35

ptodev force-pushed the ptodev/dynamic-pipelines branch from 378cb04 to cc02bfc Compare November 20, 2024 09:43

ptodev mentioned this pull request Nov 25, 2024

Flow: Add component for multi-tenant remote_write support #521

Open

thampiotr assigned ptodev and wildum Dec 10, 2024

ptodev force-pushed the ptodev/dynamic-pipelines branch from d8b2547 to d1591b1 Compare December 16, 2024 10:37

wildum force-pushed the ptodev/dynamic-pipelines branch from e8cd817 to 8217e86 Compare December 19, 2024 14:40

wildum force-pushed the ptodev/dynamic-pipelines branch 2 times, most recently from b041704 to 421204a Compare January 9, 2025 09:23

ptodev changed the title ~~Proposal for dynamic pipelines~~ Dynamic pipelines - a new foreach block Jan 27, 2025

ptodev and others added 8 commits January 27, 2025 17:10

Proposal for dynamic pipelines

94509e7

Foreach prototype

1f2ce8d

Initial implementation

247b68d

wip

afdeb23

Fixes to summation1 and summation2

b917da4

* summation1 only sends during run() * summation2 tracks the sum via metrics instead of an export

fix foreach run

439b300

foreach uses the value from the collection via the var

7de0aa1

compute an ID for the foreach instances and add tests

5174a33

ptodev and others added 6 commits January 27, 2025 17:10

Disable debug metrics for components inside foreach, and for foreach …

0b9f400

…itself. (#2404)

Add stability lvl to config blocks (#2441)

6148600

* add stability lvl to config blocks * fix import git test

Add tests for types other than integers (#2436)

b91aee3

* Add tests for types other than integers * Minor fixes to string_receiver * Add a foreach test for maps which contain maps

use full hash on foreach instances and fix test

6bf628c

Add a changelog entry.

417bd4d

ptodev force-pushed the ptodev/dynamic-pipelines branch from e16e3e3 to 417bd4d Compare January 27, 2025 17:17

ptodev marked this pull request as ready for review January 27, 2025 17:18

ptodev requested review from clayton-cornell and a team as code owners January 27, 2025 17:18

mattdurham reviewed Jan 27, 2025

View reviewed changes

clayton-cornell added the type/docs Docs Squad label across all Grafana Labs repos label Jan 28, 2025

wildum and others added 5 commits January 31, 2025 11:12

typo

86435e4

allow non alphanum strings

0b42dc3

add test for wrong collection type

8611fa3

added capsule test

7db7859

Add more tests for non-alphanumeric strings.

1ab03b9

wildum reviewed Jan 31, 2025

View reviewed changes

internal/runtime/internal/controller/node_config_foreach_test.go Show resolved Hide resolved

clayton-cornell reviewed Jan 31, 2025

View reviewed changes

mattdurham reviewed Feb 6, 2025

View reviewed changes

internal/runtime/internal/controller/node_config_foreach.go Outdated Show resolved Hide resolved

mattdurham reviewed Feb 6, 2025

View reviewed changes

internal/runtime/internal/controller/node_config_foreach.go Outdated Show resolved Hide resolved

mattdurham reviewed Feb 6, 2025

View reviewed changes

internal/runtime/alloy.go Show resolved Hide resolved

mattdurham approved these changes Feb 6, 2025

View reviewed changes

thampiotr approved these changes Feb 10, 2025

View reviewed changes

ptodev and others added 3 commits February 10, 2025 17:01

Apply suggestions from code review

00c8dca

Co-authored-by: Clayton Cornell <[email protected]>

Apply suggestions from code review

737cd0d

Co-authored-by: Clayton Cornell <[email protected]>

Add comments regarding the override registry for modules

9781fab

clayton-cornell approved these changes Feb 10, 2025

View reviewed changes

Rename hashObject to objectFingerprint

5ec5380

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic pipelines - a new `foreach` block #1480

Dynamic pipelines - a new `foreach` block #1480

ptodev commented Aug 15, 2024 •

edited

Loading

captncraig commented Aug 20, 2024

github-actions bot commented Oct 7, 2024

ptodev commented Oct 29, 2024

github-actions bot commented Jan 23, 2025 •

edited

Loading

mattdurham left a comment

mattdurham Feb 6, 2025

wildum Feb 11, 2025

ptodev Feb 11, 2025

wildum Feb 11, 2025

mattdurham left a comment

Dynamic pipelines - a new foreach block #1480

Are you sure you want to change the base?

Dynamic pipelines - a new foreach block #1480

Conversation

ptodev commented Aug 15, 2024 • edited Loading

PR Description

Which issue(s) this PR fixes

captncraig commented Aug 20, 2024

github-actions bot commented Oct 7, 2024

ptodev commented Oct 29, 2024

github-actions bot commented Jan 23, 2025 • edited Loading

mattdurham left a comment

Choose a reason for hiding this comment

mattdurham Feb 6, 2025

Choose a reason for hiding this comment

wildum Feb 11, 2025

Choose a reason for hiding this comment

ptodev Feb 11, 2025

Choose a reason for hiding this comment

wildum Feb 11, 2025

Choose a reason for hiding this comment

mattdurham left a comment

Choose a reason for hiding this comment

Dynamic pipelines - a new `foreach` block #1480

Dynamic pipelines - a new `foreach` block #1480

ptodev commented Aug 15, 2024 •

edited

Loading

github-actions bot commented Jan 23, 2025 •

edited

Loading