Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tqlotel] Add priority keys parameter to the limit function. #12997

Merged
merged 2 commits into from
Sep 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 10 additions & 4 deletions pkg/telemetryquerylanguage/functions/tqlotel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,18 +101,24 @@ Examples:

## limit

`limit(target, limit)`
`limit(target, limit, priority_keys)`

The `limit` function reduces the number of elements in a `pdata.Map` to be no greater than the limit.

`target` is a path expression to a `pdata.Map` type field. `limit` is a non-negative integer.
`target` is a path expression to a `pdata.Map` type field. `limit` is a non-negative integer.
`priority_keys` is a list of strings of attribute keys that won't be dropped during limiting.

Comment on lines +109 to +110
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not a list is a variadic argument. I mean this means we cannot add arguments after and we don't need to use join/concat/etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TylerHelmuth how do we call these arguments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a limitation of the grammar at the moment. The grammar only supports 1 list parameter and it must be the last. Yes it is technically variadic, but I believe the docs currently call it a list. There is open issue to add full list capabilities.

The number of priority keys must be less than the supplied `limit`.

The map will be mutated such that the number of items does not exceed the limit.
kovrus marked this conversation as resolved.
Show resolved Hide resolved
The map is not copied or reallocated.

The map will be mutated such that the number of items does not exceed the limit. Which items are dropped is random.
Which items are dropped is random, provide `priority_keys` to preserve required keys.

Examples:

- `limit(attributes, 100)`
- `limit(resource.attributes, 50)`
- `limit(resource.attributes, 50, "http.host", "http.method")`

## replace_all_matches

Expand Down
32 changes: 24 additions & 8 deletions pkg/telemetryquerylanguage/functions/tqlotel/func_limit.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,21 @@ import (
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/telemetryquerylanguage/tql"
)

func Limit(target tql.GetSetter, limit int64) (tql.ExprFunc, error) {
func Limit(target tql.GetSetter, limit int64, priorityKeys []string) (tql.ExprFunc, error) {
if limit < 0 {
return nil, fmt.Errorf("invalid limit for limit function, %d cannot be negative", limit)
}
if limit < int64(len(priorityKeys)) {
return nil, fmt.Errorf(
"invalid limit for limit function, %d cannot be less than number of priority attributes %d",
limit, len(priorityKeys),
)
}
keep := make(map[string]struct{}, len(priorityKeys))
for _, key := range priorityKeys {
keep[key] = struct{}{}
}

return func(ctx tql.TransformContext) interface{} {
val := target.Get(ctx)
if val == nil {
Expand All @@ -37,18 +48,23 @@ func Limit(target tql.GetSetter, limit int64) (tql.ExprFunc, error) {
return nil
}

updated := pcommon.NewMap()
updated.EnsureCapacity(attrs.Len())
count := int64(0)
attrs.Range(func(key string, val pcommon.Value) bool {
for _, key := range priorityKeys {
if _, ok := attrs.Get(key); ok {
count++
}
}

attrs.RemoveIf(func(key string, value pcommon.Value) bool {
if _, ok := keep[key]; ok {
return false
}
if count < limit {
val.CopyTo(updated.UpsertEmpty(key))
count++
return true
return false
}
return false
return true
})
target.Set(ctx, updated)
// TODO: Write log when limiting is performed
// https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/9730
}
Expand Down
63 changes: 57 additions & 6 deletions pkg/telemetryquerylanguage/functions/tqlotel/func_limit_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ func Test_limit(t *testing.T) {
name string
target tql.GetSetter
limit int64
keep []string
want func(pcommon.Map)
}{
{
Expand All @@ -60,7 +61,7 @@ func Test_limit(t *testing.T) {
target: target,
limit: int64(0),
want: func(expectedMap pcommon.Map) {
expectedMap.Clear()
expectedMap.EnsureCapacity(input.Len())
},
},
{
Expand All @@ -85,6 +86,49 @@ func Test_limit(t *testing.T) {
expectedMap.UpsertBool("test3", true)
},
},
{
name: "keep one key",
target: target,
limit: int64(2),
keep: []string{"test3"},
want: func(expectedMap pcommon.Map) {
expectedMap.Clear()
expectedMap.InsertString("test", "hello world")
expectedMap.InsertBool("test3", true)
},
},
{
name: "keep same # of keys as limit",
target: target,
limit: int64(2),
keep: []string{"test", "test3"},
want: func(expectedMap pcommon.Map) {
expectedMap.Clear()
expectedMap.InsertString("test", "hello world")
expectedMap.InsertBool("test3", true)
},
},
{
name: "keep not existing key",
target: target,
limit: int64(1),
keep: []string{"te"},
want: func(expectedMap pcommon.Map) {
expectedMap.Clear()
expectedMap.InsertString("test", "hello world")
},
},
{
name: "keep not-/existing keys",
target: target,
limit: int64(2),
keep: []string{"te", "test3"},
want: func(expectedMap pcommon.Map) {
expectedMap.Clear()
expectedMap.InsertString("test", "hello world")
expectedMap.InsertBool("test3", true)
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
Expand All @@ -95,7 +139,7 @@ func Test_limit(t *testing.T) {
Item: scenarioMap,
}

exprFunc, _ := Limit(tt.target, tt.limit)
exprFunc, _ := Limit(tt.target, tt.limit, tt.keep)
exprFunc(ctx)
actual := ctx.GetItem()

Expand All @@ -111,18 +155,25 @@ func Test_limit_validation(t *testing.T) {
tests := []struct {
name string
target tql.GetSetter
keep []string
limit int64
}{
{
name: "limit less than zero",
target: &tql.StandardGetSetter{},
limit: int64(-1),
},
{
name: "limit less than # of keep attrs",
target: &tql.StandardGetSetter{},
keep: []string{"test", "test"},
limit: int64(1),
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
_, err := Limit(tt.target, tt.limit)
assert.Error(t, err, "invalid limit for limit function, -1 cannot be negative")
_, err := Limit(tt.target, tt.limit, tt.keep)
assert.NotNil(t, err)
})
}
}
Expand All @@ -142,7 +193,7 @@ func Test_limit_bad_input(t *testing.T) {
},
}

exprFunc, _ := Limit(target, 1)
exprFunc, _ := Limit(target, 1, []string{})
exprFunc(ctx)

assert.Equal(t, pcommon.NewValueString("not a map"), input)
Expand All @@ -162,6 +213,6 @@ func Test_limit_get_nil(t *testing.T) {
},
}

exprFunc, _ := Limit(target, 1)
exprFunc, _ := Limit(target, 1, []string{})
exprFunc(ctx)
}
4 changes: 2 additions & 2 deletions processor/transformprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ transform:
queries:
- set(metric.description, "Sum") where metric.type == "Sum"
- keep_keys(resource.attributes, "host.name")
- limit(attributes, 100)
- limit(attributes, 100, "host.name")
- truncate_all(attributes, 4096)
- truncate_all(resource.attributes, 4096)
- convert_sum_to_gauge() where metric.name == "system.processes.count"
Expand Down Expand Up @@ -163,7 +163,7 @@ The transform processor's implementation of the [Telemetry Query Language](https
- Although the TQL allows the `set` function to be used with `metric.data_type`, its implementation in the transform processor is NOOP. To modify a data type you must use a function specific to that purpose.
- [Identity Conflict](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/standard-warnings.md#identity-conflict): Transformation of metrics have the potential to affect the identity of a metric leading to an Identity Crisis. Be especially cautious when transforming metric name and when reducing/changing existing attributes. Adding new attributes is safe.
- [Orphaned Telemetry](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/standard-warnings.md#orphaned-telemetry): The processor allows you to modify `span_id`, `trace_id`, and `parent_span_id` for traces and `span_id`, and `trace_id` logs. Modifying these fields could lead to orphaned spans or logs.
- The `limit` function drops attributes at random. If there are attributes that should never be dropped then this function should not be used. [#9734](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/9734)
- The `limit` function drops attributes at random. If there are attributes that should never be dropped please provide them as function arguments, e.g. `limit(attibutes, 10, "http.host", "http.method")`

[alpha]: https://github.com/open-telemetry/opentelemetry-collector#alpha
[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib
16 changes: 16 additions & 0 deletions unreleased/tql-limit-func-priority-keys.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: breaking

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: pkg/telemetryquerylanguage

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Add ability to specify attribute keys in the `limit` function that aren't allowed to be dropped"

# One or more tracking issues related to the change
issues: [9734]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: This breaking change affects the transform processor since we don't have a way to default values in a function call.