Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup API for consistency, make it more consumable (breaking changes) #213

Merged
merged 5 commits into from
Jun 2, 2022

Conversation

jotak
Copy link
Member

@jotak jotak commented May 31, 2022

  • More consistent naming (camel-case) cf There are some inconsistencies in the API #206
  • Add json bindings
  • New pipeline builder to make it easier to consume and reuse API model
  • Remove garbage files (?) that prevents importing as lib from another
    project

Summary of renamings:

prom.metrics.valuekey => prom.metrics.valueKey
prom.expirytime => prom.expiryTime
kafka.addr => kafka.address
kafka.groupbalancers => kafka.groupBalancers
kafka.startoffset => kafka.startOffset
kafka.batchreadtimeout => kafka.batchReadTimeout
grpc.buffer_length => grpc.bufferLength
network.kubeconfigpath => network.kubeConfigPath
network.servicesfile => network.servicesFile
network.protocolsfile => network.protocolsFile
aggregates.Name => aggregates.name
aggregates.By => aggregates.by
aggregates.Operation => aggregates.operation
aggregates.RecordKey => aggregates.recordKey
aggregates.TopK => aggregates.topK

@jotak jotak requested review from eranra, KalmanMeth and ronensc May 31, 2022 13:19
@codecov-commenter
Copy link

codecov-commenter commented May 31, 2022

Codecov Report

Merging #213 (c7c1dab) into main (3d41231) will increase coverage by 0.51%.
The diff coverage is 82.17%.

@@            Coverage Diff             @@
##             main     #213      +/-   ##
==========================================
+ Coverage   59.32%   59.84%   +0.51%     
==========================================
  Files          59       60       +1     
  Lines        3469     3551      +82     
==========================================
+ Hits         2058     2125      +67     
- Misses       1278     1292      +14     
- Partials      133      134       +1     
Flag Coverage Δ
unittests 59.84% <82.17%> (+0.51%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/api/encode_kafka.go 100.00% <ø> (ø)
pkg/api/encode_prom.go 100.00% <ø> (ø)
pkg/api/transform_filter.go 100.00% <ø> (ø)
pkg/api/transform_generic.go 0.00% <ø> (ø)
pkg/api/transform_network.go 100.00% <ø> (ø)
pkg/api/write_loki.go 55.55% <ø> (ø)
pkg/config/config.go 50.00% <ø> (ø)
pkg/pipeline/ingest/ingest_collector.go 52.38% <0.00%> (-0.85%) ⬇️
pkg/pipeline/transform/transform.go 100.00% <ø> (ø)
pkg/pipeline/write/write_loki.go 64.24% <33.33%> (-1.30%) ⬇️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d41231...c7c1dab. Read the comment docs.

const TagDoc = "doc"
const TagEnum = "enum"
const (
FileType = "file"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jotak do you see a way to move all those constantly into the "enum" mechanism ... doing so will allow us to have automatic documentation of the specific options and make sure we maintain the list going forward.

Context: I was thinking going forward that we will be able to totally deprecate the free-text from the readme and relay on generated docs from the code ( using the doc tags in structs).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet, it seems to be a bigger refactoring
Currently only the api package is covered by doc, right? The idea would be to cover config as well? (config is like a superset on api) Also I'm not sure how the enum codegen works, is it run on a specific package or on all sources? Again, it's because these types are in config and not in api

Makefile Outdated
@@ -102,6 +102,10 @@ clean: ## Clean
tests-unit: validate_go ## Unit tests
go test -p 1 -race -coverpkg=./... -covermode=atomic -coverprofile=/tmp/coverage.out $$(go list ./... | grep -v /e2e)

.PHONY: tests-fast
tests-fast: validate_go ## Fast unit tests (no race test / coverage)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you merge this with the tests-unit stage

maybe by making the -race -coverpkg=./... -covermode=atomic -coverprofile=/tmp/coverage.out a variable the is set in tests-unit and is not set in tests-fast ???

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

jotak added 2 commits May 31, 2022 17:09
- More consistent naming (camel-case) cf netobserv#206
- Add json bindings
- New pipeline builder to make it easier to consume and reuse API model
- Remove garbage files (?) that prevents importing as lib from another
  project
Comment on lines +26 to +27
RemoveEntryIfExists string `yaml:"remove_entry_if_exists" json:"remove_entry_if_exists" doc:"removes the entry if the field exists"`
RemoveEntryIfDoesntExist string `yaml:"remove_entry_if_doesnt_exist" json:"remove_entry_if_doesnt_exist" doc:"removes the entry if the field doesnt exist"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other locations, we use camelCase for the yaml and json entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, I didn't change the enums because they are JSON values, not JSON keys. I think it's fine to keep them as they are, no?
so changes are only focusing on keys

Comment on lines +28 to +34
ConnTracking string `yaml:"conn_tracking" json:"conn_tracking" doc:"set output field to value of parameters field only for new flows by matching template in input field"`
AddRegExIf string `yaml:"add_regex_if" json:"add_regex_if" doc:"add output field if input field satisfies regex pattern from parameters field"`
AddIf string `yaml:"add_if" json:"add_if" doc:"add output field if input field satisfies criteria from parameters field"`
AddSubnet string `yaml:"add_subnet" json:"add_subnet" doc:"add output subnet field from input field and prefix length from parameters field"`
AddLocation string `yaml:"add_location" json:"add_location" doc:"add output location fields from input"`
AddService string `yaml:"add_service" json:"add_service" doc:"add output network service field from input port and parameters protocol field"`
AddKubernetes string `yaml:"add_kubernetes" json:"add_kubernetes" doc:"add output kubernetes fields from input"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere we use camelCase for the yaml and json definitions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same explanation here, it's because it's an enum, these names are values

pkg/api/transform_filter.go Outdated Show resolved Hide resolved
}

type TransformGenericOperationEnum struct {
PreserveOriginalKeys string `yaml:"preserve_original_keys" doc:"adds new keys in addition to existing keys (default)"`
ReplaceKeys string `yaml:"replace_keys" doc:"removes all old keys and uses only the new keys"`
PreserveOriginalKeys string `yaml:"preserve_original_keys" json:"preserveOriginalKeys" doc:"adds new keys in addition to existing keys (default)"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the different format between yaml and json intended?
The json tag in the other enums are formatted in snake_case.
Do we want to stick with camelCase also with the enum values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops good catch for the json here, they should be the same as yaml.
I'd stick with the snake_case here. My rationale is that it's typical to enforce case on data keys, but not on data values. I mean, values can be pretty much anything we don't have to enforce a style on them ... and currently all enums are snake case, so they are consistent ... sounds good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially suggested camelCase to follows k8s yamls conventions. They apply camelCase also on enum values. For example: imagePullPolicy: IfNotPresent.
Anyway, we can do whatever we want and I'm not against snake_case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm so in this kube example, first letter is capital. So we'd have PreserveOriginalKeys here. is it what we want? (I don't have a strong opinion)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right. I can't believe I gave an example that contradicts my claim 🤣
So, maybe we should keep our enums as they are for now as you suggest...

"github.com/netobserv/flowlogs-pipeline/pkg/api"
"github.com/stretchr/testify/require"
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a test for NewGRPCPipeline() ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested extensively all possible stages, but yes I can add one

require.Equal(t, `{"name":"prom","encode":{"type":"prom","prom":{"metrics":[{"name":"connections_per_source_as","type":"counter","filter":{"key":"name","value":"src_as_connection_count"},"valueKey":"recent_count","labels":["by","aggregate"],"buckets":[]}],"port":9090,"prefix":"flp_","expiryTime":0}}}`, string(b))
}

func TestForkPipeline(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called ...Fork...? I can't find the fork 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's because both WriteLoki and WriteStdout are called from the same plFork instance, so they take the same follows stage, resulting in a fork. If I wanted to chain WriteLoki and WriteStdout (ok, even if it doesn't make sense and would produce an error at runtime) I would have written:

  next := plFork.WriteLoki("loki", api.GetWriteLokiDefaults())
  next.WriteStdout("stdout", api.WriteStdout{})

I guess I should add some doc

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. thanks!

func (b *PipelineBuilder) next(name string, param StageParam) PipelineBuilder {
b.pipeline.stages = append(b.pipeline.stages, Stage{Name: name, Follows: b.lastStage})
b.pipeline.config = append(b.pipeline.config, param)
return PipelineBuilder{pipeline: b.pipeline, lastStage: name}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create a new PipelineBuilder or return b?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning a new PipelineBuilder allows the caller to either chain the next stage (using that new PipelineBuilder) or create a fork by keep using the old PipelineBuilder.
I can rename to PipelineBuilderStage if it helps clarify

@jotak
Copy link
Member Author

jotak commented Jun 1, 2022

@ronensc comments addressed, some documentation added

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants