Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor datadog hostname/port env vars #9466

Closed

Conversation

chazdnato
Copy link

Summary

This addresses an issue reported here:
#7954

While the tests set a null value to instead inherit the environment variables, this does not work in a Kubernetes context. For example, if I have this config:

config:
  host: null
  port: 8125

The hostname still resolves to localhost once it gets through the schema.lua defaults.

This approach reverses the order of ENV var evaluation, so that environment variables are used if set, and config if not.

Full changelog

  • reverses order of logic determining if env vars or config is used for host/port
  • adds test to ensure config is used if env var is not set

Issue reference

Fix #7954

This addresses an issue reported here:
Kong#7954

While the tests set a null value to instead inherit the environment
variables, this does not work in a Kubernetes context. For example, if I
have this config:

```
config:
  host: null
  port: 8125
```

The hostname still resolves to `localhost` once it gets through the
schema.lua defaults.

This approach reverses the order of ENV var evaluation, so that
environment variables are used if set, and config if not.
@chazdnato chazdnato requested a review from a team as a code owner September 23, 2022 16:51
@Tieske
Copy link
Member

Tieske commented Sep 26, 2022

This changes behaviour, so is a breaking change.

@Tieske
Copy link
Member

Tieske commented Sep 26, 2022

the proper way to specify a null value in yaml is to leave the line blank, like this:

config:
  host: 
  port: 8125

@chazdnato
Copy link
Author

chazdnato commented Sep 26, 2022

the proper way to specify a null value in yaml is to leave the line blank, like this:

config:
  host: 
  port: 8125

Even with that setup, the plugin renders it as localhost. Here are my configs (this is using the helm charts). Helm will interpret any null as "don't put that key in place":

If you need to delete a key from the default values, you may override the value of the key to be null, in which case Helm will remove the key from the overridden values merge. [source] source

Helm chart:

apiVersion: configuration.konghq.com/v1
kind: KongClusterPlugin
metadata:
  name: global-datadog
  annotations:
    kubernetes.io/ingress.class: kong
  labels:
    global: "true"
config:
  host:
  port: 8125
  metrics:
  prefix: api-gateway
  service_name_tag: kong_upstream
plugin: datadog

Datadog config as rendered by k8s

❯ kc get kongclusterplugin global-datadog -o yaml
apiVersion: configuration.konghq.com/v1
config:
  host: null
  port: 8125
  prefix: api-gateway
  service_name_tag: kong_upstream
kind: KongClusterPlugin
plugin: datadog

And thus the Datadog config as interpreted by kong

{
  "consumer": null,
  "protocols": [
    "http",
    "https"
  ],
  "enabled": true,
  "route": null,
  "config": {
    "consumer_tag": "consumer",
    "status_tag": "status",
    "host": "localhost",
    "prefix": "api-gateway",
    "service_name_tag": "kong_upstream",
    "port": 8125
  },
  "name": "datadog",
  "id": "0524254d-9a97-5803-b2b3-dfd1233ec721",
  "created_at": 1664195179,
  "service": null,
  "tags": null
}

My solution as prescribed should leave those using ENV vars in non-k8s context still working as expected, and those using configs still working as expected. The only anticipated change in behaviour could be if someone has both env vars and config setup, and they expect their config to be used.

I'm open to other suggested approaches.

@Tieske
Copy link
Member

Tieske commented Sep 26, 2022

Even with that setup, the plugin renders it as localhost. Here are my configs (this is using the helm charts). Helm will interpret any null as "don't put that key in place":

If that is the case then this is a major issue. Why doesn't helm/k8s stick to the JSON semantics???? There are many places where the Kong admin API relies on a JSON-null value to be specified by the user to clear a configuration item, or a default value.

@chazdnato
Copy link
Author

chazdnato commented Sep 26, 2022

Even with that setup, the plugin renders it as localhost. Here are my configs (this is using the helm charts). Helm will interpret any null as "don't put that key in place":

If that is the case then this is a major issue. Why doesn't helm/k8s stick to the JSON semantics???? There are many places where the Kong admin API relies on a JSON-null value to be specified by the user to clear a configuration item, or a default value.

I would presume because it's YAML underneath, and not JSON. Be that as it may, the issue we're trying to fix issue number 7954 has some k8s specific work arounds, that aren't viable if one is unable to use k8s 1.22+ for whatever reason.

@Tieske
Copy link
Member

Tieske commented Sep 26, 2022

I would presume because it's YAML underneath, and not JSON.

Nope, the example before is the YAML way of expressing a JSON-null;

config:
  host: 
  port: 8125

@chazdnato
Copy link
Author

Would you be able to detail the scenarios under which this may be a breaking change? Do you have alternative solutions to the approach suggested?

@chazdnato
Copy link
Author

I would presume because it's YAML underneath, and not JSON.

Nope, the example before is the YAML way of expressing a JSON-null;

config:
  host: 
  port: 8125

Both are valid: https://yaml.org/spec/1.2.2/#10211-null

However, this is not germane, and unproductive.

@Tieske
Copy link
Member

Tieske commented Sep 27, 2022

Would you be able to detail the scenarios under which this may be a breaking change? Do you have alternative solutions to the approach suggested?

if both plugin config and envvars are set, it previously used plugin config, and will now pick up envvars. That's a breaking change.

@chazdnato
Copy link
Author

Would you be able to detail the scenarios under which this may be a breaking change? Do you have alternative solutions to the approach suggested?

if both plugin config and envvars are set, it previously used plugin config, and will now pick up envvars. That's a breaking change.

Would you accept a configuration parameter that would default the current logic, but if set apply the logic I've implemented?

@chazdnato
Copy link
Author

It appears this approach has been suggested in the past: #8106 (comment)

Have replied in that thread, with hope we can get a good direction going with this.

@kikito
Copy link
Member

kikito commented Oct 3, 2022

In my opinion there is two different problems at play here:

  1. Helm/K8s (it is unclear which one is at doing this) treat "a field that isn't set" as equivalent to "a field that is set to null", while Kong treats those differently in some cases (null means "don't use the default" in Kong)
  2. Kong schemas have no functionality for reading environment variables. Every time this is needed this is done after the schema has already finished doing its work. This issue (and fix: Remove the default host and port from datadog plugin schema #8106) result from the same missing functionality.

My personal opinion is that for 1, Helm/K8s is interpreting the YAML spec wrong. Admittedly, the spec mentions that:

Nodes with empty are interpreted as if they were plain scalars with an empty value. Such nodes are commonly resolved to a “null” value.

But commonly resolved is not the same as should always be resolved, which is what seems to be happening. In fact, on the description about null, this is addressed specifically.

Represents the lack of a value. This is typically bound to a native null-like value (e.g., undef in Perl, None in Python). Note that a null is different from an empty string. Also, a mapping entry with some key and a null value is valid and different from not having that key in the mapping.

Emphasis mine.

I think Helm/K8s is at fault; they should not be adding nulls to people's yaml files.

As for 2: Ideally we should have a way to handle environment variables inside schemas. At least we should have the option of making the default a piece of Lua code that can go and grab that:

{ host = typedefs.host({ default = function() return os.getenv("DATADOG_HOST") or "localhost" end }), }

Or even add a new exclusive schema field, just for this purpose:

{ host = typedefs.host({ default_env_var = "DATADOG_HOST", default = "localhost" }), }

I don't know which option is better yet, but it is clear to me that this shouldn't be resolved by individual plugins every time they to read env variables for default values. The change needs to happen on the schema level, and then all plugins will be able to use it.

@chazdnato
Copy link
Author

chazdnato commented Oct 3, 2022

  1. Helm/K8s (it is unclear which one is at doing this) treat "a field that isn't set" as equivalent to "a field that is set to null", while Kong treats those differently in some cases (null means "don't use the default" in Kong)

I think Helm/K8s is at fault; they should not be adding nulls to people's yaml files.

I did some digging, and I think it might be the kubernetes-ingress-controller (on top of what k8s does). I can use helm/k8s to create an object that, when empty in the config, ends up with a null value in k8s [agreed this isn't optimal, but it's better than removing the key!]:

> kubectl get kongclusterplugin global-datadog -o yaml
apiVersion: configuration.konghq.com/v1
config:
  host: null
    - app:kong
  port: 8125

This is the same if I look at the object as JSON. Note that if I edit the value to be non-null, then re-edit to be blank, the key is removed entirely. This sounds like a k8s-ism, though helm may be converting empty keys to null at first.

This said, when pulling this data to post to kong, the kubernetes-ingress-controller will omitempty:

https://github.com/Kong/kubernetes-ingress-controller/blob/963623d02453d49f284bcef4dfd078cf76b684d5/pkg/apis/configuration/v1/kongclusterplugin_types.go#L51

This means that, even if k8s did maintain an empty value (or null), it would remove that key from the parsed config anyway - this is likely where the problem is actually happening.

While the kong world may say null == not default, this sadly doesn't translate to a k8s world.

@bungle
Copy link
Member

bungle commented Oct 5, 2022

Kong now has Secret Management built in. So there is another option:

  1. mark a string field as referenceable=true
  2. use reference like {vault://env/datadog-host}

@chazdnato
Copy link
Author

Kong now has Secret Management built in. So there is another option:

  1. mark a string field as referenceable=true
  2. use reference like {vault://env/datadog-host}

Can you provide guidance (or documentation?) on how this would work in a k8s context? This might be a viable approach!

The workaround suggested here is also viable if one is able to upgrade to k8s 1.21+ - I think this is the approach we're going to have to take.

Is it worthwhile doing a documentation update on these behaviours with plugins that can have ENV vars, where there is an incompatibility with how k8s interprets blank/null values?

@chazdnato
Copy link
Author

I have found that this behaviour is actually (also) enforced by Kubernetes Ingress Controller. The addition of this goes pretty far back, but there isn't any description in the commit/PR as to why this particular method was chosen or what specific problem(s) it was trying to solve:

https://github.com/Kong/kubernetes-ingress-controller/blob/649298f33179a2a8e6dfe5942d466c58f7e3cdf0/internal/dataplane/sendconfig/sendconfig.go#L177

Copy link
Contributor

@StarlightIbuki StarlightIbuki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not change the precedence for kong.conf and the environment variable. I've suggested the other solution.

I assume your use case to be that leave host and port not set, and provide them via environment variable,


function statsd_mt:new(conf)
local sock = udp()
local host = conf.host or env_datadog_agent_host
local port = conf.port or env_datadog_agent_port
local host = env_datadog_agent_host or conf.host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the environment variable takes precedence over kong.conf, which has different behavior compared to what this originally intended.

Maybe we could remove the default value for kong.conf, and:

  1. check if conf.host is set; use that if so;
  2. check if 'KONG_DATADOG_AGENT_HOST' env is set; use that if so;
  3. use the default value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check

{ host = typedefs.host({ default = "localhost" }), },

removing those default values will get us a nil when the option is omitted.

local host = conf.host or env_datadog_agent_host
local port = conf.port or env_datadog_agent_port
local host = env_datadog_agent_host or conf.host
local port = tonumber(env_datadog_agent_port or conf.port)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local port = tonumber(env_datadog_agent_port or conf.port)
local port = tonumber(env_datadog_agent_port) or conf.port

Comment on lines +157 to +158
host = "no-such-config.com", -- plugin takes above env var value, if env var is set
port = 8125, -- plugin takes above env var value, if env var is set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means env takes precedence over kong.conf.

@jaswanthikolla
Copy link

We currently have dead code in the plugin where conf.host or env_datadog_agent_host the later part of the value is never evaluated.

@StarlightIbuki
It's fine to push back on this PR with different solution. But, we should fix the apparent bug right? Once we do that, alternative solution can be done.

@fffonion
Copy link
Contributor

fffonion commented Feb 27, 2023

@jaswanthikolla this is not dead code, conf.KEY will only be nil if user explictly set it using the method we describe above.

@chazdnato As of current state, i would suggest use secrets management (env var: https://docs.konghq.com/gateway/latest/kong-enterprise/secrets-management/getting-started/) as a workaround; please also open an issue at https://github.com/Kong/kubernetes-ingress-controller to actually fix the default value issue.

@jaswanthikolla
Copy link

jaswanthikolla commented Feb 27, 2023

@fffonion

conf.host will never be nil because we have default value with { host = typedefs.host({ default = "localhost" }), },

So, the later part of the or condition will never evaluate, hence the dead code.

@fffonion
Copy link
Contributor

@jaswanthikolla It will, you can verify simply with curl localhost:8001/plugins -d name=datadog -d config.host=. Default values are only taken if user doesn't configure it; doesn't configure it vs leave it as empty is two different intents. The logic here is not a bug, the user experience is what needs improvement.
I understand the emotion that you might have when encounted an expected behaviour, but let's also interprete behaviour code with tests to avoid misleading reviewers or the original author.

@jaswanthikolla
Copy link

jaswanthikolla commented Feb 28, 2023

Default values are only taken if user doesn't configure it; doesn't configure it vs leave it as empty is two different intents.

Isn't that more confusing and against the mainstream? Let's go through it ( actually what you proposed doesn't work more on that later)

  • If I don't configure it, I will get default value
  • If I leave empty, I will get environment value.
  • Isn't very easy for different systems to make mistake such as leaving empty/nil, they inject default value? We're depending on this subtle difference which is already broken, and explained in the following.

Bug:

you can verify simply with curl localhost:8001/plugins -d name=datadog -d config.host=.

Sample Plugin config :

 fields = {
          { host = typedefs.host({ default = "localhost" }), },
  1. I tried configuring with empty with following, and I get error {plugins={[2]={config={host=\\\"length must be at least 1\\\"}}}}\")
config:
  host: ""
  1. I configured with null and I get default value. config: \n host: null
ngx_log(ngx_ERR, _log_prefix, " ----- HOST DEFAULT VAL:  " .. conf.host)
prints 
----- HOST DEFAULT VAL:  localhost

This bug is just a demonstration how fickle the contract of leaving empty vs configuring it. I think we should rather fix it by use_env or just remove the default configs of the datadog plugin.

@hbagdi I've raised another PR as I can't push to this PR.

@fffonion
Copy link
Contributor

fffonion commented Mar 2, 2023

@jaswanthikolla My previous comment was simply to point out the logic of code, that it's not a bug and code is executed correctly as designed. I do agree on the bad user experience here. IMO the proper way of solving this is not to going back and forth on designing how a null value should be handled in kong, but rather to provide a proper way for user to configure plugins and other kong entities. And the answer to this as @bungle has proposed:

Kong now has Secret Management built in. So there is another option:

- mark a string field as referenceable=true
- use reference like {vault://env/datadog-host}

@hbagdi
Copy link
Member

hbagdi commented Mar 2, 2023

@jaswanthikolla Apologies for misleading you and others.
The solution that @fffonion has proposed above is indeed the "right" solution so long as Kong is concerned and it also the easiest one to implement, maintain and test.

The way this will look to an operator is:

config:
  host: "{vault://env/DATADOG_HOST}"

The end result should be the same and this should provide a much better user experience to all the users.
Is that acceptable to you?

@chazdnato
Copy link
Author

chazdnato commented Mar 3, 2023

The way this will look to an operator is:


config:

  host: "{vault://env/DATADOG_HOST}"

The end result should be the same and this should provide a much better user experience to all the users.

Is that acceptable to you?

Will this work with non-enterprise Kong, and which version do we need to run to get this feature?

Does this work with the kubernetes-ingress-controller (which version)?

Thanks!

@hbagdi
Copy link
Member

hbagdi commented Mar 3, 2023

Will this work with non-enterprise Kong, and which version do we need to run to get this feature?

Absolutely. Environment variable based secrets are supported in open-source Kong.

Does this work with the kubernetes-ingress-controller (which version)?

The version of the ingress controller shouldn't matter but the Kong gateway version will matter (needs to be latest).

@hbagdi
Copy link
Member

hbagdi commented Mar 8, 2023

A gentle ping to move this forward @chazdnato @jaswanthikolla.
We would like to fix this problem for OpenTelemetery and Datadog plugins.
I'm not aware of any other plugins that face this problem but if there are, then we should consider including them as well.

@chazdnato
Copy link
Author

A gentle ping to move this forward @chazdnato @jaswanthikolla. We would like to fix this problem for OpenTelemetery and Datadog plugins. I'm not aware of any other plugins that face this problem but if there are, then we should consider including them as well.

I am content to close out this investigation, as the [workaround suggested previously] (#7954 (comment)) is fine. I would like to test this on Kong 3.x, but we're not upgraded yet. The only documentation I can find for vault://env/ usage in the Enterprise version of Kong. Will this be updated?

Thank you for looping back!

@hbagdi
Copy link
Member

hbagdi commented Mar 9, 2023

I am content to close out this investigation, as the [workaround suggested previously] (#7954 (comment)) is fine. I would like to test this on Kong 3.x, but we're not upgraded yet. The only documentation I can find for vault://env/ usage in the Enterprise version of Kong. Will this be updated?

I think there is some confusion.
The approach mentioned above is something can be implemented. It doesn't actually work this way today and there is some work to be done. Would be you be interested in submitting a patch for it?
It involves adding referenceable = true to the appropriate field schemas.

@jaswanthikolla
Copy link

jaswanthikolla commented Mar 13, 2023

@hbagdi @fffonion Sorry, I got confused that you are referring to k8s secrets. Yep, it's going to work.

It involves adding referenceable = true to the appropriate field schemas.

As you said, we need this PR#10398 to make it work. But, this secrets solution doesn't work for #10383

@chazdnato
Copy link
Author

Given the solutions discussed and the way forward, I am closing this in favour of #10484

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to set datadog hostname or port with environment variables
8 participants