Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate AWS target group hash to rollout metadata #1608

Closed
harikrongali opened this issue Oct 22, 2021 · 11 comments
Closed

Propagate AWS target group hash to rollout metadata #1608

harikrongali opened this issue Oct 22, 2021 · 11 comments
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Milestone

Comments

@harikrongali
Copy link
Contributor

Summary

What change needs making?
Similar to canary/stable hash values, propagate AWS target group hashes for canary & stable
Will be available only if --aws-verify-target-group enabled

Use Cases

When would you use this?


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@harikrongali harikrongali added the enhancement New feature or request label Oct 22, 2021
@harikrongali harikrongali added this to the v1.2 milestone Oct 22, 2021
@harikrongali
Copy link
Contributor Author

harikrongali commented Oct 27, 2021

Similar to PodTemplateHashValue that can be referred in analysis templates, requirement is to refer targetGroup values for services.
For ex:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: error-count
spec:
  args:
  - name: targetgroup
  metrics:
    - name: error-count
      initialDelay: 1m
      interval: 1m
      count: 3
      failureCondition: result > 50
      provider:
        wavefront:
          address: intuit.wavefront.com
          query: >
            sum(default(0, ts(aws.alb.httpcode_target_5xx_count, targetgroup={{args.canary-targetgroup}})))
---
kind: Rollout
apiVersion: argoproj.io/v1alpha1
metadata:
  name: rollout
spec:
  # Set replicas to null when enabling HPA, sample: 'replicas: null'
  replicas: 2
  strategy:
    canary:
      canaryService: desired-service
      stableService: stable-service
      canaryMetadata:
        annotations:
          role: canary
        labels:
          role: canary
      stableMetadata:
        annotations:
          role: stable
        labels:
          role: stable
      steps:
        - setCanaryScale:
            replicas: 1
        - pause: { }
        - setWeight: 5
        - setCanaryScale:
            matchTrafficWeight: true
      - analysis:
          templates:
          - templateName: error-count
          args:
          - name: stable-targetgroup
            valueFrom:
              targetGroupValue: StableService
          - name: canary-targetgroup
            valueFrom:
              targetGroupValue: CanaryService

2 features to add.
First one is to populate services metadata labels with targetgroup values similar to rollouts-pod-template-hash
Second one is to add support ValueFrom targetGroupValue which is similar to #797

@harikrongali
Copy link
Contributor Author

harikrongali commented Oct 27, 2021

@jessesuen let me know if the above make sense or if there is a better way to do?

@noam-codefresh
Copy link
Contributor

Hello.
I am trying to work on this request, and here is what I have currently made so far:
In the awsVerifyTargetGroups func, I added some code that adds the targetGroupARN as a label on the service. However, so far I keep getting a !verifyRes.Verified response from aws.VerifyTargetGroupBinding
When I drilled into it, I saw that the service's endpoints Subsets.Address have the internal IPs of the matching Pod(s), while the code is later comparing them to the targetStr which is basically the EC2 instance Id that contains the node (If I understand correctly). Is my current setup or deployment somehow messed up? Should I do anything differently to get a Verified response?

Also - do you expect the full ARN to be persisted as a label on the service? Or just the unique part of it? And what should I do when there are several TargetGroupBindings on the same service? Is this even a valid scenario?

@harikrongali
Copy link
Contributor Author

you dont need full ARN, just the id is enough, for example arn:aws:elasticloadbalancing:us-east-2:795188202216:targetgroup/k8s-testargo-albrollo-2249f2dae0/7010986ede9433b1
--> k8s-testargo-albrollo-2249f2dae0 is enough

can you post the ingress and rollout manifest files?

@noam-codefresh
Copy link
Contributor

Sure. I was mostly using the ALB getting started guide.
Here is my ingress:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: rollouts-demo-ingress
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
spec:
  rules:
  - http:
      paths:
      - path: /*
        backend:
          serviceName: rollouts-demo-root
          servicePort: use-annotation

and Rollout:

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  replicas: 1
  strategy:
    canary:
      canaryService: rollouts-demo-canary
      stableService: rollouts-demo-stable
      trafficRouting:
        alb:
          ingress: rollouts-demo-ingress
          servicePort: 80
          rootService: rollouts-demo-root
      steps:
        - setWeight: 5
        - pause: {}
  revisionHistoryLimit: 2
  selector:
    matchLabels:
      app: rollouts-demo
  template:
    metadata:
      labels:
        app: rollouts-demo
    spec:
      containers:
        - name: rollouts-demo
          image: argoproj/rollouts-demo:blue
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          resources:
            requests:
              memory: 32Mi
              cpu: 5m

I installed the ALB resources and controller from v2.3, and after some fiddling got everything to work with internet-facing ingress (I had to manually set the availability zones of the ALB to use the public subnets, instead of the private ones).

@noam-codefresh
Copy link
Contributor

oh, I think my problem is related to the tgb.Spec.TargetType being instance and not ip. I made a little hack to get it to pass some condition in the code (for debugging purposes, as a start), but of course, there are consequences to doing such hacks.
Is this feature meant to work with instance type as well? if so, I will probably need to add some resolving from the endpoints' IP addresses to the instance ids, in order to make the match.
If it's supposed to work only on ip - how can I easily make the required changes to my simple setup?

@noam-codefresh
Copy link
Contributor

For the 2nd side of the task - I've been looking at adding the stable and canary (or active/preview) to the BuildArgumentsForRolloutAnalysisRun func. But it will require getting those two services from the rollout (there's a blueGreen implementation, I'll need to add a similar canary implementation as well - and wrap in a generic one that just returns both services maybe).
Maybe another option is to just put the targetGroupID values on the ReplicaSets directly, and then the BuildArgumentsForRolloutAnalysisRun func can keep its signature.

Also - do you expect the optional values of targetGroupValue to be StableService, CanaryService, or also allow ActiveService and PreviewService (depending on whether its blueGreen or canary strategy)?

@jessesuen
Copy link
Member

jessesuen commented Nov 2, 2021

@jessesuen let me know if the above make sense or if there is a better way to do?

Instead of this:

            valueFrom:
              targetGroupValue: StableService
          - name: canary-targetgroup
            valueFrom:
              targetGroupValue: CanaryService

When we spoke about this a while back, I was originally thinking we could have valueFrom do this in a more generic way, referencing field metadata in the status. This way it would not be specific to AWS. Also, it would allow the work items could be split into two tasks:

  1. populate AWS metadata into status (for debugging and/or to be passed to analysis)
  2. support supplying status fields to analysis. To make this easier, the feature could be allowing JSON strings to analysis and allowing analysis to use dot-notation referencing if the string is a json object.

@harikrongali
Copy link
Contributor Author

Thanks for the comment @jessesuen . I think this makes sense to have a more generic option without sticking to AWS.
So, it will be populating AWS metadata (targetGroup information) to rollout Status and then Analysis objects should be able to consume json objects. And values will be passed in using ValueFrom reference.

@noam-codefresh let me know above comment make sense.

@jessesuen
Copy link
Member

We have two issues for this already:

  1. Surface ALB information into rollout status #1241
  2. Allow analysis arguments to get valueFrom Rollout status #1242

I think we should dup this one

@harikrongali
Copy link
Contributor Author

@noam-codefresh I will close this one. Please comment on #1241 and I will assign it to you.

@harikrongali harikrongali added the duplicate This issue or pull request already exists label Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants