Skip to content
This repository has been archived by the owner on Oct 27, 2023. It is now read-only.

Add tool code #1

Merged
merged 34 commits into from
Jan 22, 2019
Merged

Add tool code #1

merged 34 commits into from
Jan 22, 2019

Conversation

corest
Copy link
Contributor

@corest corest commented Jan 17, 2019

@corest corest self-assigned this Jan 17, 2019
@corest corest requested review from a team January 17, 2019 17:34
response, _ = json.Marshal(healthCheckResponse{Status: "webhook request received"})
writeJSONResponse(w, http.StatusOK, response)

if h.Event.Ref == masterRef && stringInSlice(h.Event.Repository.Name, o.repositories) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule is created only for merges into master and for configured repositories

},
}

ttl := time.Now().Add(routingRuleTTL).UTC().Unix()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default ttl for automatically created rules is 1h

return microerror.Maskf(userNotFoundError, event.Pusher.Name)
}
routingRule := &opsgenie.RoutingRule{
Name: fmt.Sprintf("autooncall-%s-%s", event.HeadCommit.ID[:5], strconv.FormatInt(ttl, 10)),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule name is prefixed with auto-oncall+first 5 symbols of the head commit


conditions := []opsgenie.Rule{
opsgenie.Rule{
Value: event.Repository.Name,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conditions list include only the repository name for now. That can be extended later

@corest
Copy link
Contributor Author

corest commented Jan 17, 2019

helm chart be added in separate PR

Copy link
Contributor

@MarcelMue MarcelMue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that you didn't want to use microkit because of the overhead? Would be cool to align this a bit more, also with some health endpoint maybe and then deploy it to ginger or something.

Nice work overall!

LICENSE Outdated
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright 2016-2017 Giant Swarm GmbH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be something else, I think 2016-2019? Not sure right now.

main.go Outdated
var help = flag.Bool("help", false, "show help for this tool")

func main() {
flag.Set("logtostderr", "true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just use the viper / cobra standard magic boilerplate here. Could safe you some hassle later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is only one flag I really need, defining flag structure for viper wasn't a time-saver

main.go Outdated
panic(fmt.Sprintf("%#v", err))
}

var config endpoint.Config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be neat to have a /version endpoint which we can use to check if the project is coming up with architect and so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly speaking, I didn't even plan to setup ci for this tool. Just build once, push into quay and run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please setup CI and CD for this. Doesn't make sense to handle projects differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with CI, but not sure about CD. It is installed as application into tenant cluster, so imo docs in wiki for support process with configuration example should be enough. I don't want to make a part of control-plane deployment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs and website are also deployed automatically. imo CD is a must.

@corest
Copy link
Contributor Author

corest commented Jan 17, 2019

I've deployed it into our production site cluster and configured webhook handler for repository https://github.com/giantswarm/test-oncall. You can play with it by merging your PRs into that repository and check routingrules after

@corest
Copy link
Contributor Author

corest commented Jan 18, 2019

For the health I've added default handler as it also required for letsencrypt to validate certs

Copy link

@kopiczko kopiczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fist batch. I focused mostly on Hook struct.

//
// Implements validation described in github's documentation:
// https://developer.github.com/webhooks/securing/
func (h *Hook) signedBy(secret []byte) bool {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a method. This should be a function. We should avoid adding methods to pure data types. Exceptions may be things like String or MarshalToJson.

}

// New reads a Hook from an incoming HTTP Request.
func New(req *http.Request, secret []byte) (hook *Hook, err error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not return a pointer unless memory is a concern (which it isn't).

@corest
Copy link
Contributor Author

corest commented Jan 18, 2019

As I understood @xh3b4sd meant using microkit/microserver, server/service structure eg following api repo. I just didn’t want to bring all that complexity I don’t need in this project. But for the sake of standardization, I’m fine to rewrite it with general approach. Overall didn’t want to invest here that much time, but that also true it will be hard to support that in case we’ll need to update something.

Thx for review points. I’ll refactor it following all the suggestions

@xh3b4sd
Copy link
Contributor

xh3b4sd commented Jan 18, 2019

Sounds good. Let me know when you need anything.

response.StatusCode = http.StatusOK
}

go e.Service.Webhook.Process(h)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this approach. I didn't find an example of async stuff. After receiving a hook I don't want to wait for opsgenie and also Github shouldn't care if routingrule was created. It just needs approve that webhook was delivered

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it looks like we should rather implement an operator. We either process a request synchronously or create a CR in the first place and let it reconcile. Turning the request here into some background job will cause magic failures eventhough responses from the endpoint where successful. Then users try to do it again because nothing happened and then the system is in an inconsistent state.

What is the desired workflow this new service should be part of? Can you describe that or provide a simple diagram? We can maybe also hang and talk a bit to sort this out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operator? No. That's would be too much for this.
Deployment event happens -> github sends webhook to auto-oncall -> app dispatches payload and responds to github it received valid webhook -> async process creates routingrule

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why the processing has to be split out of the sync request. I think it is just fine then to remove the goroutine and add some backoff to be a bit more safe.

Copy link
Contributor Author

@corest corest Jan 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processing is separated from the sync, because the success of creating a routing rule is not a part of the response to github about webhook state. Github cares about the delivery of the webhook with payload, not about internal processing. E.g., github shouldn't get error about webhook delivery if opsgenie is down. At least that's how I see it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Then let's go with this and iterate in it. I think for the future it would make sense to create a CR synchronously and have an operator reconcile it accordingly. That way all logic is more or less guaranteed. Right now it is not, will fail once and be forgotten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
all other review points have been addressed.

@corest
Copy link
Contributor Author

corest commented Jan 18, 2019

Refactored a bit. Now I couldn't tell that is this project for. But structure was followed :D

@corest
Copy link
Contributor Author

corest commented Jan 19, 2019

Also I changed logic.
Now instead of push events, deployment events are used.
If deployment event was created by taylorbot - app will get author of reference commit and create for him routing rule. If you used your own github api key and created deployment event with opsctl - routingrule will be automatically created for your deployment.

@teemow
Copy link
Member

teemow commented Jan 21, 2019

Nice! 👍

@corest
Copy link
Contributor Author

corest commented Jan 21, 2019

@xh3b4sd ptal

Copy link
Contributor

@xh3b4sd xh3b4sd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the binary is committed. Let's remove it and have a .gitignore.

# go-tests = true
# unused-packages = true

[prune]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are no dependencies at all. Maybe try the init again.

auto-oncall application is a webhook handler, responsible for creating new Opsgenie routing rules on every deployment event.

# configuration
Configuration requires next data to be configured in `values.yaml` of the helm chart:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like it should be a secret and not a configmap. As which is it used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a secret, that's just a values file for helm, which used for generating templates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,29 @@
package webhook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of the file is a bit off. There is no Event type in event.go. We usually call files in which we gather specs and interfaces spec.go, especially when there are lots of different types.

func (s *Service) createRoutingRule(event DeploymentEvent) error {
var err error

var opsGenieService *opsgenie.OpsGenie
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opsgenie service should be configured as a dependency at program boot. There is no reason to do it over and over again at runtime.

User: user,
}

err = opsGenieService.CreateEscalation(routingRule)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The different calls could be performed in parallel if performance is an issue for the service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, escalation and routing rule are related entities. Routing rule depends on escalation

Copy link
Contributor

@xh3b4sd xh3b4sd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty nice work. I think that goes into the right direction.


[[constraint]]
name = "github.com/giantswarm/opsctl"
version = "9923241.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that right? It does not look right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what dep generated for me

Copy link

@tuommaki tuommaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit late to the parties, but added a notion of possibility of using string map for users. Not compulsory, but could improve maintainability of that user map.

daemonCommand := newCommand.DaemonCommand().CobraCommand()

daemonCommand.PersistentFlags().String(f.Service.Oncall.OpsgenieToken, "", "Opsgenie API token.")
daemonCommand.PersistentFlags().String(f.Service.Oncall.Users, "", "github_id:opsgenie_id mapppings, separated by comma.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use map[string]string for Users? See https://godoc.org/github.com/spf13/pflag#FlagSet.GetStringToString and spf13/pflag#133 so it should work --service.oncall.users=github_id0=opsgenie_id0,github_id1=opsgenie_id1,... - then one can write user mappings as a YAML map in installations file and do something similar to what we did with string slices here: https://github.com/giantswarm/aws-operator/blob/master/helm/aws-operator-chart/templates/03-configmap.yaml#L18

Later when querying for value, I think you would just do: viper.GetStringMapString(f.Service.Oncall.Users) and get a map of users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using map[string]string for Users on the service level. But with flag this magic didn't work for me. It just fails to automatically dispatch that flag into map[string]string

@corest corest merged commit 1e93fa1 into master Jan 22, 2019
@corest corest deleted the development branch January 22, 2019 09:19
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants