-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
endpoint/endpoint.go
Outdated
response, _ = json.Marshal(healthCheckResponse{Status: "webhook request received"}) | ||
writeJSONResponse(w, http.StatusOK, response) | ||
|
||
if h.Event.Ref == masterRef && stringInSlice(h.Event.Repository.Name, o.repositories) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rule is created only for merges into master and for configured repositories
endpoint/endpoint.go
Outdated
}, | ||
} | ||
|
||
ttl := time.Now().Add(routingRuleTTL).UTC().Unix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default ttl for automatically created rules is 1h
endpoint/endpoint.go
Outdated
return microerror.Maskf(userNotFoundError, event.Pusher.Name) | ||
} | ||
routingRule := &opsgenie.RoutingRule{ | ||
Name: fmt.Sprintf("autooncall-%s-%s", event.HeadCommit.ID[:5], strconv.FormatInt(ttl, 10)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rule name is prefixed with auto-oncall
+first 5 symbols of the head commit
endpoint/endpoint.go
Outdated
|
||
conditions := []opsgenie.Rule{ | ||
opsgenie.Rule{ | ||
Value: event.Repository.Name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conditions list include only the repository name for now. That can be extended later
helm chart be added in separate PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that you didn't want to use microkit because of the overhead? Would be cool to align this a bit more, also with some health endpoint maybe and then deploy it to ginger
or something.
Nice work overall!
LICENSE
Outdated
same "printed page" as the copyright notice for easier | ||
identification within third-party archives. | ||
|
||
Copyright 2016-2017 Giant Swarm GmbH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be something else, I think 2016-2019? Not sure right now.
main.go
Outdated
var help = flag.Bool("help", false, "show help for this tool") | ||
|
||
func main() { | ||
flag.Set("logtostderr", "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just use the viper / cobra standard magic boilerplate here. Could safe you some hassle later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is only one flag I really need, defining flag structure for viper wasn't a time-saver
main.go
Outdated
panic(fmt.Sprintf("%#v", err)) | ||
} | ||
|
||
var config endpoint.Config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be neat to have a /version
endpoint which we can use to check if the project is coming up with architect and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly speaking, I didn't even plan to setup ci for this tool. Just build once, push into quay and run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please setup CI and CD for this. Doesn't make sense to handle projects differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with CI, but not sure about CD. It is installed as application into tenant cluster, so imo docs in wiki for support process with configuration example should be enough. I don't want to make a part of control-plane deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs and website are also deployed automatically. imo CD is a must.
Co-Authored-By: corest <[email protected]>
I've deployed it into our production site cluster and configured webhook handler for repository https://github.com/giantswarm/test-oncall. You can play with it by merging your PRs into that repository and check routingrules after |
For the health I've added default handler as it also required for letsencrypt to validate certs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fist batch. I focused mostly on Hook
struct.
service/githubhook/hook.go
Outdated
// | ||
// Implements validation described in github's documentation: | ||
// https://developer.github.com/webhooks/securing/ | ||
func (h *Hook) signedBy(secret []byte) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be a method. This should be a function. We should avoid adding methods to pure data types. Exceptions may be things like String
or MarshalToJson
.
service/githubhook/hook.go
Outdated
} | ||
|
||
// New reads a Hook from an incoming HTTP Request. | ||
func New(req *http.Request, secret []byte) (hook *Hook, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not return a pointer unless memory is a concern (which it isn't).
As I understood @xh3b4sd meant using microkit/microserver, server/service structure eg following api repo. I just didn’t want to bring all that complexity I don’t need in this project. But for the sake of standardization, I’m fine to rewrite it with general approach. Overall didn’t want to invest here that much time, but that also true it will be hard to support that in case we’ll need to update something. Thx for review points. I’ll refactor it following all the suggestions |
Sounds good. Let me know when you need anything. |
response.StatusCode = http.StatusOK | ||
} | ||
|
||
go e.Service.Webhook.Process(h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this approach. I didn't find an example of async stuff. After receiving a hook I don't want to wait for opsgenie and also Github shouldn't care if routingrule was created. It just needs approve that webhook was delivered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it looks like we should rather implement an operator. We either process a request synchronously or create a CR in the first place and let it reconcile. Turning the request here into some background job will cause magic failures eventhough responses from the endpoint where successful. Then users try to do it again because nothing happened and then the system is in an inconsistent state.
What is the desired workflow this new service should be part of? Can you describe that or provide a simple diagram? We can maybe also hang and talk a bit to sort this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operator? No. That's would be too much for this.
Deployment event happens -> github sends webhook to auto-oncall -> app dispatches payload and responds to github it received valid webhook -> async process creates routingrule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why the processing has to be split out of the sync request. I think it is just fine then to remove the goroutine and add some backoff to be a bit more safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Processing is separated from the sync, because the success of creating a routing rule is not a part of the response to github about webhook state. Github cares about the delivery of the webhook with payload, not about internal processing. E.g., github shouldn't get error about webhook delivery if opsgenie is down. At least that's how I see it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. Then let's go with this and iterate in it. I think for the future it would make sense to create a CR synchronously and have an operator reconcile it accordingly. That way all logic is more or less guaranteed. Right now it is not, will fail once and be forgotten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
all other review points have been addressed.
Refactored a bit. Now I couldn't tell that is this project for. But structure was followed :D |
Also I changed logic. |
Nice! 👍 |
@xh3b4sd ptal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the binary is committed. Let's remove it and have a .gitignore
.
# go-tests = true | ||
# unused-packages = true | ||
|
||
[prune] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are no dependencies at all. Maybe try the init again.
auto-oncall application is a webhook handler, responsible for creating new Opsgenie routing rules on every deployment event. | ||
|
||
# configuration | ||
Configuration requires next data to be configured in `values.yaml` of the helm chart: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like it should be a secret and not a configmap. As which is it used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a secret, that's just a values file for helm, which used for generating templates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helm comes in separate branch https://github.com/giantswarm/auto-oncall/blob/add_hel/helm/auto-oncall-chart/templates/secret.yaml
service/webhook/event.go
Outdated
@@ -0,0 +1,29 @@ | |||
package webhook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming of the file is a bit off. There is no Event
type in event.go
. We usually call files in which we gather specs and interfaces spec.go
, especially when there are lots of different types.
service/webhook/service.go
Outdated
func (s *Service) createRoutingRule(event DeploymentEvent) error { | ||
var err error | ||
|
||
var opsGenieService *opsgenie.OpsGenie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The opsgenie service should be configured as a dependency at program boot. There is no reason to do it over and over again at runtime.
service/webhook/service.go
Outdated
User: user, | ||
} | ||
|
||
err = opsGenieService.CreateEscalation(routingRule) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The different calls could be performed in parallel if performance is an issue for the service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, escalation and routing rule are related entities. Routing rule depends on escalation
Co-Authored-By: corest <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty nice work. I think that goes into the right direction.
|
||
[[constraint]] | ||
name = "github.com/giantswarm/opsctl" | ||
version = "9923241.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that right? It does not look right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what dep generated for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit late to the parties, but added a notion of possibility of using string map for users. Not compulsory, but could improve maintainability of that user map.
daemonCommand := newCommand.DaemonCommand().CobraCommand() | ||
|
||
daemonCommand.PersistentFlags().String(f.Service.Oncall.OpsgenieToken, "", "Opsgenie API token.") | ||
daemonCommand.PersistentFlags().String(f.Service.Oncall.Users, "", "github_id:opsgenie_id mapppings, separated by comma.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use map[string]string
for Users
? See https://godoc.org/github.com/spf13/pflag#FlagSet.GetStringToString and spf13/pflag#133 so it should work --service.oncall.users=github_id0=opsgenie_id0,github_id1=opsgenie_id1,...
- then one can write user mappings as a YAML map in installations file and do something similar to what we did with string slices here: https://github.com/giantswarm/aws-operator/blob/master/helm/aws-operator-chart/templates/03-configmap.yaml#L18
Later when querying for value, I think you would just do: viper.GetStringMapString(f.Service.Oncall.Users)
and get a map of users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using map[string]string for Users on the service level. But with flag this magic didn't work for me. It just fails to automatically dispatch that flag into map[string]string
Towards https://github.com/giantswarm/giantswarm/issues/3103