Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health checks #2671

Merged
merged 7 commits into from
Feb 15, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions injection/sharedmain/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ import (
"knative.dev/pkg/logging"
"knative.dev/pkg/logging/logkey"
"knative.dev/pkg/metrics"
"knative.dev/pkg/network"
"knative.dev/pkg/profiling"
"knative.dev/pkg/reconciler"
"knative.dev/pkg/signals"
Expand Down Expand Up @@ -313,6 +314,11 @@ func MainWithConfig(ctx context.Context, component string, cfg *rest.Config, cto
return controller.StartAll(ctx, controllers...)
})

// Setup default health checks to catch issues with cache sync etc.
if !HealthProbesDisabled(ctx) {
network.ServeHealthProbes(ctx)
}

// This will block until either a signal arrives or one of the grouped functions
// returns an error.
<-egCtx.Done()
Expand All @@ -324,6 +330,18 @@ func MainWithConfig(ctx context.Context, component string, cfg *rest.Config, cto
}
}

type healthProbesDisabledKey struct{}

// WithHealthProbesDisabled signals to MainWithContext that it should disable default probes (readiness and liveness).
func WithHealthProbesDisabled(ctx context.Context) context.Context {
return context.WithValue(ctx, healthProbesDisabledKey{}, struct{}{})
}

// HealthProbesDisabled checks if default health checks are disabled in the related context.
func HealthProbesDisabled(ctx context.Context) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be an exported method, or can it be internal-only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be private.

return ctx.Value(healthProbesDisabledKey{}) != nil
}

func flush(logger *zap.SugaredLogger) {
logger.Sync()
metrics.FlushExporter()
Expand Down
90 changes: 90 additions & 0 deletions network/health_check.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
/*
Copyright 2022 The Knative Authors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package network

import (
"context"
"errors"
"log"
"net/http"
"os"
"sync"
"time"
)

// ServeHealthProbes sets up liveness and readiness probes.
func ServeHealthProbes(ctx context.Context) {
port := os.Getenv("KNATIVE_HEALTH_PROBES_PORT")
if port == "" {
port = "8080"
}
handler := healthHandler{HealthCheck: newHealthCheck(ctx)}
mux := http.NewServeMux()
mux.HandleFunc("/", handler.handle)
mux.HandleFunc("/health", handler.handle)
mux.HandleFunc("/readiness", handler.handle)

server := http.Server{ReadHeaderTimeout: time.Minute, Handler: mux, Addr: ":" + port}

go func() {
go func() {
<-ctx.Done()
_ = server.Shutdown(ctx)
}()

// start the web server on port and accept requests
log.Printf("Probes server listening on port %s", port)

if err := server.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
log.Fatal(err)
}
}()
}

func newHealthCheck(sigCtx context.Context) func() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts about how we might let users customize this later?

Copy link
Contributor Author

@skonto skonto Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could refactor this to allow pass a func and propagate it from ServeHealthProbes(ctx context.Context) or use context to register another callback there and get it with some func such as healthCheckFromContext. WDYTH?

once := sync.Once{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this once?

Copy link
Contributor Author

@skonto skonto Feb 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Activator has the same approach too for its healthcheck. So since the probe handler could be part of a handler chain (we might want to make it generic) we want the msg to be printed once. Of course I can simplify it for now if we expect SIGTERM to be managed once.

return func() error {
select {
// When we get SIGTERM (sigCtx done), let readiness probes start failing.
case <-sigCtx.Done():
once.Do(func() {
log.Println("Signal context canceled")
})
return errors.New("received SIGTERM from kubelet")
default:
return nil
}
}
}

// healthHandler handles responding to kubelet probes with a provided health check.
type healthHandler struct {
HealthCheck func() error
}

func (h *healthHandler) handle(w http.ResponseWriter, r *http.Request) {
if IsKubeletProbe(r) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking IsKubeletProbe here keeps this health checker from being used for e.g. proxy or apiserver health checks.

I know we may not need them today, but I'm wondering if we should be passing the request to the health checking function rather than doing this check here. i.e. func(http.Request) error. This might be a half-baked idea, though.

if err := h.HealthCheck(); err != nil {
log.Println("Healthcheck failed: ", err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this extract the logger from the context and use that?

http.Error(w, err.Error(), http.StatusInternalServerError)
} else {
w.WriteHeader(http.StatusOK)
}
return
}
http.Error(w, "Unexpected request", http.StatusBadRequest)
}