Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP API support #4543

Merged
merged 14 commits into from
Aug 18, 2023
Merged

HTTP API support #4543

merged 14 commits into from
Aug 18, 2023

Conversation

cretz
Copy link
Member

@cretz cretz commented Jun 26, 2023

What changed?

Implements support for HTTP API as proposed at temporalio/proposals#79. Specifically:

  • Added httpPort and httpDisabled and httpAdditionalForwardedHeaders on config RPC
    • Only applicable to frontend service
    • Default port is gRPC port + 10, default httpDisabled is false (meaning this is on by default)
    • Default of port 0 now means HTTP disabled but all docker configs set to default to 7243
    • There is a default set of headers as visible from code
    • This config should be discussed to see if we're ok with it
  • Added HTTPAPIServer for frontend which runs HTTP API server using gRPC gateway with our custom modifications which include
    • Ability to support ?pretty to pretty print JSON and/or ?noPayloadShorthand to disable shorthand payloads
    • Uses direct gRPC server invocation via invocation of gRPC interceptors
    • Implements well-defined graceful stop and adds basic common Go HTTP server protection measures
    • Custom error handler due to gogo status vs Google gRPC status issues with Any details
    • Properly forwards headers and TLS auth info so claim mapper and authorizer works
  • Added HTTPAPIServerProvider for frontend which creates HTTPAPIServer
  • Updated frontend Service to support HTTPAPIServer
    • Make gRPC server and HTTP server start concurrently
    • Make gRPC server and HTTP server shutdown concurrently
  • Added tests for HTTP API
  • Added basic mTLS integration tests for gRPC and HTTP API

This is ready for review. This will not pass CI until the other PRs are merged. Related PRs:

Comment on lines 98 to 104
// HTTPPort is the port on which HTTP will listen. If unset/0 and HTTP is
// enabled (the default), this will be the gRPC port + 10. This setting only
// applies to the frontend service.
HTTPPort int `yaml:"httpPort"`
// HTTPDisabled can be set to true to disable HTTP API. This setting only
// applies to the frontend service.
HTTPDisabled bool `yaml:"httpDisabled"`
Copy link
Member Author

@cretz cretz Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatives to httpPort and httpDisabled in the frontend service config directly are to have a top-level HTTP configuration section. However this seemed cleanest.

EDIT: Note, httpDisabled is no longer present

Comment on lines +66 to +68
Details []struct {
RunID string `json:"runId"`
} `json:"details"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this style of details to work, I had to implement a custom error handler because the default reusing our marshaler doesn't work with Google "Any" wkt, and we want to use our marshaler.

_, respBody = s.httpPost(
http.StatusOK,
"/api/v1/namespaces/"+s.namespace+"/workflows/"+workflowID+"/query/some-query",
`{ "query": { "queryArgs": [{ "someField": "query-arg" }] } }`,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice the inconsistencies in our API come to the forefront here. What is query.queryArgs here is input for signal/workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are new APIs, though. Why not address this?

Copy link
Member Author

@cretz cretz Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot address this reasonably with gRPC gateway. Our HTTP API, as a rule and as a byproduct of our approach, exposes the RPCs we have made, it does not alter them. We chose to do this in our API, gRPC and HTTP users have to deal with it. This is covered a bit better in the proposal: https://github.com/temporalio/proposals/blob/master/api/http-api.md.

If we want to fix our RPCs to be more consistent, HTTP and gRPC will benefit. If not, HTTP and gRPC calls will be inconsistent across calls (but consistent across protocols).

Comment on lines +96 to +97
// Our version of gRPC gateway only supports integer enums in queries :-(
"/api/v1/namespaces/"+s.namespace+"/workflows/"+workflowID+"/history?historyEventFilterType=2",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note how the enum in the query param has to be an integer. This is because our continued use of Gogo proto means we have to support an outdated gRPC gateway which doesn't support enum names in query params. I can customize this, but it's basically copying the entirety of https://github.com/grpc-ecosystem/grpc-gateway/blob/v1.16.0/runtime/query.go and making slight alterations for enum-name support. If I did make this change, it'd be in temporalio/api-go#112.

@cretz cretz changed the title HTTP API support (early preview) HTTP API support Jun 30, 2023
@cretz cretz marked this pull request as ready for review June 30, 2023 15:49
@cretz cretz requested a review from a team as a code owner June 30, 2023 15:49
go.mod Outdated
@@ -148,3 +149,5 @@ require (
modernc.org/strutil v1.1.3 // indirect
modernc.org/token v1.1.0 // indirect
)

replace go.temporal.io/api => ../temporal-api-go
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is of course temporary pending merge of dependencies

Comment on lines 64 to 66
if tcpAddrRef == nil {
return nil, fmt.Errorf("must use TCP for gRPC listener to support HTTP API")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I'd include a %T somewhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need ok, why use it? I don't think we need Go types on this error message.

tcpAddr := *tcpAddrRef
tcpAddr.Port = rpcConfig.HTTPPort
if tcpAddr.Port == 0 {
tcpAddr.Port = tcpAddrRef.Port + 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding 10 here to the RPC port might work for our static configs, but it could conflict with others--it's also just too much implicit behavior IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting I open on 7243 by default regardless of your gRPC port? I can do that, though would like others' opinions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The +10 feels too magical and surprising to me also. I'd rather just require an explicit port here. All port selection should be done in static config (or by something wrapping the server and generating a static config in memory)

Feel free to add an explicit port to all the development configs and dev servers

Copy link
Member Author

@cretz cretz Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we require an explicit port for gRPC today? What does the absence of a gRPC port in config mean today? Does requiring a port basically mean that we are not turning the HTTP server on by default? Or does it mean that we just bind it to any random port by default (arguably a security issue)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do always require an explicit port, afaik. (temporalite has code to pick a random port, but then it puts it in the static config.)

We can set a default port the same way we set ports for everything else: by putting them in all the static config files and docker template static config. (Or have it off by default but I think we decided to have it on by default?)

Copy link
Member Author

@cretz cretz Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! That's information I didn't know. Yes, I will make this required and set the default in static configs. I assume this is only docker/config_template.yaml? Can you confirm for me what happens if frontend gRPC port config is left off? (I can try to dig as well, but I assume it picks a random port which makes it not very usable)

EDIT: I will just change our code to assume 0 port (i.e. unset) means HTTP disabled and set it in the docker config and remove explicit HTTP disabling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed to require an HTTP port since we require it for gRPC. The absence of an HTTP port (or 0) means disabled. Docker config templates have been updated to set to 7243 by default.

}
tcpAddr := *tcpAddrRef
tcpAddr.Port = rpcConfig.HTTPPort
if tcpAddr.Port == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ok to specify the zero port and get a random available port.

Copy link
Member Author

@cretz cretz Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support that for gRPC port today? How do you differentiate an unset YAML int from 0 with gRPC port today? Or do we actually default to a random port for gRPC? Are we ok w/ default as a random port for HTTP?


// GracefulStop stops the HTTP server. This will first attempt a graceful stop
// with a drain time, then will hard-stop. This will not return until stopped.
func (h *HTTPAPIServer) GracefulStop() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should accept a context because we have one at the top-level from fx (although you may have to plumb it down from there)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use the context for bounding gRPC server graceful stop, why is it a new requirement for HTTP server graceful stop? I intentionally chose to match what gRPC server does.

_, respBody = s.httpPost(
http.StatusOK,
"/api/v1/namespaces/"+s.namespace+"/workflows/"+workflowID+"/query/some-query",
`{ "query": { "queryArgs": [{ "someField": "query-arg" }] } }`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are new APIs, though. Why not address this?

tcpAddr := *tcpAddrRef
tcpAddr.Port = rpcConfig.HTTPPort
if tcpAddr.Port == 0 {
tcpAddr.Port = tcpAddrRef.Port + 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The +10 feels too magical and surprising to me also. I'd rather just require an explicit port here. All port selection should be done in static config (or by something wrapping the server and generating a static config in memory)

Feel free to add an explicit port to all the development configs and dev servers

return nil, fmt.Errorf("failed listening for HTTP API on %v: %w", &tcpAddr, err)
}
// Close the listener if anything else in this function fails
success := false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used named return values and check that instead of an extra local var?

Copy link
Member Author

@cretz cretz Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually choose this approach because readers can get confused on when named return values are set and especially if you name it err it can often be swallowed in confusing ways. But I will change if preferred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about potential confusion with err: elsewhere in the codebase we use retErr for that pattern. But no preference here

if conn, _ := r.Context().Value(httpRemoteAddrContextKey{}).(net.Conn); conn != nil {
addr = conn.RemoteAddr()
}
r = r.WithContext(peer.NewContext(r.Context(), &peer.Peer{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice. this means we don't have to change the authorization interceptor at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. And I have tests proving it.

if err := s.server.Serve(listener); err != nil {
logger.Fatal("Failed to serve on frontend listener", tag.Error(err))

// Start the gRPC and HTTP server (if any) in the background
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing this in #4584, so after that lands, you can just start two goroutines that check return value independently, and not have to do the channel here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If that lands first I will change when I conflict.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that one landed

Copy link
Member Author

@cretz cretz Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we now do the same for the HTTP server (start in background, don't block)

@cretz
Copy link
Member Author

cretz commented Jul 7, 2023

Added http_service_requests metric, made StartWorkflowExecutionRequest.request_id optional, and added general PR updates. I won't be constantly fixing conflicts from master until this is nearer approval.

logger log.Logger,
) (*HTTPAPIServer, error) {
// If HTTP API server not enabled, return nil
rpcConfig := config.Services[string(serviceName)].RPC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no reason internal-frontend should listen on http. Maybe we should exclude it here too? e.g.

if serviceName != primitives.FrontendService { return nil, nil }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I was not aware there was an internal-frontend reusing this same package/code. Will only apply conditional to ensure only the public frontend gets the HTTP API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

tcpAddr := *tcpAddrRef
tcpAddr.Port = rpcConfig.HTTPPort
if tcpAddr.Port == 0 {
tcpAddr.Port = tcpAddrRef.Port + 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do always require an explicit port, afaik. (temporalite has code to pick a random port, but then it puts it in the static config.)

We can set a default port the same way we set ports for everything else: by putting them in all the static config files and docker template static config. (Or have it off by default but I think we decided to have it on by default?)

return nil, fmt.Errorf("failed listening for HTTP API on %v: %w", &tcpAddr, err)
}
// Close the listener if anything else in this function fails
success := false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about potential confusion with err: elsewhere in the codebase we use retErr for that pattern. But no preference here

// moment using gRPC's default at
// https://github.com/grpc/grpc-go/blob/0673105ebcb956e8bf50b96e28209ab7845a65ad/server.go#L58.
// Max header bytes is defaulted by Go to 1MB.
r.Body = http.MaxBytesReader(w, r.Body, 1024*1024*4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's been talk about increasing this limit for grpc. We already increase it for internode connections but not frontend: https://github.com/temporalio/temporal/blob/master/common/rpc/grpc.go#L58-L59

Can you put this is a constant so we can keep it in sync between grpc and http? Maybe next to that one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Will call it maxHttpRequestBytes for now, but would expect it to be updated to share with the gRPC one if/when y'all get around to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if err := s.server.Serve(listener); err != nil {
logger.Fatal("Failed to serve on frontend listener", tag.Error(err))

// Start the gRPC and HTTP server (if any) in the background
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that one landed

@@ -304,6 +304,7 @@ services:
grpcPort: {{ $temporalGrpcPort }}
membershipPort: {{ default .Env.FRONTEND_MEMBERSHIP_PORT "6933" }}
bindOnIP: {{ default .Env.BIND_ON_IP "127.0.0.1" }}
httpPort: {{ default .Env.FRONTEND_HTTP_PORT "7243" }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Will open PRs to those upon merge here.

return nil, errRequestIDNotSet
// For easy direct API use, we default the request ID here but expect all
// SDKs and other auto-retrying clients to set it
request.RequestId = uuid.New()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth doing this only for http?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing any call-specific things for HTTP is not only a bit difficult, it violates the spirit of "generic HTTP interface over existing gRPC behavior". So far I have been able to avoid any call-specific HTTP behavior.

@cretz cretz merged commit c784796 into temporalio:main Aug 18, 2023
@cretz cretz deleted the http-api branch August 18, 2023 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants