Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable using gRPC client for GCS #92

Merged
merged 4 commits into from
Dec 20, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#51](https://github.com/thanos-io/objstore/pull/51) Azure: Support using connection string authentication.
- [#76](https://github.com/thanos-io/objstore/pull/76) GCS: Query for object names only in `Iter` to possibly improve performance when listing objects.
- [#85](https://github.com/thanos-io/objstore/pull/85) S3: Allow checksum algorithm to be configured
- [#92](https://github.com/thanos-io/objstore/pull/92) GCS: Allow using a gRPC client.

### Changed
- [#38](https://github.com/thanos-io/objstore/pull/38) *: Upgrade minio-go version to `v7.0.45`.
Expand Down
25 changes: 22 additions & 3 deletions providers/gcs/gcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"golang.org/x/oauth2/google"
"google.golang.org/api/iterator"
"google.golang.org/api/option"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
"gopkg.in/yaml.v2"
Expand All @@ -31,8 +32,10 @@ const DirDelim = "/"

// Config stores the configuration for gcs bucket.
type Config struct {
Bucket string `yaml:"bucket"`
ServiceAccount string `yaml:"service_account"`
Bucket string `yaml:"bucket"`
ServiceAccount string `yaml:"service_account"`
UseGRPC bool `yaml:"use_grpc"`
GRPCConnPoolSize int `yaml:"grpc_conn_pool_size"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a bit of testing by tweaking this parameter up and down. Performance follows a U-shaped curve where using 1 connection is abysmally slow, even slower than HTTP. With a pool size of 512 performance is better, but it seems to be best with 256.

We could consider using that (or some other number) as the default to make sure users do not need to guess in the beginning. We can still keep the value configurable if anyone needs further customization.

Copy link
Contributor Author

@fpetkovski fpetkovski Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out we had to enable direct path as documented here https://cloud.google.com/go/docs/reference/cloud.google.com/go/storage/latest#hdr-Experimental_gRPC_API.

If the application is running within GCP, users may get better performance by enabling DirectPath (enabling requests to skip some proxy steps). To enable, set the environment variable GOOGLE_CLOUD_ENABLE_DIRECT_PATH_XDS=true and add the following side-effect imports to your application:

import (
    _ "google.golang.org/grpc/balancer/rls"
    _ "google.golang.org/grpc/xds/googledirectpath"
)

This made the connection pool config obsolete (as a matter of fact, anything higher than 1 would hang the client). The config option is still useful when direct path is not enabled though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably mention the direct path option as a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated.

}

// Bucket implements the store.Bucket and shipper.Bucket interfaces against GCS.
Expand Down Expand Up @@ -75,7 +78,23 @@ func NewBucketWithConfig(ctx context.Context, logger log.Logger, gc Config, comp
option.WithUserAgent(fmt.Sprintf("thanos-%s/%s (%s)", component, version.Version, runtime.Version())),
)

gcsClient, err := storage.NewClient(ctx, opts...)
return newBucket(ctx, logger, gc, opts)
}

func newBucket(ctx context.Context, logger log.Logger, gc Config, opts []option.ClientOption) (*Bucket, error) {
var (
err error
gcsClient *storage.Client
)
if gc.UseGRPC {
opts = append(opts,
option.WithGRPCDialOption(grpc.WithRecvBufferPool(grpc.NewSharedBufferPool())),
option.WithGRPCConnectionPool(gc.GRPCConnPoolSize),
)
gcsClient, err = storage.NewGRPCClient(ctx, opts...)
} else {
gcsClient, err = storage.NewClient(ctx, opts...)
}
if err != nil {
return nil, err
}
Expand Down