Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querier timeout when reaching ingesters #718

Closed
cyriltovena opened this issue Jul 5, 2019 · 4 comments · Fixed by #788
Closed

Querier timeout when reaching ingesters #718

cyriltovena opened this issue Jul 5, 2019 · 4 comments · Fixed by #788
Assignees
Labels
component/loki good first issue These are great first issues. If you are looking for a place to start, start here! help wanted We would love help on these issues. Please come help us! type/bug Somehing is not working as expected

Comments

@cyriltovena
Copy link
Contributor

Currently there is no timeout when doing GRPC between ingesters and querier.

This leads to the querier stacking and starving goroutines until it gets restarted.

It is only happening when ingesters are unresponsive.

We should have a timeout for that connection.

@cyriltovena cyriltovena added kind/bug component/loki help wanted We would love help on these issues. Please come help us! good first issue These are great first issues. If you are looking for a place to start, start here! labels Jul 5, 2019
@cyriltovena cyriltovena changed the title Querier time out between with ingester Querier timeout when reaching ingesters Jul 5, 2019
@sh0rez
Copy link
Member

sh0rez commented Jul 7, 2019

Sure?

It looks like the querier configures its IngesterClient using t.cfg.IngesterClient config:

loki/pkg/loki/modules.go

Lines 135 to 136 in 1c95a5b

func (t *Loki) initQuerier() (err error) {
t.querier, err = querier.New(t.cfg.Querier, t.cfg.IngesterClient, t.ring, t.store)

factory := func(addr string) (grpc_health_v1.HealthClient, error) {
return client.New(clientCfg, addr)
}

But this one does include a timeout:

RemoteTimeout time.Duration `yaml:"remote_timeout,omitempty"`

And this timeout defaults to 5s because of its flag:

f.DurationVar(&cfg.RemoteTimeout, "ingester.client.timeout", 5*time.Second, "Timeout for ingester client RPCs.")

So did I miss another place where a timeout is required?

@cyriltovena
Copy link
Contributor Author

cyriltovena commented Jul 8, 2019

I don't think this is used. But what is missing mostly is a query timeout for instance:

clientDeadline := time.Now().Add(time.Duration(*deadlineMs) * time.Millisecond)
ctx, cancel := context.WithDeadline(ctx, clientDeadline)

return client.Query(ctx, req)

There is no timeout here but it should be there for most of queries type. (Not sure about tailing)

@pracucci
Copy link
Contributor

pracucci commented Jul 18, 2019

@Kuqd I would be glad to pick from here and work on a PR. Few questions, please:

  1. Should we allow to set the query timeout via querier config flag, or should be possible to override it via HTTP request too?
  2. Should the querier.Label() underlying requests have a different timeout than querier.Query()? If yes, may -querier.logs-query-timeout and -querier.labels-query-timeout work?
  3. Any suggestion on how we should unit test it?

@cyriltovena
Copy link
Contributor Author

Yes please.

1 - Only via a config flag with a sane default.
2 - Label and Query should be the same.
3 - I'm not sure this would be necessary be if you achieve it, I'll take it.

Thanks @pracucci

@chaudum chaudum added the type/bug Somehing is not working as expected label Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/loki good first issue These are great first issues. If you are looking for a place to start, start here! help wanted We would love help on these issues. Please come help us! type/bug Somehing is not working as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants