Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
roachtest: retry fetching timeseries data in overload test
A test failure has been observed with the following error: ``` 503 Service Unavailable, content-type: application/json, body: { "error": "all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: failed to write client preface: io: read/write on closed pipe\"", "message": "all SubConns are in TransientFailure, latest connection error: connection error: desc = \"transport: failed to write client preface: io: read/write on closed pipe\"", "code": 14, "details": [ ] }, error: <nil> ``` That error makes some sense if the DefaultClass connection has failed for some reason. In the fullness of time we should get to the bottom of why these gRPC connections still close when overloaded. We had hoped that cockroachdb#39041 would be the end of that but unfortunately it still seems to happen sometimes. That being said, the situation resolves itself rapidly when the load stops. This PR adds a retry loop to make the system robust to these transient failures. It's also possible that we should run timeseries queries as SystemClass operations but I'll also leave that for another change. Release note: None
- Loading branch information