Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected upstream response #397

Closed
chicknsoup opened this issue Mar 20, 2020 · 6 comments
Closed

Unexpected upstream response #397

chicknsoup opened this issue Mar 20, 2020 · 6 comments
Assignees
Labels
bug Something isn't working investigating assignee is investigating to determine root cause and fix

Comments

@chicknsoup
Copy link

I have to restart trickster once every week or so in order to normally query from Grafana. Checking the log file, there are only unexpected upstream response errors, nothing else.
Prometheus queries still ok, restarting trickster fixes the problem.

time=2020-03-20T05:14:12.615544864Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:14:12.620210271Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:14:12.624673167Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:15:12.277004078Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:15:12.281477037Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:15:12.28644297Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:15:12.614298131Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:15:12.620147753Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:15:12.62552363Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:16:12.276595385Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:16:12.28187261Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:16:12.286772955Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:16:12.611576218Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:16:12.616117145Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:16:12.62063816Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:17:12.277820947Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:17:12.283518923Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:17:12.287928425Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:17:12.612475642Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:17:12.616642372Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:17:12.62083934Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:18:12.276652316Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:18:12.282414148Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:18:12.287325991Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:18:12.61284092Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:18:12.617943482Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:18:12.622819376Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:19:12.278048109Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:19:12.284003523Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:19:12.289953829Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:19:12.612766929Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:19:12.61722679Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:19:12.621872603Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:20:12.277719554Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:20:12.282379063Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:20:12.287320726Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:20:12.613289757Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:20:12.617910763Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:20:12.622620485Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:21:12.277273852Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:21:12.282081338Z app=trickster caller=proxy/engines
/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:21:12.286657918Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:21:12.612752781Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:21:12.616835802Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:21:12.620943531Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:22:12.276771777Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:22:12.282310817Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:22:12.287833134Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:22:12.613380953Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:22:12.618206738Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
time=2020-03-20T05:22:12.622683351Z app=trickster caller=proxy/engines/deltaproxycache.go:379 level=error event="unexpected upstream response" statusCode=400
@jranson
Copy link
Member

jranson commented Mar 20, 2020

Thanks for this report. In the next release, we'll enhance this log entry to include the full inbound (from grafana) client request, and Trickster outbound response (to prom), and the Prometheus full response. So that way, you can provide us with further detail that might point towards a root cause. Separately we will look at reproducing and root causing the behavior.

@jranson jranson self-assigned this Mar 20, 2020
@jranson jranson added bug Something isn't working investigating assignee is investigating to determine root cause and fix labels Mar 20, 2020
@chicknsoup
Copy link
Author

Here's what i've found so far.

  • Unexpected upstream response error has been fixed (by removing invalid queries from Grafana) --> This is not the culprit.
  • Trickster stopped working after 7-10 days
  • By the time it stopped, tcp connection started leaking (below is the number of open files - most of them are socket file). It kept leaking until I restarted trickster:
    image

image

lsof -p 56359|grep  "protocol: TCPv6"|more
trickster 56359 bigmon    7u     sock                0,7      0t0 373506307 protocol: TCPv6
trickster 56359 bigmon   13u     sock                0,7      0t0 373500120 protocol: TCPv6
trickster 56359 bigmon   14u     sock                0,7      0t0 373504354 protocol: TCPv6
trickster 56359 bigmon   18u     sock                0,7      0t0 373506363 protocol: TCPv6
trickster 56359 bigmon   20u     sock                0,7      0t0 373504357 protocol: TCPv6
trickster 56359 bigmon   21u     sock                0,7      0t0 373504361 protocol: TCPv6
trickster 56359 bigmon   22u     sock                0,7      0t0 373504350 protocol: TCPv6
trickster 56359 bigmon   23u     sock                0,7      0t0 373513395 protocol: TCPv6
trickster 56359 bigmon   24u     sock                0,7      0t0 373499874 protocol: TCPv6
trickster 56359 bigmon   25u     sock                0,7      0t0 373513397 protocol: TCPv6
trickster 56359 bigmon   26u     sock                0,7      0t0 373513399 protocol: TCPv6
trickster 56359 bigmon   27u     sock                0,7      0t0 373459636 protocol: TCPv6
trickster 56359 bigmon   28u     sock                0,7      0t0 373459638 protocol: TCPv6
trickster 56359 bigmon   29u     sock                0,7      0t0 373500326 protocol: TCPv6
trickster 56359 bigmon   30u     sock                0,7      0t0 373504368 protocol: TCPv6
trickster 56359 bigmon   31u     sock                0,7      0t0 373504369 protocol: TCPv6
trickster 56359 bigmon   32u     sock                0,7      0t0 373514389 protocol: TCPv6
trickster 56359 bigmon   33u     sock                0,7      0t0 373503613 protocol: TCPv6
trickster 56359 bigmon   34u     sock                0,7      0t0 373500330 protocol: TCPv6
trickster 56359 bigmon   35u     sock                0,7      0t0 373504371 protocol: TCPv6
trickster 56359 bigmon   36u     sock                0,7      0t0 373504373 protocol: TCPv6
trickster 56359 bigmon   37u     sock                0,7      0t0 373507772 protocol: TCPv6
trickster 56359 bigmon   38u     sock                0,7      0t0 373507822 protocol: TCPv6
lsof -p 56359|grep  "protocol: TCPv6"|wc -l
10893
  • No more logs written to file
  • Restarting trickster fixes the issue.

@chicknsoup
Copy link
Author

Today it happened again (after nearly 1day running). Checking the log I can only see these lines:

length=0 ttl=30s is_direct=true
time=2020-04-06T00:41:41.429040456Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.a6497b43695b87bb2c38c0ab7d0311f9 length=0 ttl=30s is_direct=true
time=2020-04-06T00:41:41.429231187Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" length=0 ttl=30s is_direct=true cacheKey=10.0.0.41:8189.dfbc9e001d7880478ccb2d2b6f43bccc
time=2020-04-06T00:41:41.429919695Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.982e7e3473492cc384ac239ffab06c59 length=0 ttl=30s is_direct=true
time=2020-04-06T00:41:41.431294552Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.583533bb778ec4bfe3cba0c7e6b9d42d length=0 ttl=30s is_direct=true
time=2020-04-06T00:42:41.408106793Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" is_direct=true cacheKey=10.0.0.41:8189.13d0401d6e1b82260d60e90f61f16f5f length=0 ttl=30s
time=2020-04-06T00:42:41.408290309Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.465dead3a2ef62826af4c757c293a534 length=0 ttl=30s is_direct=true
time=2020-04-06T00:42:41.408333343Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.814a56cc1ef86db8c70f8084d87e7805 length=0 ttl=30s is_direct=true
time=2020-04-06T00:42:41.421912444Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" ttl=30s is_direct=true cacheKey=10.0.0.41:8189.af9efd836dbb250be01bf0b288541315 length=0
time=2020-04-06T00:43:41.413055032Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.6ed818ff1737bf5b97d5c6c650e6ef48 length=0 ttl=30s is_direct=true
time=2020-04-06T00:43:41.413289438Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" is_direct=true cacheKey=10.0.0.41:8189.fc1dae17c741994493d45f28a4615e30 length=0 ttl=30s
time=2020-04-06T00:43:41.41336599Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.dfc97ecac8d9e8f2d8899c611ae9c836 length=0 ttl=30s is_direct=true
time=2020-04-06T00:43:41.41334791Z app=trickster caller=runtime/asm_amd64.s:1357 level=debug event="memorycache cache store" cacheKey=10.0.0.41:8189.8eedf142f0a4c2d9f6daec00e06b98e1 length=0 ttl=30s is_direct=true

I don't see any event="delta routine completed"

@chicknsoup
Copy link
Author

I ended up using the previous version, and it seems working well so far (4 days)
https://github.com/tricksterproxy/trickster/releases/tag/v1.09-beta9

@jranson
Copy link
Member

jranson commented Apr 14, 2020

@chicknsoup thanks for this info on the file descriptors, after fixing your dashboard queries. Looks like this may now be a duplicate of #391, which we'll be issuing a patch version 1.0.3 shortly. Thanks for your information and helping us get to the bottom of it! (p.s. we will still include the additional detail for Unexpected Upstream Response in the 1.1 version)

@jranson jranson mentioned this issue Apr 16, 2020
@jranson
Copy link
Member

jranson commented Jun 2, 2020

we believe this to be resolved during the beta cycle, please reopen if the GA version of 1.1 still brings you to the max file limits.

@jranson jranson closed this as completed Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working investigating assignee is investigating to determine root cause and fix
Projects
None yet
Development

No branches or pull requests

2 participants