Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase socket timeout #247

Merged
merged 1 commit into from
Nov 4, 2019
Merged

Increase socket timeout #247

merged 1 commit into from
Nov 4, 2019

Conversation

adamchainz
Copy link
Contributor

Both @naohiro-t and I have seen the socket time out during startup only to reconnect shortly after:

2019-08-30T09:25:10+0100 DEBUG Parsing Core Agent manifest path: /tmp/scout_apm_core/scout_apm_core-v1.1.8-x86_64-apple-darwin/manifest.json
2019-08-30T09:25:10+0100 DEBUG Core Agent manifest json: {u'core_agent_version': u'1.1.8', u'core_agent_binary': u'core-agent', u'version': u'1.1.8', u'core_agent_binary_sha256': u'2896f584cb373c596164805510531a34b29e4eb34a13685edd4c6df364161414'}
[2019-08-30T08:25:10][core_agent][INFO] Initializing logger with log level: Info
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket thread exception: timeout('timed out',)
Traceback (most recent call last):
  File "/Users/chainz/Documents/Projects/scout_apm_python/src/scout_apm/core/socket.py", line 105, in run
    result = self._send(body)
  File "/Users/chainz/Documents/Projects/scout_apm_python/src/scout_apm/core/socket.py", line 174, in _send
    self._read_response()
  File "/Users/chainz/Documents/Projects/scout_apm_python/src/scout_apm/core/socket.py", line 184, in _read_response
    raw_size = self.socket.recv(4)
timeout: timed out
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket thread stopped.
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket attempt 1, connecting to /tmp/scout_apm_core/scout_apm_core-v1.1.8-x86_64-apple-darwin/scout-agent.sock, PID: 90625, Thread: <CoreAgentSocket(Thread-1, started daemon 123145439891456)>
2019-08-30T09:25:11+0100 DEBUG Monkey patched SQL
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket is connected

A bump up to a 3 second timeout seems to fix it - at least on my machine.

I guess the biggest risk of a too-fast timeout is continually reconnecting and re-registering when a server is low on CPU cycles, which would amplify the CPU usage of the library/agent. A longer timeout seems appropriate and unlikely to cause problems because this happens in a background thread.

@dlanderson
Copy link
Contributor

This may be papering over an issue with the core agent. Since the socket is already connected, communication with the core agent should be very fast. It's possible the core agent is blocked somewhere it shouldn't be. @cschneid

@adamchainz
Copy link
Contributor Author

Agree it's papering over an issue, but after debugging on Slack we don't seem to have found a cause? @cschneid what do you think about merging this?

Both @naohiro-t and I have seen the socket time out during startup only to reconnect shortly after:

```
2019-08-30T09:25:10+0100 DEBUG Parsing Core Agent manifest path: /tmp/scout_apm_core/scout_apm_core-v1.1.8-x86_64-apple-darwin/manifest.json
2019-08-30T09:25:10+0100 DEBUG Core Agent manifest json: {u'core_agent_version': u'1.1.8', u'core_agent_binary': u'core-agent', u'version': u'1.1.8', u'core_agent_binary_sha256': u'2896f584cb373c596164805510531a34b29e4eb34a13685edd4c6df364161414'}
[2019-08-30T08:25:10][core_agent][INFO] Initializing logger with log level: Info
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket thread exception: timeout('timed out',)
Traceback (most recent call last):
  File "/Users/chainz/Documents/Projects/scout_apm_python/src/scout_apm/core/socket.py", line 105, in run
    result = self._send(body)
  File "/Users/chainz/Documents/Projects/scout_apm_python/src/scout_apm/core/socket.py", line 174, in _send
    self._read_response()
  File "/Users/chainz/Documents/Projects/scout_apm_python/src/scout_apm/core/socket.py", line 184, in _read_response
    raw_size = self.socket.recv(4)
timeout: timed out
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket thread stopped.
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket attempt 1, connecting to /tmp/scout_apm_core/scout_apm_core-v1.1.8-x86_64-apple-darwin/scout-agent.sock, PID: 90625, Thread: <CoreAgentSocket(Thread-1, started daemon 123145439891456)>
2019-08-30T09:25:11+0100 DEBUG Monkey patched SQL
2019-08-30T09:25:11+0100 DEBUG CoreAgentSocket is connected
```

A bump up to a 3 second timeout seems to fix it - at least on my machine.

I guess the biggest risk of a too-fast timeout is continually reconnecting and re-registering when a server is low on CPU cycles, which would amplify the CPU usage of the library/agent. A longer timeout seems appropriate and unlikely to cause problems because this happens in a background thread.
@adamchainz adamchainz force-pushed the increase_socket_timeout branch from 74688f2 to c1b55b6 Compare November 4, 2019 16:48
@adamchainz adamchainz merged commit b2f21a0 into master Nov 4, 2019
@adamchainz adamchainz deleted the increase_socket_timeout branch November 4, 2019 17:06
@adamchainz
Copy link
Contributor Author

Merged after discussion on Slack - there may still be some sticking points in the core agent as @cschneid said.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants