You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the changes made in #4045, after a 24 hour grace period, a session's event stream is uploaded, whether the session is closed or not. This logic was built assuming that a after 24 hours a session must be abandoned. However, a session can remain open indefinitely, and the grace period logic misbehaves in this edge case.
With some of the recent changes made to session uploading and moderated sessions, the grace period's misbehavior has some varying effects depending on the version.
v8.3.4
If the server is closed during a session before the grace period, upon restart the server will successfully upload the session after the grace period, as expected.
But if the session remains open after the grace period upload, then the following problems arise:
If the session exits immediately, the session recording becomes unplayable. Hitting Play for the session in the webui leads to the message - Recording for this session is not available.
See the debug logs at the bottom for this case
If the server is restarted, the session recording doesn't show up in the webui (because it couldn't ensure session leave event?), but is on disk. I believe the recording on disk is only the grace period upload, anything after is lost.
If a new session chunk is made and attempts to write it to disk, it will fail and the session will freeze. Once closed, the session recording is does not show up in the webui, but is on disk, similar to the case above.
If less than a full chunk is made, then it will truncate the uploaded recording to the length of the additional chunk. So a portion of the initial session chunk is playable...
What should happen
When the grace period is reached on a pending session upload:
The pending upload parts should be uploaded (currently works)
The session should be forced to close, if it isn't already.
Although ungraceful, I don't see another way to do this without turning it into a "hybrid" sync/async session writer.
Alternatively we could alter the grace period logic to only complete the upload if the session is closed, but this may ruin the current separation of concerns.
This might be possible by improving the logic around ensuring the session.leave event.
Ideally, users would use the node-sync session streaming option, which would help to avoid this disruptive grace period behavior altogether, outside of the intended scenario where a session stream fails to upload at the end of the session's life time (network issue, etc.).
Debug logs - v8.3.4 - session exits immediately after grace period upload completes
...
2022-03-24T10:01:32-07:00 INFO [AUDIT] session.start addr.local:127.0.0.1:3022 addr.remote:127.0.0.1:36534 cluster_name:example.com code:T2000I ei:0 event:session.start login:bjoerger namespace:default server_addr:127.0.0.1:3022 server_hostname:server01 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 server_labels:map[arch:x86_64 cluster:example.com env:staging hostname:bjoerger-ThinkPad-X1-Extreme] session_recording:node sid:15a2c1a2-9913-48ba-ae53-651be25842ff size:280:25 time:2022-03-24T17:01:32.016Z uid:2be6d0cc-8122-4e22-9a2a-6a25d16dfeb1 user:dev events/emitter.go:325
# grace period ends (set to 10 seconds)
2022-03-24T10:01:42-07:00 DEBU [AUTH:COMP] Upload 0bb39db8-95b0-4b5e-a15a-e87b17e82905 grace period is over. Trying to complete. events/complete.go:140
2022-03-24T10:01:42-07:00 DEBU [AUTH:COMP] Completed upload Upload(session=15a2c1a2-9913-48ba-ae53-651be25842ff, id=0bb39db8-95b0-4b5e-a15a-e87b17e82905, initiated=2022-03-24 10:01:32.011363435 -0700 PDT). events/complete.go:144
2022-03-24T10:01:42-07:00 DEBU [AUTH:COMP] Found 1 active uploads, completed 1. events/complete.go:186
2022-03-24T10:01:42-07:00 DEBU [UPLOAD] Scanned 1 uploads, started 1 in /home/bjoerger/gravitational/teleport-config/local/storage/data/dir/main/log/upload/streaming/default. filesessions/fileasync.go:289
2022-03-24T10:01:42-07:00 DEBU [AUTH:GRPC] CreateAuditStream connection from f58992c2-576d-42bd-be86-498aa37bd3e3.example.com. auth/grpcserver.go:169
2022-03-24T10:01:42-07:00 DEBU [AUTH:GRPC] Created stream: <nil>. auth/grpcserver.go:212
2022-03-24T10:01:42-07:00 INFO [AUDIT] session.upload cluster_name:example.com code:T2005I ei:2.147483647e+09 event:session.upload sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:42.872Z url:file:///home/bjoerger/gravitational/teleport-config/local/storage/data/dir/main/log/records/multi/15a2c1a2-9913-48ba-ae53-651be25842ff events/emitter.go:325
2022-03-24T10:01:42-07:00 DEBU [AUTH:GRPC] Completed stream: <nil>. auth/grpcserver.go:259
2022-03-24T10:01:42-07:00 DEBU [AUTH:GRPC] Flushed and closed the stream. auth/grpcserver.go:190
2022-03-24T10:01:42-07:00 DEBU [UPLOAD] Session upload completed. duration:6.473477ms session-id:15a2c1a2-9913-48ba-ae53-651be25842ff filesessions/fileasync.go:443
# grace period upload completes
# exit session after upload
2022-03-24T10:01:49-07:00 DEBU [SESSION:N] Encountered a fatal error Copying from PTY to writer error:[read /dev/ptmx: input/output error] srv/sess.go:766
2022-03-24T10:01:49-07:00 DEBU [NODE] Exec request ("/home/bjoerger/gravitational/teleport/build/teleport") complete: 0 id:8 local:127.0.0.1:3022 login:bjoerger remote:127.0.0.1:36534 teleportUser:dev regular/sshserver.go:1377
2022-03-24T10:01:49-07:00 INFO [SESSION:N] Closing party cc89464a-2e7a-4848-99ab-16d54b84307e srv/sess.go:1392
2022-03-24T10:01:49-07:00 INFO [NODE] Removing party ServerContext(127.0.0.1:36534->127.0.0.1:3022, user=bjoerger, id=8) party(id=cc89464a-2e7a-4848-99ab-16d54b84307e) from session 15a2c1a2-9913-48ba-ae53-651be25842ff id:8 local:127.0.0.1:3022 login:bjoerger remote:127.0.0.1:36534 teleportUser:dev srv/sess.go:1077
2022-03-24T10:01:49-07:00 DEBU [NODE] Releasing associated resources - context has been closed. id:8 local:127.0.0.1:3022 login:bjoerger remote:127.0.0.1:36534 teleportUser:dev srv/monitor.go:253
2022-03-24T10:01:49-07:00 INFO [AUDIT] session.data addr.local:127.0.0.1:3022 addr.remote:127.0.0.1:36534 code:T2006I ei:2.147483646e+09 event:session.data login:bjoerger namespace:default rx:9690 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:49.206Z tx:5879 uid:42a63caa-7db4-46cb-9a46-28ea720d8dce user:dev events/emitter.go:325
2022-03-24T10:01:49-07:00 INFO [SESSION:N] Session 15a2c1a2-9913-48ba-ae53-651be25842ff will be garbage collected. srv/sess.go:321
2022-03-24T10:01:49-07:00 INFO [SESSION:N] Party member cc89464a-2e7a-4848-99ab-16d54b84307e left session 15a2c1a2-9913-48ba-ae53-651be25842ff. srv/sess.go:1222
2022-03-24T10:01:49-07:00 DEBU [SSH:PROXY] Closed connection 127.0.0.1:36534. sshutils/server.go:471
2022-03-24T10:01:49-07:00 DEBU [PROXY] Client 127.0.0.1:36534 disconnected. id:6 local:127.0.1.1:3080 login:bjoerger remote:127.0.0.1:36534 teleportUser:dev regular/sshserver.go:1364
2022-03-24T10:01:49-07:00 DEBU [NODE] Subsystem proxySubsys(cluster=default/example.com, host=server01, port=0) finished with result: read tcp 127.0.0.1:60512->127.0.0.1:3022: use of closed network connection. regular/sshserver.go:1587
2022-03-24T10:01:49-07:00 DEBU [SSH:NODE] Closed connection 127.0.0.1:36534. sshutils/server.go:471
2022-03-24T10:01:49-07:00 DEBU [PROXY] Releasing associated resources - context has been closed. id:6 local:127.0.1.1:3080 login:bjoerger remote:127.0.0.1:36534 teleportUser:dev srv/monitor.go:253
2022-03-24T10:01:49-07:00 WARN Failed to upload part. error:[
ERROR REPORT:
Original Error: *trace.NotFoundError open /home/bjoerger/gravitational/teleport-config/local/storage/data/dir/main/log/upload/streaming/default/multi/0bb39db8-95b0-4b5e-a15a-e87b17e82905/15a2c1a2-9913-48ba-ae53-651be25842ff/1.part: no such file or directory
Stack Trace:
/home/bjoerger/gravitational/teleport/lib/events/filesessions/filestream.go:83 github.com/gravitational/teleport/lib/events/filesessions.(*Handler).UploadPart
/home/bjoerger/gravitational/teleport/lib/events/stream.go:674 github.com/gravitational/teleport/lib/events.(*sliceWriter).startUpload.func1
/home/bjoerger/.tools/go/src/runtime/asm_amd64.s:1581 runtime.goexit
User Message: open /home/bjoerger/gravitational/teleport-config/local/storage/data/dir/main/log/upload/streaming/default/multi/0bb39db8-95b0-4b5e-a15a-e87b17e82905/15a2c1a2-9913-48ba-ae53-651be25842ff/1.part: no such file or directory] events/stream.go:612
2022-03-24T10:01:49-07:00 INFO [AUDIT] session.data addr.local:127.0.0.1:3022 addr.remote:127.0.0.1:36534 cluster_name:example.com code:T2006I ei:2.147483646e+09 event:session.data login:bjoerger namespace:default rx:9690 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:49.206Z tx:5879 uid:42a63caa-7db4-46cb-9a46-28ea720d8dce user:dev events/emitter.go:325
2022-03-24T10:01:49-07:00 DEBU [SESSION:N] Session has encountered 2 slow writes out of 30. Check disk and network on this server. events/auditwriter.go:384
2022-03-24T10:01:49-07:00 INFO [SESSION:N] Closing session 15a2c1a2-9913-48ba-ae53-651be25842ff. srv/sess.go:632
2022-03-24T10:01:49-07:00 DEBU [SESSION:N] Session has encountered 2 slow writes out of 30. Check disk and network on this server. events/auditwriter.go:384
2022-03-24T10:01:49-07:00 INFO [AUDIT] session.leave cluster_name:example.com code:T2003I ei:28 event:session.leave namespace:default server_addr:127.0.0.1:3022 server_hostname:server01 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 server_labels:map[arch:x86_64 cluster:example.com env:staging hostname:bjoerger-ThinkPad-X1-Extreme] sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:49.206Z uid:9bc972c9-0825-430f-8fc1-05b48d2c91cf user:dev events/emitter.go:325
2022-03-24T10:01:49-07:00 DEBU [TERM:LOCA] Closed PTY srv/term.go:300
2022-03-24T10:01:49-07:00 DEBU [SESSION:N] Stopping poll and sync of terminal size to all parties. srv/sess.go:1176
2022-03-24T10:01:49-07:00 INFO [AUDIT] session.leave cluster_name:example.com code:T2003I ei:28 event:session.leave namespace:default server_addr:127.0.0.1:3022 server_hostname:server01 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 server_labels:map[arch:x86_64 cluster:example.com env:staging hostname:bjoerger-ThinkPad-X1-Extreme] sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:49.206Z uid:9bc972c9-0825-430f-8fc1-05b48d2c91cf user:dev events/emitter.go:325
2022-03-24T10:01:49-07:00 INFO [AUDIT] session.end cluster_name:example.com code:T2004I ei:29 enhanced_recording:false event:session.end interactive:true namespace:default participants:[dev] server_addr:127.0.0.1:3022 server_hostname:server01 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 server_labels:map[arch:x86_64 cluster:example.com env:staging hostname:bjoerger-ThinkPad-X1-Extreme] session_recording:node session_start:2022-03-24T17:01:32.013503299Z session_stop:2022-03-24T17:01:49.206600381Z sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:49.207Z uid:77ec8d9d-b73a-4d20-a8cd-21e4194f005e user:dev events/emitter.go:325
2022-03-24T10:01:49-07:00 INFO [AUDIT] session.end cluster_name:example.com code:T2004I ei:29 enhanced_recording:false event:session.end interactive:true namespace:default participants:[dev] server_addr:127.0.0.1:3022 server_hostname:server01 server_id:f58992c2-576d-42bd-be86-498aa37bd3e3 server_labels:map[arch:x86_64 cluster:example.com env:staging hostname:bjoerger-ThinkPad-X1-Extreme] session_recording:node session_start:2022-03-24T17:01:32.013503299Z session_stop:2022-03-24T17:01:49.206600381Z sid:15a2c1a2-9913-48ba-ae53-651be25842ff time:2022-03-24T17:01:49.207Z uid:77ec8d9d-b73a-4d20-a8cd-21e4194f005e user:dev events/emitter.go:325
# Attempt to play session recording
2022-03-24T10:03:59-07:00 DEBU [AUDIT] GetSessionEvents. afterN:0 printEvents:true sid:15a2c1a2-9913-48ba-ae53-651be25842ff events/auditlog.go:896
2022-03-24T10:03:59-07:00 DEBU [AUDIT] Recording 15a2c1a2-9913-48ba-ae53-651be25842ff is already downloaded and unpacked to /home/bjoerger/gravitational/teleport-config/local/storage/data/dir/main/log/playbacks/sessions/default/15a2c1a2-9913-48ba-ae53-651be25842ff.tar. events/auditlog.go:662
2022-03-24T10:03:59-07:00 DEBU [WEB] Unable to find events for session 15a2c1a2-9913-48ba-ae53-651be25842ff. error:[
ERROR REPORT:
Original Error: *trace.NotFoundError session "15a2c1a2-9913-48ba-ae53-651be25842ff" not found
Stack Trace:
/home/bjoerger/gravitational/teleport/lib/events/auditlog.go:586 github.com/gravitational/teleport/lib/events.readSessionIndex
/home/bjoerger/gravitational/teleport/lib/events/auditlog.go:535 github.com/gravitational/teleport/lib/events.(*AuditLog).readSessionIndex
/home/bjoerger/gravitational/teleport/lib/events/auditlog.go:916 github.com/gravitational/teleport/lib/events.(*AuditLog).GetSessionEvents
/home/bjoerger/gravitational/teleport/lib/auth/auth_with_roles.go:2170 github.com/gravitational/teleport/lib/auth.(*ServerWithRoles).GetSessionEvents
/home/bjoerger/gravitational/teleport/lib/auth/apiserver.go:2167 github.com/gravitational/teleport/lib/auth.(*APIServer).getSessionEvents
/home/bjoerger/gravitational/teleport/lib/auth/apiserver.go:305 github.com/gravitational/teleport/lib/auth.(*APIServer).withAuth.func1
/home/bjoerger/gravitational/teleport/lib/httplib/httplib.go:67 github.com/gravitational/teleport/lib/httplib.MakeHandlerWithErrorWriter.func1
/home/bjoerger/gravitational/teleport/vendor/github.com/julienschmidt/httprouter/router.go:387 github.com/julienschmidt/httprouter.(*Router).ServeHTTP
/home/bjoerger/gravitational/teleport/lib/httplib/httplib.go:184 github.com/gravitational/teleport/lib/httplib.RewritePaths.func1
/home/bjoerger/.tools/go/src/net/http/server.go:2047 net/http.HandlerFunc.ServeHTTP
/home/bjoerger/gravitational/teleport/lib/auth/middleware.go:557 github.com/gravitational/teleport/lib/auth.(*Middleware).ServeHTTP
/home/bjoerger/gravitational/teleport/vendor/github.com/gravitational/oxy/ratelimit/tokenlimiter.go:118 github.com/gravitational/oxy/ratelimit.(*TokenLimiter).ServeHTTP
/home/bjoerger/gravitational/teleport/vendor/github.com/gravitational/oxy/connlimit/connlimit.go:75 github.com/gravitational/oxy/connlimit.(*ConnLimiter).ServeHTTP
/home/bjoerger/.tools/go/src/net/http/server.go:2879 net/http.serverHandler.ServeHTTP
/home/bjoerger/.tools/go/src/net/http/server.go:1930 net/http.(*conn).serve
/home/bjoerger/.tools/go/src/runtime/asm_amd64.s:1581 runtime.goexit
Caught:
/home/bjoerger/gravitational/teleport/lib/httplib/httplib.go:142 github.com/gravitational/teleport/lib/httplib.ConvertResponse
/home/bjoerger/gravitational/teleport/lib/auth/clt.go:284 github.com/gravitational/teleport/lib/auth.(*Client).Get
/home/bjoerger/gravitational/teleport/lib/auth/clt.go:1363 github.com/gravitational/teleport/lib/auth.(*Client).GetSessionEvents
/home/bjoerger/gravitational/teleport/lib/web/apiserver.go:2316 github.com/gravitational/teleport/lib/web.(*Handler).siteSessionEventsGet
/home/bjoerger/gravitational/teleport/lib/web/apiserver.go:2479 github.com/gravitational/teleport/lib/web.(*Handler).WithClusterAuth.func1
/home/bjoerger/gravitational/teleport/lib/httplib/httplib.go:67 github.com/gravitational/teleport/lib/httplib.MakeHandlerWithErrorWriter.func1
/home/bjoerger/gravitational/teleport/vendor/github.com/julienschmidt/httprouter/router.go:387 github.com/julienschmidt/httprouter.(*Router).ServeHTTP
/home/bjoerger/.tools/go/src/net/http/server.go:2090 net/http.StripPrefix.func1
/home/bjoerger/.tools/go/src/net/http/server.go:2047 net/http.HandlerFunc.ServeHTTP
/home/bjoerger/gravitational/teleport/lib/web/apiserver.go:444 github.com/gravitational/teleport/lib/web.NewHandler.func1
/home/bjoerger/.tools/go/src/net/http/server.go:2047 net/http.HandlerFunc.ServeHTTP
/home/bjoerger/gravitational/teleport/vendor/github.com/julienschmidt/httprouter/router.go:448 github.com/julienschmidt/httprouter.(*Router).ServeHTTP
/home/bjoerger/gravitational/teleport/lib/web/apiserver.go:215 github.com/gravitational/teleport/lib/web.(*WebAPIHandler).ServeHTTP
/home/bjoerger/gravitational/teleport/vendor/github.com/gravitational/oxy/ratelimit/tokenlimiter.go:118 github.com/gravitational/oxy/ratelimit.(*TokenLimiter).ServeHTTP
/home/bjoerger/gravitational/teleport/vendor/github.com/gravitational/oxy/connlimit/connlimit.go:75 github.com/gravitational/oxy/connlimit.(*ConnLimiter).ServeHTTP
/home/bjoerger/.tools/go/src/net/http/server.go:2879 net/http.serverHandler.ServeHTTP
/home/bjoerger/.tools/go/src/net/http/server.go:1930 net/http.(*conn).serve
/home/bjoerger/.tools/go/src/runtime/asm_amd64.s:1581 runtime.goexit
User Message: session "15a2c1a2-9913-48ba-ae53-651be25842ff" not found
] web/apiserver.go:2318
``
Related to https://github.com/gravitational/teleport/issues/10983
The text was updated successfully, but these errors were encountered:
With the changes made in #4045, after a 24 hour grace period, a session's event stream is uploaded, whether the session is closed or not. This logic was built assuming that a after 24 hours a session must be abandoned. However, a session can remain open indefinitely, and the grace period logic misbehaves in this edge case.
teleport/lib/events/complete.go
Lines 140 to 143 in 506da87
With some of the recent changes made to session uploading and moderated sessions, the grace period's misbehavior has some varying effects depending on the version.
v8.3.4
If the server is closed during a session before the grace period, upon restart the server will successfully upload the session after the grace period, as expected.
But if the session remains open after the grace period upload, then the following problems arise:
Recording for this session is not available
.What should happen
When the grace period is reached on a pending session upload:
session.leave
event.Ideally, users would use the
node-sync
session streaming option, which would help to avoid this disruptive grace period behavior altogether, outside of the intended scenario where a session stream fails to upload at the end of the session's life time (network issue, etc.).Debug logs -
v8.3.4
- session exits immediately after grace period upload completesThe text was updated successfully, but these errors were encountered: