feat: send events when user leave the page #1146

devcorpio · 2022-01-20T19:54:36Z

Context

Currently the RUM agent sends events to the APM server in intervals of 500ms. Despite so, there are still scenarios where no information is being sent.

Take into account the following scenarios

user closes the tab
user closes the browser
user navigates to another page
user reloads the page

If they take place before the interval is triggered the information collected until this moment will be lost. And that is precisely what we want to reduce as much as possible with the changes made in this PR.

Summary

A new logic that combines the lifecycle API events: visibilitychange, pagehide and the usage of fetch with the flag keepalive in order to send events when user leaves the page.

"The keepalive option can be used to allow the request to outlive the page. Fetch with the keepalive flag is a replacement for the Navigator.sendBeacon() API.". More info here

Why are not we using Beacon API?

Since the Content-Type of our requests is application/x-ndjson the browser performs a preflight request and in case of Beacon API the request sets the credentials mode to include. Hence, if we don't want to see the following error in the browser: Response to preflight request doesn't pass access control check: The value of the 'Access-Control-Allow-Credentials' header in the response is '' which must be 'true' when the request's credentials mode is 'include' the APM Server has to include Access-Control-Allow-Credentials: true in the response
Beacon API does not allow to add custom headers and we need to add “Content-Encoding” after compressing the payload with Compression Streams API

Thoughts about this

Could it make sense to stop sending the content-type in our requests? (At this moment is a mandatory field and value)
Without that constraint navigator.sendBeacon() could be used in all browsers except Chromium browsers (because currently are the ones compressing the payload with GZIP).

What browsers do not support the `keepalive` flag in fetch?

IE 11 (that's because this browser do not even support fetch)
Firefox performs the request as if the flag was set to false
Safari < 13 performs the request as if the flag was set to false

What browsers do support the `keepalive` flag in fetch?

Chrome, Edge (Chromium), and Safari >=13 support it properly

Unfortunately, there is an exception in Chromium browsers, there are two scenarios that are not covered:

page reload
navigation to a different page

Those scenarios are not covered because the compression process is not fast enough and the APM server do not receive the request.

In Chromium browsers there are two more interesting nuances (in this case good ones):

When closing tab or browser the compression process delay is not a problem and the request is being sent
When closing tab or browser even XHR are being sent too!

When is all this logic executed?

When the events visibilitychange or pagehide are triggered.

Why are not we using the `unload` event?

In this link the details are well explained. But mainly because are not reliable events (especially on mobile) and prevents the bfcache of working as expected. In fact, because of that we might need to change our web-vitals implementation

Why are we still using the `500ms` interval?

In SPAs the information collected after navigating a few pages exceeds 64kb which is the limit for fetch with keepalive (beacon api has the very same limit). We need to bear in mind that at this moment we are only compressing the payload in Chromium browsers, so having payloads of 64kb would not be that uncommon. If we remove the interval we will not send information at all for the SPA scenario.

Is there any fallback?

XHR is the fallback in two different situations:

if payload exceeds the payload limit (in fact, I set the limit a bit lower since it may very depending on the browser)
Chrome < 81 had a bug with preflight requests when using keepalive. So, in a more general way, if there is a fetch network error, there is a fallback with XHR

But keep in mind that the XHR request will not be done when:

user closes the tab
user closes the browser
user navigates to another page
user reloads the page

dev-utils/karma.js

packages/rum-core/src/common/config-service.js

apmmachine · 2022-01-20T20:24:06Z

📦 Bundlesize report

Filename	Size(bundled)	Size(gzip)	Diff(gzip)
elastic-apm-opentracing.umd.min.js	66.0 KiB	21.0 KiB	⚠️ 611 Bytes
elastic-apm-rum.umd.min.js	59.8 KiB	19.4 KiB	⚠️ 608 Bytes

apmmachine · 2022-01-20T20:24:09Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2022-02-10T12:46:30.348+0000
Duration: 97 min 36 sec

Test stats 🧪

Test	Results
Failed	0
Passed	770
Skipped	8
Total	778

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
run benchmark tests : Run the benchmark test.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

packages/rum-core/src/state.js

packages/rum-core/src/performance-monitoring/metrics.js

packages/rum-core/src/common/apm-server.js

packages/rum-core/src/common/http/fetch.js

packages/rum-core/src/common/page-visibility.js

packages/rum-core/src/common/patching/fetch-patch.js

packages/rum-core/src/opentracing/index.js

packages/rum-core/src/common/page-visibility.js

vigneshshanmugam · 2022-01-26T05:01:02Z

packages/rum/src/index.js

+    TRANSACTION_SERVICE
+  ])
+
+  const enabled = bootstrap(configService, transactionService)


the reason why bootstrap was called first before creating service factory is that we need to account for when the time the agent got initialized, we might need to split this with installing page visibility handlers. if we need service factories to be created.

Your comment just make me realise that a better place to add the page visibility handlers could be inside ApmBase.init otherwise we would be adding listeners even with the agent not initialised

devcorpio · 2022-01-31T16:59:44Z

packages/rum-core/src/performance-monitoring/performance-monitoring.js

@@ -258,6 +260,18 @@ export default class PerformanceMonitoring {
    const configService = this._configService
    const transactionService = this._transactionService

+    // do not process calls to our own endpoints
+    if (task.data && task.data.url) {


If the request's url points to our own endpoints we will NOT process it. This new behaviour works for XHR and fetch.

P.S. Nevertheless, I'm not removing the XHR_IGNORE logic since it might be useful for other uses cases (although I would advocate to remove it in future PRs if there are not valid use cases)

P.S.2. This new logic also helps to cover the case where our users make use of fetch polyfills that rely on XHR

Can you make an issue and get rid of XHR_IGNORE in the next PR's? I don't think its required as we have this logic which can handle both now.

Created! #1164

vigneshshanmugam

Great work and the details are super helpful @devcorpio 🎉

Thanks for going extra mile in adding lots of tests. I just added couple of small points, Once we clear that, this PR is good to go.

packages/rum-core/src/common/page-visibility.js

vigneshshanmugam · 2022-02-08T20:06:42Z

packages/rum-core/src/common/page-visibility.js

+ * @param transactionService
+ */
+export function observePageVisibility(configService, transactionService) {
+  if (document.visibilityState === 'hidden') {


I am actually thinking, Do we want to capture the lastHiddenStart time when the rum agent got loaded vs when the rum agent gets initialized?

The listeners should still be when the agent is initialized though. Thoughts @devcorpio ?

@vigneshshanmugam it is true that before lastHiddenStart value was being updated even when the agent was not initialised and now is the opposite.

Umm, to be honest, I'm not able to see the benefit of allowing that value to be set with the agent uninitialised. Are you seeing any nuance that I'm missing?

And when it comes to the listeners maybe I'm a bit purist (or stubborn 😄 ) but I don't like the idea of having code within the agent that reacts to user interactions having the agent uninitialised

I couldnt think of a solid case as well. Was just trying to remember why I did it this way before and wondering if it would cause any problems.

But lets move ahead, not a blocker for sure. We can revisit if we see any problems.

packages/rum-core/test/common/apm-server.spec.js

packages/rum-core/test/common/page-visibility.spec.js

vigneshshanmugam · 2022-02-08T21:00:47Z

packages/rum-core/src/performance-monitoring/performance-monitoring.js

@@ -258,6 +260,18 @@ export default class PerformanceMonitoring {
    const configService = this._configService
    const transactionService = this._transactionService

+    // do not process calls to our own endpoints
+    if (task.data && task.data.url) {


Can you make an issue and get rid of XHR_IGNORE in the next PR's? I don't think its required as we have this logic which can handle both now.

vigneshshanmugam

LGTM

devcorpio requested a review from vigneshshanmugam January 20, 2022 19:57

devcorpio linked an issue Jan 20, 2022 that may be closed by this pull request

Consider using Beacon API for sending payload on unload event #195

Closed

devcorpio commented Jan 20, 2022

View reviewed changes

dev-utils/karma.js Show resolved Hide resolved

devcorpio commented Jan 20, 2022

View reviewed changes

packages/rum-core/src/common/config-service.js Show resolved Hide resolved

devcorpio added 3 commits January 25, 2022 12:27

feat: send events when user leaves the page

fc47823

chore: add tests for fetch

e141d76

chore: fix flaky tests

f7d6f99

devcorpio marked this pull request as ready for review January 25, 2022 15:07

vigneshshanmugam reviewed Jan 26, 2022

View reviewed changes

devcorpio added 6 commits January 27, 2022 16:03

chore: observe page visibility when initialising

b49df95

chore: set limit to 60 KiB

f3426c5

chore: describe reason behind the unobserve

0f24a58

chore: use existing constants for services

dc39ec6

chore: change global state restore strategy

b12b12c

chore: do not process calls to agent endpoints

57974a1

devcorpio commented Jan 31, 2022

View reviewed changes

devcorpio added 2 commits February 1, 2022 13:53

chore: improve keepalive check

a68e36f

chore: avoid executing keepalive if beforeSend defined

20788d9

devcorpio requested a review from vigneshshanmugam February 2, 2022 17:55

vigneshshanmugam approved these changes Feb 8, 2022

View reviewed changes

devcorpio mentioned this pull request Feb 10, 2022

Get rid of XHR_IGNORE flag #1164

Open

devcorpio added 3 commits February 10, 2022 13:11

chore: move variables inside function

c828a51

chore: use customEvent util

29d26d1

chore: remove needless spy

0251d52

devcorpio requested a review from vigneshshanmugam February 10, 2022 15:45

vigneshshanmugam approved these changes Feb 10, 2022

View reviewed changes

devcorpio merged commit 2429814 into elastic:main Feb 10, 2022

devcorpio deleted the send_beacon_unload branch February 10, 2022 15:57

devcorpio mentioned this pull request Feb 11, 2022

add Beacon API support #1166

Open

paulb-elastic mentioned this pull request Feb 23, 2022

Send RUM events for all activity (including partial page loads that don't reach onLoad) #970

Closed

devcorpio mentioned this pull request Feb 24, 2022

Use fetch rather than XHR for http requests #1171

Open

4 tasks

devcorpio mentioned this pull request Mar 28, 2022

fix: report LCP score properly #1190

Merged

mshustov mentioned this pull request Apr 21, 2022

bump RUM agent version elastic/kibana#130765

Merged

devcorpio mentioned this pull request May 23, 2022

Track custom unmanaged transactions #1231

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: send events when user leave the page #1146

feat: send events when user leave the page #1146

devcorpio commented Jan 20, 2022 •

edited

Loading

apmmachine commented Jan 20, 2022 •

edited

Loading

apmmachine commented Jan 20, 2022 •

edited

Loading

Build stats

Test stats 🧪

vigneshshanmugam Jan 26, 2022

devcorpio Jan 26, 2022

devcorpio Jan 31, 2022

vigneshshanmugam Feb 8, 2022

devcorpio Feb 10, 2022

vigneshshanmugam left a comment

vigneshshanmugam Feb 8, 2022

devcorpio Feb 10, 2022 •

edited

Loading

vigneshshanmugam Feb 10, 2022

vigneshshanmugam Feb 8, 2022

vigneshshanmugam left a comment

feat: send events when user leave the page #1146

feat: send events when user leave the page #1146

Conversation

devcorpio commented Jan 20, 2022 • edited Loading

Context

Summary

Why are not we using Beacon API?

Thoughts about this

What browsers do not support the keepalive flag in fetch?

What browsers do support the keepalive flag in fetch?

When is all this logic executed?

Why are not we using the unload event?

Why are we still using the 500ms interval?

Is there any fallback?

apmmachine commented Jan 20, 2022 • edited Loading

📦 Bundlesize report

apmmachine commented Jan 20, 2022 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

🤖 GitHub comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vigneshshanmugam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

devcorpio Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vigneshshanmugam left a comment

Choose a reason for hiding this comment

devcorpio commented Jan 20, 2022 •

edited

Loading

What browsers do not support the `keepalive` flag in fetch?

What browsers do support the `keepalive` flag in fetch?

Why are not we using the `unload` event?

Why are we still using the `500ms` interval?

apmmachine commented Jan 20, 2022 •

edited

Loading

apmmachine commented Jan 20, 2022 •

edited

Loading

devcorpio Feb 10, 2022 •

edited

Loading