Make getAvailableSpace return available space instead of max requested space #773

deividasstr · 2025-01-28T15:32:04Z

Hey, fixing a bug when disk persistence is not being used if requested max cache size (in DiskBufferingConfiguration.builder().setMaxCacheSize) is less than 7mb. Default 60mb sounds alot.

This log gave me a confusing hint: Insufficient folder cache size: -715243, it must be at least: 1048576.

The issue lies within fixed method getAvailableSpace, which returns available space only if max needed space is higher than available space.

It seems the fixed behavior is intended, as this method is used in DiskManager, names the returned value as availableCacheSize.

Screenshot of my debug breakpoint (requesting max 1_000_000 cache storage):

From the numbers here, in current solution, given unlimited available disk space, more than 6_000_000 bytes have to be requested for max cache size for the if (calculatedSize < maxCacheFileSize) { to be false (and have persistence enabled).

Now writing this I got confused why would there be a min limit for cache file? Why do each signals folder have to be bigger than maxCacheFileSize with another maxCacheFileSize subtracted from the size? When CacheStorage already allocates the requested max bytes?

…d space

linux-foundation-easycla · 2025-01-28T15:32:11Z

✅login: deividasstr / (551066a)

The committers listed above are authorized under a signed CLA.

breedx-splk · 2025-01-28T17:33:21Z

@deividasstr Thanks for the contribution! That error message sure is confusing and maybe even a little embarassing. 😁 We will take a look at the PR soon, but in the meantime can you please sign the contributor license agreement (CLA)? Thanks again!

deividasstr · 2025-01-29T11:23:42Z

Hey @breedx-splk , I have signed the CLA.

So to sum up, this PR turned into a question, if there a minimum cache size required or not.

If yes, I can update the methods and docs in a separate PR.
If no, please let me know how to proceed. Maybe my proposed changes aren't required and something around if (calculatedSize < maxCacheFileSize) needs to be adjusted.

breedx-splk · 2025-01-31T00:22:45Z

Hey @deividasstr . I thought @LikeTheSalad was going to comment on this, but I think the overall reason we tried to limit the amount of disk used was to try and be a good citizen on the device. If the instrumentation were to get too aggressive with disk usage (due to being offline, or for whatever reason, like excessive telemetry), this could limit the app's own ability to function. We tried to find a reasonable compromise that would allow the app to continue using the cache while the instrumentation also used part of the cache.

The idea was to eventually drop telemetry or stop capturing data if the device is too limited...that way the app is unharmed.

there a minimum cache size required or not.

I think there is. Something too small would prevent our ability to correctly buffer even a few spans or events to disk. Curious on how @LikeTheSalad thinks we should proceed.

LikeTheSalad · 2025-01-31T09:48:56Z

Hi @deividasstr, thank you for your PR!

Regarding a minimum cache size requirement, at first, I thought the same as @breedx-splk, that it would be useful to ensure we have some space for the disk buffering functionality to work properly. The way the cache works here is that each signal has a folder in which we store multiple files of 1MB size max each iirc, though they can be smaller than that, so it made sense to try and ensure that there's at least 1MB available per signal, plus 1MB to account for the data that's read-only to not get cleared before getting exported.

However, after taking another look at it and also after having some extra validations for when a file is corrupted, it seems like probably having a minimum cache size might not be needed after all, given that if something goes wrong, the data will be sent to the exporter right away, which is what would happen anyway in case the disk buffering can't get properly configured due to setting a max folder size of zero, which can happen because of this validation.

So to summarise, unless I missed something important, it seems like we can remove this validation altogether.

We still need to try and use as few resources as possible from the host app though, so I don't like the idea of setting the whole available disk space size as a max cache size for the APM use case. I'm not sure what could be the best approach here, though one option could be to avoid executing this method and instead always use the getUsableSpace() approach and then use a fraction of what it returns.

deividasstr · 2025-02-03T11:04:50Z

Thanks for being so involving in the discussion!

My suggestion would be:

Remove min size requirement (this validation)
Reduce default cache size 60 -> 10mb? Datadog default is ~12mb, Firebase - 10mb.
As a developer, I would like to see how often events are lost due to insufficient storage (and if it is due to general storage shortage, or because max cache size was insufficient). Are there some logs of persistence failure? Maybe even a callback of such case could be useful.

deividasstr · 2025-02-10T13:11:43Z

What do you think @LikeTheSalad @breedx-splk?

LikeTheSalad · 2025-02-10T13:30:11Z

What do you think @LikeTheSalad @breedx-splk?

I agree with the points you mentioned earlier, I just wanted to give some time for people to provide some other ideas in case we missed something, though that doesn't seem to be the case. I've added this topic for tomorrow's Android SIG meeting to discuss the default cache size value and, if there are no other concerns raised on the call then I think we can move forward with them. For your third item, I like the idea of adding a callback for those cases, that's something I can add to the disk buffering artifact.

Make getAvailableSpace return available space instead of max requeste…

551066a

…d space

deividasstr requested a review from a team as a code owner January 28, 2025 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make getAvailableSpace return available space instead of max requested space #773

Make getAvailableSpace return available space instead of max requested space #773

deividasstr commented Jan 28, 2025

linux-foundation-easycla bot commented Jan 28, 2025 •

edited

Loading

breedx-splk commented Jan 28, 2025

deividasstr commented Jan 29, 2025

breedx-splk commented Jan 31, 2025

LikeTheSalad commented Jan 31, 2025 •

edited

Loading

deividasstr commented Feb 3, 2025 •

edited

Loading

deividasstr commented Feb 10, 2025

LikeTheSalad commented Feb 10, 2025

Make getAvailableSpace return available space instead of max requested space #773

Are you sure you want to change the base?

Make getAvailableSpace return available space instead of max requested space #773

Conversation

deividasstr commented Jan 28, 2025

linux-foundation-easycla bot commented Jan 28, 2025 • edited Loading

breedx-splk commented Jan 28, 2025

deividasstr commented Jan 29, 2025

breedx-splk commented Jan 31, 2025

LikeTheSalad commented Jan 31, 2025 • edited Loading

deividasstr commented Feb 3, 2025 • edited Loading

deividasstr commented Feb 10, 2025

LikeTheSalad commented Feb 10, 2025

linux-foundation-easycla bot commented Jan 28, 2025 •

edited

Loading

LikeTheSalad commented Jan 31, 2025 •

edited

Loading

deividasstr commented Feb 3, 2025 •

edited

Loading