getProgramAccounts() large string error #73

GustavAlbrecht · 2025-01-22T06:10:56Z

Overview

The getProgramAccounts() method call throws and error when used on the Stake program:
Error: Cannot create a string longer than 0x1fffffe8 characters

Steps to reproduce

const STAKE_PROGRAM_ID = "Stake11111111111111111111111111111111111111";

let rpcClient = createSolanaRpc(config.get_program_accounts_rpc_endpoint);

let data = await rpcClient.getProgramAccounts(STAKE_PROGRAM_ID, {encoding: "base64"}).send();

Edit: I use node --version v22.11.0

Description of bug

The core issue is a node internal limit on string size. I used libraries like stream-json to workaround this. Since getProgramAccounts is expected to sometimes return heavy data i consider it a bug that this library isn't handling heavy data coming from this call.

The text was updated successfully, but these errors were encountered:

steveluscher · 2025-01-23T18:40:29Z

Very interesting. What version of Node is this by the way?

@lorisleiva, do you think this is a limitation introduced by our new bigint-aware JSON parser, that's not present in the native parser?

GustavAlbrecht · 2025-01-25T09:50:46Z

I use node --version v22.11.0

lorisleiva · 2025-01-26T21:05:59Z

Ugh yeah it is likely that our custom JSON parser — that mitigates bigint values unsafe above 2^51-1 — ends up causing this limitation.

This is because we are having to await on response.text() instead of response.json() and the latter is likely making use of data streams.

As such, we end up hitting the Node limitation on strings which is set at 0x1fffffe8 characters.

However, that many characters amount to 512Mb of data and perhaps the pros of having safe u64 values (e.g. lamports) on the client outweighs the cons of having a half gigabyte data size limit by default.

I say by default because anyone can customise their RPC object by providing custom transports and APIs, which, when dealing with such large amounts of data, might be the best way forward anyway.

steveluscher · 2025-02-20T01:39:28Z

Is this the case, @GustavAlbrecht? Are you downloading >512Mb of data from an RPC?

steveluscher · 2025-02-20T01:50:47Z

Do you think there's any advantage to changing the fromJson API to instead deal with the body directly, which is a ReadableStream? I suppose we could let fromJson take in the body instead, build up the JSON string with your parser in a streaming fashion, and then JSON.parse() it. At worst it would make the processor faster (ie. be able to start sooner) and at best it might fix this problem?

steveluscher · 2025-02-20T01:51:54Z

We definitely won't be including something like this in the core library, but you might consider making a custom network transport that either disables our bigint parser, or uses something like https://github.com/karminski/streaming-json-js to convert it to JSON progressively without having to store the entire string.

GustavAlbrecht · 2025-02-23T09:03:54Z

Is this the case, @GustavAlbrecht? Are you downloading >512Mb of data from an RPC?

We definitely won't be including something like this in the core library, but you might consider making a custom network transport that either disables our bigint parser, or uses something like https://github.com/karminski/streaming-json-js to convert it to JSON progressively without having to store the entire string.

To answer your first question: Yes i download more than 512MB. I remember it happened in 2024 that it needed to switch to parsing progressively because stake accounts surpassed the 512 MB limit.

For the second question: To clarify i was using axios back then and i used the stream-json and stream-chain packages to fix the issue. I was looking to replace it with this library recently but given the issue i didn't complete it. I didn't use stream-json with this library or modified anything. So i don't have a strong opinion on how to fix it the best way.

I guess in theory an implementation that would parse json text and yield account objects in chunks would be ideal in terms of memory and processing for many use cases. Or at least for my use case where i fetch accounts, send them to the stake account parser, and then persist them in Postgres in batches. Right now i first need to wait until the whole text is parsed into account objects before i can start parsing account.data and persist stake account objects in Postgres.

GustavAlbrecht added the bug Something isn't working label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getProgramAccounts() large string error #73

getProgramAccounts() large string error #73

GustavAlbrecht commented Jan 22, 2025 •

edited

Loading

steveluscher commented Jan 23, 2025

GustavAlbrecht commented Jan 25, 2025

lorisleiva commented Jan 26, 2025

steveluscher commented Feb 20, 2025

steveluscher commented Feb 20, 2025

steveluscher commented Feb 20, 2025 •

edited

Loading

GustavAlbrecht commented Feb 23, 2025 •

edited

Loading

getProgramAccounts() large string error #73

getProgramAccounts() large string error #73

Comments

GustavAlbrecht commented Jan 22, 2025 • edited Loading

Overview

Steps to reproduce

Description of bug

steveluscher commented Jan 23, 2025

GustavAlbrecht commented Jan 25, 2025

lorisleiva commented Jan 26, 2025

steveluscher commented Feb 20, 2025

steveluscher commented Feb 20, 2025

steveluscher commented Feb 20, 2025 • edited Loading

GustavAlbrecht commented Feb 23, 2025 • edited Loading

GustavAlbrecht commented Jan 22, 2025 •

edited

Loading

steveluscher commented Feb 20, 2025 •

edited

Loading

GustavAlbrecht commented Feb 23, 2025 •

edited

Loading