Spanner runStream cause a lot memory consumption #934

hadson19 · 2020-05-04T08:22:48Z

Environment details

windows:
Node.js 10.x:
npm version: 6.9.0
@google-cloud/spanner version: @google-cloud/[email protected]

Steps to reproduce

const { Spanner } = require("@google-cloud/spanner");
const fs = require("fs");
const util = require('util');
const stream = require('stream');
const through = require('through');

const pipeline = util.promisify(stream.pipeline);

const spanner = {
  "instanceId": "..",
  "databaseId": "..",
  "projectId": ".."
};
const filePath = "./users.data.txt";
const keyFilename = "service-account.json";

const spannerDatabase = new Spanner({
  projectId: spanner.projectId,
  keyFilename,
})
  .instance(spanner.instanceId)
  .database(spanner.databaseId);

const spannerStream = spannerDatabase.runStream({
  sql: 'select * from user',
  json: true,
  gaxOptions: {
    pageSize: 1000,
  },
});

const run = async () => {
  await pipeline(
    spannerStream,
    through(function (data) {
      return this.queue(`${JSON.stringify(data)}\n`);
    }),
    fs.createWriteStream(filePath)
  );

  process.exit(0);
};

run();

The stream consume a lot of memory (RAM) (some time it's take 600 Mb)
The gaxOptions (pageSize, maxResults, retry) doesn't work.

Thanks!

The text was updated successfully, but these errors were encountered:

olavloite · 2020-05-04T09:43:23Z

@hadson19
I'm not directly able to reproduce this. Could you elaborate a little bit on how you are measuring the memory consumption?

I've tried to reproduce the problem with the following script:

async function queryWithMemUsage(instanceId, databaseId, projectId) {
  // Imports the Google Cloud client library
  const {Spanner} = require('@google-cloud/spanner');

  // Creates a client
  const spanner = new Spanner({
    projectId: projectId,
  });

  // Gets a reference to a Cloud Spanner instance and database
  const instance = spanner.instance(instanceId);
  const database = instance.database(databaseId);

  const query = {
    sql: `SELECT *
          FROM TableWithAllColumnTypes
          ORDER BY ColInt64`,
    gaxOptions: {pageSize: 1000},
  };

  let count = 0;
  database
    .runStream(query)
    .on('data', row => {
      count++;
      if (count % 100 === 0) {
        console.log(`Current row: ${row.toJSON({wrapNumbers: true})}`);
        console.log(`Processed ${count} rows so far`);
        const used = process.memoryUsage().heapUsed / 1024 / 1024;
        console.log(`Current mem usage: ${Math.round(used * 100) / 100} MB`);
      }
    })
    .on('error', console.log)
    .on('end', () => {
      console.log(`Finished processing ${count} rows`);
      database.close();
    });
}

When running that script, the memory usage printed out for each 100 rows will vary between 30MB to 80MB. The memory usage will increase steadily to approx 80MB when a garbage collect will reduce it back to 30MB.

Example log from a run of the script using a table which contains huge rows:

Current row: [object Object]
Processed 100 rows so far
Current mem usage: 32.75 MB
Current row: [object Object]
Processed 200 rows so far
Current mem usage: 38.73 MB
Current row: [object Object]
Processed 300 rows so far
Current mem usage: 48.99 MB
Current row: [object Object]
Processed 400 rows so far
Current mem usage: 34.41 MB
Current row: [object Object]
Processed 500 rows so far
Current mem usage: 32.81 MB
Current row: [object Object]
Processed 600 rows so far
Current mem usage: 31.85 MB
Current row: [object Object]
Processed 700 rows so far
Current mem usage: 37.9 MB
Current row: [object Object]
Processed 800 rows so far
Current mem usage: 39.96 MB
Current row: [object Object]
Processed 900 rows so far
Current mem usage: 54.63 MB
Current row: [object Object]
Processed 1000 rows so far
Current mem usage: 52.64 MB
Current row: [object Object]
Processed 1100 rows so far
Current mem usage: 62.48 MB
Current row: [object Object]
Processed 1200 rows so far
Current mem usage: 68.3 MB
Current row: [object Object]
Processed 1300 rows so far
Current mem usage: 79.03 MB
Current row: [object Object]
Processed 1400 rows so far
Current mem usage: 34.41 MB
Current row: [object Object]
Processed 1500 rows so far
Current mem usage: 39.45 MB
Current row: [object Object]
Processed 1600 rows so far
Current mem usage: 39.81 MB
Current row: [object Object]
Processed 1700 rows so far
Current mem usage: 49.65 MB
Current row: [object Object]
Processed 1800 rows so far
Current mem usage: 55.1 MB
Current row: [object Object]
Processed 1900 rows so far
Current mem usage: 65.69 MB
Current row: [object Object]
Processed 2000 rows so far
Current mem usage: 72.19 MB
Current row: [object Object]
Processed 2100 rows so far
Current mem usage: 72.7 MB
Current row: [object Object]
Processed 2200 rows so far
Current mem usage: 27.29 MB
Current row: [object Object]
Processed 2300 rows so far
Current mem usage: 33.06 MB
Current row: [object Object]
Processed 2400 rows so far
Current mem usage: 43.47 MB
Current row: [object Object]
Processed 2500 rows so far
Current mem usage: 48.39 MB
Current row: [object Object]
Processed 2600 rows so far
Current mem usage: 50.01 MB
Current row: [object Object]
Processed 2700 rows so far
Current mem usage: 59.59 MB
Current row: [object Object]
Processed 2800 rows so far
Current mem usage: 69.78 MB
Current row: [object Object]
Processed 2900 rows so far
Current mem usage: 75.44 MB
...

hadson19 · 2020-05-04T09:50:28Z

@olavloite
To mesure memory usage I've used similar this mechanizm

 setInterval(() => {
   const used = process.memoryUsage().heapUsed / 1024 / 1024;
   console.log(`Current mem usage: ${Math.round(used * 100) / 100} MB`);
 }, 100);

Can you try to write the result into the file?
Because, it seems that backpressuring in nodejs-spanner streams doesn't work.

Can you also check the pause method ?
Using handler data to consume data doesn't approve that streams work correctly

olavloite · 2020-05-04T11:13:59Z

@hadson19

I've changed my test case to the following:

async function queryWithMemUsage(instanceId, databaseId, projectId) {
  // Imports the Google Cloud client library
  const {Spanner} = require('@google-cloud/spanner');
  const fs = require('fs');
  const stream = require('stream');
  // eslint-disable-next-line node/no-extraneous-require
  const through = require('through');

  // Creates a client
  const spanner = new Spanner({
    projectId: projectId,
  });

  // Gets a reference to a Cloud Spanner instance and database
  const instance = spanner.instance(instanceId);
  const database = instance.database(databaseId);

  const query = {
    sql: `SELECT *
          FROM TableWithAllColumnTypes
          ORDER BY ColInt64`,
    gaxOptions: {pageSize: 1000},
  };

  let count = 0;
  const fileStream = fs.createWriteStream('/home/loite/rs.txt');
  const rs = database
    .runStream(query)
    .on('data', () => {
      count++;
      if (count % 100 === 0) {
        console.log(`Processed ${count} rows so far`);
        const used = process.memoryUsage().heapUsed / 1024 / 1024;
        console.log(`Current mem usage: ${Math.round(used * 100) / 100} MB`);
      }
    })
    .on('error', console.log)
    .on('end', () => {
      console.log('Finished writing file');
      database.close();
    });
  // eslint-disable-next-line node/no-unsupported-features/node-builtins
  await stream.pipeline(
    rs,
    through(function (data) {
      return this.queue(
        `${JSON.stringify(data.toJSON({wrapNumbers: true}))}\n`
      );
    }),
    fileStream
  );
}

This generates a results file sized 2.1GB. The memory usage is even lower than in the initial test. It now varies between approx 30MB and 55MB.

Processed 100 rows so far
Current mem usage: 33.09 MB
Processed 200 rows so far
Current mem usage: 41.48 MB
Processed 300 rows so far
Current mem usage: 30.93 MB
Processed 400 rows so far
Current mem usage: 43.89 MB
Processed 500 rows so far
Current mem usage: 48.15 MB
Processed 600 rows so far
Current mem usage: 40.99 MB
Processed 700 rows so far
Current mem usage: 42.5 MB
Processed 800 rows so far
Current mem usage: 34.73 MB
Processed 900 rows so far
Current mem usage: 49.04 MB
Processed 1000 rows so far
Current mem usage: 35.42 MB
Processed 1100 rows so far
Current mem usage: 37.07 MB
Processed 1200 rows so far
Current mem usage: 26.57 MB
Processed 1300 rows so far
Current mem usage: 38.64 MB
Processed 1400 rows so far
Current mem usage: 47.67 MB
Processed 1500 rows so far
Current mem usage: 25.23 MB
Processed 1600 rows so far
Current mem usage: 36.7 MB
Processed 1700 rows so far
Current mem usage: 30.94 MB
Processed 1800 rows so far
Current mem usage: 42.25 MB
Processed 1900 rows so far
Current mem usage: 54.32 MB
Processed 2000 rows so far
Current mem usage: 38.51 MB
Processed 2100 rows so far
Current mem usage: 49.96 MB
Processed 2200 rows so far
Current mem usage: 32.65 MB
Processed 2300 rows so far
Current mem usage: 48.11 MB
Processed 2400 rows so far
Current mem usage: 32.9 MB
Processed 2500 rows so far
Current mem usage: 46.36 MB
Processed 2600 rows so far
Current mem usage: 43.44 MB
Processed 2700 rows so far
Current mem usage: 37.55 MB
Processed 2800 rows so far
Current mem usage: 49.52 MB
Processed 2900 rows so far
Current mem usage: 35.06 MB
Processed 3000 rows so far
Current mem usage: 44.52 MB
Processed 3100 rows so far
Current mem usage: 43.16 MB
Processed 3200 rows so far
Current mem usage: 41.11 MB
Processed 3300 rows so far
Current mem usage: 26.32 MB
Processed 3400 rows so far
Current mem usage: 37.54 MB
Processed 3500 rows so far
.....
Processed 49000 rows so far
Current mem usage: 42.61 MB
Processed 49100 rows so far
Current mem usage: 35.88 MB
Processed 49200 rows so far
Current mem usage: 47.36 MB
Processed 49300 rows so far
Current mem usage: 46.96 MB
Processed 49400 rows so far
Current mem usage: 31.99 MB
Processed 49500 rows so far
Current mem usage: 42.15 MB
Processed 49600 rows so far
Current mem usage: 33.06 MB
Processed 49700 rows so far
Current mem usage: 45.3 MB
Processed 49800 rows so far
Current mem usage: 31.49 MB
Processed 49900 rows so far
Current mem usage: 44.7 MB
Processed 50000 rows so far
Current mem usage: 48.18 MB
Finished writing file

hadson19 · 2020-05-04T11:18:59Z

hm... it seems that you have a good ssd (the latency is not big).

Can you add postpone Transform.. (to emit backpresure)?

for example:

class PostponeTransform extends Transform {
  _transform(chunk, enc, next) {
    setTimeout(() => {
      this.push(chunk, enc);
      next();
    }, 100 );
  }
}

  await stream.pipeline(
    rs,
    new PostponeTransform({
      highWaterMark: 100,
      objectMode: true,
    }),
    through(function (data) {
      return this.queue(
        `${JSON.stringify(data.toJSON({wrapNumbers: true}))}\n`
      );
    }),
    fileStream
  );

Because we have a bit latency with storage (which emit pause).. And this pause doesn't effect on nodejs spanner library.

Thanks

olavloite · 2020-05-04T11:26:41Z

@hadson19 Yep, that seems to trigger the problem for me as well.

stephenplusplus · 2020-05-04T15:44:43Z

When you see the increased memory usage, is it beyond what the system is capable of handling? I have noticed while researching stream memory consumption issues in the past, Node's internals are in control, and while the usage gets high, it's not causing a crash, for example.

olavloite · 2020-05-04T15:51:45Z

When you see the increased memory usage, is it beyond what the system is capable of handling? I have noticed while researching stream memory consumption issues in the past, Node's internals are in control, and while the usage gets high, it's not causing a crash, for example.

I've been able to make it go out of memory on my local machine while testing this.

olavloite · 2020-05-05T14:12:35Z

I've been able to make it go out of memory on my local machine while testing this.

I should note that this only happens when I try to write my 2GB result set to a file and deliberately add the PostponeTransformer to the pipeline. Without the PostponeTransformer or with a smaller result set, it does not go out of memory.

I have a solution locally that will fix this now, but I want to do some additional testing first.

hadson19 · 2020-05-05T14:39:35Z

The PostponeTransform is a primitive simulation of network latency (stream of the record to external source). The main idea is to handle nodejs backpressure in nodejs-spanner. Please, correct me, if I'm wrong

The request stream should be paused if the downstream is indicating that it cannot handle any more data at the moment. The request stream should be resumed once the downstream does accept data again. This reduces memory consumption and potentially out of memory errors when a result stream is piped into a slow writer. Fixes googleapis#934

olavloite · 2020-05-06T09:01:14Z

The PostponeTransform is a primitive simulation of network latency (stream of the record to external source). The main idea is to handle nodejs backpressure in nodejs-spanner. Please, correct me, if I'm wrong

Well, both should handle backpressure correctly. If you were to add some custom transformer or stream to your pipeline that itself does not handle backpressure correctly, you could still run into problems.

In this case it seems that one of the internal streams used by nodejs-spanner was not handling it well in all cases. That does not necessarily mean that the proposed fix will change the actual memory consumption in your specific case, but it should at least prevent any unnecessary build up of buffered data internally. See #936 (comment) for more information on a number of tests that I ran for this.

skuruppu · 2020-05-07T01:59:27Z

@hadson19 would you be willing to try #936 to see if it fixes your issue? Note that the Node.js client library is going through a major version update so using what's in the master branch may be tricky. You may also just be able to patch the change locally on the version you're using.

hadson19 · 2020-05-07T15:59:58Z

@skuruppu @olavloite
I've used that pull request and I'm still getting a lot of memory consumption more than 300 mb.
Thanks.

olavloite · 2020-05-07T16:13:26Z

@hadson19 What is the size of the result set that you are streaming in terms of number of columns and number of rows, and what is the size of the data in the columns (or maybe easier: the size of the total result set?).

hadson19 · 2020-05-07T16:42:15Z

@olavloite
The table has near 20 columns
Total row size (stringified) is near 510 length (utf-8) (simple characters)
In my test table are near 200 000 rows
I've used the same PostponeTransform transport to check this stream

class PostponeTransform extends stream.Transform {
  _transform(chunk, enc, next) {
     setTimeout(() => {
      this.push(JSON.stringify(chunk), enc);
      next();
     }, 80 );
  }
}

  await pipeline(
    spannerDatabase.runStream({
      sql: 'select * from user',
      json: true,
        gaxOptions: {
          pageSize: 10,
        },
    }),
    new PostponeTransform({
      highWaterMark: 10,
      objectMode: true,
    }),
    fs.createWriteStream(filePath)
  )

and the library (nodejs-spanner) try to consume a lot of memory.
I've noticed that there is no more then ~300 mb.

hadson19 · 2020-05-18T10:47:03Z

@olavloite
Unfortunately after settings millions of rows in Spanner and use query above in nodejs - the memory used more then 900Mb.

olavloite · 2020-05-19T10:37:29Z

@hadson19

Unfortunately after settings millions of rows in Spanner and use query above in nodejs - the memory used more then 900Mb.

What happens if you run the same query on the same table without this change?

I agree that 900MB is a lot, and I'll look into that as well, but for me this change at least reduced the total memory consumption for these kind of very large result sets to an amount that could be handled. In other words; it did not run out of memory.

hadson19 · 2020-05-25T08:35:39Z

@olavloite
It is a little bit different (with or without your pull request). At the beginning - memory increases up to nearly 400Mb and waits (until it is consumed by the transformer). But after a stream (transform stream) resumes it - the memory increases up to nearly 900 Mb.

And after a nearly one minute
used 1395 MB
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 00007FF7032D879A v8::internal::GCIdleTimeHandler::GCIdleTimeHandler+4506
2: 00007FF7032B3206 node::MakeCallback+4534
3: 00007FF7032B3B80 node_module_register+2032
4: 00007FF7035D1AFE v8::internal::FatalProcessOutOfMemory+846
5: 00007FF7035D1A2F v8::internal::FatalProcessOutOfMemory+639
6: 00007FF7037B8224 v8::internal::Heap::MaxHeapGrowingFactor+9620
7: 00007FF7037AF206 v8::internal::ScavengeJob::operator=+24550
8: 00007FF7037AD85C v8::internal::ScavengeJob::operator=+17980
9: 00007FF7037ACD25 v8::internal::ScavengeJob::operator=+15109
10: 00007FF7037B6684 v8::internal::Heap::MaxHeapGrowingFactor+2548
11: 00007FF7038E06F8 v8::internal::Factory::AllocateRawArray+56
12: 00007FF7038E1072 v8::internal::Factory::NewFixedArrayWithFiller+66
13: 00007FF70390B2B6 v8::internal::Factory::NewCallHandlerInfo+113510
14: 00007FF703C5D680 v8::internal::compiler::JSTypedLowering::simplified+29264

Thanks

* fix: pause request stream on backpressure The request stream should be paused if the downstream is indicating that it cannot handle any more data at the moment. The request stream should be resumed once the downstream does accept data again. This reduces memory consumption and potentially out of memory errors when a result stream is piped into a slow writer. Fixes #934 * fix: do not retry stream indefinitely PartialResultSetStream should stop retrying to push data into the stream after a configurable number of retries have failed. * fix: process review comments * fix: remove unused code * tests: add test for pause/resume * fix: return after giving up retrying + add test

product-auto-label bot added the api: spanner Issues related to the googleapis/nodejs-spanner API. label May 4, 2020

hadson19 changed the title ~~Spanner *runStream* cause a lot mamory consumption~~ Spanner *runStream* cause a lot memory consumption May 4, 2020

olavloite self-assigned this May 4, 2020

olavloite added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed type: question Request for information or clarification. Not an issue. labels May 5, 2020

yoshi-automation added the triage me I really want to be triaged. label May 5, 2020

olavloite added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed triage me I really want to be triaged. labels May 5, 2020

olavloite mentioned this issue May 5, 2020

fix: pause request stream on backpressure #936

Merged

olavloite closed this as completed in #936 Jun 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spanner runStream cause a lot memory consumption #934

Spanner runStream cause a lot memory consumption #934

hadson19 commented May 4, 2020 •

edited

Loading

olavloite commented May 4, 2020

hadson19 commented May 4, 2020 •

edited

Loading

olavloite commented May 4, 2020

hadson19 commented May 4, 2020 •

edited

Loading

olavloite commented May 4, 2020

stephenplusplus commented May 4, 2020

olavloite commented May 4, 2020

olavloite commented May 5, 2020

hadson19 commented May 5, 2020 •

edited

Loading

olavloite commented May 6, 2020

skuruppu commented May 7, 2020

hadson19 commented May 7, 2020 •

edited

Loading

olavloite commented May 7, 2020

hadson19 commented May 7, 2020 •

edited

Loading

hadson19 commented May 18, 2020

olavloite commented May 19, 2020

hadson19 commented May 25, 2020 •

edited

Loading

Spanner *runStream* cause a lot memory consumption #934

Spanner *runStream* cause a lot memory consumption #934

Comments

hadson19 commented May 4, 2020 • edited Loading

Environment details

Steps to reproduce

olavloite commented May 4, 2020

hadson19 commented May 4, 2020 • edited Loading

olavloite commented May 4, 2020

hadson19 commented May 4, 2020 • edited Loading

olavloite commented May 4, 2020

stephenplusplus commented May 4, 2020

olavloite commented May 4, 2020

olavloite commented May 5, 2020

hadson19 commented May 5, 2020 • edited Loading

olavloite commented May 6, 2020

skuruppu commented May 7, 2020

hadson19 commented May 7, 2020 • edited Loading

olavloite commented May 7, 2020

hadson19 commented May 7, 2020 • edited Loading

hadson19 commented May 18, 2020

olavloite commented May 19, 2020

hadson19 commented May 25, 2020 • edited Loading

Spanner runStream cause a lot memory consumption #934

Spanner runStream cause a lot memory consumption #934

hadson19 commented May 4, 2020 •

edited

Loading

hadson19 commented May 4, 2020 •

edited

Loading

hadson19 commented May 4, 2020 •

edited

Loading

hadson19 commented May 5, 2020 •

edited

Loading

hadson19 commented May 7, 2020 •

edited

Loading

hadson19 commented May 7, 2020 •

edited

Loading

hadson19 commented May 25, 2020 •

edited

Loading