Source S3: Cannot allocate memory #6606

harshithmullapudi · 2021-10-01T10:03:08Z

Enviroment

OS Version / Instance: EC2 t2.medium
Memory / Disk: 4Gb / 16Gb
Deployment: Docker
Airbyte Version: 0.29.22-alpha
Source name/version: S3 0.1.3
Destination name/version: Postgres 0.3.7
Step: Synching

Current Behavior

One of my connections suddenly runs out of memory while inferring the schema. It works well for a couple of days/weeks and then can't sync anymore. It happened already ~two weeks ago, I "solved" the issue by upgrading to a newest version.
Upon further inspection, I can see that earlier this week, some of the attempts failed, but the 2nd or 3rd one succeeded. Interestingly too, successful attempts usually still timed out at some point while inferring the schema. Size of the files doesn't seem to have an impact (the biggest file was 147.4KB and was sync last month, usually they are around 50KB). The only difference with my other S3 connections is that this specific streams deals with multiple files per sync.
I tried to reduced the block size as well, but to no avail. I also tried to provide the schema as an input but same same.
Attached you'll find logs of a successful run as well as one that failed.

Expected Behavior

Tell us what should happen.

Logs

If applicable, please upload the logs from the failing operation.
For sync jobs, you can download the full logs from the UI by going to the sync attempt page and
clicking the download logs button at the top right of the logs display window.

LOG


replace this with
your long log
output here

Steps to Reproduce

Are you willing to submit a PR?

Remove this with your answe
logs-2724-2.txt
logs-2570-2.txt
r.

The text was updated successfully, but these errors were encountered:

harshithmullapudi added the type/bug Something isn't working label Oct 1, 2021

Phlair mentioned this issue Oct 1, 2021

🎉 Source S3 - memory & performance optimisations + advanced CSV options #6615

Merged

Phlair closed this as completed in #6615 Oct 19, 2021

igrankova added connectors/sources-files connectors/source/s3 labels Jan 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Source S3: Cannot allocate memory #6606

Source S3: Cannot allocate memory #6606

harshithmullapudi commented Oct 1, 2021

Source S3: Cannot allocate memory #6606

Source S3: Cannot allocate memory #6606

Comments

harshithmullapudi commented Oct 1, 2021

Enviroment

Current Behavior

Expected Behavior

Logs

Steps to Reproduce

Are you willing to submit a PR?