Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Stripe: sync freeze with 80k records #6417

Closed
marcosmarxm opened this issue Sep 24, 2021 · 16 comments · Fixed by #10359 or #45143
Closed

Source Stripe: sync freeze with 80k records #6417

marcosmarxm opened this issue Sep 24, 2021 · 16 comments · Fixed by #10359 or #45143

Comments

@marcosmarxm
Copy link
Member

Enviroment

  • Airbyte version: ~
  • OS Version / Instance: 4vCPU, 16GB RAM VM on Google Cloud
  • Deployment: Docker
  • Source Connector and version: Stripe
  • Destination Connector and version: Bigquery
  • Severity: Very Low / Low / Medium / High / Critical
  • Step where error happened: Deploy / Sync job / Setup new connection / Update connector / Upgrade Airbyte

Current Behavior

Look that for one user Stripe connector cant finish the sync, or took 17h to sync 80k. Other user comment that he is able to sync 500k in 4h. I opened this issue to record this and investigate further if there is any option in Stripe we need inform to users enable.
Slack convo.

I already did reduce the data to one month. The job halts at the very same 79000 records. What kind of requests the start_date filters? It’s like Airbyte keeps syncing 79871 balance transactions regardless the start date parameter.
Actually, it continues 17 hours later. 😮 But it cannot find the BigQuery endpoint. Can it be due to the prolonged run time?

when limiting the scope to invoices, invoice_lines and invoice_line_items, the job succeeded, although took more than 6 hours to finish because it synced all the data.

Expected Behavior

Tell us what should happen.

Logs

If applicable, please upload the logs from the failing operation.
For sync jobs, you can download the full logs from the UI by going to the sync attempt page and
clicking the download logs button at the top right of the logs display window.

LOG

replace this with
your long log
output here

Steps to Reproduce

Are you willing to submit a PR?

Remove this with your answer.

@marcosmarxm marcosmarxm added type/bug Something isn't working area/connectors Connector related issues labels Sep 24, 2021
@sherifnada sherifnada added this to the Connectors 2021-11-12 milestone Oct 29, 2021
@htrueman
Copy link
Contributor

htrueman commented Nov 16, 2021

I've ran the connector locally, got the following output logs-24-2.txt. I think it may be a destination issue. Working on debugging and testing to resolve the exact place of the issue.
Another log: logs-25-0.txt.

@htrueman
Copy link
Contributor

@sherifnada so the connectors fails with (see extended logs in the files attached above).

2021-11-17 21:53:31 WARN () ActivityExecutionContextImpl(doHeartBeat):153 - Heartbeat failed
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999163700s. [closed=[], open=[[remote_addr=airbyte-temporal/172.18.0.8:7233]]]
	at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262) ~[grpc-stub-1.40.0.jar:1.40.0]
	at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243) ~[grpc-stub-1.40.0.jar:1.40.0]
	at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156) ~[grpc-stub-1.40.0.jar:1.40.0]
	at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.recordActivityTaskHeartbeat(WorkflowServiceGrpc.java:2710) ~[temporal-serviceclient-1.0.4.jar:?]
	at io.temporal.internal.sync.ActivityExecutionContextImpl.sendHeartbeatRequest(ActivityExecutionContextImpl.java:203) ~[temporal-sdk-1.0.4.jar:?]
	at io.temporal.internal.sync.ActivityExecutionContextImpl.doHeartBeat(ActivityExecutionContextImpl.java:147) ~[temporal-sdk-1.0.4.jar:?]
	at io.temporal.internal.sync.ActivityExecutionContextImpl.heartbeat(ActivityExecutionContextImpl.java:108) ~[temporal-sdk-1.0.4.jar:?]
	at io.airbyte.workers.temporal.CancellationHandler$TemporalCancellationHandler.checkAndHandleCancellation(CancellationHandler.java:46) ~[io.airbyte-airbyte-workers-0.32.0-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getCancellationChecker$4(TemporalAttemptExecution.java:216) ~[io.airbyte-airbyte-workers-0.32.0-alpha.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]

This fail happens after a different number of BQ tables created each time. Seems that issue is within java core and I'm not sure how to fix that. I would also try to to run normalization locally to make sure its not an issue.

@htrueman
Copy link
Contributor

htrueman commented Nov 18, 2021

I've ran the normalization process separately with docker run --rm --init -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/24/2/normalize --network host --log-driver none airbyte/normalization:dev run --integration-type bigquery --config destination_config.json --catalog destination_catalog.json (I took destination_config.json and destination_catalog.json from airbyre_workspace docker volume).
It worked just fine, see bq-normalization-log.txt.

So in total: source, destination and normalization modules work fine. It seems that the issue is with java core, and I'm not sure how to fix that. @sherifnada, perhaps you now who may help me with that?

@htrueman
Copy link
Contributor

htrueman commented Nov 19, 2021

To reproduce the bug #6417 (comment) you need to set up the stripe -> bigquery connector, using the credentials from our LastPass account. Then just run the sync and wait for some time. It would fail, but each time it fails on another point for syncing.

@sherifnada
Copy link
Contributor

@htrueman thanks for the great summary. I'll pass this on to the Airbyte team.

@sherifnada sherifnada added area/platform issues related to the platform and removed blocked labels Nov 29, 2021
@sherifnada sherifnada removed this from the Connectors Dec 10 2021 milestone Nov 29, 2021
@davinchia
Copy link
Contributor

@htrueman I'm not able to reproduce this.

I set up Stripe and BQ in prod and all my syncs have been successful.
Screen Shot 2021-11-30 at 12 13 03 AM

@davinchia
Copy link
Contributor

Were you seeing this on all syncs?

@jrhizor
Copy link
Contributor

jrhizor commented Nov 29, 2021

I'd also be curious to know if you're hitting max RAM, CPU, or disk space usage during your tests.

@htrueman
Copy link
Contributor

htrueman commented Dec 1, 2021

@htrueman I'm not able to reproduce this.

I set up Stripe and BQ in prod and all my syncs have been successful. Screen Shot 2021-11-30 at 12 13 03 AM

l the same as described above and still get the same error logs-28-0.txt. IDK, perhaps it's my local issue or smth.
I've used both stripe and bq creds from our LastPass acount.

@davinchia
Copy link
Contributor

@htrueman were you running this locally?

@htrueman
Copy link
Contributor

htrueman commented Dec 2, 2021

@htrueman were you running this locally?

Yes. I don't have access to cloud to test it there

@davinchia
Copy link
Contributor

davinchia commented Dec 2, 2021

Cool, can you give me details of how you set it up? (docker/kube, if kube, what kind of kube cluster) And what kind of resources the cluster had? I might be able to reproduce it this way.

I was confused since the original issue description mentions this was run in GCP.

@htrueman
Copy link
Contributor

htrueman commented Dec 2, 2021

Cool, can you give me details of how you set it up? (docker/kube, if kube, what kind of kube cluster) And what kind of resources the cluster had? I might be able to reproduce it this way.

I was confused since the original issue description mentions this was run in GCP.

Well, I've got pretty simple setup:

  • I've just ran the local docker env with docker-compose up.
  • Then started the front end server npm start.
  • Then I've created new connection with latest stripe and bq versions, using the creds from lastpass.
  • Then I've started the sync, waited for some time and got the logs attached above.

@misteryeo
Copy link
Contributor

@davinchia @htrueman is there further steps for this ticket?

@antixar antixar moved this to Prioritized for scoping in GL Roadmap Jan 28, 2022
@antixar antixar moved this from Prioritized for scoping to Ready for implementation in GL Roadmap Jan 28, 2022
@davinchia
Copy link
Contributor

this looks like it's in the connector roadmap - so I think the only thing left is to wait for implementation

@oustynova oustynova assigned midavadim and unassigned htrueman Feb 9, 2022
@midavadim midavadim moved this from Ready for implementation (prioritized) to Implementation in progress in GL Roadmap Feb 10, 2022
@midavadim midavadim moved this from Implementation in progress to In review (internal) in GL Roadmap Feb 18, 2022
@midavadim midavadim moved this from In review (internal) to In review (Airbyte) in GL Roadmap Feb 28, 2022
@midavadim midavadim moved this from In review (Airbyte) to Done in GL Roadmap Mar 10, 2022
@midavadim
Copy link
Contributor

We have Improved performance for streams with substreams: invoice_line_items, subscription_items, bank_accounts
We significantly decreased the number of required requests for mentioned streams.
Previously, 1 request was needed for each extracted record.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
10 participants