Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes the Cypher files execution not happening sequentially #2707

Merged
merged 1 commit into from
Apr 11, 2022

Conversation

ncordon
Copy link
Contributor

@ncordon ncordon commented Apr 6, 2022

What

Fixes runFiles so that the execution happens sequentially, rather than in parallel.

It wasn't working because of two reasons:

  • We were creating a list with streams produced by runManyStatements, which starts a thread. Which means all threads were started effectively at the same time for the different files it receives as input.
  • Stream.reduce doesn't have good semantics to achieve the sequential execution either. On the one hand it's eager (or terminal as they call it in the docs), which means the threads are fired when we call the reduce phase and not when the stream is consumed. On the other hand it asks for associativity, which means we could group operations like (read the + as in execute the operation, not just concatenating them, since we have said it's eager):
a + b + c + d
as
(a + b) + (c + d)

flatMap on the other hand is an intermediate operation (it creates a lazy stream) and the threads will not be fired until the Stream is consumed.

Why

apoc.cypher.runFiles and apoc.cypher.runSchemaFile were not being executed sequentially.

That was a problem creating flakiness in the modified testRunFilesMultiple test.

It was reproducible doing apoc.cypher.runFiles with the following (added in the PR to the test) files:

UNWIND RANGE(0,2) as id
CREATE (n:Node {id:id})
RETURN n.id as id;

CALL apoc.util.sleep(2000)

MATCH (n) DELETE n; // The files definitely run in parallel cause I see 4 deleted here

and

UNWIND RANGE(0,2) as id
CREATE (n:Node {id:id})
RETURN n.id as id;

CALL apoc.util.sleep(1000)

MATCH (n) DELETE n;

@ncordon ncordon added bug team-cypher-surface Cypher Surface team should review the PR to-cherry-pick issues related to commits to cherry-pick 5.0 labels Apr 6, 2022
@AzuObs
Copy link
Contributor

AzuObs commented Apr 7, 2022

I think that you're right and that reduce was the problem. The java docs for Stream.reduce says that this function is not constrained to run sequentially.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5.0 bug team-cypher-surface Cypher Surface team should review the PR to-cherry-pick issues related to commits to cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants