Compile recursive iterators to sub-linear time complexity code. Now it is compiled to quadratic time complexity code. #25052

AlexRadch · 2018-02-26T12:12:21Z

Compile recursive iterators to sub-linear time complexity code. Now it is compiled to quadratic time complexity code.

Any version:

Recursive iterations code is compiled to quadratic time complexity code.

Steps to Reproduce:

Write sub-linear time recursive code. For example next code:

        static IEnumerable<int> RangeRecursive1(int b, int e)
        {
            if (b > e)
                yield break;
            yield return b;
            foreach(var v in RangeRecursive1(b + 1, e))
                yield return v;
        }

Expected Behavior:

Sub-linear time complexity.

Actual Behavior:

Quadratic time complexity.

Such code is compiled to loop in loop in loop and so on. All that they do is to return values from recursive loops but have quadratic time complexity.

Work around

Use separated loop and replace recursive calls by linear loops. Here https://github.com/AlexRadch/YieldForEachDN I created such loop (I called it Hyperloop). It support IHyperloop interface with two methods:

    public interface IHyperloop<in T>
    {
        void Add(IEnumerable<T> source);
        void Add(IEnumerator<T> loop);
    }

Then any recursive calls can be replaced by adding loops to IHyperloop.

For example next iterator:

        static IEnumerable<int> RangeRecursive1(int b, int e)
        {
            if (b > e)
                yield break;
            yield return b;
            foreach(var v in RangeRecursive1(b + 1, e))
                yield return v;
        }

Can be converted to iterator without loops in loops in loops and so on:

        static IEnumerable<int> RangeRecursive1_Hl(int b, int e)
        {
            var hl = new Hyperloop<int>();
            hl.Add(RangeRecursive1_HlImp(hl, b, e));
            return hl;
        }

        static IEnumerable<int> RangeRecursive1_HlImp(IHyperloop<int> hl, int b, int e)
        {
            if (b > e)
                yield break;
            yield return b;
            hl.Add(RangeRecursive1_HlImp(hl, b + 1, e));
        }

Here RangeRecursive1_Hl is proxy that create Hyperloop and add RangeRecursive1_HlImp iterator to Hyperloop. The RangeRecursive1_HlImp code is similar to original code except that loop code replaced by adding loops to IHyperloop.

            foreach(var v in RangeRecursive1(b + 1, e))
                yield return v;

Code converted to

            hl.Add(RangeRecursive1_HlImp(hl, b + 1, e));

Suggestion:

The small code transformation can reduce quadratic time complexity to sub-linear time complexity in recursive iterators.

So I suggest detect recursive iterators (it is easy) and use such small code transformation in roslyn compiler to reduce time complexity.

The text was updated successfully, but these errors were encountered:

svick · 2018-02-26T15:27:17Z

It's not clear to me how exactly would the translation work. For example, consider this code:

IEnumerable<T> DumpAndIterateFromBack<T>(LinkedListNode<T> node)
{
    if (node == null)
        yield break;
        
    foreach (var x in DumpAndIterateFromBack(node.Previous))
        yield return x;

    Console.WriteLine(node.Value);
    yield return node.Value;
}

How would you translate it to use your Hyperloop?

AlexRadch · 2018-02-26T15:54:54Z

@svick Now my Hyperloop implementation does not support loops with side effects like in your code but it can be extended to support such code. I will make support of code with side effects and show translation of your code.

AlexRadch · 2018-02-26T16:23:47Z

@svick I added support of code with side effects in Hyperloop class.

So next code

        static IEnumerable<int> DumpAndIterateFromBack(int e)
        {
            if (e <= 0)
                yield break;

            foreach (var x in DumpAndIterateFromBack(e - 1))
                yield return x;

            Console.WriteLine(e);
            yield return e;
        }

can be translated to next code:

        static IEnumerable<int> DumpAndIterateFromBack_Hl(int e)
        {
            var hl = new Hyperloop<int>();
            hl.Add(DumpAndIterateFromBack_HlImp(hl, e));
            return hl;
        }

        static IEnumerable<int> DumpAndIterateFromBack_HlImp(IHyperloop<int> hl, int e)
        {
            if (e <= 0)
                yield break;

            hl.Add(DumpAndIterateFromBack_HlImp(hl, e - 1));
            yield return default(int); // will be skiped to support code with side effects

            Console.WriteLine(e);
            yield return e;
        }

So only the loop part:

            foreach (var x in DumpAndIterateFromBack(e - 1))
                yield return x;

Translated to

            hl.Add(DumpAndIterateFromBack_HlImp(hl, e - 1));
            yield return default(int); // will be skiped to support code with side effects

all other code is the same.

KrisVandermotten · 2018-02-26T17:50:46Z

See also #15 and https://www.researchgate.net/publication/228625370_Iterators_revisited_Proof_rules_and_implementation

AlexRadch · 2018-02-27T04:18:55Z

@KrisVandermotten I think here https://github.com/AlexRadch/YieldForEachDN I implemented the same ideas as in https://www.researchgate.net/publication/228625370_Iterators_revisited_Proof_rules_and_implementation/.
I think my implementation is more optimal by memory and by performance and with more clear code transformations.

AlexRadch · 2018-02-28T17:10:08Z

@Wolfram-Schulte what are you think about my recursive iterators transformation for effective implementation?

gafter · 2019-08-04T01:19:30Z

We would consider a community contribution that addresses this feature request provided the implementation is sufficiently simple and well isolated in the compilers.

ranma42 · 2021-08-11T09:00:09Z

I would like to try and tackle this.

My current plan would be to use something like GeneratedSequenceBase to "flatten" nested enumerables.

What is the best way to recognize that a sequence has been generated according to these rules (or, which would be the same to me, supports both the "single result" and the "sequence result")?
I am thinking of using an interface, but I am unsure whether it is ok (it would be a somewhat special one, as it would have a meaning for the compiler). On the upside doing it with a "core" interface would make it possible for F# to efficiently yield! a C# enumerable and for C# to yield foreach an F# sequence (efficiently as in flattened, to avoid recursion costs).

CyrusNajmabadi · 2021-08-11T13:36:33Z

We would need a solution that came with no perf impact on any existing code.

ranma42 · 2021-08-11T17:23:11Z

Yup, the idea is to only change the codegen if the method contains a yield foreach

ArcadeMode · 2022-11-06T18:43:19Z

Hello, I'd like to have a crack at this.

My current understanding:

When writing an iterator function that iterates over another iterator function and yield returns the nested iterator's results, then the resulting generated code performs two operations for each of the nested iterator's returns. This appears to internally be two statemachines where the inner state-change causes the outer to state change which is simply propagating values. This propagation takes as many operations as nesting is deep, which in the examples above can be arbitrarily deep due to recursion, resulting in quadratic complexity. But it is also present in simple nested usages of such foreach + yield returning methods.

If I understand the research paper's intuition what it does is not propagate values 'upward' but instead have the outer-most iterator point directly to the nested iterator which is 'currently' active, however deep it may be. Thus avoiding the costly propagation.

There is a suggestion to enable new yield foreach syntax. I could try and implement this syntax which would (probably?) allow me to rewrite the expression such that there is no such propagation happening by pointing the outer iterator directly to the inner one untill it has completed. Existing foreach with yield return would remain how they are (and thus still enable e.g. side effects to be implemented for such inner-iterator value changes).

Is my understanding correct and would the suggested approach be desired?

CyrusNajmabadi · 2022-11-06T18:50:06Z

@ArcadeMode the critical thing is that any existing code that does a standard foreach over any existing IEnumerables (or anything foreach operates on) not have any negative impact whatsoever.

ArcadeMode · 2022-11-06T19:28:08Z

I found the discussion in the csharplang repo and noticed that the suggested approaches are unsuitable. Its a marvelously complex issue for sure! The related discussion about supporting yield foreach as syntax sugar is also very interesting. But I digress,

(I am still getting familar with this part of roslyn so if im about to spew nonsense dont hesitate to tell me).

What I am thinking is to use yield foreach as a means to have the enumerator block enumerate the inner enumerator directly. It would effectively mean when the outer enumerator's MoveNext() is invoked and its at a yield foreach instruction, the call is forwarded to that enumerator's MoveNext() and its yield return is also forwarded directly. This would avoid the state-change propagation by tapping into the nested enumerator directly. Since most of the behavior is coded at the Lowering stage I feel this may be possible.

Furthermore this would not touch foreach and thus would actually be opt-in behavior for avoiding the time complexity explosion that can arrise here.

I would love your feedback on this.

CyrusNajmabadi · 2022-11-06T19:34:44Z

Furthermore this would not touch foreach and thus would actually be opt-in behavior for avoiding the time complexity explosion that can arrise here.

Could you give an example of:

what the user would write inside the iterator.
what that code would actually get translated to.
what the user would write when consuming the iterator.
what that code would actually get translated to.

Thanks!

gafter added Area-Compilers Feature Request labels Apr 9, 2018

gafter added the help wanted The issue is "up for grabs" - add a comment if you are interested in working on it label Aug 4, 2019

gafter added this to the Backlog milestone Aug 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile recursive iterators to sub-linear time complexity code. Now it is compiled to quadratic time complexity code. #25052

Compile recursive iterators to sub-linear time complexity code. Now it is compiled to quadratic time complexity code. #25052

AlexRadch commented Feb 26, 2018 •

edited

Loading

svick commented Feb 26, 2018 •

edited

Loading

AlexRadch commented Feb 26, 2018

AlexRadch commented Feb 26, 2018 •

edited

Loading

KrisVandermotten commented Feb 26, 2018

AlexRadch commented Feb 27, 2018

AlexRadch commented Feb 28, 2018

gafter commented Aug 4, 2019

ranma42 commented Aug 11, 2021

CyrusNajmabadi commented Aug 11, 2021

ranma42 commented Aug 11, 2021

ArcadeMode commented Nov 6, 2022

CyrusNajmabadi commented Nov 6, 2022

ArcadeMode commented Nov 6, 2022

CyrusNajmabadi commented Nov 6, 2022

Compile recursive iterators to sub-linear time complexity code. Now it is compiled to quadratic time complexity code. #25052

Compile recursive iterators to sub-linear time complexity code. Now it is compiled to quadratic time complexity code. #25052

Comments

AlexRadch commented Feb 26, 2018 • edited Loading

svick commented Feb 26, 2018 • edited Loading

AlexRadch commented Feb 26, 2018

AlexRadch commented Feb 26, 2018 • edited Loading

KrisVandermotten commented Feb 26, 2018

AlexRadch commented Feb 27, 2018

AlexRadch commented Feb 28, 2018

gafter commented Aug 4, 2019

ranma42 commented Aug 11, 2021

CyrusNajmabadi commented Aug 11, 2021

ranma42 commented Aug 11, 2021

ArcadeMode commented Nov 6, 2022

CyrusNajmabadi commented Nov 6, 2022

ArcadeMode commented Nov 6, 2022

CyrusNajmabadi commented Nov 6, 2022

AlexRadch commented Feb 26, 2018 •

edited

Loading

svick commented Feb 26, 2018 •

edited

Loading

AlexRadch commented Feb 26, 2018 •

edited

Loading