LanguageExt.Core ► Immutable Collections ► Iterator

Iterator<A> is a functional-wrapper for IEnumerator<A>. The abstraction leaks a little, so it's worth understanding how it works by reading the details below. On the whole it behaves like an immutable stream that caches values as it goes, but there's some footguns that you should be aware of so that they can be avoided.

Problem: `IEnumerator<A>`

It is mutable which means using it in immutable data-structures is problematic.
If you pass an IEnumerator reference to two threads, each thread can call MoveNextand it will move the enumeration position for other thread, or even worse, move past the end of the sequence due to race conditions.
Enumerators start before the first item and use a complicated mechanism for accessing and testing the validity of the element value.

Nobody in their right mind would invent an interface like IEnumerator<A> today.

Solution: `Iterator<A>`

Iterator<A> still uses IEnumerator<A> internally, but it makes it thread-safe and functional. From the outside the type acts and works exactly like any other immutable sequence, but internally it does some quite complex processing to achieve this with an IEnumerator<A> reference.

You may say "Why not just drop IEnumerator<A>?" - which is a completely valid position to hold. Unfortunately, IEnumerable and IEnumerator are baked into the CPS state-machine that is used for yield return and yield break. So, we don't get to ignore those types, and instead we need to make them play nice.

IEnumerable<A> has a method called GetEnumerator() which is used to access an IEnumerator<A>. A new extension method is available called GetIterator(), this will yield an Iterator<A>.

You can pattern-match an Iterator<A> like a functional 'cons' linked-list type:

static A Sum<A>(Iterator<A> iter) where A : INumber =>
    iter switch
    {
        Iterator<A>.Nil                 => A.Zero,
        Iterator<A>.Cons(var x, var xs) => x + Sum(xs),
    }

Or, use IsEmpty and Head:

static A Sum<A>(Iterator<A> iter) where A : INumber =>
    iter.IsEmpty
        ? A.Zero
        : iter.Head + Sum(iter.Tail);

Or, use built-in operators:

static A Sum<A>(Iterator<A> iter) where A : INumber =>
    iter.Fold(A.Zero, (s, x) => s + x);

Or, take an imperative approach:

static A Sum<A>(Iterator<A> iter) where A : INumber
{
    var total = A.Zero;
    while(!iter.IsEmpty)
    {
        total += iter.Head;
        iter = iter.Tail;
    }
    return total;
}

Iterator<A> is abstract and the first type returned from GetIterator() will be a Iterator<A>.ConsFirst, this type implements Iterator<A>. The internal fields that ConsFirst contains are these:

IEnumerable<A> enumerable;
int firstAcquired;
Iterator<A>? firstValue;

So, you can see that it doesn't actually have an IEnumerator<A> as this point.

The two key properties of Iterator<A> are Head (for accessing the current item) and Tail (for accessing the remaining items), so let's look at those for ConsFirst:

public override A Head =>
    First.Head;

public override Iterator<A> Tail =>
    First.Tail;

They both access First, which is:

Iterator<A> First
{
    get
    {
        if (firstAcquired == 2) return firstValue!;
        
        SpinWait sw = default;
        while (firstAcquired < 2)
        {
            if (Interlocked.CompareExchange(ref firstAcquired, 1, 0) == 0)
            {
                try
                {
                    var enumerator = enumerable.GetEnumerator();
                    if (enumerator.MoveNext())
                    {
                        firstValue = new ConsValueEnum(enumerator.Current, enumerator);
                    }
                    else
                    {
                        enumerator.Dispose();
                        firstValue = Nil.Default;
                    }

                    firstAcquired = 2;
                }
                catch (Exception)
                {
                    firstAcquired = 0;
                    throw;
                }
            }
            else
            {
                sw.SpinOnce();
            }
        }

        return firstValue!;
    }
}

This all looks quite complex, but you should be able to see that the Interlocked.CompareExchange then-block is where the IEnumerator is created. We then either set firstValue to a new ConsValueEnum with the head-item and the enumerator as arguments; or we set it to Nil.

Upon success we set firstAcquired to 2. So, subsequent calls to First will just return firstValue. This locking technique without using locks is a way to efficiently protect the enumerator from race-conditions.

So, upon first access to either Head or Tail we launch the IEnumerator and cache the first item in the sequence. All subsequent access goes to Head or Tail on either Nil or ConsValueEnum. We never touch the IEnumerator again in ConsFirst.

The Nil implementation isn't so surprising:

public override A Head =>
    throw new InvalidOperationException("Nil iterator has no head");

public override Iterator<A> Tail =>
    this;

ConsValueEnum is where it gets interesting. It has the following internal fields:

Exception? exception;
IEnumerator<A>? enumerator;
int tailAcquired;
Iterator<A>? tailValue;

It also has a Head property that is set in the constructor:

public override A Head { get; }

So, we can access the Head value at any time, but the Tail value isn't yet set:

public override Iterator<A> Tail
{
    get
    {
        if(tailAcquired == 2) return tailValue!;
        if(tailAcquired == 3) exception!.Rethrow();

        SpinWait sw = default;
        while (tailAcquired < 2)
        {
            if (Interlocked.CompareExchange(ref tailAcquired, 1, 0) == 0)
            {   
                try
                {
                    if (enumerator!.MoveNext())
                    {
                        tailValue = new ConsValueEnum(enumerator.Current, enumerator);
                    }
                    else
                    {
                        enumerator?.Dispose();
                        enumerator = null;
                        tailValue = Nil.Default;
                    }

                    tailAcquired = 2;
                }
                catch (Exception e)
                {
                    exception = e;
                    tailAcquired = 3;
                    throw;
                }
            }
            else
            {
                sw.SpinOnce();
            }
        }

        if(tailAcquired == 3) exception!.Rethrow();
        return tailValue!;
    }
}

This does a similar thing to ConsFirst of protecting a section with Interlocked.CompareExchange. So, we can only ever access the 'then' part [of that if statement] once. In that block we MoveNext the IEnumerator which will either return true or false.

If true then we create another ConsValueEnum, if false then we use Nil. Whichever is created gets assigned to tailValue and tailAcquired gets set to 2. That means subsequent calls to Tail will just return tailValue.

That process continues for each item of the sequence until the IEnumerator runs out of items to yield. The end result is a linked-list of ConsValueEnum objects that have a ConsFirst object at the head of the linked-list.

So, Iterator<A> effectively caches the sequence as you go. If you hold on to the head of the sequence then the whole list may end up in memory at once. This could be problematic when working with large lazy sequences or even infinite sequences.

This for example is fine:

for(var iter = Naturals.GetIterator(); !iter.IsEmpty; iter = iter.Tail)
{
    Console.WriteLine(iter.Head);
}

Because the iter reference keeps getting updated in-place, meaning that nothing is holding on to the head-item in the sequence, and so the garbage-collector can collect those unreferenced items.

Whereas this will cause memory-usage to grow and grow:

var start = Naturals.GetIterator();
for(var iter = start; !iter.IsEmpty; iter = iter.Tail)
{
    Console.WriteLine(iter.Head);
}

Because start is holding a reference to the first item, so it must hold a reference (indirectly) to every subsequent item. Meaning the garbage-collector can't collect.

To get around this you can use Clone:

var start = Naturals.GetIterator();
for(var iter = start.Clone(); !iter.IsEmpty; iter = iter.Tail)
{
    Console.WriteLine(iter.Head);
}

This creates a new 'head' for the sequence and so iter is the only reference, meaning updates to iter make the head elements free for garbage collection.

So, Iterator<A> is much, much more powerful than IEnumerator<A>. It is mostly useful for immutable data-types that need to carry an IEnumerator<A>, but can't due it its limitations. Iterator<A> has some limitations of its own, but they are relatively easy to workaround, whereas that isn't the case with IEnumerator<A> (without writing a type like Iterator<A>!).

Contents

Iterator <A>

Empty = new Nil()
Head
Tail
IsEmpty
Count
Clone ()
Split ()
AsEnumerable ()
AsIterable ()
Deconstruct (out A head, out Iterator<A> tail)
Select (Func<A, B> f)
Map (Func<A, B> f)
Bind (Func<A, Iterator> f)
SelectMany <B, C> (Func<A, Iterator> bind, Func<A, B, C> project)
Apply (Iterator<Func<A, B>> ff, Iterator<A> fa)
Concat (Iterator<A> other)
Fold <S> ( S state, Func<A, Func<S, S>> f)
Fold <S> ( S state, Func<S, A, S> f)
FoldBack <S> ( S state, Func<A, Func<S, S>> f)
FoldBack <S> ( S state, Func<S, A, S> f)
FoldWhile <S> ( S state, Func<A, Func<S, S>> f, Func<(S State, A Value), bool> predicate)
FoldWhile <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate)
FoldBackWhile <S> ( S state, Func<S, Func<A, S>> f, Func<(S State, A Value), bool> predicate)
FoldBackWhile <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate)
FoldUntil <S> ( S state, Func<A, Func<S, S>> f, Func<(S State, A Value), bool> predicate)
FoldUntil <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate)
FoldBackUntil <S> ( S state, Func<S, Func<A, S>> f, Func<(S State, A Value), bool> predicate)
FoldBackUntil <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate)
Merge (Iterator<A> other)
Zip (Iterator<A> other)
+ (Iterator<A> ma, Iterator<A> mb)
Dispose ()
Equals (object? obj)
Equals (Iterator<A>? rhs)
GetHashCode ()
GetEnumerator ()
ToString ()
ToFullString (string separator = ", ")
ToFullArrayString (string separator = ", ")

Nil

Default = new Nil()
ToString ()
Head
Tail
IsEmpty
Clone ()
Split ()
Count
Dispose ()

Cons

Deconstruct (out A head, out Iterator<A> tail)

IteratorExtensions

As <A> (this K<Iterator, A> ma)
GetIterator <A> (this IEnumerable<A> enumerable)

Iterator

from <A> (IEnumerable<A> enumerable)
singleton <A> (A head)
Cons <A> (A head, Iterator<A> tail)
Cons <A> (A head, Func<Iterator<A>> tail)
Nil <A> ()

Iterator

class Iterator <A> Source #

Wrapper for IEnumerator that makes it work like an immutable sequence.

It is thread-safe and impossible for any item in the sequence to be enumerated more than once.

IEnumerator from the .NET BCL has several problems:

It's very imperative
It's not thread-safe, two enumerators can't be shared

The lack of support for sharing of enumerators means that it's problematic using it internally in types like StreamT, or anything that needs to keep an IEnumerator alive for any period of time.

NOTE: There is a per-item allocation to hold the state of the iterator. These are discarded as you enumerate the sequence. However, technically it is possible to hold the initial Iterator value and subsequently gain a cached sequence of every item encountered in the enumerator.

That may well be valuable for circumstances where re-evaluation would be expensive. However, for infinite-streams this would be extremely problematic. So, make sure you discard any previous Iterator values as you walk the sequence.

Parameters

type

A

Item value type

Fields

field Iterator<A> Empty = new Nil() Source #

Empty iterator

Properties

property A Head Source #

Head element

property Iterator<A> Tail Source #

Tail of the sequence

property bool IsEmpty Source #

Return true if there are no elements in the sequence.

property long Count Source #

Return the number of items in the sequence.

Requires all items to be evaluated, this will happen only once however.

Methods

method Iterator<A> Clone () Source #

Clone the iterator so that we can consume it without having the head item referenced. This will stop any GC pressure when processing large or infinite sequences.

method Iterator<A> Split () Source #

When iterating a sequence, it is possible (before evaluation of the Tail) to Terminate the current iterator and to take a new iterator that continues on from the current location. The reasons for doing this are to break the linked-list chain so that there isn't a big linked-list of objects in memory that can't be garbage collected.

Any other iterator references that came before this one will terminate at this point. Splitting the previous and subsequent iterators here.

Parameters

returns

New iterator that starts from the current iterator position.

method IEnumerable<A> AsEnumerable () Source #

Create an IEnumerable from an Iterator

method Iterable<A> AsIterable () Source #

Create an Iterable from an Iterator

method void Deconstruct (out A head, out Iterator<A> tail) Source #

Deconstructor

method Iterator Select (Func<A, B> f) Source #

Functor map

method Iterator Map (Func<A, B> f) Source #

Functor map

method Iterator Bind (Func<A, Iterator> f) Source #

Monad bind

method Iterator<C> SelectMany <B, C> (Func<A, Iterator> bind, Func<A, B, C> project) Source #

Monad bind

method Iterator Apply (Iterator<Func<A, B>> ff, Iterator<A> fa) Source #

Applicative apply

method Iterator<A> Concat (Iterator<A> other) Source #

Concatenate two iterators

method S Fold <S> ( S state, Func<A, Func<S, S>> f) Source #

Fold the sequence while there are more items remaining

method S Fold <S> ( S state, Func<S, A, S> f) Source #

Fold the sequence while there are more items remaining

method S FoldBack <S> ( S state, Func<A, Func<S, S>> f) Source #

Fold the sequence in reverse while there are more items remaining

method S FoldBack <S> ( S state, Func<S, A, S> f) Source #

Fold the sequence in reverse while there are more items remaining

method S FoldWhile <S> ( S state, Func<A, Func<S, S>> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence while the predicate returns true and there are more items remaining

method S FoldWhile <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence while the predicate returns true and there are more items remaining

method S FoldBackWhile <S> ( S state, Func<S, Func<A, S>> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence in reverse while the predicate returns true and there are more items remaining

method S FoldBackWhile <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence in reverse while the predicate returns true and there are more items remaining

method S FoldUntil <S> ( S state, Func<A, Func<S, S>> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence until the predicate returns true or the sequence ends

method S FoldUntil <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence until the predicate returns true or the sequence ends

method S FoldBackUntil <S> ( S state, Func<S, Func<A, S>> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence in reverse until the predicate returns true or the sequence ends

method S FoldBackUntil <S> ( S state, Func<S, A, S> f, Func<(S State, A Value), bool> predicate) Source #

Fold the sequence in reverse until the predicate returns true or the sequence ends

method Iterator<A> Merge (Iterator<A> other) Source #

Interleave two iterator sequences together

Whilst there are items in both sequences, each is yielded after the other. Once one sequence runs out of items, the remaining items of the other sequence is yielded alone.

method Iterator<(A First , A Second)> Zip (Iterator<A> other) Source #

Zips the items of two sequences together

The output sequence will be as long as the shortest input sequence.

method void Dispose () Source #

Dispose

method bool Equals (object? obj) Source #

Equality test

Parameters

param	obj	Other iterator to compare against
returns	True if equal

method bool Equals (Iterator<A>? rhs) Source #

Equality test

Parameters

param	other	Other iterator to compare against
returns	True if equal

method int GetHashCode () Source #

method IEnumerator<A> GetEnumerator () Source #

Get enumerator

Parameters

returns

method string ToString () Source #

method string ToFullString (string separator = ", ") Source #

Format the collection as a, b, c, ...

method string ToFullArrayString (string separator = ", ") Source #

Format the collection as [a, b, c, ...]

Operators

operator + (Iterator<A> ma, Iterator<A> mb) Source #

Combine two sequences

class Nil Source #

Nil iterator case

The end of the sequence.

Fields

field Iterator<A> Default = new Nil() Source #

Properties

property A Head Source #

Head element

property Iterator<A> Tail Source #

Tail of the sequence

property bool IsEmpty Source #

Return true if there are no elements in the sequence.

property long Count Source #

Return the number of items in the sequence.

Requires all items to be evaluated, this will happen only once however.

Methods

method string ToString () Source #

method Iterator<A> Clone () Source #

Clone the iterator so that we can consume it without having the head item referenced. This will stop any GC pressure.

method Iterator<A> Split () Source #

When iterating a sequence, it is possible (before evaluation of the Tail) to Terminate the current iterator and to take a new iterator that continues on from the current location. The reasons for doing this are to break the linked-list chain so that there isn't a big linked-list of objects in memory that can't be garbage collected.

Parameters

returns

New iterator that starts from the current iterator position

method void Dispose () Source #

class Cons Source #

Cons iterator case.

Contains a head value and a tail that represents the rest of the sequence.

Methods

method void Deconstruct (out A head, out Iterator<A> tail) Source #

class IteratorExtensions Source #

Methods

method Iterator<A> As <A> (this K<Iterator, A> ma) Source #

Downcast operator

method Iterator<A> GetIterator <A> (this IEnumerable<A> enumerable) Source #

Get an iterator for any IEnumerable

class Iterator Source #

Methods

method Iterator<A> from <A> (IEnumerable<A> enumerable) Source #

Create an iterator from an IEnumerable

Parameters

type	A
param	enumerable
returns

method Iterator<A> singleton <A> (A head) Source #

Construct a singleton sequence

Parameters

type	A	Bound value type
param	head	Head item
returns	Iterator

method Iterator<A> Cons <A> (A head, Iterator<A> tail) Source #

Construct a sequence from a head item and a tail sequence

Parameters

type	A	Bound value type
param	head	Head item
param	tail	Tail sequences
returns	Iterator

method Iterator<A> Cons <A> (A head, Func<Iterator<A>> tail) Source #

Construct a sequence from a head item and a tail sequence

Parameters

type	A	Bound value type
param	head	Head item
param	tail	Tail sequences
returns	Iterator

method Iterator<A> Nil <A> () Source #

Empty sequence

Parameters

type	A	Bound value type
returns	Iterator

class Iterator Source #

Problem: IEnumerator<A>

Solution: Iterator<A>

Problem: `IEnumerator<A>`

Solution: `Iterator<A>`