What is the difference between ToArray and ToList?

Introduction


In the beginning of June I attended .NET Summit conference in Minsk, Belarus. Besides interesting presentations (recording are available on YouTube, language is Russian / English) there were interesting quizzes.

One of the questions was:

What is the difference between ToArray and ToList?

Software Engineer

Hmm… what could be the difference besides one returns T[] and another one returns List<T>?

If you don’t know the answer or just like digging into sources of dotnet/corefx then I invite you to join me and find out the answer!

Note

All code snippets were taken from release/3.0 branch of dotnet/corefx repository. All code snippets were truncated to make them more representative (removed input argument checks, contracts, …, etc.).

Tests were targeting .NET Core 3.0 preview 7. However most of the information should be relevant for .NET Core 2.0+ and .NET Framework 4.5+.

Benchmark


Every journey has it’s first step. I usually start with a simple benchmark.

Here is one I’ve created using BenchmarkDotNet:

[CoreJob]
[RPlotExporter, MemoryDiagnoser, RankColumn]
public class ToArrayToList
{
  private IEnumerable<int> data;

  [Params(10, 100, 1000, 10000)]
  public int N;

  [GlobalSetup]
  public void Setup()
  {
    this.data = new int[this.N];
  }

  [Benchmark]
  public int[] ToArray() => this.data.ToArray();

  [Benchmark]
  public List<int> ToList() => this.data.ToList();
}

public class Program
{
  public static void Main(string[] args)
  {
    var summary = BenchmarkRunner.Run<ToArrayToList>();
  }
}

On my machine this benchmark produces the following results:

MethodNMeanGen 0Gen 1Allocated
ToArray1049.527 ns0.015364 B
ToList1045.850 ns0.022996 B
ToArray10082.490 ns0.1013424 B
ToList10078.811 ns0.1090456 B
ToArray1000324.100 ns0.96134024 B
ToList1000321.039 ns0.96894056 B
ToArray100003,223.046 ns9.52151.190240024 B
ToList100003,278.019 ns9.52151.903540056 B

At the first glance there is no significant difference between two methods. However in such cases it is important to be absolutely sure and the only way to make sure is to examine the source code.

Examining ToArray and ToList methods


All required source code is on GitHub so it is easy to find implementations of ToArray and ToList methods inside dotnet/corefx repository:

public static partial class Enumerable
{
  // ...
  public static TSource[] ToArray<TSource>(
    this IEnumerable<TSource> source)
  {
    return source is IIListProvider<TSource> arrayProvider
      ? arrayProvider.ToArray()
      : EnumerableHelpers.ToArray(source);
  }
  // ...
  public static List<TSource> ToList<TSource>(
    this IEnumerable<TSource> source)
  {
    return source is IIListProvider<TSource> listProvider 
      ? listProvider.ToList() 
      : new List<TSource>(source);
  }
  // ...
}

* the code snippets for Enumerable.ToArray and Enumerable.ToList were obtained from dotnet/corefx repository

Both methods start from an attempt to convert input source to IIListProvider<TSource> (lines: 7, 15) and since in benchmark input sequence is initialized as int[] array:

// ...
[GlobalSetup]
public void Setup()
{
  this.data = new int[this.N];
}
// ...

… and in .NET all arrays are descendants of System.Array class which doesn’t implement IIListProvider<T> interface – we can be sure that execution proceeds right to the else branch.

Stepping into Enumerable.ToArray “else” branch


The else branch of Enumerable.ToArray continues with a call to EnumerableHelpers.ToArray method:

internal static partial class EnumerableHelpers
{
  // ...
  
  internal static T[] ToArray<T>(
    IEnumerable<T> source)
  {
    if (source is ICollection<T> collection)
    {
      int count = collection.Count;
      if (count == 0)
      {
        return Array.Empty<T>();
      }
      var result = new T[count];
      collection.CopyTo(result, arrayIndex: 0);
      return result;
    }
    var builder = new LargeArrayBuilder<T>(initialize: true);
    builder.AddRange(source);
    return builder.ToArray();
  }
  // ...
}

* the code snippet for EnumerableHelpers.ToArray was obtained from dotnet/corefx repository

Method starts with an attempt to convert input source to ICollection<T>. Successful conversion results into “happy path” where it allocates new array and copies all source (aliased as collection) items into it.

Stepping into Enumerable.ToList “else” branch


The else branch of Enumerable.ToList continues with an instantiation of a new List<T> object with source passed as constructor argument:

internal static partial class EnumerableHelpers
{
  // ...
  public List(IEnumerable<T> collection)
  {
    if (collection is ICollection<T> c)
    {
      int count = c.Count;
      if (count == 0)
      {
        _items = s_emptyArray;
      }
      else
      {
        _items = new T[count];
        c.CopyTo(_items, 0);
        _size = count;
      }
    }
    else
    {
      _size = 0;
      _items = s_emptyArray;
      using (IEnumerator<T> en = collection!.GetEnumerator())
      {
        while (en.MoveNext())
        {
          Add(en.Current);
        }
      }
    }
  }
  // ...
}

* the code snippet for List<T> was obtained from dotnet/corefx repository

Method starts with the attempt to convert input source to ICollection<T>. Successful conversion results into “happy path” where it allocates new array and copies all source (aliased as collection) items into it.

Unexpected discovery


You might already noticed the reason why benchmark results are so close 🙂 If not let’s take another look at the benchmark setup:

// ...
[GlobalSetup]
public void Setup()
{
  this.data = new int[this.N];
}
// ...

Here we initialize sequence with int[] array. As was mentioned previously all arrays are descendants of System.Array class which does implement ICollection<T> interface.

So both ToArray and ToList methods basically do the same thing:

// ...
var array = new T[collection.Count];
collection.CopyTo(array, 0);
// ...

While this isn’t quite what one can expect from benchmark it is always important to know the edge cases.

Benchmark #2


Let’s modify the benchmark to use Enumerable.Range method:

// ...
[GlobalSetup]
public void Setup()
{
  this.data = Enumerable.Range(0, this.N);
}
// ...

Executing this new benchmark on my machine produces the following results:

MethodNMeanGen 0Gen 1Allocated
ToArray1015.56 ns0.015364 B
ToList1031.97 ns0.022996 B
ToArray10087.93 ns0.1013424 B
ToList100176.19 ns0.1090456 B
ToArray1000771.31 ns0.96134024 B
ToList10001,568.32 ns0.96894056 B
ToArray100007,296.05 ns9.52150.007640024 B
ToList1000015,276.89 ns9.52150.015340056 B

Wooow! These results looks amazing! Do you feel this? The feeling of uncovered conspiracy theory? 🙂

Before starting the celebration… Have you noticed one very strange thing in these results?

Here is a union of two tables where (#1 are the results of the first benchmark and #2 are the results of current benchmark):

MethodN(#1) Mean(#2) Mean (#1) Allocated(#2) Allocated
ToArray1049.527 ns15.56 ns64 B64 B
ToList1045.850 ns31.97 ns96 B96 B
ToArray10082.490 ns87.93 ns424 B424 B
ToList10078.811 ns176.19 ns456 B456 B
ToArray1000324.100 ns771.31 ns4024 B4024 B
ToList1000321.039 ns1,568.32 ns4056 B4056 B
ToArray100003,223.046 ns7,296.05 ns40024 B40024 B
ToList100003,278.019 ns15,276.89 ns40056 B40056 B

Can you see it? Both benchmarks allocate absolutely the same amount of memory. This is suspicious because in “benchmark #1” we were allocating fixed size array for the entire input sequence which is a by the way a bare minimum of memory we could use to store the sequence, so any extra allocation (which definitely should happen in “benchmark #2”) should make a difference.

Unexpected discovery #2


To better understand what is happening let’s see what is exactly returned by Enumerable.Range method:

public static partial class Enumerable
{
  // ...
  public static IEnumerable<int> Range(
    int start, 
    int count)
  {
    long max = ((long)start) + count - 1;
    if (count == 0)
    {
      return Empty<int>();
    }
    return new RangeIterator(start, count);
  }
  // ...
}

* the code snippet for Enumerable.Range was obtained from dotnet/corefx repository

It returns a new instance of RangeIterator class which implements internal IPartition<T> interface:

private sealed partial class RangeIterator 
  : IPartition<int>
{
  // ...
}

* the code snippet for RangeIterator was obtained from dotnet/corefx repository

Which in turn extends IIListProvider<TElement> interface:

internal interface IPartition<TElement> 
  : IIListProvider<TElement>
{
  // ...
}

* the code snippet for IPartition<T> was obtained from dotnet/corefx repository

So basically the benchmark executes ToArray and ToList methods from RangeIterator class because now source is convertible to IIListProvider<TElement>:

public static partial class Enumerable
{
  // ...
  public static TSource[] ToArray<TSource>(
    this IEnumerable<TSource> source)
  {
    // ...
    return source is IIListProvider<TSource> arrayProvider
      ? arrayProvider.ToArray()
      : EnumerableHelpers.ToArray(source);
  }
  // ...
  public static List<TSource> ToList<TSource>(
    this IEnumerable<TSource> source)
  {
    // ...
    return source is IIListProvider<TSource> listProvider 
      ? listProvider.ToList() 
      : new List<TSource>(source);
  }
  // ...
}

* the code snippets for Enumerable.ToArray and Enumerable.ToList were obtained from dotnet/corefx repository

Implementation of RangeIterator clearly reveals the reason for such small memory footprint – both methods pre-allocate int[] and List<int>:

private sealed partial class RangeIterator : IPartition<int>
{
  // ...
  public int[] ToArray()
  {
    int[] array = new int[_end - _start];
    int cur = _start;
    for (int i = 0; i != array.Length; ++i)
    {
      array[i] = cur;
      ++cur;
    }
    return array;
  }
  // ...
  public List<int> ToList()
  {
    List<int> list = new List<int>(_end - _start);
    for (int cur = _start; cur != _end; cur++)
    {
      list.Add(cur);
    }
    return list;
  }
  // ...
}

* the code snippet for RangeIterator was obtained from dotnet/corefx repository

This benchmark demonstrates one very important moment – you never know where optimization is.

Benchmark #3


Let’s modify the benchmark to use compiler generated iterator:

// ...
[GlobalSetup]
public void Setup()
{
  this.data = this.GetEnumerable();
}

private IEnumerable<int> GetEnumerable()
{
  for (var i = 0; i < this.N; ++i)
  {
    yield return i;
  }
}
// ...

Executing this new benchmark on my machine produces the following results:

MethodNMeanGen 0Gen 1Allocated
ToArray10185.7 ns0.0610256 B
ToList10142.7 ns0.0610256 B
ToArray100804.8 ns0.28421192 B
ToList100761.6 ns0.29181224 B
ToArray10005,890.1 ns2.03708536 B
ToList10005,770.2 ns2.02188464 B
ToArray1000056,033.2 ns25.2686106224 B
ToList1000059,923.0 ns31.189010.3760131440 B

These results looks much more realistic. However there is no absolute winner here but rather a sign to continue our journey (there is no obvious explanation of such a significant difference in Allocated column in last benchmark).

Because compiler generated iterator doesn’t implement neither ICollection<T> nor IListProvider<T> interfaces we now can be sure these results are produced by this piece of EnumerableHelpers.ToArray method:

internal static partial class EnumerableHelpers
{
  // ...
  internal static T[] ToArray<T>(
    IEnumerable<T> source)
  {
    // ...
    var builder = new LargeArrayBuilder<T>(initialize: true);
    builder.AddRange(source);
    return builder.ToArray();
  }
  // ...
}

* the code snippet for EnumerableHelpers.ToArray was obtained from dotnet/corefx repository

… and this piece of List<T> constructor:

public List(IEnumerable<T> collection)
{
  // ...
  if (collection is ICollection<T> c)
  {
    // ...
  }
  else
  {
    _size = 0;
    _items = s_emptyArray;
    using (IEnumerator<T> en = collection!.GetEnumerator())
    {
      while (en.MoveNext())
      {
        Add(en.Current);
      }
    }
  }
}

* the code snippet for List<T> was obtained from dotnet/corefx repository

To better understand results of last benchmark we need to find out how exactly LargeArrayBuilder<T> and List<T> allocate memory.

How LargeArrayBuilder<T> allocate?


LargeArrayBuilder<T> is an internal struct used to create dynamically sized arrays. On the high-level what it does is it allocates multiple fixed sized arrays to store input items and then combines them into single array. This high-level overview is mostly enough to understand the big picture but in our case details are important.

The EnumerableHelpers.ToArray method invokes the following LargeArrayBuilder<T> methods:

  • LargeArrayBuilder(initialize: true)
  • LargeArrayBuilder<T>.AddRange(source)
  • LargeArrayBuilder<T>.ToArray()

Let’s see what each of these invocations does.

LargeArrayBuilder(initialize: true)


All instances of LargeArrayBuilder<T> structure are initialized with some maximum capacity value. This value limits the amount of memory instance can consume. In our case new instance is initialized with int.MaxValue value:

internal struct LargeArrayBuilder<T>
{
  // ...
  private const int StartingCapacity = 4;
  private const int ResizeLimit = 8;
  // ...
  private readonly int _maxCapacity;  
  private T[] _first;   
  private ArrayBuilder<T[]> _buffers; 
  private T[] _current;               
  private int _index;
  private int _count;
  // ...
  public LargeArrayBuilder(bool initialize)
    : this(maxCapacity: int.MaxValue)
  {
  }
  
  public LargeArrayBuilder(int maxCapacity)
    : this()
  {
    _first = _current = Array.Empty<T>();
    _maxCapacity = maxCapacity;
  }
  // ...
}

* the code snippet for LargeArrayBuilder<T> was obtained from dotnet/corefx repository

Beside maximum capacity LargeArrayBuider<T> struct has a bunch of private fields and constants.

Let’s see what each of them does:

FieldDescription
ResizeLimitInteger constant, has value of 8.
StartingCapacityInteger constant, has value of 4.
_maxCapacity The maximum capacity builder can have.
_first The first buffer items are stored in. Resized until ResizeLimit.
_buffers After ResizeLimit * 2, previously filled out buffers are stored here.
_current Currently filling buffer. If _count <= ResizeLimit, this is _first.
_index Index into the current buffer.
_count Count of all of the items in this builder.

LargeArrayBuilder<T>.AddRange(source)


The next stop is LargeArrayBuilder<T>.AddRange method. This method iterates over input sequence and stores items in current buffer (aliased as destination (line: 8)):

internal struct LargeArrayBuilder<T>
{
  // ...
  public void AddRange(IEnumerable<T> items)
  {
    using (IEnumerator<T> enumerator = items.GetEnumerator())
    {
      T[] destination = _current;
      int index = _index;
      while (enumerator.MoveNext())
      {
        T item = enumerator.Current;
                    
        if ((uint)index >= (uint)destination.Length)
        {
          AddWithBufferAllocation(item, ref destination, ref index);
        }
        else
        {
          destination[index] = item;
        }
                    
        index++;
      }
      _count += index - _index;
      _index = index;
    }
  }
  // ...
}

* the code snippet for LargeArrayBuilder<T>.AddRange was obtained from dotnet/corefx repository

When current buffer is filled out (line: 14) control is passed to LargeArrayBuilder<T>.AddWithBufferAllocation method (line: 16):

internal struct LargeArrayBuilder<T>
{
  // ...
  private void AddWithBufferAllocation(
    T item, 
    ref T[] destination, 
    ref int index)
  {
    _count += index - _index;
    _index = index;
    AllocateBuffer();
    // ...
  }
  // ...
}

* the code snippet for LargeArrayBuilder<T>.AddWithBufferAllocation was obtained from dotnet/corefx repository

This method increases count of items stored in builder by number of items filled in the current buffer (line: 9) and calls LargeArrayBuilder<T>.AllocateBuffer method (line: 11):

internal struct LargeArrayBuilder<T>
{
  // ...
  private void AllocateBuffer()
  {
    if ((uint)_count < (uint)ResizeLimit)
    {  
      int nextCapacity = Math.Min(
        _count == 0 ? StartingCapacity : _count * 2,
        _maxCapacity);

      _current = new T[nextCapacity];
      Array.Copy(_first, 0, _current, 0, _count);

      _first = _current;
    }
    else
    {
      int nextCapacity;
      if (_count == ResizeLimit)
      {
        nextCapacity = ResizeLimit;
      }
      else
      {
        _buffers.Add(_current);
        nextCapacity = Math.Min(_count, _maxCapacity - _count);
      }
      _current = new T[nextCapacity];
      _index = 0;
    }
  }
  // ...
}

* the code snippet for LargeArrayBuilder<T>.AllocateBuffer was obtained from dotnet/corefx repository

The LargeArrayBuilder<T>.AllocateBuffer method is responsible for allocating a new buffer. However, what exactly this method does depends on the current count of items stored in builder:

  • When adding 1st item (_count == 0) it allocates buffer of StartingCapacity size (which is 4) and stores it in _first and _current fields. This buffer is used to store first 4 items (lines: 9-15).
  • When adding 5th item (_count == 4) it allocates buffer of StartingCapacity * 2 (which is 8) size and stores it in _first and _current fields. The items from previous buffer are copied into new buffer. This buffer is used to store first 8 items (lines: 9-15).
  • When adding 9th item (_count == 8) it allocates buffer of ResizeLimit size (which is 8) and stores it in _current field while leaving previous buffer in the _first field. This new buffer is used to store next 8 items (lines: 22, 29).
  • When adding more items (_count > 8) they are stored in _current buffer. When _current buffer becomes full it is pushed into _buffers list and new buffer of _count size is allocated. This buffer is stored in _current field and used to store next _count items (lines: 26-27, 29).

On return LargeArrayBuilder<T>.AddWithBufferAllocation aliases current buffer (line: 11) and current index (line: 12) as destination and index and stores new item in current buffer (line: 13):

internal struct LargeArrayBuilder<T>
{
  // ...
  private void AddWithBufferAllocation(
    T item, 
    ref T[] destination, 
    ref int index)
  {
    // ...
    AllocateBuffer();
    destination = _current;
    index = _index;
    _current[index] = item;
  }
  // ...
}

* the code snippet for LargeArrayBuilder<T>.AddWithBufferAllocation was obtained from dotnet/corefx repository

This way execution continues until the end of input sequence.

Looks a bit complicated? Let’s take a small example to see how this works for a sequence of 20 items:

Item #Trace_count_first_current_buffers
14 items buffer allocated (new T1[4])
1 item stored in T1[0]
_
3 items to store
0T1[4]T1[4](empty)
2-4Storing items in T1[1]-T1[3]1-4
58 items buffer allocated (new T2[8])
4 items copied from T1[0-3] to T2[0-3]
1 item stored in T2[4]
_
3 items to store
4T2[8]T2[8](empty)
6-8Storing items in T2[5]-T2[7]5-8
98 items buffer allocated (new T3[8])
1 item stored in T3[0]
_
7 items to store
8T2[8]T3[8](empty)
10-16Storing items in T3[1]-T3[7] 9-16
1716 items buffer allocated (new T4[16])
1 buffer stored (T3)
1 item stored in T4[0]
_
15 items to store
16T2[8]T4[16](T3[8])
18-20Storing items in T4[1]-T4[3] 17-20
2012 items to store20T2[8] T4[16] (T3[8])
Total4 buffers allocated
1 buffers stored
36 item allocated

As you can see storing sequence of 20 items requires LargeArrayBuilder<T> to allocate memory for 36 items (almost twice as more!). At first glance this may look quite inefficient, however before blaming LargeArrayBuilder<T> in inefficiency you should note that it will allocate the same 36 items for any input sequence of 17 to 32 items inclusively.

Now the questions is – How many items it will allocate for sequences of 100, 1000 or 10000 items?

We obviously can’t draw a table for such large sequences (we can but this isn’t a productive way of doing things) but we can calculate this as sum of capacities of all allocated buffers.

We already know how LargeArrayBuilder<T> allocates new buffers, so we can express last current buffer capacity (hereinafter C) as function of input sequence length (hereinafter S):

C = f(S) \rightarrow \begin{cases} 4, & S \geq 1 \text{ and } S \leq 4 \\ 8, &S > 4 \text{ and } S \leq 8 \\ 8 \times 2^{n - 1}, &S > 8 \end{cases}

Third case represents capacity as n-th term of geometric progression where 8 is a scale factory and 2 is a common ratio. Such representation is possible because when _count exceeds ResizeLimit value (which is 8) new buffer’s capacity is chosen to be equal to _count value which leads to a doubling of the capacity of each subsequent buffer allocation. This allows us to represent all possible capacity values as geometric progression.

Allocating same amount of items as we have already stored leads to another interesting effect – capacity of the last current buffer (hereinafter C_l) should be enough to store at least half of the input sequence:

C_l \geq S \div 2

Replacing C_l with a formula for n-th term:

8 \times 2^{n - 1} \geq S \div 2

Now we can express n – a position term representing capacity of last current buffer which also equals to count of buffers allocated after _count exceeds ResizeLimit:

n = \dfrac{\ln(S \div 16)}{\ln(2)} + 1

Here is a step by step explanation (feel free to skip is solutions is obvious to you):

Equation

8 \times 2^{n - 1} \geq S \div 2

Solution

Divide both sides by 8:

2^{n - 1} = S \div 16

Then take \log_{2} of both sides:

\log_{2}(2^{n - 1}) = \log_{2}(S \div 16)

Now use the property of logarithms (power rule) to simplify the left side:

(n - 1) \times \log_{2}(2) = \log_{2}(S \div 16)

Because \log_{2}(2) = 1, remove it:

n - 1 = \log_{2}(S \div 16)

Increment both sides by 1:

n = \log_{2}(S \div 16) + 1

And change logarithm base from 2 to e:

n = \dfrac{\ln(S \div 16)}{\ln(2)} + 1

Now we need to address a very important moment – because allocated buffer store multiple items the above formula should return the same n for a range of sequence lengths (i.e. it should return the same result for sequence of 9 to 16 items).

Such behavior can be achieved by ceiling rational parts or the equation:

n = \bigg\lceil\dfrac{\ln(\lceil S \div 16 \rceil)}{\ln(2)}\bigg\rceil + 1

Using n we now can express total count of buffers allocated, total count of items allocated and total count of buffers stored values:

Total count of buffers allocated

T_\text{buffers allocated} = \begin{cases} 1, &S \geq 1 \text{ and } S \leq 4 \\ 2, &S > 4 \text{ and } S \leq 8 \\ 2 + n, &S > 8 \end{cases}

The case for S > 8 is how many buffers was allocated after 8th item plus number of previously allocated buffers.

Author’s note

Total count of items allocated

T_\text{items allocated} = \begin{cases} 4, &S \geq 1 \text{ and } S \leq 4 \\ 12, &S > 4 \text{ and } S \leq 8 \\ 12 + (-8 \times (1-2^n)), &S > 8 \end{cases}

The case for S > 8 is a sum of geometric progression plus count of previously allocated items.

Author’s note

Total count of buffers stored

T_\text{buffers stored} = \begin{cases} 0, &S \geq 1 \text{ and } S \leq 16 \\ n - 1, &S > 16 \end{cases}

The last case for S > 16 is how many buffers was allocated after 8th item minus currently active buffer.

Author’s note

Here are the calculations for benchmark sequences:

Sequence length Total
count of buffers
allocated
Total
count of items
allocated
Total
count of buffers stored
103200
10061323
1000910286
10000131638810

LargeArrayBuilder<T>.ToArray()


The LargeArrayBuilder<T>.ToArray method is responsible for allocating final array (line: 7) and coping buffered items to it (line: 8). The copy part is delegated to LargeArrayBuilder<T>.CopyTo method:

internal struct LargeArrayBuilder<T>
{
  // ...
  public T[] ToArray()
  {
    // ...
    array = new T[_count];
    CopyTo(array, 0, _count);
    return array;
  }
  // ...
}

* the code snippet for LargeArrayBuilder<T>.ToArray and is obtained from dotnet/corefx repository

LargeArrayBuilder<T>.CopyTo method is very straightforward. It iterates over all allocated buffers and copies items to input array. Correct buffer is selected using LargeArrayBuilder<T>.GetBuffer method:

internal struct LargeArrayBuilder<T>
{
  // ...
  public void CopyTo(T[] array, int arrayIndex, int count)
  {
    for (int i = 0; count > 0; i++)
    {
      T[] buffer = GetBuffer(index: i);
      int toCopy = Math.Min(count, buffer.Length);
      Array.Copy(buffer, 0, array, arrayIndex, toCopy);
      count -= toCopy;
      arrayIndex += toCopy;
    }
  }
  // ...
  public T[] GetBuffer(int index)
  {
    return index == 0 
      ? _first
      : index <= _buffers.Count 
        ? _buffers[index - 1] 
        : _current;
  }
  // ...
}

* the code snippets for LargeArrayBuilder<T>.CopyTo and LargeArrayBuilder<T>.GetBuffer were obtained from dotnet/corefx repository

Summing up


Knowledge of how LargeArrayBuilder<T> allocates memory allows us to estimate how much memory will be required to build an array depending on input sequence length:

\begin{aligned} T_\text{estimated} = V_\text{buffers} + V_\text{final array} \end{aligned} \\\\\\ \begin{aligned} \text{where}\ &T_\text{estimated} &&-\ \text{is the estimated amount of memory required in bytes} \\ &V_\text{buffers} &&-\ \text{is the amount of memory required to hold all buffers} \\ &V_\text{final array} &&-\ \text{is the amount of memory required to hold final array} \end{aligned}

Memory required to hold a single array instance is calculated as memory required to hold System.Array object plus memory required to hold all array elements:

\begin{aligned}V_\text{array} = V_\text{object} + N_\text{elements} \times V_{element} \end{aligned} \\\\\\ \begin{aligned} \text{where}\ &V_\text{object} &&-\ \text{is the size of object in bytes} \\ &V_\text{element} &&-\ \text{is the size of array element in bytes} \\ &N_\text{elements} &&-\ \text{is the number of elements in the array}\end{aligned}

Hence, the amount of memory required to hold all buffers can be expressed through Total count of buffers allocated and Total count of items allocated values as:

V_\text{buffers} = T_\text{buffers allocated} \times V_\text{object} + T_\text{items allocated} \times V_\text{element}

… and amount of memory required to hold final array:

V_\text{final array} = V_\text{object} + S \times V_\text{element} \\\\\\ \text{where}\ S -\ \text{is the length of the input sequence}

So estimated amount of memory required to build an array can be expressed as:

\begin{aligned}T_\text{estimated} &= T_\text{buffers allocated} \times V_\text{object}\ + \\ &+ T_\text{items allocated} \times V_\text{element}\ + \\ &+  V_\text{object} + S \times V_\text{element}\end{aligned}

Considering the following values on 64 bit systems (in bytes):

\begin{aligned} &V_\text{object} &&= 24 \\  &V_\text{element} &&= \text{sizeof(int)} \rightarrow 4 \end{aligned}

The above formula can be simplified to:

T_\text{estimated} = 24 \times T_\text{buffers allocated} + 4 \times T_\text{items allocated} + (4 \times S + 24)

Beside memory allocated by instance of LargeArrayBuilder<T> benchmark results also include memory allocated by compiler generated enumerator (40 bytes on x64 system) and memory allocated by ArrayBuilder<T> structure (_buffers field). Because we don’t know how ArrayBuilder<T> allocates we can treat it’s allocations as estimation error, however because we know amount of memory consumed by compiler generated enumerator we should subtract its size from the benchmark results:

Sequence length Total count of buffers allocatedTotal count of items allocatedTotal bytes
estimated
Total bytes
allocated (iterator adjusted)
Estimation error %
103202162160
1006132109611520.0486
100091028835284960.0169
1000013163881058881061840.0027

How List<T> allocates?


When new instance of List<T> is created from the IEnumerable<T> (which isn’t ICollection<T>) all items from input sequence are sequentially added to empty List<T> instance (lines: 16-21):

public class List<T>
{
  // ...
  public List(IEnumerable<T> collection)
  {
    // ...
    if (collection is ICollection<T> c)
    {
      // ...
    }
    else
    {
      _size = 0;
      _items = s_emptyArray;
      using (IEnumerator<T> en = collection!.GetEnumerator())
      {
        while (en.MoveNext())
        {
          Add(en.Current);
        }
      }
    }
  }
  // ...
}

* the code snippet for List<T> was obtained from dotnet/corefx repository

Items are added using List<T>.Add method:

public class List<T>
{
  // ...
  public void Add(T item)
  {
    _version++;
    T[] array = _items;
    int size = _size;
    if ((uint)size < (uint)array.Length)
    {
      _size = size + 1;
      array[size] = item;
    }
    else
    {
      AddWithResize(item);
    }
  }  
  // ...
}

* the code snippet for List<T>.Add was obtained from dotnet/corefx repository

This method checks if there is enough capacity in the internal array to store new item (line: 9) and if not it switches execution to List<T>.AddWithResize method (line: 16):

public class List<T>
{
  // ...
  private void AddWithResize(T item)
  {
    int size = _size;
    EnsureCapacity(size + 1);
    _size = size + 1;
    _items[size] = item;
  }
  // ...
  private void EnsureCapacity(int min)
  {
    if (_items.Length < min)
    {
      int newCapacity = _items.Length == 0 
        ? DefaultCapacity 
        : _items.Length * 2;
      if ((uint)newCapacity > Array.MaxArrayLength) 
        newCapacity = Array.MaxArrayLength;
      if (newCapacity < min) 
        newCapacity = min;
      Capacity = newCapacity;
    } 
  }
  // ...
}

* the code snippets for List<T>.AddWithResize and List<T>.EnsureCapacity were obtained from dotnet/corefx repository

List<T>.AddWithResize works in pair with List<T>.EnsureCapacity method: the latter ensure there will be enough space to store one more item by recalculating internal array capacity by either setting it to List<T>.DefaultCapacity (which is 4 items) in case when current internal array is empty array (line: 17) or by doubling current capacity value (line: 18).

Internal array is reallocated inside of List<T>.Capacity property setter. Besides array allocation it also copies items from old array to the new one.

Author’s note

Looks quite simple. Right? I am sure it is, but still let’s take a small example and see how this works for a sequence of 20 items:

Item #TraceCapacity
14 items array allocated (T1[4])
1 item stored in T1[0]

3 items to store
4
2-4Storing items in T1[1]-T1[3]
58 items array allocated (T2[8])
4 items copied from T1[0-3] to T2[0-3]
1 item stored in T2[4]

3 items to store
8
6-8Storing items in T2[5]-T2[7]
916 items array allocated (T3[16])
8 items copied from T2[0-7] to T3[0-7]
1 item stored in T3[8]

7 items to store
16
10-16Storing items in T3[9]-T2[15]
1732 items array allocated (T4[32])
16 items copied from T3[0-15] to T4[0-15]
1 item stored in T4[16]

15 items to store
32
18-20Storing items in T4[17]-T4[19]
Total4 arrays allocated
60 items allocated

So List<T> allocates 60 items in order to store 20 items. It is much more than LargeArrayBuilder<T>, which allocates 36 items for the same input. Before making any decisions let’s calculate how many internal arrays and items are allocated for sequences of 10, 100, 1000 and 10000 items.

This can be easily done because as you might already noticed – List<T> capacity is always doubled… which means it’s value can be expressed through geometric progression where List<T>‘s default capacity 4 is a scale factor and 2 is a common ratio:

4, 4 \times 2, 4 \times 2^2, 4 \times 2^3, ..., etc

The n term of the progressions is calculated as:

C_n = 4 \times 2^{n - 1}

Because List<T>‘s internal array should be able to store all items stored in previous internal array we can say that last internal array capacity (hereinafter C_l) should be at least enough to hold all sequence items:

C_l \geq S

Replacing C_l with a formula for n-th term:

4 \times 2^{n - 1} \geq S

The n is then expressed as:

4 \times 2^{n - 1} \geq S \rightarrow n = \bigg\lceil\dfrac{\ln(\lceil S \div 4 \rceil)}{\ln(2)}\bigg\rceil + 1

I won’t repeat the solution here. It is absolutely the same as we did for LargeArrayBuilder<T>.

Author’s note

Using n we now can express total count of array allocated and total count of items allocated values:

Total count of arrays allocated is equal to:

T_\text{arrays allocated} = n

Total count of items allocated is equal to:

T_\text{items allocated} = -4 \times (1-2^n)

Here are the calculations for benchmark sequences:

Sequence length Total count of arrays allocated Total count of items allocated
10328
1006252
100092044
100001332764

We also can calculate exactly how much memory will be required to create List<T> depending on the input sequence length:

\begin{aligned} T_\text{exact} = V_\text{arrays} + V_\text{List of T} \end{aligned} \\\\\\ \begin{aligned} \text{where}\ &T_\text{exact} &&-\ \text{is the exact amount of memory required in bytes} \\ &V_\text{arrays} &&-\ \text{is the amount of memory required to hold all arrays} \\ &V_\text{List of T} &&-\ \text{is the amount of memory consumed by empty List of T object}\end{aligned}

We already know how to calculate memory required to hold a single array instance:

V_\text{array} = V_\text{object} + N_\text{elements} \times V_{element}

Hence, the amount of memory required to hold all arrays can be expressed through Total count of allocated arrays and Total count of allocated items values as:

V_\text{arrays} = T_\text{arrays allocated} \times V_\text{object} + T_\text{items allocated} \times V_\text{element}

Hence, exact amount of memory required is:

T_\text{exact} = T_\text{arrays allocated} \times V_\text{object} + T_\text{items allocated} \times V_\text{element} + V_\text{List of T}

Considering the following values on 64 bit systems (in bytes):

\begin{aligned} &V_\text{object} &&= 24 \\  &V_\text{element} &&= \text{sizeof(int)} \rightarrow 4 \\ &V_\text{List of T} &&= 32\end{aligned}

The above formula can be simplified to:

T_\text{exact} = 24 \times T_\text{arrays allocated} + 4 \times T_\text{items allocated} + 32

You might remember (or not) that in the first What is wrong with this code? post we have counted the size of empty List<T> as 40 bytes.

This was 100% correct for previous version of .NET Core. In .NET Core 3.0 most of the collection classes have their _syncRoot field removed (here is a PR) which resulted in smaller collection sizes.

Author’s note

Beside memory allocated by instance of List<T> benchmark results also include memory allocated by it’s enumerator (40 bytes on x64 system) which we should subtract from the benchmark results:

Sequence length Total count of arrays allocatedTotal count of items allocatedTotal bytes allocated (exact)Total bytes allocated (iterator adjusted)
10328216216
100625211841184
10009204484248424
100001332764131400131400

Unexpected discovery #3


Now when understand how both LargeArrayBuilder<T> and List<T> allocate we can compare how many items they allocate for the same amount of input elements:

Sequence length Total count of items allocated (List<T>)Total count of items allocated + count of items in final array (LargeArrayBuilder<T>)
102820 + 10 = 30
100252132 + 100 = 232
100020441028 + 1000 = 2028
100003276416388 + 10000 = 26388

This data explains the difference we saw in Allocated column for last benchmark – in that case List<T> allocated 6,376 items more than LargeArrayBuilder<T>.

Conclusion


If you are reading these lines then first of all – Thank you for reading this huge post. I hope you enjoyed it.

Here is a small reward in form of short summary:

Result #1

In cases when source is ICollection<T> both ToArray and ToList methods demonstrate almost the same performance characteristics because internally they execute almost identical code.

Result #2

In cases when input sequence is a product of multiple iterators the performance of both ToArray and ToList methods would depend on the optimizations implemented by those iterators and in some cases can differ significantly.

Result #3

In cases when input sequence is an enumerable both ToArray and ToList methods demonstrate very close performance characteristics in terms of speed. However for large sequences ToArray method could has smaller memory footprint. At the same time it is important to understand – because ToArray causes allocation of multiple independent arrays it could cause memory fragmentation (and even object promotion) because GC wouldn’t be able to clean these arrays until the ToArray operation is done. It is hard to say how critical these differences are but it is always good to know about them.

Bonus

Did you know that ArrayBuilder<T> class used inside of LargeArrayBuilder<T> class has the same allocation strategy as List<T>? 🙂

One thought on “What is the difference between ToArray and ToList?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s