Roll your own custom list with C# .NET

The generic list (https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1?view=netcore-3.1) is probably the most often used collection type in C#. It’s easy to use, it’s type safe, it offers a lot of manipulation methods and a myriad of LINQ functions for searching, sorting and much, much more. If you need a collection where you can add, remove, insert and set elements at an index then the List is your go-to collection type.

However, imagine that you’re sitting in a job interview and your interviewer tells you to build your own list-type collection without using any of the built-in collection types. Say you’re only allowed to use bare-bones language constructs from your C# 101 intro class. In this case what the interviewer really wants to know is how familiar you are with arrays.

Normally you wouldn’t do this in a real job assignment but quite many interviews revolve around data structures where they want to find out if you really know the ins and the outs of things like arrays, linked lists, queues, stacks etc.

Arrays and lists

What’s the difference between lists and arrays anyway? An array is a basic (primitive) data structure in .NET and many other computing languages to hold x number of contiguous data points such as here:

23769053149nullnullnull
Array with capacity 8 and size 5

This is an integer array with a capacity of 8 and a size of 5. This distinction is important. The array was declared as int[8] where 8 is the fixed size of the array. We must provide the array capacity, we cannot just write arr = new int[], we’ll get a compilation error in Java and C#. It would work in JavaScript though, like let arr = [], but arrays in JS are more like lists with ready-made utility functions such as pop, push, unshift etc.. In this post we’ll stick to arrays as they are implemented in C#.

Also, this capacity cannot be changed later on, it’s constant. Therefore expanding and shrinking the array is not possible just like that, out of the box. At this stage the array is empty, i.e. its size is 0 because we haven’t set any of the values in the various slots. We can set the value of each slot using a 0-based index, like arr[4] = 149 which will set slot #4 to 149 just like in the example above. In an array we can set the slots in any order, we don’t need to go in any particular order, like arr[0] = 23, arr[1] = 76, arr[2]=90 etc. We can have an array such as this one:

2376null53nullnull376null

A list on the other hand is a collection API that uses an array in the background for its work. It’s a utility or convenience class that makes working with arrays easier. It wraps an array to hide the implementation details from you so that you can conveniently call myList.Add(newItem), myList.Clear(), myList.RemoveAt(5) etc. without having to worry about stuff like copying and resizing an array. A List is also called a dynamic array which is a deceptive name, but there you go.

Implementation

The topic of this post is to come up with a possible implementation of a simple custom generic list using an array as its backup store. Let’s start with a skeleton. This skeleton also shows the implementation of the simplest methods:

using System;
using System.Collections.Generic;
using System.Data;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Algorithms
{
    public class RollYourOwnList<T>
    {
        private T[] data;
        private int size = 0;
        private int capacity;

        public RollYourOwnList(int initialCapacity = 8)
        {
            if (initialCapacity < 1) initialCapacity = 1;
            this.capacity = initialCapacity;
            data = new T[initialCapacity];
        }

        public int Size { get { return size; } }
        public bool IsEmpty { get { return size == 0; } }

        public T GetAt(int index)
        {
            ThrowIfIndexOutOfRange(index);
            return data[index];
        }

        public void SetAt(T newElement, int index)
        {
            ThrowIfIndexOutOfRange(index);
            data[index] = newElement;
        }

        public void InsertAt(T newElement, int index)
        {
            ThrowIfIndexOutOfRange(index);
            //to be implemented...
        }

        public void DeleteAt(int index)
        {
            ThrowIfIndexOutOfRange(index);
            //to be implemented
        }

        public void Add(T newElement)
        {
            //to be implemented
        }

        public bool Contains(T value)
        {
            //to be implemented
            return false;
        }

        public void Clear()
        {
            //to be implemented
        }

        private void Resize()
        {
            //to be implemented
        }

        private void ThrowIfIndexOutOfRange(int index)
        {
            if (index > size - 1 || index < 0)
            {
                throw new ArgumentOutOfRangeException(string.Format("The current size of the array is {0}", size));
            }
        }        
    }
}

The class is generic and has three backing fields:

  • data: the array that will hold the values of the list
  • size: the number of initialised slots in the array
  • capacity: the total number of available slots in the array

We can conclude that size is less than or equal to the array capacity. The list is full once size = capacity. The constructor lets us set an initial capacity which defaults to 8. Why default to 8? I don’t know really. I remember reading about it somewhere, that the actual .NET List implementation sets the initial capacity to 8 but don’t take my word for it, it could have been a dream or something. If you know more then feel free to write in the comment section.

So now we have a backing array with available slots set by the initial capacity. The size variable is of course 0 to begin with. The Size and IsEmpty properties should be fairly easy to read: they return the actual size and whether the array size is 0.

The ThrowIfIndexOutOfRange function is called from various places in the code. It throws an ArgumentOutOfRangeException if the user tries to access a slot in the array that has not been initialised. We basically want to avoid a data store where we have nulls or other default values somewhere in the middle.

The GetAt and SetAt functions do exactly what you think they do: they access the internal array using the 0-based index and square brackets to either return the current value or rewrite an existing one.

Next, let’s look at the InsertAt function. It accepts the element to be added and the position index. This is not the same as SetAt. Here we want to make room for a new item in the array without overwriting any existing values. What we need to do is shift the range of values in the array from and including the requested index so that the position at index becomes free. Then we can set the value of that new, empty slot. However, before the move can happen, we also need to check whether we have reached the max capacity, in which case we must extend the array.

Say that we have the following array:

54342577

…and we want to insert a new integer at position 2, i.e. that is currently occupied by the number 25.

So at first we check if we have room for one more. We clearly don’t so we need to resize the array. This is simply done by creating a new, bigger array and copying the existing values from the existing array into the same slots where they were before:

54342577

How much bigger should the new array be? A good rule of thumb is to double its current capacity, so in this example the new capacity is 8. You see that copying and shifting values is a relatively time-consuming operation so we want to economise on that.

The next step is to shift the values 25 and 77, i.e. everything from index 2 until the size of the array to the right:

54342577

Now we have an empty slot at position 2 without any data loss. We can now set position 2 of the array to the required integer, say 165:

54341652577

Here’s the Resize() method:

private void Resize()
{
    T[] resized = new T[capacity * 2];
    for (int i = 0; i < capacity; i++)
    {
        resized[i] = data[i];
    }
    data = resized;
    capacity = capacity * 2;
}

We create the larger array and make data point to it. We also make sure that capacity reflects the new, extended capacity.

Here’s the implemented InsertAt function:

public void InsertAt(T newElement, int index)
{
    ThrowIfIndexOutOfRange(index);
    if (size == capacity)
    {
        Resize();
    }

    for (int i = size; i > index; i--)
    {
        data[i] = data[i - 1];
    }

    data[index] = newElement;
    size++;
}

As stated above: check if we need to resize the array, then shift the necessary range to the right and insert the new element. We also increase the size by 1 of course. Note that the shifting loop starts from the right, i.e. from size and moves each element one slot to the right. In our example we move 77 and then 25. If we started the shifting from the left, i.e. from index then we would overwrite the value in the position i + 1. In our example we would move 25 to the position of 77 and 77 would disappear, that’s clearly not what we want.

The DeleteAt function is also based on shifting values, but we don’t need to create any empty slot for a new element. It’s enough to shift the values by one from right to left. Here’s our starting array:

54341652577

Say we want to delete 34, i.e. the element at position 1. Then we first move 165 to the left, then 25 and 77. If we implement this loop up to this point only then our array will in fact look like this:

54165257777

Why 77 twice? We don’t really move the values of the array, we rather copy them to a different position. In this loop the final operation will be to copy 77 to position 3 from 4, but the value in position 4 is still available, its value is not overwritten by anything. So what can we do?

We can overwrite its value by the default of the T data type: null, 0, false etc. Also, we must make sure that size is decremented by one so that loops traversing the array disregard that default value. So we’ll end up with this array:

5416525770

…since the default value of integers is 0. C# has a neat shorthand to get the default value of a type which we’ll use in the implementation:

public void DeleteAt(int index)
{
    ThrowIfIndexOutOfRange(index);
    for (int i = index; i < size - 1; i++)
    {
        data[i] = data[i + 1];
    }

    data[size - 1] = default(T);
    size--;
}

The Add function adds a new element to the end (tail) of the array, i.e. at the size position. Remember that array positions are 0 based and sizes start with one. This is a lovely source of off-by-one and out-of-range errors that every normal developer has experienced a couple of times. Again, we need to resize the array before trying to add a new element to end of the array:

public void Add(T newElement)
{
    if (size == capacity)
    {
        Resize();
    }

    data[size] = newElement;
    size++;
}

The Contains and Clear functions are straightforward. In the former we iterate through the array and if we find a value that’s equal to the search term then we return true. In the latter we simply recreate the backing array and set the size to 0:

public bool Contains(T value)
{
    for (int i = 0; i < size; i++)
    {
        T currentValue = data[i];
        if (currentValue.Equals(value))
        {
            return true;
        }
    }
    return false;
}

public void Clear()
{
    data = new T[capacity];
    size = 0;
}

That’s it really. We could of course implement a lot of other functions on our custom list, such as AddRange but the ones above are probably the most commonly used besides LINQ functions. Also, if you think of the above assignment in an interview setting you probably won’t have time to implement anything else – unless it’s a day-long hackathon 🙂

Finally let me give you a set of unit tests that test the main functions of our own beautiful dynamic array:

using Microsoft.VisualStudio.TestTools.UnitTesting;
using Algorithms;
using System;
using System.Collections.Generic;
using System.Text;

namespace Algorithms.Tests
{
    [TestClass()]
    public class RollYourOwnListTests
    {
        [TestMethod()]
        public void RollYourOwnListSetupTests()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>();
            Assert.AreEqual(0, intList.Size);
            Assert.IsTrue(intList.IsEmpty);

            intList = new RollYourOwnList<int>(16);
            Assert.AreEqual(0, intList.Size);
            Assert.IsTrue(intList.IsEmpty);
        }

        [TestMethod()]
        public void RollYourOwnListAddTests()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            Assert.AreEqual(1, intList.Size);
            Assert.IsFalse(intList.IsEmpty);

            intList.Add(35);
            Assert.AreEqual(2, intList.Size);
            Assert.IsFalse(intList.IsEmpty);

            intList.Add(29);
            Assert.AreEqual(3, intList.Size);

            intList.Add(16);
            Assert.AreEqual(4, intList.Size);

            intList.Add(48);
            Assert.AreEqual(5, intList.Size);

            intList.Add(90);
            Assert.AreEqual(6, intList.Size);

            Assert.AreEqual(13, intList.GetAt(0));
            Assert.AreEqual(35, intList.GetAt(1));
            Assert.AreEqual(29, intList.GetAt(2));
            Assert.AreEqual(16, intList.GetAt(3));
            Assert.AreEqual(48, intList.GetAt(4));
            Assert.AreEqual(90, intList.GetAt(5));

            intList.SetAt(1000, 2);
            Assert.AreEqual(1000, intList.GetAt(2));

            intList.SetAt(1001, 0);
            Assert.AreEqual(1001, intList.GetAt(0));

            intList.SetAt(1002, 5);
            Assert.AreEqual(1002, intList.GetAt(5));           
        }

        [TestMethod()]
        public void RollYourOwnListInsertTests()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.InsertAt(24, 0);
            intList.InsertAt(48, 1);
            Assert.AreEqual(3, intList.Size);

            intList.InsertAt(87, 2);
            Assert.AreEqual(4, intList.Size);
            Assert.AreEqual(24, intList.GetAt(0));
            Assert.AreEqual(48, intList.GetAt(1));
            Assert.AreEqual(87, intList.GetAt(2));
            Assert.AreEqual(13, intList.GetAt(3));
        }

        [TestMethod()]
        public void RollYourOwnListDeleteTests()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.DeleteAt(0);

            Assert.AreEqual(0, intList.Size);
            Assert.IsTrue(intList.IsEmpty);

            intList.Add(85);
            intList.Add(43);
            intList.Add(8643);
            intList.Add(90);
            intList.Add(1000);
            intList.Add(1200);
            Assert.AreEqual(6, intList.Size);

            intList.DeleteAt(5);
            Assert.AreEqual(5, intList.Size);
            intList.DeleteAt(0);
            Assert.AreEqual(4, intList.Size);
            intList.DeleteAt(2);
            Assert.AreEqual(3, intList.Size);

            Assert.AreEqual(43, intList.GetAt(0));
            Assert.AreEqual(8643, intList.GetAt(1));
            Assert.AreEqual(1000, intList.GetAt(2));
        }

        [TestMethod()]
        public void RollYourOwnListClearTests()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.Add(65);
            intList.Add(73);
            intList.Add(5678);
            intList.Add(987);
            intList.Clear();
            Assert.AreEqual(0, intList.Size);
            Assert.IsTrue(intList.IsEmpty);
        }

        [TestMethod()]
        public void RollYourOwnListContainsTests()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.Add(65);
            intList.Add(73);
            intList.Add(5678);
            Assert.IsTrue(intList.Contains(73));
            Assert.IsTrue(intList.Contains(65));
            Assert.IsTrue(intList.Contains(13));
            Assert.IsTrue(intList.Contains(5678));
            Assert.IsFalse(intList.Contains(1001));
        }

        [TestMethod()]
        [ExpectedException(typeof(ArgumentOutOfRangeException))]
        public void ThrowExceptionIfGettingAtInvalidIndex()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.Add(35);
            intList.Add(29);
            intList.GetAt(3);
        }

        [TestMethod()]
        [ExpectedException(typeof(ArgumentOutOfRangeException))]
        public void ThrowExceptionIfSettingAtInvalidIndex()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.Add(35);
            intList.Add(29);
            intList.SetAt(45, 3);
        }

        [TestMethod()]
        [ExpectedException(typeof(ArgumentOutOfRangeException))]
        public void ThrowExceptionIfInsertingAtInvalidIndex()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.Add(35);
            intList.Add(29);
            intList.InsertAt(45, 3);
        }

        [TestMethod()]
        [ExpectedException(typeof(ArgumentOutOfRangeException))]
        public void ThrowExceptionIfDeletinggAtInvalidIndex()
        {
            RollYourOwnList<int> intList = new RollYourOwnList<int>(2);
            intList.Add(13);
            intList.Add(35);
            intList.Add(29);
            intList.DeleteAt(3);
        }
    }
}
Advertisement

About Andras Nemes
I'm a .NET/Java developer living and working in Stockholm, Sweden.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Elliot Balynn's Blog

A directory of wonderful thoughts

Software Engineering

Web development

Disparate Opinions

Various tidbits

chsakell's Blog

WEB APPLICATION DEVELOPMENT TUTORIALS WITH OPEN-SOURCE PROJECTS

Once Upon a Camayoc

Bite-size insight on Cyber Security for the not too technical.

%d bloggers like this: