Tag: Algorithms/General

  • Array vs. Linked List

    Array vs. Linked List

    Arrays and linked lists are data structures that sequentially arrange elements of a particular type.

    However, there are mayor differences, and depending on the requirements, the choice of data structure significantly impacts the memory requirements and performance of the application.

    This article answers the following questions:

    • What are the differences between array and linked list?
    • What are the advantages and disadvantages of one data structure over the other?
    • What is the time complexity of the different operations (such as accessing an element, inserting, removing, and determining the size)?
    • When should you use which data structure?

    Let’s start with a comparison of both data structures…

    Difference between Array and Linked List

    The following image shows the basic layout of both data structures. I’ve included the linked list as both a singly and doubly linked list:

    Array – singly linked list – doubly linked list
    Array – singly linked list – doubly linked list

    An array is a contiguous block of memory that directly contains the data elements¹.

    A linked list consists of list nodes, each containing a data element¹ and a reference to the next node (and – in the case of a doubly linked list – to the previous node).

    The following sections compare the consequences of the layout of the two data structures in terms of the time required to insert and remove elements, the memory required, and the principle of locality (I’ll explain what this means in the corresponding section).

    ¹ A data element can be a primitive element, such as an int, double, or char – or a reference to an object.

    Array vs. Linked List: Time Complexity

    Let’s start with the cost of the various operations.

    (You can find an introduction to time complexity in the article “Big O Notation and Time Complexity – Easily Explained“.)

    Accessing a Specific Element (“Random Access”)

    With an array, we can address each element directly. In terms of effort, it makes no difference how long the array is or at which position we read or write an element.

    In the array example, it makes no difference whether we access the “a” or the “p”:

    Accessing a specific element in an array ("random access")
    Accessing a specific element in an array (“random access”)

    The time required is therefore constant. Thus, the time complexity for accessing (writing or reading) a particular element of an array is: O(1)

    In a linked list, in contrast, we can only access the first element directly. For all others, we have to follow the list node by node until we reach the desired element.

    In the linked list example, we need more steps to reach the “p” than to get to the “a”:

    Accessing a specific element of a linked list ("random access")
    Accessing a specific element of a linked list (“random access”)

    With randomly distributed access to the elements, the average cost is proportional to the length of the list. The time complexity is, therefore: O(n)

    Adding or Removing an Element

    In a linked list, we can insert and remove nodes at any position. The cost is always the same, regardless of how long the list is and at which location we insert (provided we have a reference to the node where we want to insert/remove).

    Inserting an element into a linked list: O(1)
    Inserting an element into a linked list: O(1)

    Thus, the time complexity for inserting into and removing from a linked list is: O(1)

    An array cannot change its size. To insert or remove an element, we always have to copy the array into a new, larger or smaller array:

    Inserting an element into an array: O(n)
    Inserting an element into an array: O(n)

    The time required is proportional to the array length. The time complexity is, therefore: O(n)

    Data structures such as Java’s ArrayList have a strategy for reducing the average time complexity of inserting and removing elements: By reserving space in the array for new elements, both when creating and when expanding, they can reduce the time complexity – at least for insertion and removal at the end of an array-based data structure – to O(1).

    With a circular array, we can also reduce the time complexity for insertion and removal at the beginning of an array-based data structure to O(1). That is how the Java ArrayDeque is implemented, for example.

    Determining Size

    The size of an array is known and can be queried, for example, in Java via array.length. The effort for this is independent of the length of the array, so it is constant.

    Thus, the time complexity for determining the length of an array is: O(1)

    In the case of a linked list, we have to run through the entire list and count the list nodes. The longer the list, the longer the counting takes.

    Thus, the time complexity for determining the length of a linked list is: O(n)

    Some data structures based on linked lists (e.g., the Java LinkedList) additionally store the size in a field, which they update on insertion and removal. Therefore, we can query the size of such data structures in constant time, i.e., O(1).

    Time Complexity Overview

    The following table summarizes the time complexities of the various operations:

    OperationArrayLinked List
    Accessing the nth element:O(1)O(n)
    Inserting an element:O(n)O(1)
    Removing an element:O(n)O(1)
    Determining the size:O(1)O(n)

    Thus, accessing an element (reading or writing) and determining length is cheaper with an array – inserting and removing, on the other hand, with a linked list.

    Array vs. Linked List: Memory Consumption

    In an array, each field requires as much memory as the data type it contains. For example, an array of int primitives requires 4 bytes per entry:

    Memory consumption of an int array: 4 bytes per entry
    Memory consumption of an int array: 4 bytes per entry

    In a linked list, we must store both the data element and references to each node’s successor (and possibly predecessor) nodes.

    If we stay with the int primitives and assume 4 bytes¹ per reference, we reach 8 bytes per element for a singly linked list.

    In JVM languages, however, 12 bytes are added per node for the header of the node object – plus 4 fill bytes since objects must occupy a multiple of 8 bytes of memory.¹ Thus, we need a total of 24 bytes per list node.

    Memory consumption of a single linked list in Java: 24 bytes per node
    Memory consumption of a single linked list in Java: 24 bytes per node

    We need one more reference for a doubly linked list, so we end up with 12 bytes per entry.

    For JVM-based languages, we add the 12 bytes for the object header. However, the total remains at 24 bytes, since the additional four bytes take up the space previously occupied by the fill bytes.

    Memory consumption of a doubly linked list in Java: 24 bytes per node
    Memory consumption of a doubly linked list in Java: 24 bytes per node

    The following table shows the memory requirements per field for an array and a linked list – each for C/C++ and JVM-based languages:

    LanguageArraySingly linked listDoubly linked list
    C/C++:4 bytes8 bytes12 bytes
    JVM language:4 bytes24 bytes¹24 bytes¹

    Up to this point, the memory consumption speaks for the array – especially in Java.

    (¹ For the memory considerations, I’m assuming a 64-bit JVM with Compressed Class Pointers and Compressed Oops nabled.)

    Memory Efficiency

    However, the comparison is that clear only if we know the size of the data structure in advance and it does not change.

    Array-based data structures whose size can change, e.g., the Java ArrayList, usually reserve additional array fields for new elements (as mentioned above). With a linked list, however, memory is allocated for each element separately only when an element is inserted.

    Array vs. linked list: memory efficiency
    Array vs. linked list: memory efficiency

    The same applies to removing elements. In an array-based data structure, the removed field is usually left free for future insert operations. For a linked list, it gets immediately deleted (or released for deletion by the garbage collector).

    Linked lists are thus more memory efficient than arrays.

    In summary: for the same length, a linked list requires at least twice as much memory as an array – and even six times as much in Java! However, with varying lengths, an array-based data structure can block unused memory, so you must weigh these two factors against each other.

    Array vs. Linked List: Locality

    To answer the question “Which is faster – an array or a linked list?” we need to consider one more factor: the principle of locality.

    Since the memory for an array is allocated in one piece, its elements are located at consecutive memory addresses. When accessing main memory, all array elements on the same memory page are loaded into the CPU cache simultaneously. Thus, once we have accessed one array element, we can access the neighboring elements very quickly.

    The nodes of a linked list, in contrast, are allocated at arbitrary locations in memory, i.e., they can be distributed over the entire memory. When traversing a linked list, a new memory page would have to be loaded for each element in the worst case.

    Advantages of Linked List over Array

    In this and the next section, I’ll summarize the advantages and disadvantages of arrays and linked lists.

    Why is a linked list better than an array?

    • Elements can be inserted and removed with constant time.
    • A linked list does not occupy any unused memory.

    Advantages of Array over Linked List

    And when is an array better than a linked list?

    • We can access any array element (“random access”) in constant time.
    • We can traverse an array from back to front – this is not possible with a singly linked list, only with a doubly linked one.
    • When containing the same number of elements, an array occupies significantly less memory than a linked list (C/C++: factor 2–3; Java: factor 6).
    • Due to the principle of locality, we can access elements close to each other much faster in an array.
    • The garbage collector can perform a reachability analysis much quicker on an array than on a linked list.
    • Deleting an array frees a contiguous memory area, while deleting a linked list leaves fragmented memory.

    Conclusion: When to Use an Array and When to Use a Linked List?

    The question “Which data structure is better – array or linked list?” can, like so many things, only be answered with an “It depends”.

    If elements are often inserted or removed in the middle of the data structure, then a linked list should be the better choice.

    For all other use cases, array-based data structures generally deliver better performance and a better memory footprint and should therefore be preferred.

    If you suspect that a linked list is better suited for your purpose, just try it out. Take measurements and make a decision based on the results.

    If you still have questions, please ask them via the comment function. Do you want to be informed about new tutorials and articles? Then click here to sign up for the HappyCoders.eu newsletter.

  • Binary Search (+ Java Code Examples)

    Binary Search (+ Java Code Examples)

    We developers are often faced with determining the position of a particular element in a sorted array (or in a list). The most straightforward approach would be to traverse the array from left to right, matching each element with the element we are looking for. This is called a “linear search”.

    “Binary search” is much faster. In this article, you will learn:

    • How does binary search work?
    • How to implement binary search in Java (recursive and iterative)?
    • Which binary search functions does the JDK provide?
    • How fast is binary search compared to linear search?
    • When does it make sense to run a binary search in a LinkedList?

    You can find the source code for the article in this GitHub repository.

    Binary Search – an Example

    In the past, if we wanted to translate an unknown word, we didn’t have an app for that. We had to look it up in a dictionary. In theory, we could search every page from the top left to the bottom right for the specific word, from front to back.

    If we were lucky, we would find the word on the first pages of the book. If we’re unlucky, we won’t find it until near the end of the book – or not at all (we wouldn’t find that out until the very last page). Even with words that are relatively far in front (such as “binary search”), we would have to search for quite a while this way.

    This approach is called “linear search”. The following image shows a simplified example with numbers instead of words. We want to find the position of the number 61 in the array shown.

    Linear search in an integer array
    Linear search in an integer array

    In this simplified example, we need six steps to find the 61.

    Of course, no one would look in a dictionary in this way. Instead, we open the book in the middle and see whether the word comes alphabetically before or after it. We thus know in which half of the book the word is located and can ignore the other half. After that, we search the middle again and narrow the search area to half once more (i.e., a quarter in total). With each additional search step, we halve the number of remaining pages. This way, we get to the target page – and on the target page to the word we are looking for – in relatively few steps.

    We call this a “binary search”. The following image clearly shows that the search leads to the result much faster than the linear search:

    Binary search in an integer array
    Binary search in an integer array

    With binary search, we only need three steps:

    1. In the first step, we compare the searched value 61 with the middle element 36. 61 is larger, so it must be to the right of 36.
    2. In the second step, we compare 61 with the middle element of the right subarray, 79. The value we are looking for is smaller, so it must be to the left of 79.
    3. There is only one element between 36 and 79. We have to compare this element with the searched element again. In this example, we have found the searched element 61. However, there could have been another number between 36 and 79. This would have meant that the array does not contain 61 at all.

    Of course, binary search only makes sense if the words in the dictionary are sorted (like the numbers in the example). If the words were printed in random order, we would have no choice but to search word by word – that is, linearly.

    Binary Search – Pseudocode

    In the following pseudocode, we denote the element we are looking for by “key”.

    1. Determine the middle position of the array range to be searched.
    2. Read the element at the middle position.
    3. Compare the key with the middle element:
      • If the key is equal to the middle element, then we have reached our goal. Return the middle position as result.
      • If the key is smaller than the middle element, perform a binary search in the subarray to the left of the middle position. However, if this subarray has a length of 0, the search ends without a result.
      • If the key is greater than the middle element, perform a binary search in the subarray to the right of the middle position. However, if this subarray has a length of 0, the search ends without a result.

    Implementing Binary Search in Java

    We can implement binary search recursively or iteratively.

    Recursive Binary Search

    The pseudocode for binary search from the previous chapter suggests a recursive implementation.

    The recursive implementation in Java for an array of int primitives looks like this:

    public static int binarySearchRecursively(int[] array, int key) {
      return binarySearchRecursively(array, 0, array.length, key);
    }
    
    public static int binarySearchRecursively(
        int[] array, int fromIndex, int toIndex, int key) {
      if (toIndex <= fromIndex) return -1;
    
      int mid = (fromIndex + toIndex) >>> 1;
      int midVal = array[mid];
    
      if (key == midVal) {
        return mid;
      } else if (key < midVal) {
        return binarySearchRecursively(array, fromIndex, mid, key);
      } else {
        return binarySearchRecursively(array, mid + 1, toIndex, key);
      }
    }Code language: Java (java)

    You can find the code in the GitHub repository in the class BinarySearch starting at line 12. Corresponding unit tests are provided in BinarySearchTest.

    It is important to calculate the middle position mid with an “unsigned right shift”:

    int mid = (fromIndex + toIndex) >>> 1

    And not as follows:

    int mid = (fromIndex + toIndex) / 2

    In case the sum is greater than Integer.MAX_VALUE, the second variant would lead to an overflow or a “roll over”, and the result would be a negative number.

    Without the >>> operator, the following method would also be correct:

    int mid = fromIndex + (toIndex - fromIndex) / 2;

    But that is nowhere near as cool ;-)

    Iterative Binary Search

    Recursion requires additional CPU cycles and additional memory on the heap. Therefore, iterative implementations are usually preferable.

    The corresponding iterative Java implementation for an int array looks like this:

    public static int binarySearchIteratively(int[] array, int key) {
      return binarySearchIteratively(array, 0, array.length, key);
    }
    
    public static int binarySearchIteratively(
        int[] array, int fromIndex, int toIndex, int key) {
      int low = fromIndex;
      int high = toIndex;
    
      while (low < high) {
        int mid = (low + high) >>> 1;
        int midVal = array[mid];
    
        if (key == midVal) {
          return mid;
        } else if (key < midVal) {
          high = mid;
        } else {
          low = mid + 1;
        }
      }
    
      return -1;
    }Code language: Java (java)

    The variables low and high are not absolutely necessary here. You could also change fromIndex and toIndex within the while loop. However, reassigning method parameters is usually considered unclean design.

    You can also find this code in the BinarySearch class starting at line 52 and the unit tests in BinarySearchTest starting at line 64.

    Binary Search in the JDK

    Of course, we do not have to implement binary search in arrays ourselves. The JDK provides appropriate methods for arrays of all primitive data types and for object arrays in the java.util.Arrays class. It also provides a method for binary search in lists in the java.util.Collections class.

    Arrays.binarySearch()

    For example, in an int array we can search as follows:

    int[] array = new int[] {10, 19, 23, 25, 36, 61, 79, 81, 99};
    int posOf36 = Arrays.binarySearch(array, 36);Code language: Java (java)

    Collections.binarySearch()

    In a corresponding ArrayList of Integer objects we can search as follows:

    List<Integer> list = new ArrayList<>(List.of(10, 19, 23, 25, 36, 61, 79, 81, 99));
    int posOf36 = Collections.binarySearch(list, 36);Code language: Java (java)

    Note: The Collections.binarySearch() method can be invoked for any class that implements the List interface. Thus, for example, also for LinkedList.

    In a linked list, however, a specific element cannot be accessed directly, but only by iteration. That brings us (almost) back to linear search. More about this – and why binary search on a LinkedList can still be useful – you’ll find out in the next chapter.

    Time Complexity of Binary Search

    In binary search, we halve the number of entries left to search with each search step. Or the other way around: if the number of entries doubles, we only need one more search step.

    This corresponds to logarithmic effort, i.e., O(log n).

    You can learn more about big O notation here: Big O Notation and Time Complexity – Easily Explained

    Binary Search Runtime

    We can verify the theoretically derived time complexity with the program BinarySearchRuntime from the GitHub repository. The program generates random arrays with 10,000 to 200,000,000 elements and searches them for a randomly selected element.

    Since the times are in the nanosecond range, each measurement consists of searches for 100 different keys. The measurement is repeated 100 times for each array size; then, the median is printed. The following graph shows the average runtime in relation to the array size:

    Runtime of binary search in relation to array size
    Runtime of binary search in relation to array size

    The logarithmic progression can be seen very well.

    Binary Search vs. Linear Search

    With linear search, the best case is finding the element we are looking for in the first step. In the worst case, we have to search the entire array. In the average case, half of the entries. With n entries, that is n/2 search steps. The duration of the search increases linearly with the number of entries. We say:

    The time complexity of the linear search is O(n).

    We can measure the runtime of linear search with the LinearSearchRuntime program. The following image shows the comparison of the runtimes of binary and linear search. I had to reduce the range to 100,000 elements to be able to recognize at least a minimal increase of the measured values for the binary search:

    Comparing the runtimes of binary and linear search
    Comparing the runtimes of binary and linear search

    We can see the linear time of the linear search very nicely. It is also apparent that the binary search is orders of magnitude faster than the linear search.

    Runtime of Binary Search for Small Arrays

    Due to the higher complexity of the binary search code, linear search can be faster for small arrays. The following diagram shows a section of the comparison of run times for up to 500 elements. Each measurement point is the median of 100 measurements with 10,000 repetitions each.

    Binary and linear search for small arrays
    Binary and linear search for small arrays

    That confirms the assumption. For arrays up to a maximum of about 230 elements, linear search is faster than binary search. Of course, this is not a general statement but applies only to my laptop and the JDK I currently use.

    You can once again nicely see the linear time – O(n) – compared to the logarithmic time – O(log n).

    Runtime of Binary Search in a LinkedList

    In the chapter Binary Search in the JDK, I mentioned that the Collections.binarySearch() method can also be applied to a LinkedList. Collections.binarySearch() distinguishes internally between lists that implement the RandomAccess interface, such as ArrayList, and other lists. For lists with “random access”, a regular binary search is performed.

    To access the middle element in lists without random access, we would have to follow the elements from the beginning to the middle, element by element. From there, we would again reach the center of the left or right half by following the list, element by element. The following diagram should illustrate this:

    Binary search in a doubly linked list
    Binary search in a doubly linked list

    For example, to find the position of 19, we would first have to follow the orange arrows to the center, then the blue arrows back to 23, and finally the yellow arrow to 19.

    That works only with a doubly linked list. For iterating left in a singly linked list, you would have to jump back to the beginning and, from there, follow the arrows to the right again.

    No matter if singly or doubly linked – in any case, we have to iterate over more elements than with linear search. While we have an average of n/2 search steps in the linear search in total, we already iterate over n/2 elements to reach the middle in the first step of the binary search. In the second step, we iterate over n/4 elements; in the third step, we iterate over n/8 elements, and so on.

    So at first glance, binary search makes absolutely no sense in a LinkedList.

    When Is Binary Search in a LinkedList Useful?

    Nevertheless, binary search in a LinkedList can be faster than linear search. Although we have to iterate over more elements (as shown in the previous section) – the number of comparisons remains in the order of O(log n)!

    Depending on the cost of the comparison function – which can be significantly higher for an object than for a primitive data type – this can make a considerable difference. So if you ever need to search in a LinkedList, it’s worth trying binary search with Collections.binarySearch() and comparing it to linear search.

    Summary

    This article has shown the principle of binary search and its advantages over linear search for sorted arrays and lists. I demonstrated the theoretically derived time complexity on an example. I also showed that binary search could be useful for a doubly linked list.

    A very similar technique is the search in a binary search tree.

  • Big O Notation and Time Complexity – Easily Explained

    Big O Notation and Time Complexity – Easily Explained

    The big O notation¹ is used to describe the complexity of algorithms.

    On Google and YouTube, you can find numerous articles and videos explaining the big O notation. But to understand most of them (like this Wikipedia article), you should have studied mathematics as a preparation. ;-)

    That’ s why, in this article, I will explain the big O notation (and the time and space complexity described with it) only using examples and diagrams – and entirely without mathematical formulas, proofs and symbols like θ, Ω, ω, ∈, ∀, ∃ and ε.

    You can find all source codes from this article in this GitHub repository.

    ¹ also known as “Bachmann-Landau notation” or “asymptotic notation”

    Types of Complexity

    Computational Time Complexity

    Computational time complexity describes the change in the runtime of an algorithm, depending on the change in the input data’s size.

    In other words: “How much does an algorithm degrade when the amount of input data increases?”

    Examples:

    • How much longer does it take to find an element within an unsorted array when the size of the array doubles? (Answer: twice as long)
    • How much longer does it take to find an element within a sorted array when the size of the array doubles? (Answer: one more step)

    Space Complexity

    Space complexity describes how much additional memory an algorithm needs depending on the size of the input data.

    This does not refer to the memory required for the input data itself (i.e., that twice as much space is naturally needed for an input array twice as large), but the additional memory needed by the algorithm for loop and helper variables, temporary data structures, and the call stack (e.g., due to recursion).

    Complexity Classes

    We divide algorithms into so-called complexity classes. A complexity class is identified by the Landau symbol O (“big O”).

    In the following section, I will explain the most common complexity classes, starting with the easy-to-understand classes and moving on to the more complex ones. Accordingly, the classes are not sorted by complexity.

    O(1) – Constant Time

    Pronounced: “Order 1”, “O of 1”, “big O of 1”

    The runtime is constant, i.e., independent of the number of input elements n.

    In the following graph, the horizontal axis represents the number of input elements n (or more generally: the size of the input problem), and the vertical axis represents the time required.

    Since complexity classes can only be used to classify algorithms, but not to calculate their exact running time, the axes are not labeled.

    Complexity class O(1) – constant time

    O(1) Examples

    The following two problems are examples of constant time:

    • Accessing a specific element of an array of size n: No matter how large the array is, accessing it via array[index] always takes the same time².
    • Inserting an element at the beginning of a linked list: This always requires setting one or two (for a doubly linked list) pointers (or references), regardless of the list’s size. (In an array, on the other hand, this would require moving all values one field to the right, which takes longer with a larger array than with a smaller one).

    ² This statement is not one hundred percent correct. Effects from CPU caches also come into play here: If the data block containing the element to be read is already (or still) in the CPU cache (which is more likely the smaller the array is), then access is faster than if it first has to be read from RAM.

    O(1) Example Source Code

    The following source code (class ConstantTimeSimpleDemo in the GitHub repository) shows a simple example to measure the time required to insert an element at the beginning of a linked list:

    public static void main(String[] args) {
      for (int n = 32; n <= 8_388_608; n *= 2) {
        LinkedList<Integer> list = createLinkedListOfSize(n);
    
        long time = System.nanoTime();
        list.add(0, 1);
        time = System.nanoTime() - time;
    
        System.out.printf("n = %d -> time = %d ns%n", n, time);
      }
    }
    
    private static LinkedList<Integer> createLinkedListOfSize(int n) {
      LinkedList<Integer> list = new LinkedList<>();
      for (int i = 0; i < n; i++) {
        list.add(i);
      }
      return list;
    }Code language: Java (java)

    On my system, the times are between 1,200 ns and 19,000 ns, unevenly distributed over the various measurements. This is sufficient for a quick test. But we don’t get particularly good measurement results here, as both the HotSpot compiler and the garbage collector can kick in at any time.

    The test program TimeComplexityDemo with the ConstantTime class provides better measurement results. The test program first runs several warmup rounds to allow the HotSpot compiler to optimize the code. Only after that are measurements performed five times, and the median of the measured values is displayed.

    Here is an extract of the results:

    --- ConstantTime (results 5 of 5) ---
    ConstantTime, n =        32 -> fastest: 31,700 ns, median: 44,900 ns
    ConstantTime, n =    16,384 -> fastest: 14,400 ns, median: 40,200 ns
    ConstantTime, n = 8,388,608 -> fastest: 34,000 ns, median: 51,100 nsCode language: plaintext (plaintext)

    The effort remains about the same, regardless of the size of the list. The complete test results can be found in the file test-results.txt.

    O(n) – Linear Time

    Pronounced: “Order n”, “O of n”, “big O of n”

    The time grows linearly with the number of input elements n: If n doubles, then the time approximately doubles, too.

    “Approximately” because the effort may also include components with lower complexity classes. These become insignificant if n is sufficiently large so they are omitted in the notation.

    In the following diagram, I have demonstrated this by starting the graph slightly above zero (meaning that the effort also contains a constant component):

    Complexity class O(n) – linear time

    O(n) Examples

    The following problems are examples for linear time:

    • Finding a specific element in an array: All elements of the array have to be examined – if there are twice as many elements, it takes twice as long.
    • Summing up all elements of an array: Again, all elements must be looked at once – if the array is twice as large, it takes twice as long.

    It is essential to understand that the complexity class makes no statement about the absolute time required, but only about the change in the time required depending on the change in the input size. The two examples above would take much longer with a linked list than with an array – but that is irrelevant for the complexity class.

    O(n) Example Source Code

    The following source code (class LinearTimeSimpleDemo) measures the time for summing up all elements of an array:

    public static void main(String[] args) {
      for (int n = 32; n <= 536_870_912; n *= 2) {
        int[] array = createArrayOfSize(n);
    
        long sum = 0;
    
        long time = System.nanoTime();
        for (int i = 0; i < n; i++) {
          sum += array[i];
        }
        time = System.nanoTime() - time;
    
        System.out.printf("n = %d -> time = %d ns%n", n, time);
      }
    }
    
    private static int[] createArrayOfSize(int n) {
      int[] array = new int[n];
      for (int i = 0; i < n; i++) {
        array[i] = i;
      }
      return array;
    }
    Code language: Java (java)

    On my system, the time degrades approximately linearly from 1,100 ns to 155,911,900 ns. Better measurement results are again provided by the test program TimeComplexityDemo and the LinearTime algorithm class. Here is an extract of the results:

    --- LinearTime (results 5 of 5) ---
    LinearTime, n =         512 -> fastest:         300 ns, median:         300 ns
    LinearTime, n =     524,288 -> fastest:     159,300 ns, median:     189,400 ns
    LinearTime, n = 536,870,912 -> fastest: 164,322,600 ns, median: 168,681,700 nsCode language: plaintext (plaintext)

    You can find the complete test results again in test-results.txt.

    What is the Difference Between “Linear” and “Proportional”?

    A function is linear if it can be represented by a straight line, e.g. f(x) = 5x + 3.

    Proportional is a particular case of linear, where the line passes through the point (0,0) of the coordinate system, for example, f(x) = 3x.

    As there may be a constant component in O(n), it’s time is linear.

    O(n²) – Quadratic Time

    Pronounced: “Order n squared”, “O of n squared”, “big O of n squared”

    The time grows linearly to the square of the number of input elements: If the number of input elements n doubles, then the time roughly quadruples. (And if the number of elements increases tenfold, the effort increases by a factor of one hundred!)

    Compleity class O(n²) – quadratic time

    O(n²) Examples

    Examples of quadratic time are simple sorting algorithms like Insertion Sort, Selection Sort, and Bubble Sort.

    O(n²) Example Source Code

    The following example (QuadraticTimeSimpleDemo) shows how the time for sorting an array using Insertion Sort changes depending on the size of the array:

    public static void main(String[] args) {
      for (int n = 32; n <= 262_144; n *= 2) {
        int[] array = createRandomArrayOfSize(n);
    
        long time = System.nanoTime();
        insertionSort(array);
        time = System.nanoTime() - time;
    
        System.out.printf("n = %d -> time = %d ns%n", n, time);
      }
    }
    
    private static int[] createRandomArrayOfSize(int n) {
      ThreadLocalRandom random = ThreadLocalRandom.current();
      int[] array = new int[n];
      for (int i = 0; i < n; i++) {
        array[i] = random.nextInt();
      }
      return array;
    }
    
    private static void insertionSort(int[] elements) {
      for (int i = 1; i < elements.length; i++) {
        int elementToSort = elements[i];
        int j = i;
        while (j > 0 && elementToSort < elements[j - 1]) {
          elements[j] = elements[j - 1];
          j--;
        }
        elements[j] = elementToSort;
      }
    }
    Code language: Java (java)

    We can obtain better results with the test program TimeComplexityDemo and the QuadraticTime class. Here is an excerpt of the results, where you can see the approximate quadrupling of the effort each time the problem size doubles:

    QuadraticTime, n =   8,192 -> fastest:     4,648,400 ns, median:     4,720,200 ns
    QuadraticTime, n =  16,384 -> fastest:    19,189,100 ns, median:    19,440,400 ns
    QuadraticTime, n =  32,768 -> fastest:    78,416,700 ns, median:    79,896,000 ns
    QuadraticTime, n =  65,536 -> fastest:   319,905,300 ns, median:   330,530,600 ns
    QuadraticTime, n = 131,072 -> fastest: 1,310,702,600 ns, median: 1,323,919,500 nsCode language: plaintext (plaintext)

    You can find the complete test results in test-results.txt.

    O(n) vs. O(n²)

    At this point, I would like to point out again that the effort can contain components of lower complexity classes and constant factors. Both are irrelevant for the big O notation since they are no longer of importance if n is sufficiently large.

    It is therefore possible that, for example, O(n²) is faster than O(n) – at least up to a certain size of n.

    The following diagram compares three fictitious algorithms: one with complexity class O(n²) and two with O(n), one of which is faster than the other. It is good to see how up to n = 4, the orange O(n²) algorithm takes less time than the yellow O(n) algorithm. And even up to n = 8, less time than the cyan O(n) algorithm.

    Above a sufficiently large n (that is n = 9), O(n²) is and remains the slowest algorithm.

    Big O notation - comparison of the complexity classes O(n) and O(n²)

    Let’s move on to two, not-so-intuitive complexity classes.

    O(log n) – Logarithmic Time

    Pronounced: “Order log n”, “O of log n”, “big O of log n”

    The effort increases approximately by a constant amount when the number of input elements doubles.

    For example, if the time increases by one second when the number of input elements increases from 1,000 to 2,000, it only increases by another second when the effort increases to 4,000. And again by one more second when the effort grows to 8,000.

    Complexity class O(n²) – logarithmic time

    O(log n) Example

    An example of logarithmic growth is the binary search for a specific element in a sorted array of size n.

    Since we halve the area to be searched with each search step, we can, in turn, search an array twice as large with only one more search step.

    (The older ones among us may remember searching the telephone book or an encyclopedia.)

    O(log n) Example Source Code

    The following example (LogarithmicTimeSimpleDemo) measures how the time for binary search changes in relation to the array size.

    public static void main(String[] args) {
      for (int n = 32; n <= 536_870_912; n *= 2) {
        int[] array = createArrayOfSize(n);
    
        long time = System.nanoTime();
        Arrays.binarySearch(array, 0);
        time = System.nanoTime() - time;
    
        System.out.printf("n = %d -> time = %d ns%n", n, time);
      }
    }
    
    private static int[] createArrayOfSize(int n) {
      int[] array = new int[n];
      for (int i = 0; i < n; i++) {
        array[i] = i;
      }
      return array;
    }Code language: Java (java)

    We get better measurement results with the test program TimeComplexityDemo and the class LogarithmicTime. Here are the results:

    LogarithmicTime, n =          32 -> fastest:  77,800 ns, median: 107,200 ns
    LogarithmicTime, n =       2,048 -> fastest: 173,500 ns, median: 257,400 ns
    LogarithmicTime, n =     131,072 -> fastest: 363,400 ns, median: 413,100 ns
    LogarithmicTime, n =   8,388,608 -> fastest: 661,100 ns, median: 670,800 ns
    LogarithmicTime, n = 536,870,912 -> fastest: 770,500 ns, median: 875,700 nsCode language: plaintext (plaintext)

    In each step, the problem size n increases by factor 64. The time does not always increase by exactly the same value, but it does so sufficiently precisely to demonstrate that logarithmic time is significantly cheaper than linear time (for which the time required would also increase by factor 64 each step).

    As before, you can find the complete test results in the file test-results.txt.

    O(n log n) – Quasilinear Time

    Pronounced: “Order n log n”, “O of n log n”, “big O of n log n”

    The effort grows slightly faster than linear because the linear component is multiplied by a logarithmic one. For clarification, you can also insert a multiplication sign: O(n × log n).

    This is best illustrated by the following graph. We see a curve whose gradient is visibly growing at the beginning, but soon approaches a straight line as n increases:

    Complexity class O(n log n) – quasilinear time

    O(n log n) Example

    Efficient sorting algorithms like Quicksort, Merge Sort, and Heapsort are examples for quasilinear time.

    O(n log n) Example Source Code

    The following sample code (class QuasiLinearTimeSimpleDemo) shows how the time for sorting an array with Quicksort³ grows in relation to the array size:

    public static void main(String[] args) {
      for (int n = 32; n <= 536_870_912; n *= 2) {
        int[] array = createArrayOfSize(n);
    
        long time = System.nanoTime();
        Arrays.binarySearch(array, 0);
        time = System.nanoTime() - time;
    
        System.out.printf("n = %d -> time = %d ns%n", n, time);
      }
    }
    
    private static int[] createArrayOfSize(int n) {
      int[] array = new int[n];
      for (int i = 0; i < n; i++) {
        array[i] = i;
      }
      return array;
    }Code language: Java (java)

    The test program TimeComplexityDemo with the class QuasiLinearTime delivers more precise results. Here is an extract:

    QuasiLinearTime, n =        256 -> fastest:        12,200 ns, med.:        12,500 ns
    QuasiLinearTime, n =      4,096 -> fastest:       228,600 ns, med.:       234,200 ns
    QuasiLinearTime, n =     65,536 -> fastest:     4,606,500 ns, med.:     4,679,800 ns
    QuasiLinearTime, n =  1,048,576 -> fastest:    93,933,500 ns, med.:    95,216,300 ns
    QuasiLinearTime, n = 16,777,216 -> fastest: 1,714,541,900 ns, med.: 1,755,715,000 nsCode language: plaintext (plaintext)

    The problem size increases each time by factor 16, and the time required by factor 18.5 to 20.3. You can find the complete test result, as always, in test-results.txt.

    ³ More precisely: Dual-Pivot Quicksort, which switches to Insertion Sort for arrays with less than 44 elements. For this reason, this test starts at 64 elements, not at 32 like the others.

    Big O Notation Order

    Here are, once again, the complexity classes, sorted in ascending order of complexity:

    • O(1) – constant time
    • O(log n) – logarithmic time
    • O(n) – linear time
    • O(n log n) – quasilinear time
    • O(n²) – quadratic time

    And here the comparison graphically:

    Big O notation – Comparison of complexity classes O(1), O(log n), O(n), O(n log n), O(n²)

    I intentionally shifted the curves along the time axis so that the worst complexity class O(n²) is fastest for low values of n, and the best complexity class O(1) is slowest. To then show how, for sufficiently high values of n, the efforts shift as expected.

    Other Complexity Classes

    Further complexity classes are, for example:

    • O(nm) – polynomial time
    • O(2n) – exponential time
    • O(n!) – factorial time

    However, these are so bad that we should avoid algorithms with these complexities, if possible.

    I have included these classes in the following diagram (O(nm) with m=3):

    Big O notation – Comparison of complexity classes O(1), O(log n), O(n), O(n log n), O(n²), O(n³), O(2ⁿ), O(n!)

    I had to compress the y-axis by factor 10 compared to the previous diagram to display the three new curves.

    Summary

    Time complexity describes how the runtime of an algorithm changes depending on the amount of input data. The most common complexity classes are (in ascending order of complexity): O(1), O(log n), O(n), O(n log n), O(n²).

    Algorithms with constant, logarithmic, linear, and quasilinear time usually lead to an end in a reasonable time for input sizes up to several billion elements. Algorithms with quadratic time can quickly reach theoretical execution times of several years for the same problem sizes⁴. You should, therefore, avoid them as far as possible.

    ⁴ Quicksort, for example, sorts a billion items in 90 seconds on my laptop; Insertion Sort, on the other hand, needs 85 seconds for a million items; that would be 85 million seconds for a billion items – or in other words: two years and eight months!