@rkenmi - Algorithm Handbook

Algorithm Handbook

Algorithm Handbook

Back to Top

Updated on September 12, 2021


Welcome to the algorithm handbook wiki! In this wiki you will find a mini-cheat sheet overview of data structures, and examples of their usages in modern languages. Algorithm problems can be found here.

Time and Space Complexity

  • \(O(n)\)
    • Linear run time (ex: single for loop)
  • \(O(n^2)\)
    • Quadratic run time (ex: nested for loop)
  • \(O(2^n)\)
    • Exponential run time (ex: a power set)
    • Result of \(i\) is result of \(i-1\) multiplied by 2
    • Example:
    • \(\theta => \theta\) (1)
    • \(a => \theta, a\) (1)
    • \(a, b => \theta, a, b, ab\) (4)
    • \(a, b, c => \theta, a, b, ab, c, ac, bc, abc\) (8)
  • \(O(n!)\)
    • Factorial run time (ex: string permutations)
    • Example:
    • \(a => a\) (1)
    • \(ab => ab, ba\) (2)
    • \(abc => abc, acb, bac, bca, cab, cba\) (6)

Bit Operations

  • Sequence of 1's to the right of position \( i \) \( 0000\)\(0000\) => \( 0000\)\(1111\), \(i = 5\)

    1. Shift 1 to the left by \(i\) times.
    2. Subtract by 1
  • Sequence of 1's to the left of position \( i \) \( 0000\)\(0000\) => \(111\)\(0 0000\), \(i = 5\)

    1. Shift 1 to the left by \(i+1\) times.
    2. Subtract by 1
    3. Invert.
  • Set bit at position \(i\) \( 0100 0001\) => \( 0100 \)\(1\)\(001, i=4\)

    1. Shift 1 to the left by \(i\) times. This is the mask.
    2. Use the \(OR\) operator on the original number with the above mask.
  • Clear bit at position \(i\) \( 0100 0001\) => \( 0100 \)\(000\)\(0\)\(, i=0\)

    1. Shift 1 to the left by \(i\) times. This is the mask.
    2. This mask must now be inverted, so that 1's are flipped to 0 and 0's are flipped to 1. Use the ~ operator on the mask.
    3. Use the \(AND\) operator on the original number with the above mask.

\(AND\) applied to a 1 is harmless and retains the value of the old bit. On the other hand, \(AND\) applied to a 0 will clear the targeted bit.

  • Get bit at position \(i\) \( 1000 \)\(0\)\(011\) => The bit at position \(i\) is 0, where \(i = 3\)

    1. Shift 1 to the left by \(i\) times. This is the mask. Same as the mask for setting!
    2. Apply the \(AND\) operator on the copied number and compare the result to a 0. If the result is a 0, we know that the bit at \(i\) is a 0. If the result is not a 0, then the bit at \(i\) is a 1.


  • When constructing strings, use a list in Python and add characters accordingly. The entire string can be returned via "".join(list_variable). This is similar to using a StringBuffer in Java.
  • Strings are immutable by nature
  • Palindromes are special strings where the contents of the string are mirrored. Getting optimal solutions with a palindrome involves a clever technique of expanding from the middle. It also has to be called once more for the edge case of a single character being in the middle (hence, it is a mirror of itself), such as the d in ada.
    • The expanding from the middle technique is done in a single loop. It is still \(O(n^2)\) time complexity, but space complexity can be kept \(O(1)\). This is actually faster than the DP / caching solutions for palindromes!

1D Arrays

  • Swapping elements are always efficient -- \(O(1)\) time complexity when the indices are given.
  • Think twice before calling .remove or .pop operations. Consider swapping elements to remove elements.

2D Arrays

  • Don't ever create a 2d matrix in Python like this: matrix = [[0] * len(col)] * len(row)]
    • You are making shallow lists here which points to the same sub-list as the first one. Thus, if you change matrix[0][1] = 1, then every sublist's first element will also be 1. This means matrix[1][1] = 1, matrix[2][1] = 1, .... This will give you a headache later on as you debug why the matrix isn't working as you expect it to.
    • Make 2d matrices the right way: matrix = [[0 for _ in range(0, len(col))] for _ in range(0, len(row))]
  • Here is a nice trick for iterating through neighbors of a block (horizontal, vertical, diagonal)
    • Create an array of pairs that represents coordinates: coords = [(0, -1), (-1, 0), (1, 0), (0, 1), (-1, -1), (1, 1), (-1, 1), (1, -1)]
    • Iterate through each coordinate
    • Add the current indices to the coordinates to get the row and column that you're considering
    • Check the validity of the row and column variables:
            for c in coords:
                row, col = i + c[0], j + c[1]

                if 0 <= row < len(board) and 0 <= col < len(board[0]):
                    neighbors += board[row][col]
  • If we need to mutate the original 2D array but somehow need to keep track of the old state of the 2D array, consider changing the data in the 2D array to signify state transitions temporarily. For example, if 0 became 1, then we can call this state 2; if 0 became -1, then we can call this state -1.
  • Heaps can be used to find min/max k problems by recording the row and column of the 2D matrix value as an entry in the heap, allowing you to effectively find sorted values in less than \(O(n^2)\) time.
  • For rotating a 2d array, there is a 2-step program
    1. Transpose the matrix (Turn Columns to Rows): [[1, 2],[3, 4]] => [[1, 3],[2, 4]]
    2. Invert the rows (mirror): [1, 2, 3] => [3, 2, 1]
# Transpose a 2d matrix
for j in range(0, n):  
    for i in range(j, n):
         matrix[i][j], matrix[j][i] = matrix[j][i], matrix[i][j]  

Linked Lists

  • Use a slow and a fast pointer for linked list nodes
    • To get to the half-way point of a linked list efficiently, simply increment the fast pointer by 2 and the slow pointer by 1. When the fast pointer reaches the end of the linked list, the slow pointer will be at the middle
    • To get the \(k\) th item from the end of the linked list, simply increment the fast pointer ahead by \(k\), then increment the slow and fast pointer by 1. When the fast pointer reaches the end of the linked list, the slow pointer will be at the \(k\) th item from the end of the list.
    • Cycle: Detect a cycle by incrementing the fast pointer by 2 and slow pointer by 1. Eventually the two will meet.
    • Cycle: Find where the cycle begins by grabbing the pointer where the fast and slow met each other, and increment it along with a clean pointer that starts at position HEAD. Because there is a cycle present, the pointer starting at HEAD will eventually meet the other pointer.
  • You can interleave a linked list [1, 2, 3, 4, 5] into [1, 5, 2, 4, 3] by reversing the 2nd half of the linked list first, then have one pointer at the beginning of the list and another pointer at the middle of the list.


  • Solve for base cases first
  • Opt for inefficient solutions first, then optimize
  • Note that recursive functions take up \(O(n)\) space, if there are \(n\) function calls. Iterative versions are always faster, but they may be more complicated to implement.

Binary Trees

  • If an in-order traversal data and pre-order traversal data are both provided, a unique binary tree can be composed. If only one or the other is provided, many binary trees can be composed


  • Validate a BST
  • In-order traversal (LNR) provides a sorted traversal in ascending order
  • Reverse in-order traversal (RNL) provides a sorted traversal in descending order

Hash Tables

  • There are multiple implementations that use a hash table, and they can be quite different. For example, a set and a dictionary (i.e. Java HashMap) uses hash tables internally, but these two data structures behave differently. A set will store distinct keys only, but a dictionary will allow you to map values to distinct keys.
  • In general, mutable values should not be stored as hash table keys. For that reason, you can't add a list into a set, or have a list be a key inside a dictionary.
  • Hashing function should be \(O(1\) in time complexity.
  • Be aware of collusion resolution methods - separate chaining or open addressing
  • Be aware of hash table implementations - an array of linked lists (separate chaining), or just an array of values (open addressing)

Greedy Algorithms


  • Selection Sort

    • A simple sort: You select the minimum element, swap it with the current index, and increment the index.
    • Bad worst-case performance: \(O(n^2)\) best-case/average-case/worst-case performance.
    • Low amount of swaps: There are \(n-1\) swaps in this sort. This is actually one of its better features, and it is worth considering if copying data is a much more expensive operation than swapping them.
  • Insertion Sort

    • A simple sort: You insert the data at the proper place (like you do unconsciously with a hand of playing cards)
    • Bad worst-case performance: \(O(n^2)\) worst-case performance when the data is completely randomized (and unsorted)
    • Great performance with already sorted data: \(O(n)\) best-case performance when the data is already sorted. \(O(n)\) is very good, especially for such a simple sort! Data containers that are mostly sorted or presorted should consider this option over classics like Quicksort.
    • Great for very small datasets: The divide-and-conquer algorithms (Quicksort, Mergesort, etc.) have more overhead so they aren't quite as fast when working with tiny data sets (with less than 10 elements, for example). Consequently, many sorting libraries often implement these algorithms as hybrids; when the number of elements are too few, it would opt for Insertion Sort instead of, say, Quicksort or Mergesort.
  • Bubble Sort

    • \(O(n^2)\) worst-case/average-case performance, \(O(n)\) best-case.
  • Quicksort

    • An attractive divide-and-conquer sorting algorithm.
    • Strong average-case performance: \(O(n\log n)\), especially if data is unknown and a generic efficient sort is needed.
    • Tuneable: A partitioning helper function is used to order the data in the list based on a pivot value. Then, the current list is separated into two sublists with the pivot marking the cutoff point. The influence of the partitioning/pivot selection on the actual sorting operation varies greatly - there is no magic algorithm that works best for all cases.
    • Bad worst-case performance (although rare in practice): \(O(n^2)\) worst-case performance in instances where the pivot will always select an element at the right edge of the list, dividing the list into two sublists where one sublist is of size \(1\) and the other is of size \(n-1\).
    • No additional space required: This makes Quicksort much more attractive compared to Mergesort, which needs an auxiliary buffer that takes \(O(n)\) space.
    • General implementations are not a stable sort.
  • Mergesort

    • Also another attractive divide-and-conquer sorting algorithm.
    • Interview Advice 1: Split merge sort into two functions - one for diving an array into subarrays and another for the merge operation. Keep the function that divides an array into left and right subarrays simple - no need to pass in index counters into the function. If an array is given as an argument, you can fetch the size of the array and split down the middle to grab the left or right subarrays, and pass them into recursive calls.
    • Interview Advice 2: Two base cases for the recursive array divide and conquer. On empty array, just return empty array. On a single entry array, just return the array itself. We don't need other base cases, like for left/right indices because the function is simple in that it only takes in an array as an argument.\
    • Interview Advice 3: The merge function involves incrementing counters for array1 and array2, depending on which current value in either array is smaller. Once either array is exhausted, we need to drain out the remaining array into the results list, and return it.
    • Guaranteed fast worst-case: \(O(n\log n)\) worst-case/average-case/best-case. The worst-case time complexity of Mergesort is one of its stronger features.
    • Space expensive: An auxiliary (or temporary buffer/array) is needed to merge all the sublists together. This means \(O(n)\) space is required, which can be expensive if working with huge datasets. Note however, that for linked lists, this space is not required (do you know why?)
    • Useful for memory intensive situations: The actual merge operation is \(O(n)\), and is very useful. If you have a huge dataset that, as a whole, exceeds your memory capacity, then you can split the dataset into sorted chunks and merge them together. That will take \(O(n)\) time and is a better solution than naively sorting the dataset (which takes \(O(n \log n)\) time with the best sorting algorithm).
  • Heapsort

    • Binary heap based sorting algorithm which is typically implemented with arrays and indexing. This has potential issues if the dataset is gigantic and the array is allocated statically - priority queues are used as an alternative. Python's heapq uses priority queues with the binary heap structure.
    • Great performance: The time complexity is \(O(n\log n)\) best-case/worst-case/average-case. Smoothsort variation can hit \(O(n)\) in the best-case, which is very good.
    • No additional space required: No extra buffers are required, which makes Heapsort very attractive, especially compared to Mergesort, which needs \(O(n)\) space.
    • Fast heap building: Building the heap is surprisingly \(\Theta(n)\). (See here). If we want to convert an unordered binary tree to a heap, the cost is only \(O(n)\).
    • The first/last \(n\) elements...: Excellent choice for choosing the top \(n\) elements or bottom \(n\) elements. For the top elements, a max-heap can be used, and for the bottom elements, a min-heap can be used.
    • Complete binary tree: A binary heap is a complete tree, which means all levels are filled with nodes except possibly the last. This ensures \(O(\log n)\) worst-case time complexity for adding and deleting, which leads to a \(O(n\log n)\) sorting algorithm as a whole.
    • Automatically maintains: The heap data structure keeps everything sorted after each action; by adding and deleting values from the binary heap, the heap automatically adjusts itself (by sifting nodes upwards/downwards) to maintain the sorted properties. The sifting costs \(O(\log n)\) time due to the complete binary tree structure.
    • Since adding and removing nodes can take up to \(O(\log n)\) and there are \(n\) nodes, the time complexity of a full sort is \(O(n\log n)\) in the worst-case and average-case.


  • Undirected (Simple Graph)

    • DFS/BFS allows you to easily find connected components. This is useful for things like photoshop color fills, where each pixel in an image represents a vertex and edges are connected to all adjacent pixels.
    • DFS allows you to easily find cycles
    • BFS allows you to easily find the shortest path. Useful for things like solving a maze.
    • A vertex degree represents the number of edges connected to a vertex.
  • Directed (Digraph)

    • When implementing code, keep track of states. UNVISITED = 0, VISITING = 1, VISITED = 2. Pay attention here; VISITING and VISITED are not the same - VISITED is a node that is completely examined, while VISITING is a node where the neighbors are still being explored.
    • DAG = Direct Acyclic Graph; no cycles!
    • DFS allows you to find cycles and do topological sorts (where all edges point upwards). Topological sorts can only be done on a DAG.
    • Topological sort is basically doing a DFS traversal in post-order, and adding the results to a stack.
    • Strongly connected components in digraphs are like connected components in simple graphs. If a vertex \(A\) has a directed edge to vertex \(B\), and \(B\) has a directed edge to vertex \(A\), then these two vertices are strongly connected components. This property is transitive , reflexive, and symmetric. (Equivalence relation)
    • DFS also allows you to find strongly connected components with Kosaraju DFS. Simply run DFS on the reversed graph, then run DFS again on the original graph in the order of vertices given by the first DFS.
    • You can convert a cyclic directed graph to a kernel DAG. A kernel DAG is basically the same graph as the original, except that each strong component is treated as a single vertex and each strong component has a directed edge to another strong component. This means all the cycles in the original graph become merged into a strong component, eliminating cycles in the kernel DAG.
    • If there are \(v\) number of vertices, then there are \(2^{v^2}\) different digraphs if you exclude parallel edges. If \(v = 3\), with three vertices named \((A, B, C)\), then there are 3 edges with self-loops (i.e. \(A => A\)), and 6 more edges; 2 from \(A => B\) and \(B => A\), 2 from \(B => C\) and \(C => B\), and 2 from \(A => C\) and \(C => A\).
    • A vertex degree is separated by two categories: in-degree and out-degree.


  • Pronounced "try-es"
  • Faster alternative to string searching, but can be space heavy.
  • Basically a tree of nodes with R pointers (hence a R-tree), where R is the number of distinct characters that can be used (Radix). For example, if you want to represent lowercase alphabet characters, you can have 26 pointers (which represents the ASCII value of a-z).
  • Each node will also have a value flag. Typically, this flag marks the node as a valid word.
  • Example applications: A spell checker, auto-complete system.




Python Tricks

  • Use the ~ unary operator to get a 2's complement of an integer. This is highly valuable when you want to access an index in an array starting from the back. Instead of accessing an array via A[len(A) - 1 - x], you can instead do A[~x], which is equivalent to A[-(x + 1)]
  • If you want to represent directions in a 2D array, consider defining a list of four tuples: [(0, 1), (1, 0), (-1, 0), (0, -1)] and iterate through it.
for direction in [(0, 1), (1, 0), (0, -1), (-1, 0)]:  
  • Use .sort() to sort in-place, and sorted() to return the sorted list.

Article Tags: