Union Find — Data Structure in Python

image

A Union-Find/Disjoint set data structure is used to maintain a set of elements partitioned into a number of mutually non-overlapping subsets.

Disjoint Set —A disjoint set is a group of sets where no item can be in more than one set. In other words, if we take any two sets from the Disjoint set then their intersection will result in an empty set. In even simpler terms a disjoint set is a data structure where similar elements are kept in a single group.

image

There are two main operations DSU performs on sets:

Find — Tells us an element/node belongs to which subset.

Union — Join two subsets, if they share a common attribute

In this approach, we represent the nodes using an array, and the value of the item at that index tells which set that node belongs to. Depending upon the strategy used the value may represent its parent or the reference to the representative of the set.

image

Quick Find

class QuickFind():
    def __init__(self,N):
        self._parents = list(range(0, N))def find(self, p):
        return(self._parents[p])

    def union(self, p, q):
        root_p, root_q = self._parents[p], self._parents[q]
        for i in range(0, len(self._parents)):
            if(self._parents[i] == root_p):
                self._parents[i] = root_q def connected(self,p,q):
        return self._parents[p] == self._parents[q]

Quick Union

class QuickUnion():
    def __init__(self,N):
        self._parents = list(range(0,N))

    def find(self, p):
        while(p != self._parents[p]):
            p = self._parents[p]
        return p

    def union(self, p, q):
        root_p, root_q = self.find(p), self.find(q)
        self._parents[i] = root_q def connected(self, p, q):
        return self.find(p)==self.find(q)

Optimized Disjoint Set

We can significantly improve the implementation of the disjoint set by using two techniques:

Union By Rank — Union by rank always attaches the shorter tree to the root of the taller tree.

Path Compression — By making every other node in the path point to its grandparent's node, we can constantly flatten the tree by altering the parent node to the root node and halving the path length.

image

Weighted Quick Union with optimization

class DisJointSets():
    def __init__(self,N):
        # Initially, all elements are single element subsets
        self._parents = [node for node in range(N)]
        self._ranks = [1 for _ in range(N)]

    def find(self, u):
        while u != self._parents[u]:
            # path compression technique
            self._parents[u] = self._parents[self._parents[u]]
            u = self._parents[u]
        return u

    def connected(self, u, v):
        return self.find(u) == self.find(v)

    def union(self, u, v):
        # Union by rank optimization
        root_u, root_v = self.find(u), self.find(v)
        if root_u == root_v:
            return True
        if self._ranks[root_u] > self._ranks[root_v]:
            self._parents[root_v] = root_u
        elif self._ranks[root_v] > self._ranks[root_u]:
            self._parents[root_u] = root_v
        else:
            self._parents[root_u] = root_v
            self._ranks[root_v] += 1
        return False

Time Complexity

Find — log(N)
Union — log(N)

Problem 1

N students applied for admission to ABC Academy. An array of integers B is given representing the strengths of N people i.e. B[i] represents the strength of the ith student. Among the N students, some of them knew each other. A matrix C of size M x 2 is given which represents relations where ith relations depict that C[i][0] and C[i][1] knew each other. All students who know each other are placed in one batch. The strength of a batch is equal to the sum of the strength of all the students in it. ABC sets criteria for selection: All those batches having a strength of at least D are selected. Find the number of batches selected.

Example:

N = 7
B = [1, 6, 7, 2, 9, 4, 5]
C = [[1, 2], [2, 3], [5, 6], [5, 7]]
D = 12Ans: 2
Initial Batches :
Batch 1 = {1, 2, 3} Batch Strength = 1 + 6 + 7 = 14
Batch 2 = {4} Batch Strength = 2
Batch 3 = {5, 6, 7} Batch Strength = 9 + 4 + 5 = 18
Selected Batches are Batch 1 and Batch 2.

We have to create subsets where only students who are friends are in a single subset. Disjoint Sets will be the right choice here.

Code Implementation

def solve(N, B, C, D):
    parent = list(range(N))
    def find(x):
        if x != parent[x]:
            parent[x] = find(parent[x])
        return parent[x]

    def union(x, y):
        parent_x = find(x)
        parent_y = find(y)
        if parent_x != parent_y:
            parent[parent_y] = parent_x

    for x, y in C:
        union(x-1, y-1)

    from collections import defaultdict
    dict_pair = defaultdict(list)
    for idx, val in enumerate(parent):
        dict_pair[find(val)].append(B[idx])

    res = 0
    for v in dict_pair.values():
        if sum(v) >= D:
            res += 1
    return res

Happy Coding!

Enjoyed this article?

Share it with your network to help others discover it

Continue Learning

Discover more articles on similar topics