1 of 65

A graph is a pictorial representation of a set of objects where some pairs of objects are connected by links.
The interconnected objects are represented by points termed as vertices, and the links that connect the vertices are called edges.

2 of 65

Graph ADT

We are going to see how to create a graph and add various data elements to it using a python program.
Following are the basic operations we perform on graphs.
Display graph vertices
Display graph edges
Add a vertex
Add an edge
Creating a graph

3 of 65

A graph can be easily presented using the python dictionary data types.
We represent the vertices as the keys of the dictionary and the connection between the vertices also called edges as the values in the dictionary.
V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}

4 of 65

We can present this graph in a python program
# Create the dictionary with graph elements
graph = {
"a" : ["b","c"],
"b" : ["a", "d"],
"c" : ["a", "d"],
"d" : ["e"],
"e" : ["d"]
}
# Print the graph
print(graph)

5 of 65

Output
{'c': ['a', 'd'], 'a': ['b', 'c'], 'e': ['d'], 'd': ['e'], 'b': ['a', 'd']}

6 of 65

Display graph vertices

To display the graph vertices we simple find the keys of the graph dictionary. We use the keys() method.
class graph:
def __init__(self,gdict=None):
if gdict is None:
gdict = []
self.gdict = gdict
# Get the keys of the dictionary
def getVertices(self):
return list(self.gdict.keys())

7 of 65

Display graph vertices

# Create the dictionary with graph elements
graph_elements = {
"a" : ["b","c"],
"b" : ["a", "d"],
"c" : ["a", "d"],
"d" : ["e"],
"e" : ["d"]
}
g = graph(graph_elements)
print(g.getVertices())

8 of 65

Output
['d', 'b', 'e', 'c', 'a']

9 of 65

Display graph edges

Finding the graph edges is little tricker than the vertices as we have to find each of the pairs of vertices which have an edge in between them.
So we create an empty list of edges then iterate through the edge values associated with each of the vertices.
A list is formed containing the distinct group of edges found from the vertices.

10 of 65

class graph:
def __init__(self,gdict=None):
if gdict is None:
gdict = {}
self.gdict = gdict

11 of 65

def edges(self):
return self.findedges()
# Find the distinct list of edges
def findedges(self):
edgename = []
for vrtx in self.gdict:
for nxtvrtx in self.gdict[vrtx]:
if {nxtvrtx, vrtx} not in edgename:
edgename.append({vrtx, nxtvrtx})
return edgename

12 of 65

# Create the dictionary with graph elements
graph_elements = {
"a" : ["b","c"],
"b" : ["a", "d"],
"c" : ["a", "d"],
"d" : ["e"],
"e" : ["d"]
}
g = graph(graph_elements)
print(g.edges())

13 of 65

Output
[{'b', 'a'}, {'b', 'd'}, {'e', 'd'}, {'a', 'c'}, {'c', 'd'}]

14 of 65

Adding a vertex

Adding a vertex is straight forward where we add another additional key to the graph dictionary.
class graph:
def __init__(self,gdict=None):
if gdict is None:
gdict = {}
self.gdict = gdict
def getVertices(self):
return list(self.gdict.keys())
# Add the vertex as a key
def addVertex(self, vrtx):
if vrtx not in self.gdict:
self.gdict[vrtx] = []

15 of 65

Adding a vertex

# Create the dictionary with graph elements
graph_elements = {
"a" : ["b","c"],
"b" : ["a", "d"],
"c" : ["a", "d"],
"d" : ["e"],
"e" : ["d"]
}
g = graph(graph_elements)
g.addVertex("f")
print(g.getVertices())

16 of 65

Output
['a', 'b', 'c', 'd', 'e','f']

17 of 65

Graph Traversals

Graphs are very useful data structures in solving many important mathematical challenges.
For example computer network topology or analysing molecular structures of chemical compounds.
They are also used in city traffic or route planning and even in human languages and their grammar.
All these applications have a common challenge of traversing the graph using their edges and ensuring that all nodes of the graphs are visited.
There are two common established methods to do this traversal which is described below.

18 of 65

Depth First Traversal

Also called depth first search (DFS),this algorithm traverses a graph in a depth ward motion and uses a stack to remember to get the next vertex to start a search, when a dead end occurs in any iteration.
We implement DFS for a graph in python using the set data types as they provide the required functionalities to keep track of visited and unvisited nodes.

19 of 65

class graph:
def __init__(self,gdict=None):
if gdict is None:
gdict = {}
self.gdict = gdict

20 of 65

# Check for the visisted and unvisited nodes
def dfs(graph, start, visited = None):
if visited is None:
visited = set()
visited.add(start)
print(start)
for next in graph[start] - visited:
dfs(graph, next, visited)
return visited

21 of 65

gdict = {
"a" : set(["b","c"]),
"b" : set(["a", "d"]),
"c" : set(["a", "d"]),
"d" : set(["e"]),
"e" : set(["a"])
}
dfs(gdict, 'a')

22 of 65

Output
a
b
d
e
c

23 of 65

Breadth First Traversal

Also called breadth first search (BFS),this algorithm traverses a graph breadth ward motion and uses a queue to remember to get the next vertex to start a search, when a dead end occurs in any iteration.

24 of 65

Breadth First Traversal

When we keep visiting the adjacent unvisited nodes and keep adding it to the queue.
Then we start dequeue only the node which is left with no unvisited nodes.
We stop the program when there is no next adjacent node to be visited.

25 of 65

import collections
class graph:
def __init__(self,gdict=None):
if gdict is None:
gdict = {}
self.gdict = gdict

26 of 65

def bfs(graph, startnode):
# Track the visited and unvisited nodes using queue
seen, queue = set([startnode]), collections.deque([startnode])
while queue:
vertex = queue.popleft()
marked(vertex)
for node in graph[vertex]:
if node not in seen:
seen.add(node)
queue.append(node)

27 of 65

def marked(n):
print(n)
# The graph dictionary
gdict = {
"a" : set(["b","c"]),
"b" : set(["a", "d"]),
"c" : set(["a", "d"]),
"d" : set(["e"]),
"e" : set(["a"])
}
bfs(gdict, "a")

28 of 65

Output
a
c
b
d
e

29 of 65

Topological ordering

Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge uv, vertex u comes before v in the ordering.
Topological Sorting for a graph is not possible if the graph is not a DAG.�

30 of 65

For example, a topological sorting of the following graph is “5 4 2 3 1 0”.
There can be more than one topological sorting for a graph.
For example, another topological sorting of the following graph is “4 5 2 3 1 0”.
The first vertex in topological sorting is always a vertex with in-degree as 0 (a vertex with no in-coming edges).

32 of 65

Topological sorting can be implemented recursively and non-recursively.
First, we show the clearer recursive version, then provide the non-recursive version with analysis.

33 of 65

from collections import defaultdict
#Class to represent a graph
class Graph:
def __init__(self,vertices):
self.graph = defaultdict(list) #dictionary containing adjacency List
self.V = vertices #No. of vertices
# function to add an edge to graph
def addEdge(self,u,v):
self.graph[u].append(v)

34 of 65

# A recursive function used by topologicalSort
def topologicalSortUtil(self,v,visited,stack):
# Mark the current node as visited.
visited[v] = True
# Recur for all the vertices adjacent to this vertex
for i in self.graph[v]:
if visited[i] == False:
self.topologicalSortUtil(i,visited,stack)
# Push current vertex to stack which stores result
stack.insert(0,v)

35 of 65

def topologicalSort(self):
# Mark all the vertices as not visited
visited = [False]*self.V
stack =[]
# Call the recursive helper function to store Topological
# Sort starting from all vertices one by one
for i in range(self.V):
if visited[i] == False:
self.topologicalSortUtil(i,visited,stack)
# Print contents of stack
print (stack)

36 of 65

g= Graph(6)
g.addEdge(5, 2);
g.addEdge(5, 0);
g.addEdge(4, 0);
g.addEdge(4, 1);
g.addEdge(2, 3);
g.addEdge(3, 1);
print ("Following is a Topological Sort of the given graph")
g.topologicalSort()

37 of 65

Following is a Topological Sort of the given graph
[5, 4, 2, 3, 1, 0]
Complexity Analysis:
Time Complexity: O(V + E): The above algorithm is simply DFS with a working stack and a result stack. Unlike the recursive solution, recursion depth is not an issue here.
Auxiliary space: O(V): The extra space is needed for the 2 stacks used.

38 of 65

Single-source shortest-paths

Dijkstra’s Algorithm solves the single-source shortest-paths problem.
For a given vertex called the source in a weighted connected graph, find shortest paths to all its other vertices.
The single-source shortest-paths problem asks for a family of paths, each leading from the source to a different vertex in the graph, though some paths may, of course, have edges in common.
The most widely used applications are transportation planning and packet routing in communication networks including the Internet.
It also includes finding shortest paths in social networks, speech recognition, document formatting, robotics, compilers, and airline crew scheduling.
In the world of entertainment, one can mention pathfinding in video games and finding best solutions to puzzles using their state-space graphs.
Dijkstra’s algorithm is the best-known algorithm for the single-source shortest-paths problem.

41 of 65

The shortest paths (identified by following nonnumeric labels backward from a destination vertex in the left column to the source) and their lengths (given by numeric labels of the tree vertices) are as follows:
From a to b : a − b of length 3
From a to d : a − b − d of length 5
From a to c : a − b − c of length 7
From a to e : a − b − d − e of length 9

42 of 65

WARSHALL’S ALGORITHM (All-Pairs Path Existence Problem)

A directed graph (or digraph) is a graph, or set of vertices connected by edges, where the edges have a direction associated with them.
An Adjacency matrix A = {a_ij} of a directed graph is the boolean matrix that has 1 in its ith row and jth column if and only if there is a directed edge from the ith vertex to the jth vertex.
The transitive closure of a directed graph with n vertices can be defined as the n x n boolean matrix T = {t_ij }, in which the element in the ith row and the jth column is 1 if there exists a nontrivial path (i.e., directed path of a positive length) from the ith vertex to the jth vertex; otherwise, tij is 0.

45 of 65

1’s reflect the existence of paths with no intermediate vertices (R(0) is just the adjacency matrix); boxed row and column are used for getting R(1).
1’s reflect the existence of paths with intermediate vertices numbered not higher than 1, i.e., just vertex a (note a new path from d to b); boxed row and column are used for getting R(2).
1’s reflect the existence of paths with intermediate vertices numbered not higher than 2, i.e., a and b (note two new paths); boxed row and column are used for getting R(3).
1’s reflect the existence of paths with intermediate vertices numbered not higher than 3, i.e., a, b, and c (no new paths); boxed row and column are used for getting R(4).
1’s reflect the existence of paths with intermediate vertices numbered not higher than 4, i.e., a, b, c, and d (note five new paths).

46 of 65

FLOYD’S ALGORITHM (All-Pairs Shortest-Paths Problem)

Floyd’sl algorithm is an algorithm for finding shortest paths for all pairs in a weighted

connected graph (undirected or directed) with (+/-) edge weights.

A distance matrix is a matrix (two-dimensional array) containing the distances, taken pairwise, between the vertices of graph.
The lengths of shortest paths in an n × n matrix D called the distance matrix: the element d_ij in the ith row and the jth column of this matrix indicates the length of the shortest path from the ith vertex to the jth vertex.
We can generate the distance matrix with an algorithm that is very similar to Warshall’s algorithm is called Floyd’s algorithm.
Floyd’s algorithm computes the distance matrix of a weighted graph with n vertices through a series of n × n matrices:
D⁽⁰⁾, . . . , D^(k−1), D^(k), . . . , D⁽ⁿ⁾
The element d_ij^(k) in the ith row and the jth column of matrix D^(k) (i, j = 1, 2, . . . , n, k = 0, 1,. . . , n) is equal to the length of the shortest path among all paths from the ith vertex to the jth vertex with each intermediate vertex, if any, numbered not higher than k.

48 of 65

Lengths of the shortest paths with no intermediate vertices (D(0) is simply the weight matrix).
Lengths of the shortest paths with intermediate vertices numbered not higher than 1, i.e., just a (note two new shortest paths from b to c and from d to c ).
Lengths of the shortest paths with intermediate vertices numbered not higher than 2, i.e., a and b (note a new shortest path from c to a).
Lengths of the shortest paths with intermediate vertices numbered not higher than 3, i.e., a, b, and c (note four new shortest paths from a to b, from a to d, from b to d, and from d to b).
Lengths of the shortest paths with intermediate vertices numbered not higher than 4, i.e., a, b, c, and d (note a new shortest path from c to a).

49 of 65

Prim’s Algorithm

A spanning tree of an undirected connected graph is its connected acyclic subgraph (i.e., a tree) that contains all the vertices of the graph.
If such a graph has weights assigned to its edges, a minimum spanning tree is its spanning tree of the smallest weight, where the weight of a tree is defined as the sum of the weights on all its edges.
The minimum spanning tree problem is the problem of finding a minimum spanning tree for a given weighted connected graph.

50 of 65

Graph and its spanning trees, with T1 being the minimum spanning tree.

51 of 65

The minimum spanning tree is illustrated in Figure .
If we were to try constructing a minimum spanning tree by exhaustive search, we would face two serious obstacles.
First, the number of spanning trees grows exponentially with the graph size (at least for dense graphs).
Second, generating all spanning trees for a given graph is not easy; in fact, it is more difficult than finding a minimum spanning tree for a weighted graph.

52 of 65

Prim’s algorithm constructs a minimum spanning tree through a sequence of expanding subtrees.
The initial subtree in such a sequence consists of a single vertex selected arbitrarily from the set V of the graph’s vertices.
On each iteration, the algorithm expands the current tree in the greedy manner by simply attaching to it the nearest vertex not in that tree.
The algorithm stops after all the graph’s vertices have been included in the tree being constructed.

53 of 65

ALGORITHM Prim(G)
//Prim’s algorithm for constructing a minimum spanning tree
//Input: A weighted connected graph G = {V, E}
//Output: E_T, the set of edges composing a minimum spanning tree of G V_T←{v₀} //the set of tree vertices can be initialized with any vertex E_T←Φ
for i ←1 to |V| − 1 do

find a minimum-weight edge e∗ = (v∗, u∗) among all the edges (v, u) such that v is in V_T and u is in V − V_T

V_T←V_T𝖴 {u^*}

E_T←E_T𝖴 {e^*}

return E_T

54 of 65

If a graph is represented by its adjacency lists and the priority queue is implemented as a min-heap, the running time of the algorithm is O(|E| log |V |) in a connected graph, where |V| − 1≤|E|.

55 of 65

Application of Prim’s algorithm

56 of 65

The parenthesized labels of a vertex in the middle column indicate the nearest tree vertex and edge weight.
Selected vertices and edges are in bold.

57 of 65

KRUSKAL'S ALGORITHM

Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected graph G= {V, E} as an acyclic subgraph with |V| − 1 edges for which the sum of the edge weights is the smallest.
The algorithm constructs a minimum spanning tree as an expanding sequence of subgraphs that are always acyclic but are not necessarily connected on the intermediate stages of the algorithm.

58 of 65

The algorithm begins by sorting the graph’s edges in nondecreasing order of their weights.
Then, starting with the empty subgraph, it scans this sorted list, adding the next edge on the list to the current subgraph if such an inclusion does not create a cycle and simply skipping the edge otherwise.

59 of 65

Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected graph G = (V, E) as an acyclic subgraph with |V| − 1 edges for which the sum of the edge weights is the smallest.

60 of 65

ALGORITHM Kruskal(G)
//Kruskal’s algorithm for constructing a minimum spanning tree
//Input: A weighted connected graph G = ( V, E )
//Output: E_T, the set of edges composing a minimum spanning tree of G
sort E in nondecreasing order of the edge weights w(e_i₁) ≤ . . . ≤ w(e_i _|_E_|) E_T ← Φ; ecounter ← 0 //initialize the set of tree edges and its size
K ← 0 //initialize the number of processed edges
while ecounter < |V| − 1 do

k ← k + 1

if E_T 𝖴 {e_ik} is acyclic

E_T ← E_T 𝖴 {e_ik}; ecounter ← ecounter + 1

return E_T

61 of 65

The initial forest consists of |V | trivial trees, each comprising a single vertex of the graph.
The final forest consists of a single tree, which is a minimum spanning tree of the graph.
On each iteration, the algorithm takes the next edge (u, v) from the sorted list of the graph’s edges, finds the trees containing the vertices u and v, and, if these trees are not the same, unites them in a larger tree by adding the edge (u, v).

62 of 65

Fortunately, there are efficient algorithms for doing so, including the crucial check for whether two vertices belong to the same tree.
They are called union-find algorithms. With an efficient union-find algorithm, the running time of Kruskal’s algorithm will be O(|E| log |E|).