Agenda:
A graph $G$ consists of
A graph is way to represent objects and their relations
What are some things you might want to know about a graph? E.g., consider social networks, transport network, etc.
Who is friends with Billie Eilish?
find neighbors of a node:
$ N(v) =\{v_i ~ \mid (v, v_i) \in E ~~ \hbox{or} ~~(v_i, v) \in E\}$
How popular is Billie Eilish?
degree: number of neighbors of a node:
$ |N(v)|$
$ \begin{bmatrix} & A & B & C & D\\ A & 0 & 1 & 0 & 0\\ B & 1 & 0 & 1 & 1\\ C & 0 & 1 & 0 & 1\\ D & 0 & 1 & 1 & 0\\ \end{bmatrix} \begin{bmatrix} & A & B & C & D\\ A & 0 & 1 & 0 & 0\\ B & 0 & 0 & 1 & 0\\ C & 0 & 0 & 0 & 1\\ D & 0 & 1 & 0 & 0\\ \end{bmatrix} $
import numpy as np
from tabulate import tabulate
labels = ['A', 'B', 'C', 'D']
nodes = [0,1,2,3]
edges = [(0,1), (1,2), (2, 3), (3, 1)]
def make_graph(nodes, edges, directed=False):
graph = np.zeros((len(nodes), len(nodes)), dtype=int)
for e in edges:
graph[e[0], e[1]] = 1
if not directed: # add reverse direction
graph[e[1], e[0]] = 1
return graph
graph = make_graph(nodes, edges, directed=False)
print('undirected')
print(tabulate(graph, labels))
undirected A B C D --- --- --- --- 0 1 0 0 1 0 1 1 0 1 0 1 0 1 1 0
digraph = make_graph(nodes, edges, directed=True)
print('\ndirected')
print(tabulate(digraph, labels))
directed A B C D --- --- --- --- 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0
def neighbors_adjacency(graph, node, labels):
result = []
i = labels.index(node)
print(graph[i])
for index, val in enumerate(graph[i]):
if val == 1:
result.append(labels[index])
return result
neighbors_adjacency(graph, 'B', labels)
[1 0 1 1]
['A', 'C', 'D']
Runtime to access neighbors of a node?
$O(|V|)$
If there are $|V|$ nodes, what is the maximum number of edges?
....but, if a graph is dense, then there's not really any value in representing the data as a graph.
Luckily, most real-world graphs are extremely sparse.
Source: Collective dynamics of 'small-world' networks. Duncan J. Watts & Steven H. Strogatz
We could simply store the graph as a set of edges.
edges = set([('A', 'B'),
('B', 'C'),
('C', 'D'),
('D', 'B')])
edges
{('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'B')}
What's the space requirement?
$O(|E|)$
How can we access the neighbors of a node?
def neighbors_set(edges, node):
# assuming an undirected graph
result = []
for e in edges:
if e[0] == node:
result.append(e[1])
elif e[1] == node:
result.append(e[0])
return result
neighbors_set(edges, 'B')
['A', 'D', 'C']
What is the work/span of accessing neighbors using an edge set?
Work: $O(|E|)$
Span: $O(\lg |E|)$ (using filter)
Can we do better?
We can use a hashmap (dict
) to store the neighbors of each node.
graph = {
'A': {'B'},
'B': {'A', 'C', 'D'},
'C': {'B', 'D'},
'D': {'B', 'C'}
}
graph
{'A': {'B'}, 'B': {'A', 'C', 'D'}, 'C': {'B', 'D'}, 'D': {'B', 'C'}}
What is work of accessing neighbors?
def neighbors_map(graph, node):
return graph[node]
neighbors_map(graph, 'B')
{'A', 'B', 'D'}
Constant time to lookup the neighbors of a node.
One of the fundamental operations over graphs
Can be used to solve a number of problem:
Consider the task of crawling every web page reachable from a starting page $s$.
How would you do this?
At any point in the search, a vertex can be in one of three sets:
We can then describe a generic search algorithm as follows:
while vertices remain to be visited:
- visit some unvisited nodes in the frontier
- update the three sets
We'll look at two common search algorithms, breadth-first search and depth-first search.
Just to get an idea of the problem, we'll start with an inefficient way of searching called vertex hopping
until all nodes visited:
pick a node
visit all its neighbors
def hop(graph):
for v in graph:
print('searching with %s' % v)
for neighbor in graph[v]:
print('...found %s' % neighbor)
hop(graph)
searching with A ...found B searching with B ...found A ...found C ...found D searching with C ...found B ...found D searching with D ...found B ...found C
This of course ignores the idea of paths in a graph.
However, we could still build on this to solve the reachability problem. How? We'll address this problem in recitation.