Graph Data Structure


A graph (undirected or directed) is an ordered pair of sets G = (V, E), where set V is a nonempty and finite set of elements known as vertices and E is the set of edges. The following properties define a graph:

  • Each edge connects exactly two vertices;
  • In the case of an undirected graph, the edge connects the vertices both ways.
  • In the case of directed graphs, the edge has a direction and is commonly referred to as an arc.
  • Here is a visual representation of an undirected graph:

Graph Data Structure 1

  • There are 5 vertices (1, 2, 3, 4, 5) and 4 edges ([1 2], [1 3], [1 5], [3 4]).
  • The degree of a vertex is given by the total number of edges incident to that vertex. For example, 1 has degree 3.
  • A path in a graph is a sequence of vertices connected by edges. For example, (2, 1, 3, 4) is a path in the given graph.
  • A cycle is a path that starts and ends in the same vertex. For example, if 1 and 4 were connected in the graph above, the sequence (1, 3, 4, 1) would be a cycle.
  • A complete graph is a graph if any two vertices are directly connected by an edge.
  • A connected graph is a graph in which there is a path between every 2 vertices. The graph drawn above is connected.

Graphs : Graph traversals

Traversing a graph requires systematic examination of the graph’s vertices in order to process the information associated with them. There are two basic methods of traversing a graph, as follows:

  1. Depth – first search (DFS)
  2. Breadth – firstSearch (BFS)

Depth first search is performed as follows:

  1. Browsingbegins with an initial vertex, called the starting vertex.
  2. Then the firstunvisitedneighbor of the starting vertex is visited. Vertex y is considered a neighbor of vertex x if there exists a direct edge (x, y) in the graph.
  3. Next the first unvisited neighbor of the first unvisited neighbor of the starting vertex is visited, and we repeat the process until there are no unvisited neighbors for a vertex. When we reach such a vertex, we go back to the vertex we visited previously and continue looking through the rest of the unvisited neighbors.
  4. The algorithm can be easily implemented with the use of a stack. We will look at an example so you can have a better understanding of the concept.

Suppose we want to perform a DFS traversal on the following graph:

Graph Data Structure 2

We start with an empty stack. If 1 is our starting vertex, then we firstly added it to the stack. Then we add its first unvisited neighbor, which is 2. We repeat the same for 2 and add 5 to the stack. At this point our stack is (1, 2, 5). Since there are no unvisited neighbors of 5 (2 and 1 have already been visited), we perform a pop() operation on the stack and go back to 2 and then to 1. At this point we add the next unvisited neighbor of 1 (3), and finally 4. The DFS traversal is 1 2 5 3 4.

Next, breath first search is done with the following steps:

  1. We choose a starting vertex and visit it first.
  2. We visit all unvisited neighbors of the starting vertex one by one.
  3. The simplest algorithm for BFS uses a queue. Let’s look at an example.

Suppose we want to perform a BFS on the same graph and we choose 1 as a starting node again. We firstly add all unvisited neighbors of 1 to an initially empty queue, which would look like (1, 2, 3, 5). We then perform a dequeue operation, obtaining (2, 3 ,5). We then expand 2, which has no more unvisited neighbor, so we perform another dequeue operation, removing it from the queue. At this point the queue is (3, 5). We finally expand 3, which has 4 as an unvisited neighbor. The final BFS traversal is therefore 1 2 3 5 4.

Graphs : Topological sort

Given a directed graph with no cycles, topological sorting performs a linear arrangement of nodes based on the edges between them. The edge orientation corresponds to a sequence relationship from the source node to the destination. Thus, if (u, v) is one of the edges of the graph, u must appear ahead of v in the listing. Topological sorting can be seen as placing nodes along a horizontal line so that all edges are directed from left to right.


The most common algorithm for topological sort uses DFS. Consider the following directed graph:

Graph Data Structure 3

The steps the algorithm would perform are as follows:

  • Perform a DFS traversal from any vertex (let’s suppose we start with 4).
  • As opposed to a regular traversal, we are now interested in all the nodes we run into during the search, even they have been marked as visited or not. For example, when we start from 4 we would output 4 3 1 2 5 5 5 (the last two 5s were outputted when visiting from 1 and 4, respectively).
  • We consider the output in the step above and we iterate through it from right to left, removing any duplicates. In the example above we would remove the first and second 5.
  • The topological sort in this case is 4 3 1 2 5.

Graphs : Example – task management

Suppose you are assigned a list of tasks, dependent on each other. For example, suppose you have 5 tasks, numbered 1, 2, 3, 4 and 5, respectively, and you know you must start with either task 3 or 5. Furthermore, you know that task 2 cannot be performed until task 3 is finished, while task 4 cannot be performed until task 2 and 5 are finished and task 5 depends on the result of task 1, so it must be performed after it. This is a classic example of a problem that can be solved using topological sort. The steps are as follows:

  1. We build a graph, with the following properties:
    1. Every task is assigned a vertex of the graph;
    2. There is an edge between vertex x and vertex y if x must be completed before y.

The graph for the problem above would look as follows:


Graph Data Structure 4

  1. We apply the topological sort algorithm from any of the starting nodes, which in this case are 3 or 5. Any solution obtained at this step is a valid one. For example, the solution 3 1 2 5 4 is valid.

Finally, it is also worth mentioning that topological sort is also often used in real-world contexts. For example, when installing packages under Linux, there are dependencies between them and we often need to install one package before another. A good order for installing these files is based on finding a topological sort with the algorithm we have just described.

Further Reads :

Previous articleB-TREES Data Structure
Next articleHash Table Data Structure
Since last 15 years in different geographical locations, Sumit prepared hiring format for several hiring managers/teams to hire the balanced talents and interviewed talents on the various stages of their selection process. He also interviewed by hundreds of companies in different geographical locations.His best conclusion for hiring teams and candidate is to prepare in advance. Here ‘advance’ means keep your interview book ready and continue to update it even you are not going to interview candidates or applying for any job in next six months.