The Erdos Renyi networks consist of nodes that have a probability $p$ to be connected to other nodes. This is otherwise known as a random network, and its statistical properties and characteristics are well studied. In this entry, we will be looking at the following:
Table of Contents
- Creating an Adjacency Matrix
- Average Degree
- Average Clustering Coefficient
- Average Shortest Path Length
- References
Creating an Adjacency Matrix
Before the days of networkx
, networks were defined using adjacency matrices.
The values of the adjacency matrix $A$ are given by
\begin{equation}
A_{ij} = 1
\end{equation}
when node $i$ is connected to node $j$.
In undirected networks, $A_{ij}=A_{ji}$, while this is not necessarily the case
for directed networks.
In [1]:
Average Degree
The degree of a node is defined to be the number of edges that node is a part of. For an Erdos-Renyi network, we expect the average degree to be \begin{equation}\langle k \rangle = Np, \end{equation} for large networks $N$.
In [2]:
<matplotlib.text.Text at 0x9cce7f0>
In [3]:
In [4]:
$\langle k \rangle = 0.300024590071 $ , p= 1.69578686191e-114
Average Clustering Coefficient
The clustering coefficient $C_i$ of a node $i$ is defined by the equation \begin{equation} C_i = \frac{2t_i}{k_i(k_i-1)}, \end{equation} where $t_i$ is the number of triangles that contain the node, and $k_i$ is the degree of the node.
For an Erdos-Renyi network, the average clustering coefficient for large $N$ reduces to \begin{equation}\langle C \rangle = p. \end{equation}
Counting the number of triangles is not a trivial task, therefore, we will
cheat and use networkx
to calculate the clustering coefficient of our
networks.
In [5]:
In [6]:
$p = 0.299735616273 $ , p= 0.743566602627
Note: obtaining the clustering coefficient can be really slow, with the naive implementation having computational complexity $\mathcal{O}(n^3)$ . This is a resource intensive process and for the sake of brevity, I calculated the clustering coefficient for a smaller network. I obtained a result of $\langle C \rangle = 0.3003$ which is close to $p=0.3$.
Average Shortest Path Length
The shortest path length between two nodes $i$ and $j$ is the path with the least number of edges connecting the two nodes. For an Erdos-Renyi network, the average shortest path length $\langle l \rangle$ is given by
\begin{equation}\langle l \rangle = \frac{\ln{N}}{\ln{\langle k \rangle}}. \end{equation}
In [7]:
$<l> = 1.00407716536 $ , p= 5.62192309984e-18
It’s clear form the figure that there is an exponential relationship between
$\langle k \rangle$ and $N$. The slope of the log-log plot obtained corresponds
to $\langle l \rangle =1.004$. To verify if the slope of the logarithmic plots
is indeed $\langle l \rangle$, we can use the built-in
nx.average_shortest_path_length
.
In [8]:
In [9]:
<matplotlib.text.Text at 0xd05ca90>
The obtained result isn’t quite what we are expecting. However, it should be
noted that $N=1000$ used is quite small, and the analytical trend was derived
for a large random network. It is quite possible that we are unable to replicate
the results since the networks sizes used are too small, due to limitations in
the memory inefficient implementation of networkx
.
References
- http://dx.doi.org/10.1103/PhysRevE.75.027105
- http://people.seas.harvard.edu/~babis/tsourICDM08.pdf