_{1}

^{*}

We study how the graph structure of the Internet at the Autonomous Systems (AS) level evolved during a decade. For each year of the period 2008-2017 we consider a snapshot of the AS graph and examine how many features related to structure, connectivity and centrality changed over time. The analysis of these metrics provides topological and data traffic information and allows to clarify some assumptions about the models concerning the evolution of the Internet graph structure. We find that the size of the Internet roughly doubled. The overall trend of the average connectivity is an increase over time, while that of the shortest path length is a decrease over time. The internal core of the Internet is composed of a small fraction of big AS and is more stable and connected the external cores. A hierarchical organization emerges where a small fraction of big hubs are connected to many regions with high internal cohesiveness, poorly connected among them and containing AS with low and medium numbers of links. Centrality measurements indicate that the average number of shortest paths crossing an AS or containing a link between two of them decreased over time.

The Internet is a highly engineered communication infrastructure continuously growing over time. It consists of Autonomous Systems (ASes) each of which can be considered a network, with its own routing policy, administrated by a single authority. ASes peer with each other to exchange traffic and use the Border Gateway Protocol (BGP) [

The structure of the Internet has been studied by many authors and the literature on the subject is vast. One of the most used methods is the statistical analysis of different metrics characterizing the AS graph [

The ASes graphs have been constructed from the publicly available IPv4 Routed /24 AS Links Dataset provided by CAIDA [

In this section, we introduce the metrics chosen for this analysis whose summary scheme is shown in

The degree distribution P ( k ) is the probability that a random chosen node

Year | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 |
---|---|---|---|---|---|---|---|---|---|---|

# Nodes | 28,838 | 31,892 | 35,149 | 38,550 | 41,527 | 47,407 | 47,581 | 50,856 | 51,736 | 52,361 |

# Edges | 135,723 | 152,447 | 184,071 | 213,870 | 281,596 | 282,939 | 298,355 | 347,518 | 379,652 | 414,501 |

Metric | Relevance | Importance |
---|---|---|

Degree distribution k-core decomposition | Structure | Scale-free, global properties. Nested hierarchical structure of tightly interlinked subgraphs. |

Clustering coefficient Shortest path length | Connectivity | Neighborhood connectivity. Hierarchical structure. Reachability (minimum number of hops between two ASes). |

Closeness centrality Node betweenness centrality Edge betweenness centrality | Centrality | Indicates the proximity of an AS to all others. Related to node traffic load. Related to link traffic load. |

has degree k. If a graph has N k nodes with degree k then P ( k ) = N k / N . Since P ( k ) is a probability distribution it satisfies the normalization condition ∑ k min k max P ( k ) = 1 where k min and k max are the minimum and maximum degree, respectively. From P ( k ) we can calculate the average degree k ^ = ∑ k min k max k P ( k ) . For a random network P ( k ) follows a binomial distribution and in the limit of sparse network δ ≪ 1 , where δ is the link density, it is well approximated by a Poissonian. The Internet, as many other real networks, can be considered sparse and, moreover, it is scale-free which means that it contains both small and very high degree nodes and this feature cannot be reproduced by a Poissonian. Many studies agree that the degree distribution follows a power law P ( k ) ~ k − α though deviations have been observed [

A k-core of a graph is obtained by removing all nodes with degree less than k. Therefore, the k-core is the maximal subgraph in which all nodes have at least degree k. The 0-core is the full graph and coincides with the 1-core if there are no isolated nodes, as in the case of the Internet. The k-core decomposition is a way of peeling the graph by progressively removing the outermost low degree layers up to the innermost high degree core which we call nucleus. We denote by k max c o r e the coreness of the nucleus, and by Ɲ_{n} (Ɲ_{k}) and E n ( E k ) the number of ASes and edges in the nucleus (in the k-core). In the case of the Internet the analysis of the k-core decomposition over time is useful for understanding whether its nucleus, composed of high degree ASes, evolves differently from its periphery.

The local clustering coefficient C i of a node i of degree k is the ratio of the actual number of edges E i connecting its neighbors to the maximum possible number of edges that could connect them. For an undirected graph C i = 2 E i / k ( k − 1 ) . By averaging over all nodes we obtain the global clustering coefficient C = ∑ i C i / N . For a random network, C is independent of the node’s degree and decreases with the size of the graph as C ~ N − 1 . Scale-free networks exhibit a quite different behavior. For example, the clustering coefficient of a scale-free network obtained from the Barabasi-Albert model. [

The shortest path length between two nodes is the minimum number of hops needed to connect them. Of course, for any pair of nodes there may be several shortest paths connecting them. The shortest path length distribution s ( h ) provides, for a given number h of hops, the number of shortest paths of length h. We call S the average shortest path length. The diameter D is the longest shortest path. The importance of the shortest paths is mainly related to routing. Many routing algorithms are based on the shortest path length. Adaptive algorithms allow changing routing decision to optimize traffic load and prevent incidences of congestions. The knowledge of the available shortest paths is then crucial for routing efficiency.

The closeness centrality Γ of a node i is the inverse of its average shortest path length to all other nodes: Γ ( i ) = ( N − 1 ) / ∑ j = 1 N − 1 σ ( i , j ) where σ ( i , j ) is the shortest path length between i and j. Nodes with high Γ are those closest to all others and can be considered central in the network. On the contrary, nodes with low Γ are, on average, far away from the others and can be considered peripheric.

The concept of betweenness centrality applies to both nodes and edges. The betweenness centrality of a node i is defined as B n ( i ) = ∑ j , k ∈ N σ ( j , k ; i ) / σ ( j , k ) where the sum is over all pairs of nodes, σ ( j , k ) is the number of shortest paths and σ ( j , k ; i ) is the number of those passing through i. If j = k then σ ( j , k ) = 1 and if i ∈ { j , k } , σ ( j , k ; i ) = 0 . The betweenness centrality B e of an edge e is defined in the same way. In this case σ ( j , k ; e ) is the number of shortest paths containing e. Efficient routing policies exploit as much as possible available shortest paths, hence a node (edge) with high betweenness centrality carries large traffic load. In [

In this section, we compare the measurements of the metrics concerning the Internet AS graphs obtained for each year of the decade 2008-2017 and report the corresponding results.

The left plot of

Year | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 |
---|---|---|---|---|---|---|---|---|---|---|

δ × 10 − 4 | 3.3 | 3.0 | 3.0 | 2.9 | 3.3 | 2.5 | 2.6 | 2.7 | 2.8 | 3.0 |

k ^ | 9.4 | 9.6 | 10.5 | 11.1 | 13.6 | 11.9 | 12.5 | 13.7 | 14.7 | 15.8 |

k max | 5430 | 6430 | 7684 | 8416 | 11,179 | 9838 | 10,682 | 11,739 | 18,110 | 13,725 |

k min P L | 23 | 8 | 44 | 30 | 16 | 15 | 17 | 17 | 14 | 14 |

α | 2.12 ± 0.03 | 2.13 ± 0.01 | 2.08 ± 0.03 | 2.07 ± 0.03 | 2.20 ± 0.02 | 2.10 ± 0.01 | 2.10 ± 0.02 | 2.11 ± 0.01 | 2.10 ± 0.01 | 2.14 ± 0.01 |

p ± 0.01 | 0.67 | 0.04 | 0.60 | 0.93 | 0 | 0 | 0 | 0 | 0 | 0 |

k-core both the number of ASes and edges increase over time. The evolution of the Internet nucleus is shown in the right plot of _{n} increases from 2008 to 2013, decreases in 2014 and 2015 and then increases again until 2017. We observe the same trend also for the number of edges in the nucleus as shown in

On average the nucleus contains ~0.4% of all ASes and ~4% of all edges. Carmi et al. [_{n} as a power of N on the base of a numerical simulation assuming a scale-free growing model with the same parameters of the real Internet. Instead, from the analysis of the Internet at the AS level during the period 2001-2006 Guo-Quing Zhang et al. [

We now examine in the left plot of

Year | k max c o r e | Ɲ_{n} | E n | K (core) | ∆Ɲ_{k}_{ }(%) | Δ E k ( % ) | δ k |
---|---|---|---|---|---|---|---|

2008 | 55 | 106 | 4040 | 2 | 4.82 | −0.4 | (42.1 ± 4.7) × 10^{−}^{5} |

2009 | 60 | 121 | 5185 | 3 | 10.22 | 5.48 | (79.8 ± 9.2) × 10^{−}^{5} |

2010 | 68 | 144 | 7077 | 4 | 11.24 | 9.49 | (13.0 ± 1.5) × 10^{−4} |

2011 | 75 | 163 | 8921 | 5 | 11.26 | 12.31 | (19.2 ± 2.3) × 10^{−4} |

2012 | 87 | 171 | 10383 | 6 | 11.00 | 14.49 | (26.5 ± 3.3) × 10^{−4} |

2013 | 92 | 198 | 13,132 | 7 | 10.56 | 16.13 | (35.0 ± 4.6) × 10^{−4} |

2014 | 97 | 181 | 12,020 | 8 | 10.08 | 17.44 | (44.6 ± 6.1) × 10^{−4} |

2015 | 104 | 177 | 12,008 | 9 | 9.57 | 19.3 | (55.3 ± 7.8) × 10^{−4} |

2016 | 106 | 196 | 14,084 | 10 | 9.22 | 19.38 | (6.8 ± 1.0) × 10^{−}^{3} |

2017 | 106 | 261 | 20,838 | k max c o r e | 0.13 | 2.05 | (64.1 ± 6.8) × 10^{−2} |

Compared to the evolution of the nucleus it is evident that the periphery evolves with a different dynamics. In

Tauro et al. [

The clustering coefficient has been used to investigate the hierarchical organization of real networks. The hierarchy could be a consequence of the particular role of the nodes in the network. A stub AS does not carry traffic outside itself and is connected to a transit AS that, on the contrary, is designed for this purpose. The hierarchy of the Internet is rooted in its geographical organization in international, national backbones, regional and local areas. This is the skeleton of the Internet. International and national backbones are connected to regional networks which finally connect local areas to the Internet, implementing in such a way a best and less expensive strategy. It is reasonable to suppose that this hierarchical structure introduces correlations in the connectivity of the ASes. A. Vázquez et al. [

In the left plot of

Year | S ± 0.6 | D | C |
---|---|---|---|

2008 | 3.1 | 6 | 0.59 |

2009 | 3.0 | 7 | 0.59 |

2010 | 3.0 | 7 | 0.61 |

2011 | 3.0 | 6 | 0.62 |

2012 | 2.9 | 7 | 0.65 |

2013 | 3.0 | 6 | 0.63 |

2014 | 3.0 | 6 | 0.63 |

2015 | 3.0 | 6 | 0.65 |

2016 | 2.9 | 7 | 0.68 |

2017 | 2.9 | 7 | 0.68 |

to the scaling C ( k ) ~ k − γ . However, data show that C ( k ) decreases with k especially for k > 100 . Low degree ASes have high neighbourhood connectivity and, on the contrary, neighbours of big hub ASes are slightly connected among them. This is consistent with a hierarchical organization in which big ASes are connected to many regions with high internal cohesiveness and composed of low or medium degree ASes, and these regions are poorly connected among them. Since the C ( k ) plots of the AS graph snapshots overlap, to study the evolution of the clustering coefficient over years we compare the CCDF of C ( k ) in

distributions are very intertwined indicating that during the decade 2008-2017 this region was rather static. In the medium degree region a clear separation emerges between the CCDF of the different years and for a given value of C ( k ) the CCDF increases over time. The gap is even more pronounced in the peripheric low degree region. This result suggests that the evolution of the Internet from 2008 to 2017 was not uniform and the most significant changes mainly affected its middle and even more its periphery, and the neighborhood connectivity in these regions increased over time.

The left plot of ^{−4} and found S ( 2001 ) = 3.4611 and S ( 2006 ) = 3.3352 . They noticed that simple power law and small world models, which predict a growth of S with the size of the Internet, fail to explain the overall slight decrease of S over time and argued that this might be due to the fact that the Internet expands according to many factors not considered by simple models like competitive and cooperative processes (like commercial relationships), policy-driven strategy and other human choices. From the comparison of our result with that of Zhao et al. there are indications the S has been slightly reduced during the period 2001-2017. This reinforces the fact that a pure power law model could not explain the evolution of the Internet because for

2 < α < 3 it predicts S ~ ln ln N [

The closeness centrality Γ as a function of the node’s degree k is shown in

The average node betweenness centrality as a function of the degree k is shown in ^{−5} and ~3.6 × 10^{−5} respectively. In

B n ( k ) | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 |
---|---|---|---|---|---|---|---|---|---|---|

k ≤ 100 | 3.3 × 10^{−4} | 3.0 × 10^{−4} | 2.2 × 10^{−4} | 1.8 × 10^{−4} | 1.2 × 10^{−4} | 1.2 × 10^{−4} | 1.2 × 10^{−4} | 8.9 × 10^{−5} | 7.7 × 10^{−5} | 7.3 × 10^{−5} |

100 < k ≤ 1000 | 3.0 × 10^{−3} | 2.8 × 10^{−3} | 2.2 × 10^{−3} | 2.0 × 10^{−3} | 1.5 × 10^{−3} | 1.6 × 10^{−3} | 1.5 × 10^{−3} | 1.3 × 10^{−3} | 1.0 × 10^{−3} | 0.9 × 10^{−3} |

k > 1000 | 4.5 × 10^{−2} | 4.0 × 10^{−2} | 3.1 × 10^{−2} | 2.9 × 10^{−2} | 1.9 × 10^{−2} | 2.0 × 10^{−2} | 1.8 × 10^{−2} | 1.5 × 10^{−2} | 1.4 × 10^{−2} | 1.4 × 10^{−2} |

shown, for each year of the decade 2008-2017, the colored 3D map of the average B e . The highest B e is associated to edges which have at least a high degree ( k > 1000 ) AS as a terminal. Edges connecting low or medium degree ASes have lower B e . This is what one would expect considering that high degree ASes are the backbone of the Internet and the most part of the shortest routes should cross them. We also observe a slight decrease of B e over time. The overall average B e was ~2.2 × 10^{−5} in 2008 and ~0.7 × 10^{−5} in 2017, indicating that somehow the Internet has become less congested although it has expanded. By looking at the colored contour maps shown on the right side of ^{−5} and ~5.4 × 10^{−5}. This is a further confirmation that during the evolution of the Internet the traffic load somehow decreases. This may be due to the adoption of more efficient routing policies and to infrastructural upgrades with more advanced network devices.

We studied the evolution of the Internet at the AS level during the decade 2008-2017. For each year of the decade we considered a snapshot of the AS undirected graph and analyzed how a wide range of metrics related to structure, connectivity and centrality varies over time. During the decade 2008-2017 the Internet almost doubled its size and became more connected. The Internet is a scale-free network because it contains both very high and low degree ASes. For all the years 2008-2017 the best fit of the degree distributions with a power law P ( k ) ~ k − α provides values of the exponent very close to each other and around α ≅ 2.1 . However, the statistical analysis shows that a pure power law model fails to explain the scale-free properties. The study of the k-core decomposition shows that Internet has a small internal nucleus composed of high degree ASes much more stable and connected than external cores. We investigated the hierarchical organization of the Internet by studying the average clustering coefficient C. We found that there are indications of an overall hierarchical organization of the Internet where a small fraction of big ASes are connected to many regions with high internal cohesiveness containing low and medium degree ASes and these regions are slightly connected among them. The average shortest path length S of the Internet slightly decreased during the decade 2008-2017 form ~3.1 to ~2.9 measured in 2008 and 2016-2017 respectively. Regardless of the analyzed year, the closeness centrality Γ of an AS increases with its degree. Hence, big ASes are in the center of the Internet and low degree ASes are in the periphery. It is reasonable to assume that the traffic load of an AS or carried by an edge is proportional to the number of shortest paths passing through the AS and containing the edge. These measurements can be quantified by the average node and edge betweenness centrality B n and B e . There is evidence of an overall slight decrease of both B n and B e during the decade 2008-2017, suggesting that during its evolution the Internet became less congested.

The computing resources and the related technical support used for this work have been provided by CRESCO/ENEAGR\-ID High Performance Computing infrastructure and its staff [

The authors declare no conflicts of interest regarding the publication of this paper.

Funel, A. (2019) The Graph Structure of the Internet at the Autonomous Systems Level during Ten Years. Journal of Computer and Communications, 7, 17-32. https://doi.org/10.4236/jcc.2019.78003