Earthquake Data
I got the earthquake data from
Phivolcs
website for eathquake that occured between March 1-9, 2016. The data was saved
to a csv
file to be processed by python
.
As is, the data did not neet that much cleaning. I also used pandas
to manage
the data.
In [2]:
In creating the network, we will need information about the distance between two
earthquake epicentres. The euclidean distance is inappropriate for this as the
distances are on the surface of the Earth, which is not a flat plane. Instead,
we will use the great circle distance or haversine
formula, defined as
follows:
In [3]:
Network construction
We will construct the earthquake network using locations of the earthquakes as nodes. The edges between the nodes will depend on the Nearest record breaking criterion which has the following rules:
- Two earthquakes $i$ and $j$ are connected if the occur consecutively in time.
- An earthquake $k$ is connected to past earthquakes $i$ if the new earthquake breaks the record $R_{ij}$ of previous earthquake pairs.
An earthquake record of a earthquake $i$ is the current shortest distance between all other earthquake $j$. Thus a new earthquake $k$ is connected to an older earthquake $i$ if $R_{ik}<R_{ij}$.
In [4]:
In [5]:
Connections in this network are directed. The directionality of links actually implies some possible causal relation, hence, the directionality also tells us the relative sequence of events. Future events do not point to past events, but past events can point to future events.
Degree distributions
The degree distribution is one of the usual metrics by which networks are characterized. The earthquake network created in this entry is a directed graph, wherein the directionality of links is important. Thus, we characterize the in and out degree distributions of the network.
In [6]:
<matplotlib.legend.Legend at 0xa258a20>
Unfortunately the dataset only contained 69 earthquakes, which is too small to pull of a distribution. The degrees themselves fall within a lange of less than one order of magnitude.
At first glance, it appears that there are more high out-degree nodes than in- degree ones. This tells us that we can link more previous quakes to many different subsequent quakes. On the other hand, there are less connections that can be made as to which previous earthquakes are connected to a future earthquake.
I believe that a possible insight we can get from this small dataset is that a single earthquake can cascade into more earthquakes. This is a reasonable claim since we have an idea of the concept of aftershocks produced by earthquakes. However, the size of the dataset is insufficient to back up any claims we draw from the data.
Magnitude distribution
Finally, we’ll look into the magnitude distribution of the Earthquakes in the recorded period. This analysis is a purely statistical one, independent of the network structure of the earthquakes. Let’s start by plotting the CCDF of the earthquakes.
In [7]:
<matplotlib.legend.Legend at 0x8a4feb8>
At first glance, we cannot characterize the distribution that results as a power law. the low slope at low magnitudes means that there are only a few low magnitude earthquakes recorded. This can most likely be attributed to the data gathering of Phivolcs, wherein not all earthquakes are recorded. It would appear that their cutoff is less than magnitude 1.9.
The sharp slope around 3 indicates that most earthquakes that occur are within this magnitude. Unfortunately, as is with the degree distributions, the size of the dataset is not significant enough to draw meaningful analysis from the distribution of data.
Clustering coefficient distribution
In [12]:
E:\Applications\Anaconda\lib\site-packages\matplotlib\axes\_axes.py:519: UserWarning: No labelled objects found. Use label='...' kwarg on individual plots.
warnings.warn("No labelled objects found. "
The clustering coefficient distribution has a sharp increase at the center, flattening out at the extreme values. The form of this CDF appears to be gaussian, shifted to the lower clustering coefficients. The clustering coefficients obtained are indeed larger than a randomly connected network, implying the fact that there is an underlying structure in the network. Naturally, we’d expect that this is a consequence of the network construct