Lab 3: Dynamics on and of Networks

Problem 1: Exploring the SIR mean-field model

In this problem, we will examine how the network structure affects how well the Erdos-Renyi SIR mean field model captures the dynamics. Feel free to adapt the code we used in class.

Figure 1. Example small-world network generated using the Watts-Strogatz algorithm (image from Wikipedia).

For this problem, we will use both the Erdos-Renyi graph, as well as the Watts-Strogatz small world network. As a reminder, the way the Watts-Strogatz method works is that we start with a ring of nodes, connected to their $k$ nearest neighbors. We then rewire one end of each edge to a new, randomly chosen neighbor with probability $\beta$. This results in a network that has mostly local connections, but some long range connections (depending on $\beta$), leading to the small world phenomenon of high clustering but low path lengths between nodes.

Another useful feature of this model is that as you adjust $\beta$ between $0$ and $1$, you can move between a very localized ring network when $\beta = 0$ (matching what we used in cellular automata), to a more random network when $\beta = 1$ (similar to an Erdos-Renyi network). Note that while the random network we achieve when $\beta = 1$ (i.e. we rewire all edges) is very similar to an Erdos-Renyi network, it is not quite the same. This is because we only rewire one end of each edge (e.g. we only rewire edges going to clockwise nodes). This rewiring rule means that every node must have at least $k/2$ edges (unlike in an Erdos-Renyi network, where the degree can potentially range from zero to all other nodes).

Problem 1A) Finding an ‘equivalent’ small-world network. Let’s see if we can find a sort of ‘equivalent’ small world network to match the Erdos-Renyi network we looked at in class. We can do this by setting the average degree of the small world network, $k$, equal to that of the Erdos-Renyi network. Calculate the average degree of an Erdos-Renyi network with $N$ nodes and edge probability $p_e$. Be sure to explain your reasoning as well as giving a formula for the average degree.

Problem 1B) Implementing the simulation on a small-world network. Now that we know what average degree to use, adapt the SIR model on an Erdos-Renyi network that we used in class to run on a Watts-Strogatz network. You can do this as follows:

In your initialize() function, change the network from the nx.erdos_renyi_graph() function, to use the nx.watts_strogatz_graph() function. This function takes three arguments: $N$, the number of nodes, $k$, the average degree (i.e. number of ring neighbors to connect to), and $\beta$, the rewiring probability.
For the number of neighbors to connect to, calculate the average degree using the formula you derived in 1A, using an edge probability of $p_e = 0.06$. Note that you will need to round your answer to the nearest integer, since you can only connect to an integer number of neighbors.
For the other parameters, take $p_i = p_r = 0.1$ and $N = 100$. We will try out a range of different $\beta$ values, so set it to whatever you like for now.

Implement the model and run it a couple of times for a few different $\beta$ values ranging from 0 to 1 (e.g. at least include $\beta$ values that are low, medium, and high, and you may want more values of $\beta$ to fully see how it works!). How well does the Erdos-Renyi mean field model perform as you change $\beta$ from 0 to 1? In your write-up, be sure to include a couple of plots of the prevalence (fraction of the population that is infected) over time, comparing the Erdos-Renyi mean field prediction to the Watts-Strogatz network simulation.

Problem 1C) Exploring the Erdos-Renyi mean-field model performance. Running one or two simulations is often not enough to really explore the model behavior, particularly for a stochastic model like this one. Adjust your code so that for a given parameter set, you run at least $100$ simulations for $100$ timesteps each. For this you will probably want to skip the visualization at each time step (and the pycxsimulator function if you’re using it), and instead run your model in a for loop without observing it until the simulation has finished. Some example code for this might be something like:

timesteps = 100  # number of timesteps per simulation
numsims = 100    # number of simulations to run

# this array will store the time series of fraction infected for each simulation
prevarray = np.zeros([timesteps,numsims])  

for i in range(0,numsims):        # loop over all simulations
    initialize()                  # initialize each simulation
    for j in range(1,timesteps):  # loop over the timesteps for that simulation
        update()                  # update
    prevarray[:,i] = prev         # store the resulting simulation in prevarray

Note however, that if you do it this way you will also need to move the prev = [], prev_mf = [], and susc_mf = [] lines into the initialize() function! Otherwise the code as written will keep appending new prevalences from one simulation to the next, so that when we try to store each run into our prevarray, the prev variable will be the wrong length.

Try running your code for at least five different values of $\beta$, and be sure to include 0 and 1 (i.e. no rewiring and 100% rewiring). For each choice of $\beta$ make a plot of all of your simulations, and plot the mean-field model prediction on top of this as a line (or use a distinct color, so we can tell the difference between the mean field and the network simulations). Include your plots in your writeup. You should see something like this (plotted for $\beta = 0.5$, with the black line as the mean field):

Figure 2. Example of a quantile plot. The dark blue filled area is the 5% to 95% quantiles, the light blue filled area is the 25% to 75% quantiles, the blue line is the median of the network simulations, and the black line is the mean-field model.

Problem 1D) Capturing uncertainty. To more effectively capture the range of uncertainty in our simulations, adjust the plots you made in Problem 1C to plot the median across your simulations, the 5% and 95% quantiles, and the mean-field model prediction. For this, you may want to use two python functions: np.quantile() (part of the numpy package) and fill_between() part of the pyplot package. Take a look at the documentation for these two functions to figure out how to use them with your model output (and feel free to ask me if you have trouble with them).

Redo the plots for the five (or more) different values of $\beta$ from Problem 1C. How well does the Erdos-Renyi mean field model capture the dynamics as $\beta$ changes? Would you feel comfortable using the mean field model for this system (and if so, for what values of $\beta$)?

Problem 1E) Comparing Watts-Strogatz vs. Erdos-Renyi. Finally, make one more plot like you did in Problem 1D, but now for the Erdos-Renyi graph with $p_e = 0.06$. Compare this to the Watts-Strogatz plot you made for $\beta = 1$ in Problem 1D. How well does the mean field model capture the model behavior for the most random Watts-Strogatz network vs. for the Erdos-Renyi network?

Problem 2: Cascading failure on a power grid

Figure 3. Degree distributions for two electrical power grid networks, from Chassin DP, Posse C. Evaluating North American electric grid reliability using the Barabási–Albert network model. Physica A: Statistical Mechanics and its Applications. 2005 Sep 15;355(2-4):667-77.

This problem is adapted from Sayama, Exercise 16.12. Here we will model the possibility of large scale blackouts (power failures) on a region’s electrical grid. Electrical grids have previously been shown to match a power law degree distribution, indicating a scale-free network, so we will use the Barabasi-Albert preferential attachment model to generate our power grid network. We will consider an undirected, unweighted network, where nodes represent electrical stations/substations, and edges represent high-voltage power transmission lines. While we are modeling this as a power grid in this example, it’s worth noting that similar models are used to examine failures in banks and other financial institutions, among other topics.

Problem 2A) Implement a power grid network model (you can adapt the SIR model code or one of the examples from PyCX for this if you like) with the following rules:

The underlying network is a Barabasi-Albert preferential attachment network with 100 nodes ($N = 100$), where each node attaches to two existing nodes ($m = 2$). You can implement this with the networkx function g = nx.barabasi_albert_graph(N,m).
Time will update synchronously (like we have been doing with the SIR model) in discrete time steps.
Each node is assigned a maximum capacity (the maximum power that node can handle) upon initialization, which we will draw from a normal distribution with mean 100 and standard deviation of 10.
Each node also carries an electrical load, which is determined by the amount of power being drawn from that node (i.e. by houses, businesses in the area, etc.). Because it is inefficient and potentially costly to design power stations with significantly more capacity than is typically needed, we will suppose that the power station loads can range up to but less than, the capacity, and set the initial loads to be drawn uniformly at random from 20% to 100% of each node’s capacity.
Each node also has a state associated with it, either ‘running’ or ‘failed’. The rules for the nodes are as follows:
- If the node is running but it’s load is greater than it’s capacity, the node will turn to a failed node at the next time step. When this happens, its load will be evenly distributed to its neighbors (realistically, this should probably go to only the neighbors that are still running, but you can just distribute it to all neighbors)
- If the node is already failed, nothing happens (we could actually implement failure as removal of the node from the network, but for ease of visualization we’ll keep the node but mark it as failed).

Implement the model and simulate time forward (say for around 50 timesteps, or however long until the dynamics settle). Then change the initial load so that one node chosen at random has a load greater than its capacity (say by setting its load equal to $1 + $ its capacity) and simulate the model again. What dynamics do you observe? You may need to try this a couple of times. Include any relevant plots to illustrate the dynamics you see.

For this problem, you may want to use two python functions to sample form a normal distribution and uniform distribution: np.random.normal(mean,SD) and np.random.uniform(min, max) (where you’ll also need to import numpy, e.g. import numpy as np)

Problem 2B) Next, try adjusting the initial load ranges to be closer or further from the maximum capacity (e.g. compare loads ranging from 10-100%, 50-100%, and 80-100%, etc.). How does the size and frequency of the cascade failures change?

Problem 2C) Now that we have the model running, let’s try it with a real-world power grid network. Download the Western United States power grid network from Mark Newman’s website (it’s listed as ‘Power grid’), from D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998). Load the network as a graph using g = nx.read_gml('power/power.gml',label='id'). Note that this network is somewhat large (4941 nodes)! Be careful running things on this network as it may run very slowly.

Plot the network using the spectral layout, pos=nx.spectral_layout(g), (note that because the network is large, the usual spring layout that we use can be very slow, so the spectral layout is often faster). How does the network structure look? Does it look like the simulated networks we generated earlier? (If not, what is different?) Then plot the network degree distribution (on a log-log scale). Does this network appear to be scale free?

Problem 2D) Run the model on the Western United States power grid network, again with a random node chosen to be over capacity when you initialize the system (you can use 50-100% initial load for this model). However, don’t try to visualize it at every time step! This will probably be too slow to run. Instead, just run the model in a for loop (similar to Problem 1) for at least 20 simulations (even 20 simulations may take a while depending on how you code this and how fast your computer/colab is), and at the end of each simulation tabulate the total number of failed nodes, i.e. the total size of the electrical blackout resulting from the node failure. Generate a histogram of the blackout sizes (i.e. the size of the cascades) from each simulation. How robust is this power network to random node failure? In what percent of simulations do more than 10% of the nodes fail?

Lab 3: Dynamics on and of Networks

CMPLXSYS 530 - Marisa Eisenberg

Due date: 3/7/2024

Problem 1: Exploring the SIR mean-field model

Problem 2: Cascading failure on a power grid