Lab 4: Parameter sampling and sensitivity

Problem 1: Sensitivity analysis of a predator-prey model

Problem 1A) Exploring the model. To explore some parameter sampling and sensitivity analysis, we’ll use an agent-based predator-prey model from the PyCX module. This model (a variation of the classic Lotka-Volterra model) tracks rabbits (prey) and foxes (predators) as they move through space, and foxes can prey upon (eat) the rabbits with probability dr when they come within a radius cd of one another.

In addition to movement and predation, the prey can reproduce, and predators can reproduce if there is food available. If there are no rabbits (prey) nearby, the predators may also die with probability df. For more information about the model, please see Sections 4.6 and 19.4 of Sayama’s book (Introduction to the Modeling and Analysis of Complex Systems).

Make a copy of the model abm-predator-prey-with-plot.py from the PyCX module (you can use the one I’ve provided here on Google Colab) and run the it a few times, for at least 300 timesteps (you may need to run it for a while to observe the general behavior), exploring different parameter values by hand.

What general types of behavior patterns do you notice, both in the plot of total rabbits and foxes, and in the spatial patterns of the populations?
How do the dynamics compare to the differential equation verson of this model given in Sayama 4.6?
Does the model always follow the same patterns or are there different general behaviors? (E.g. do either or both populations ever go extinct?)

Be sure to run the model for long enough to see how the dynamics turn out (e.g. running to at least 200 or 300 timesteps).

Figure 1. 5 runs of the predator prey model with the default parameters. Note the general pattern of oscillations but also occasional population extinction (in this case of foxes, leading to an all-rabbit population). Blue lines indicate rabbits (prey) and red lines indicate foxes (predators). Brightness of the line indicates which simulation (i.e. the brightest red and brightest blue lines were from the same simulation, etc.).

Problem 1B) Examining stochastic variation. Next, run the model at the original, default parameters for at least 5 model simulations (something more like 20 is probably better), each for 400 timesteps. Plot all the trajectories for your model runs, either by hand (by running the model with PyCX and then saving the resulting plots), or you can write a for loop to run the model for 400 timesteps multiple times and plot the resulting simulations. An example of the type of plot you might create is shown in Figure 1. How much variation do you see in the overall trajectory of the model from one run to the next? Are there common patterns that you observe?

Note also that the model may take a while to run each time—-if you use a loop, you may want to add a print statement to print the run number (i.e. the for loop index) each time, so that you can track the progress as it goes (and make sure your code is running correctly).

Problem 1C) A 1-parameter sweep. Next, we will run a parameter sweep of a single parameter, the rabbit carrying capcity nr. Because the behavior of the model may vary stochastically from one simulation to the next, we will run the model at least 5-10 times for each parameter value.

Write a loop to simulate the model for nr values equal to: 200, 300, 400, 500, 600, 700, 800, remembering to run the model 5 times for each value (or more, depending on how long it takes to run on your computer—when I ran it, 10 simulations per paramter value gave a relatively smooth-ish curve). At each parameter value, calculate the mean final value (after 400 timesteps) of the predators and prey across all 5 runs (or however many) for that parameter value. Note this may take a while to run! That said, if your computer is able to handle it, you may get cleaner results if you run more than 5 runs per parameter value, and if you take a finer resolution sequence of values for nr (e.g. jumping by 50 instead of by 100).

Make a plot of the mean final predator and prey populations as a function of the carrying capacity (i.e. with the nr value on the x-axis and mean final predator and prey values on the y-axis). What patterns do you notice? Why do you think this is?

Next, what potential issues might there be with using the mean final values of predators and prey as the output for our 1-parameter sensitivity analysis? Based on your observations of the model dynamics, what aspects of the system dynamics will this output capture or not capture?

Propose a different output measure that addresses one of the limitations you’ve identified and run a parameter sweep with that output instead. What do you observe? Show your plots and comments.

Problem 1D) Sampling parameter space and uncertainty quantification. Next, we’ll explore the parameter space further through Latin hypercube sampling. For this, we’ll use the PyDOE package (see here for more info and here for commands to install with conda/Anaconda—I’ve also included the install commands for Google Colab in the example model code). For this problem, we will explore we will sample the parameter space for four parameters:

nr, the carrying capacity for rabbits
dr, the death probability for rabbits when they encounter a fox
df, the death probability for foxes when they have no food
rf, the reproduction probability for foxes if there is food (rabbits) nearby

If you’re feeling adventurous though, feel free to explore what happens to the model behavior with the other parameters too!

We will take the bounds for nr to be from \(200\) to \(800\), and the other parameters we will take to be between \(0\) and \(1\). (However, if computation time is a problem, you may need to narrow these ranges further, see the notes below!)

Sample 100 parameter sets using Latin hypercube sampling, and run the model twice for each parameter set (we’ll just run the model twice to save on computation time). You can generate your parameter sets like this:

from pyDOE import lhs
import numpy.matlib as npm

nsamples = 100
reruns = 2
nparams = 4

# Set up parameter array
params = npm.repmat(lhs(nparams, samples = nsamples),reruns,1) 
# Each row is a new parameter set

The command lhs(nparams, samples = nsamples) generates nparams by nsamples array where each row is a new parameter set, and the command npm.repmat, makes reruns copies of that array (i.e. two copies of the array), so that we run each parameter set twice.

Note that by default, lhs will sample your parameters uniformly on [0,1]. This is fine for most of the parameters, but we will have to rescale this range to go from 200 to 800 for nr, the rabbit carrying capacity. We can do that as follows:

# Run the simulations!
for i in range(0,nsamples*reruns):
    nr = params[i,0]*(800-200) + 200 # carrying capacity for rabbits
                                     # note we have rescaled the 0,1 range of params[i,0] to match the range for nr!
    dr = params[i,1] # death probability for rabbits when encountering fox
    df = params[i,2] # death probability for foxes when no food available
    rf = params[i,3] # reproduction probability for foxes when food available
    
    # and the rest of your code goes here!

Run the model for each of the parameter sets, and plot histograms of the resulting final predator and prey population totals across all your samples (you don’t need to average across each of the two runs in this case, since there are only two of them—just treat the two runs as separate samples).

Also generate a scatterplot of the final predator vs. final prey values across all samples (or you might try a 2d histogram using hist2d). What do you observe? What are the general types of behavior observed in the model?

Note that it may be useful to normalize the final prey population total to the carrying capacity parameter used for that run (i.e. calculate the final percent of rabbit carrying capacity for each simulation). This way you can more easily see when the rabbits stabilized at carrying capacity (otherwise the carrying capacities are different for each run).

Some notes in case computation time is an issue:

The model can run extremely slowly on some parts of these parameter ranges for some machines. If you have problems with this, here are some edits I’d recommend:

Speed up computation time if foxes or rabbits go extinct: if the rabbits go extinct, then the foxes will die as well. Similarly, if the foxes go extinct (but the rabbits are still around), there is nothing to stop the rabbits from approaching carrying capacity. To speed up computation time, you may want to add an if statement to check if either population has reached \(0\), and if so, you can just fill in the final values for both populations, e.g. if the foxes go extinct you can fill in \(0\) for foxes, and the carrying capacity value for the rabbits (it’s not always clear if the rabbits would completely reach carrying capacity within the 300 timesteps we are simulating, but it’s likely close enough).

Shrink the parameter ranges: here are some smaller parameter ranges that I found that seemed to work reasonably well, but you can also try your own! The ranges are shown in the comments and the transformation to get that range is shown in the code.

nr = params[i,0]*(700-200) + 200 # carrying capacity for rabbits (200 to 700)
dr = params[i,1]*0.5+0.5 # death probability for rabbits when encountering fox (0.5 to 1)
df = params[i,2]*0.25 # death probability for foxes when no food available (0 to 0.25)
rf = params[i,3]*0.5+0.25 # reproduction probability for foxes when food available (0.25 to 0.75)

Problem 1E) Visualizing parameter sensitivity. Finally, generate scatterplots for each of your parameters vs. each of the model outputs (i.e. final predator and final prey population sizes). This should generate 8 scatterplots total. What do you observe? What effect does each parameter have on the model dynamics? Which parameters is the model most/least sensitive to?

Also generate a heat map plot of your choice, plotting one of the model outputs as a function of two of the parameters (you may choose which). For the heatmap, you may find the pyplot function tricontourf convenient!

Finally, comment on how the choice of model output may limit your ability to discern how the parameters affect the model dynamics. What alternative model outputs might you choose? You don’t have to implement them (though you’re welcome to!), just discuss.

Optional Problem 1F) Calculate correlation coefficients. (Note this problem is optional! Just for fun if you’re interested) From the scatterplots you generated in Problem 1E, evaluate a correlation measure of your choice between the parameters and the model outputs. This could be a linear trend slope, a correlation coefficient, etc. What do you observe? Does the correlation measure help to interpret the data?

Lab 4: Parameter sampling and sensitivity

CMPLXSYS 530 - Marisa Eisenberg

Due date: 3/26/24

Problem 1: Sensitivity analysis of a predator-prey model

Problem 2: Project Proposal