Problem 1: Sensitivity analysis of a predator-prey model

Problem 1A) Exploring the model. To explore some parameter sampling and sensitivity analysis, we’ll use an agent-based predator-prey model from the PyCX module. This model (a variation of the classic Lotka-Volterra model) tracks rabbits (prey) and foxes (predators) as they move through space, and foxes can prey upon (eat) the rabbits with probability dr when they come within a radius cd of one another.

In addition to movement and predation, the prey can reproduce, and predators can reproduce if there is food available. If there are no rabbits (prey) nearby, the predators may also die with probability df. For more information about the model, please see Sections 4.6 and 19.4 of Sayama’s book (Introduction to the Modeling and Analysis of Complex Systems).

Make a copy of the model abm-predator-prey-with-plot.py from the PyCX module (you can use the one I’ve provided here on Google Colab) and run the it a few times, for at least 300 timesteps (you may need to run it for a while to observe the general behavior), exploring different parameter values by hand.

Be sure to run the model for long enough to see how the dynamics turn out (e.g. running to at least 200 or 300 timesteps).



Figure 1. 5 runs of the predator prey model with the default parameters. Note the general pattern of oscillations but also occasional population extinction (in this case of foxes, leading to an all-rabbit population). Blue lines indicate rabbits (prey) and red lines indicate foxes (predators). Brightness of the line indicates which simulation (i.e. the brightest red and brightest blue lines were from the same simulation, etc.).

Problem 1B) Examining stochastic variation. Next, run the model at the original, default parameters for at least 5 model simulations (something more like 20 is probably better), each for 400 timesteps. Plot all the trajectories for your model runs, either by hand (by running the model with PyCX and then saving the resulting plots), or you can write a for loop to run the model for 400 timesteps multiple times and plot the resulting simulations. An example of the type of plot you might create is shown in Figure 1. How much variation do you see in the overall trajectory of the model from one run to the next? Are there common patterns that you observe?

Note also that the model may take a while to run each time—-if you use a loop, you may want to add a print statement to print the run number (i.e. the for loop index) each time, so that you can track the progress as it goes (and make sure your code is running correctly).


Problem 1C) A 1-parameter sweep. Next, we will run a parameter sweep of a single parameter, the rabbit carrying capcity nr. Because the behavior of the model may vary stochastically from one simulation to the next, we will run the model at least 5-10 times for each parameter value.

Write a loop to simulate the model for nr values equal to: 200, 300, 400, 500, 600, 700, 800, remembering to run the model 5 times for each value (or more, depending on how long it takes to run on your computer—when I ran it, 10 simulations per paramter value gave a relatively smooth-ish curve). At each parameter value, calculate the mean final value (after 400 timesteps) of the predators and prey across all 5 runs (or however many) for that parameter value. Note this may take a while to run! That said, if your computer is able to handle it, you may get cleaner results if you run more than 5 runs per parameter value, and if you take a finer resolution sequence of values for nr (e.g. jumping by 50 instead of by 100).

Make a plot of the mean final predator and prey populations as a function of the carrying capacity (i.e. with the nr value on the x-axis and mean final predator and prey values on the y-axis). What patterns do you notice? Why do you think this is?

Next, what potential issues might there be with using the mean final values of predators and prey as the output for our 1-parameter sensitivity analysis? Based on your observations of the model dynamics, what aspects of the system dynamics will this output capture or not capture?

Propose a different output measure that addresses one of the limitations you’ve identified and run a parameter sweep with that output instead. What do you observe? Show your plots and comments.


Problem 1D) Sampling parameter space and uncertainty quantification. Next, we’ll explore the parameter space further through Latin hypercube sampling. For this, we’ll use the PyDOE package (see here for more info and here for commands to install with conda/Anaconda—I’ve also included the install commands for Google Colab in the example model code). For this problem, we will explore we will sample the parameter space for four parameters:

If you’re feeling adventurous though, feel free to explore what happens to the model behavior with the other parameters too!

We will take the bounds for nr to be from \(200\) to \(800\), and the other parameters we will take to be between \(0\) and \(1\). (However, if computation time is a problem, you may need to narrow these ranges further, see the notes below!)

Sample 100 parameter sets using Latin hypercube sampling, and run the model twice for each parameter set (we’ll just run the model twice to save on computation time). You can generate your parameter sets like this:

from pyDOE import lhs
import numpy.matlib as npm

nsamples = 100
reruns = 2
nparams = 4

# Set up parameter array
params = npm.repmat(lhs(nparams, samples = nsamples),reruns,1) 
# Each row is a new parameter set

The command lhs(nparams, samples = nsamples) generates nparams by nsamples array where each row is a new parameter set, and the command npm.repmat, makes reruns copies of that array (i.e. two copies of the array), so that we run each parameter set twice.

Note that by default, lhs will sample your parameters uniformly on [0,1]. This is fine for most of the parameters, but we will have to rescale this range to go from 200 to 800 for nr, the rabbit carrying capacity. We can do that as follows:

# Run the simulations!
for i in range(0,nsamples*reruns):
    nr = params[i,0]*(800-200) + 200 # carrying capacity for rabbits
                                     # note we have rescaled the 0,1 range of params[i,0] to match the range for nr!
    dr = params[i,1] # death probability for rabbits when encountering fox
    df = params[i,2] # death probability for foxes when no food available
    rf = params[i,3] # reproduction probability for foxes when food available
    
    # and the rest of your code goes here!

Run the model for each of the parameter sets, and plot histograms of the resulting final predator and prey population totals across all your samples (you don’t need to average across each of the two runs in this case, since there are only two of them—just treat the two runs as separate samples).

Also generate a scatterplot of the final predator vs. final prey values across all samples (or you might try a 2d histogram using hist2d). What do you observe? What are the general types of behavior observed in the model?

Note that it may be useful to normalize the final prey population total to the carrying capacity parameter used for that run (i.e. calculate the final percent of rabbit carrying capacity for each simulation). This way you can more easily see when the rabbits stabilized at carrying capacity (otherwise the carrying capacities are different for each run).

Some notes in case computation time is an issue:

The model can run extremely slowly on some parts of these parameter ranges for some machines. If you have problems with this, here are some edits I’d recommend:


Problem 1E) Visualizing parameter sensitivity. Finally, generate scatterplots for each of your parameters vs. each of the model outputs (i.e. final predator and final prey population sizes). This should generate 8 scatterplots total. What do you observe? What effect does each parameter have on the model dynamics? Which parameters is the model most/least sensitive to?

Also generate a heat map plot of your choice, plotting one of the model outputs as a function of two of the parameters (you may choose which). For the heatmap, you may find the pyplot function tricontourf convenient!

Finally, comment on how the choice of model output may limit your ability to discern how the parameters affect the model dynamics. What alternative model outputs might you choose? You don’t have to implement them (though you’re welcome to!), just discuss.


Optional Problem 1F) Calculate correlation coefficients. (Note this problem is optional! Just for fun if you’re interested) From the scatterplots you generated in Problem 1E, evaluate a correlation measure of your choice between the parameters and the model outputs. This could be a linear trend slope, a correlation coefficient, etc. What do you observe? Does the correlation measure help to interpret the data?



Problem 2: Project Proposal

For the second half of this lab, we’ll be working on more detailed project proposals, which we’ll share with the rest of the class. The project proposal is an opportunity for you to lay out a more detailed plan for your model (and the beginnings of a draft for your paper!), and to start to work out how to code up the model. I recommend you take a look at the PARTE framework and ODD protocol for some good guidelines for how to document your model.

For your project proposal, you may use the following proposal template based on the PARTE and ODD approaches. There is no particular page limit, but something on the order of 1-3 pages is likely about the right length, potentially plus code or pseudo-code depending on where you are in your project.