# Continuous time, discrete event models

Standard

Recently, I’ve been exposed to situations where I am trying to model discrete, binary events (i.e., 0 or 1 like heads-or-tails). My knee-jerk response has been: use a logistic regression or another model with a binomial outcome. The jack-of-all trades generalize linear model usually servers me well in these situations. However, my recent events have had continuous-time predictors. Although Cox proportional hazards model can be used if the event is something like survival, this did not seem appropriate for my situation because I had multiple events occurring per individual. Enter in continuous-time, discrete events.

A Poisson regression is similar to a binomial if the probability of an even occurring is small enough. Enter in a Poisson regression as a method for modeling animal behavior. I first saw this in a mathematical statistics paper describing models animal movements, but found another paper by some of the co-authors that was more accessible. From this, I learned I needed to use the following version of the Poisson regression:

y ~ Poisson(μ)

μ = τ exp( β x’).

I was able to program this in Stan, by adopting code I found online. This model can also be modified to treat individuals a random effect (and prevent pseudo-replication) if the data allows or requires it.

# The Population Dynamics of Disturbance Specialist Plant Populations

Standard

In my second post for Quantitative Dynamics I’m going to discuss a topic that I have studied since my graduate work at the University of Nebraska. In 2010 Brigitte Tenhumberg, Richard Rebarber, Diana Pilson and I embarked on a journey studying the long-term, stochastic dynamics of wild sunflower, a disturbance specialist plant population that uses a seed bank to buffer against the randomness of disturbances.

Because the seeds of disturbance specialist plants cannot germinate without a soil disturbance, there are many periods of time for which these populations will have zero or few above-ground plants, and hence no new members of the population from one season to the next. As such, much like a freelance worker with uncertain pay, a seed bank (account) is necessary for long-term viability.

In our work (which you can find here, here and here) we created an integral projection model with stochasticity modeling 1) the presence of a disturbance and 2) the depth of a disturbance. We found through mathematical analyses and simulations that the presence of disturbances increased population viability (as you would expect), but the intensity, depth and autocorrelation of disturbances had a different effect on populations depending on their viability. For populations that were viable, increasingly intense and positively-autocorrelated disturbances enhanced long-term population sizes, whereas when populations were near extinction levels both dynamics were actually harmful to population viability. These results were novel and surprising. You can find my blog post on the topic in The American Naturalist as well.

In subsequent work we would like to study transient dynamics of such systems. Transient dynamics, to this point, have not garnered the attention of long-term dynamics in stochastic systems. However, my friend Iain Stott and colleagues have gotten the ball rolling in that direction, and it’s only a matter of time.

# Trend analysis from aggregate data

Standard

Often, people collect data for time with replication. For example, the LTRM program collects fish, aquatic vegetation, and water quality data through time. Multiple samples are collected for each year. However, these observations are not independent and failure to consider this would be pseudoreplication. Aggregating (or taking the mean) of data within a year can be one method to prevent pseudoreplication. Aggregating comes with a trade-off of losing information about the raw data. State-space models may be a method to recover this information.

State-space models describe a true, but unknown and un-measurable “state” (e.g., the “true” population of catfish in the Upper Mississippi River) and the observation error associated with collecting the data. Kalman Fileters can be used to fit these model such as the MARSS package in R can be used to fit these models.

We were interesting in comparing state-space models from the MARSS package to other methods such as simple linear regression and auto-regressive models (publication here). Using simulated data and observed data from the LTRM, we found that the simpler models performed better than the state-space models likely because the LTRM data was not long enough for the state-space models.

# Integral projection models

Standard

Matrix population models describe populations as discrete life-, size-, or age-stages. Scientists apply these models to understand population ecology and guide conservation. However, some species have continuous life histories. For example, thistles grow continuously as presented within this paper.

Fish also grow continuously. We sought to understand how different management approaches could be used to control grass carp. This species impacts native ecosystems by out-competing native fish. Mangers were interesting in evaluating the use of YY-males to control populations. YY-males work because they spawn and only produce male offspring. Thus, it is possible in theory to cause a population to crash by biasing the sex-ratio.

We constructed an integral projection model for grass carp and compared different yy-male release methods. We found the life history of grass carp does not work well with the YY-male strategy because the species lives long and females produce many offspring.

# An Introduction to Galton-Watson Processes

Standard

Howdy! I’m Eric Eager, and I’m an associate professor of mathematical biology at the University of Wisconsin – La Crosse.  I’m also a data scientist for Pro Football Focus and Orca Pacific.  In my first post for Quantitative Dynamics, I’m going to discuss a topic near and dear to my heart: Branching processes (thanks Sebastian Schreiber for teaching me these five years ago).

Branching processes are a great bridge between the continuous-space population models that permeate the ecological literature (e.g. Caswell 2001, Ellner, Childs and Rees 2016) and the individual-based realities that drive ecological systems (Railsback and Grimm 2011). All branching process models specify an absorbing state (usually extinction in ecology) and model the probability of reaching the absorbing state by creating an iterative map from one generation to the next. This allows you to work with a model whose space is in a set of discrete values (individual-based), but with a resulting model that’s a difference equation (traditional ecological models).

The most famous example of a branching process is the Galton-Watson process. Francis Galton was concerned about the eventual fate of surnames (a quaint artifact of the past), especially among the aristocracy. Below are a couple of videos I made, one deriving the Galton-Watson process and one solving it. Enjoy!

# Review of Deep Work

Standard

I recently read the book Deep Work by Carl Newport. In a short sentence, the book is anti-multitasking. More specifically, the book discusses the importance of focusing and concentrating in order to work deeply one topic and do it well.

I like the book, probably because it meets my preconceived notions about how to be effective. However, the book did provide new ideas for me so I decided to implement some of the book’s suggestions at work. Specifically, I scheduled my time and focused on one project at a time. By doing this, I got a great deal done. I suspect this was for 3 reasons:

1. First, I was being intentional about being productive;

I plan on adding this book to re-read every couple of years shelf and I highly recommend the book. I’ve also become aware of Carl’s blog because of the book and recommend that as well (Carl run a life hacks blog on working more efficiently). I particularly like his workflow re-engineering post. Maybe I’ll post an update of that works for me in a future post.

# tikz in LaTeX and Structural Equation Modeling

Standard

During grad school, I attended an ESA Workshop on Structural Equation Modeling (SEM) let by Jim Grace. The approach allows for multivariate analysis with multiple predictors, multiple response variables, and latent variables. Up until now, my research never required using the method and I never bought the software he recommended at the time because the GUI program recommended by Grace was too expensive for my limited needs.

Recently, I had a need to use SEM at work. We had two response variables: environmental DNA (eDNA) and the ash-free dry weight of an aquatic organism (AFDW). Both were predicted by multiple environmental variables and AFDW predicted eDNA. A perfect problem for SEM.

To refresh myself of SEM, I revisited Grace’s work. I discovered that he maintains an excellent tutorial about SEM. The pages provide a nice introduction, as does his (slightly outdated) book, his classic book, and a recent Ecoshephere article.

However, I did not have a nice way to plot my results. I did not want to use a WYSIWYG tool like Inkscape or Power Point. But I remembered the tikz package in LaTeX. Here’s the figure I created:

Example SEM plot.

I created the figure using this LaTeX code:

\documentclass{article}

\usepackage[paperheight =11.3cm, paperwidth =9.5cm, margin = 0.1cm]{geometry}

\usepackage{tikz}
\usetikzlibrary{arrows}
\usetikzlibrary{positioning}

\begin{document}

\pagenumbering{gobble}

\begin{tikzpicture}[ -> , >=stealth',auto,node distance=3.5cm,
thick,main node/.style={rectangle,draw, font=\sffamily}]

\node[main node] (1) {Lake};
\node[main node] (2) [below of=1] {Depth};
\node[main node] (3) [below of=2] {Non-habitat};
\node[main node] (4) [below of=3] {Habitat};

\node[main node] (6) [below right of=2, align = center] {AFDW\\ $$r^2 = 0.223$$};
\node[main node] (7) [right of=6, align = center] {eDNA\\ $$r^2 = 0.384$$};

\path[every node/.style={font=\sffamily\small}]
(1) edge node [above = 40pt] {\textbf{0.497}} (6)
(2) edge node [left = 10pt] {\textbf{-0.370}} (6)
(3) edge node [above] {0.094} (6)
(4) edge node [left = 10pt] {0.116} (6)

(1) edge[bend left] node [above = 10 pt] {\textbf{0.385}} (7)
(2) edge[bend left] node [above = 5pt ] {0.197} (7)
(3) edge[bend right] node [above = 0pt] {-0.298} (7)
(4) edge[bend right] node [below = 5pt] {0.204} (7)

(6) edge node [ ] {-0.180} (7);

\end{tikzpicture}

\end{document}

# Using RcppArmadillo for a matrix population model

Standard

I’ve had a busy winter and spring with training for the Birkie and then recovering from the Birkie, hence the few posts. One of the things I’ve been doing is teaching myself the Rcpp package in R. This package lets people easily use C++ code within R. In this post, I demonstrate how I used Rcpp and specifically the RcppArmadillo package to create a population model.

Matrix models are popular in ecology. These models are series of difference equations (i.e., discrete time). I was interested in coding a simple example for a two life-stage species with the projection matrix R.

To code this up in R, I use the following code

popModelR A = matrix(c(0, 2, 0.25, 0.5), nrow = 2, byrow = TRUE),
P0 = c(0, 50)){
P = matrix(0, nrow = dim(A)[1], ncol = nYears + 1)
P[, 1] = P0
for( t in 1:nYears){
P[ , t + 1] = A  %*% P[, t]
}
return(P)
}

nYears = 10
A = matrix(c(0, 2, 0.25, 0.5),
nrow = 2, byrow = TRUE)
P0 = matrix(c(0, 50), nrow = 2)
P0
popModelROut <- popModelR(nYears = 10,
A = A,
P0 = P0)

Obviously, this simple model runs fast, but how would one code this with Rcpp? In order to get matrix operators, I needed to use the RcppArmadillo package, so my code looks like this:

library(“inline”)
library(“Rcpp”)

src1 <- ‘
int nYearsX = Rcpp::as<int>(nYears);
arma::mat P0X = Rcpp::as<arma::mat>(P0);
arma::mat AX  = Rcpp::as<arma::mat>(A);
arma::mat PX(AX.n_cols, nYearsX + 1);
PX.col(0) = P0X;

for(int t = 0; t < nYearsX; t++) {
PX.col(t + 1) =  AX * PX.col(t);
}

return Rcpp::wrap(PX);

popModelRcpp <- cxxfunction(signature(nYears = “integer”,
A = “matrix”,
P0 = “matrix”),

popModelRcpp(nYears, A, P0)

Now, to compare the two functions, I use the benchmark package and run the model for 100 simulated years:

library(rbenchmark)
nYears = 100
res popModelR(nYears, A, P0),
columns = c(“test”, “replications”, “elapsed”,
“relative”, “user.self”, “sys.self”),
order = “relative”,
replications = 1000)
print(res)

test replications elapsed relative user.self sys.self
1 popModelRcpp(nYears, A, P0) 1000 0.02 1.00 0.00 0.00
2 popModelR(nYears, A, P0) 1000 0.53 29.61 0.64 0.00

The Rcpp code is almost 30 times quicker than the base code in R!

# review of “How Not to be Wrong: The Power of Mathematical Thinking”

Standard
I just got done reading “How Not to be Wrong: The Power of Mathematical Thinking” by Jordan Ellenberg. My younger brother lent me his copy. In sentence, the book can be summed up this this phrase from the book: “Mathematics is the extension of common sense by other means.” The book does a great job of explaining how mathematics and statistics can be used understand the world around us. The book is filled many good examples such as Wald’s WWII on where to place extra armor on planes. Wald was given data of where plane got shot and asked where should extra armor be placed (answer: the places without the holes!). The book is filled with many other interesting examples as well.

The only downsides to the book are that it can become long and drawn out at times. Also, I was familiar with many of the examples and had seen them before. Finally, I would add the book is “math lite”, which is a strength for many potential readers.

# 6 tips for a new LaTeX user

Standard
Recently a coworker started using LaTeX and asked for some tips. Here’s my 6 tips for starting to use LaTeX:
1.  Start what you finish (i.e., close environments or else you get errors or weird bugs), for example $$needs an$$
2. Every document needs 3 things: \documentclass{<class>}, \begin{document}, and \end{document}
3. For equations, use  inline and \begin{eqnarray} \end{eqnarray} equations. \\ creates a new line. Use \\ \nonumber to continue on an equation and &=& to space multiple equations for example:
<code>
\begin{eqnarray}
a &=& b +c \\
a & =& b \\ \nonumber
& & c
\end{eqnarray}
</code>
Gives you something like:
a = b +c   (1)
a = b

c        (2)

4. Bib files are your friend for citations. Use Google Scholar to populate new citations.
5.  \textit{My italics text}, \textbf{my bold text}, should get most of your formatting. Do NOT use the depreciated {\bf bold} or {\it italics} style. (cf http://tex.stackexchange.com/questions/41681/correct-way-to-bold-italicize-text for more details on the second point)
6. {} can be very helpful, especially for complicated math functions, when order of operation is important. For example, \sigma_{\pi^2}^{2}