0.1 Aim

To compare the form of the function between distance to most closely related species and the probability of co-occurrence if the evolution of an occurrence trait had simultaneous effects of trait conservatism and repulsion among closely related species.

1 Simulate Phylogeny

Start with a simple balanced phylogeny with constructed branch lengths from 0 to 1. There needs to be a sufficient number of species to distinguish patterns from random, but not too many as to make the analysis huge and unwieldy. Let’s begin with 16 species.

Create Phylogeny
Simulate branch lengths
Compute cophenetic distance matrix

2 Simulation Parameters

Parameter	Description
tree	phylo object
scape.size	edge dimension of square landscape
g.center	strength of phylogenetic signal in species range centers
g.range	strength of phylogenetic signal in species range sizes
g.repulse	strength of phylogenetic repulsion
wd.all	niche width, larger values simulate broader range sizes
signal.center	simulate with phylosignal in range centers
signal.range	simulate with phylosignal in range size
same.range	make all range sizes equal
repulse	include phylogenetic repulsion in range centers
center.scale	adjust strength of phylogenetic attraction in range centers independent of g.center
range.scale	adjust strength of phylogenetic signal in range size independent of g.range
repulse.scale	adjust strength of phylogenetic repulsion independent of g.repulse
site.stoch.scale	adjust strength of random variation in species richness across sites
sd.center	sd in rnorm for the range centers, increase to get more variation in center values across species
sd.range	sd rnorm for the range sizes, increase to get more variation in range sizes across gradients
rho	Grafen branch adjustment of phylogenetic tree see corGrafen
th	probability threshold 10^-th above which species are considered present at a site

3 Paramater choices

For the simulations the scape size will be a 20*20 grid, with species able to occupy any part of the grid. This allows the species to have enough space to occupy their full range, without forcing overlap because not enough available cells. This value doesn’t really matter, its the ratio of grid size to range size. I’ve chosen a relatively small number because I’m iterating twenty times in the model and it creates a relatively large data frame (n=20,000).

For our purposes, we care most about the parameters g.center and g. repulse. The help functions reads:

The amount and type of structure is determened by the signal parameters g.center, g.range and g.repulse. Parameters are based on an Ornstein-Uhlenbeck model of evolution with stabilizing selection. Values of g=1 indicate no stabilizing selection and correspond to the Brownian motion model of evolution; **0<g<1 represents stabilizing selection**; and **g>1 corresponds to disruptive selection** where phylogenetic signal for the supplied tree is amplified. See corBlomberg. Communities are simulated along two gradients where the positions along those gradients, g.center and range sizes g.range, can exhibit phylogenetic signal. Phylogenetic attraction is simulated in the g.center parameter, while repulsion in g.repulse.

3.1 View function

## function (tree, scape.size = 10, g.center = 1, g.range = 1, g.repulse = 1, 
##     wd.all = 150, signal.center = TRUE, signal.range = TRUE, 
##     same.range = TRUE, repulse = TRUE, center.scale = 1, range.scale = 1, 
##     repulse.scale = 1, site.stoch.scale = 0.5, sd.center = 1, 
##     sd.range = 1, rho = NULL, th = 8) 
## {
##     if (!inherits(tree, "phylo")) 
##         stop("'tree' must be of class 'phylo'")
##     if (is.null(tree$edge.length)) 
##         stop("'tree' must have branch lengths")
##     if (!is.ultrametric(tree)) 
##         stop("'tree' must be ultrametric")
##     V <- vcv.phylo(tree, corr = TRUE)
##     Vinit <- V
##     nspp <- dim(V)[1]
##     bspp2 <- NULL
##     Vcomp <- NULL
##     Xscale <- 1
##     Mscale <- site.stoch.scale
##     Vscale1 <- center.scale
##     Vscale2 <- center.scale
##     if (!is.null(rho)) {
##         V <- 1 - (1 - Vinit)^rho
##         V <- V/max(V)
##     }
##     mx <- t(as.matrix((-(scape.size)/2):(scape.size/2)))
##     m <- length(mx)
##     if (signal.center) {
##         g <- abs(g.center)
##         V.a <- vcv(corBlomberg(g, tree), corr = T)
##         iD <- t(chol(V.a))
##     }
##     else {
##         V.a <- V
##         iD <- diag(nspp)
##     }
##     if (signal.range) {
##         g <- abs(g.range)
##         V.w <- vcv(corBlomberg(g, tree), corr = T)
##         iD.w <- t(chol(V.w))
##     }
##     else {
##         V.w <- V
##         iD.w <- diag(nspp)
##     }
##     e <- iD %*% rnorm(nspp, sd = sd.center)
##     e <- Vscale1 * (e - mean(e))/apply(e, 2, sd)
##     bspp1 <- e
##     if (!same.range) {
##         spmx <- t((array(1, c(nspp, 1))) %*% mx)
##         mxsp <- max(mx) * ((array(1, c(length(mx), 1))) %*% t(e))
##         wd <- range.scale * iD.w %*% rnorm(nspp, sd = sd.range)
##         wd <- wd + (abs(min(wd)))
##         wd <- wd/max(wd)
##         dif <- sort(wd)[-1] - sort(wd)[-length(wd)]
##         rat <- mean(dif/sort(wd)[-1])
##         wd[wd == 0] <- sort(wd)[2] - sort(wd)[2] * rat
##         wd <- wd.all * wd
##         X <- exp(-((spmx - mxsp)^2)/t(matrix(wd, nspp, m)))
##     }
##     else {
##         spmx <- t((array(1, c(nspp, 1))) %*% mx)
##         mxsp <- max(mx) * ((array(1, c(length(mx), 1))) %*% t(e))
##         X <- exp(-((spmx - mxsp)^2)/wd.all)
##     }
##     X <- Xscale * X
##     X1 <- diag(1 - Mscale * runif(m)) %*% X
##     e <- iD %*% rnorm(nspp, sd = sd.center)
##     e <- Vscale2 * (e - mean(e))/apply(e, 2, sd)
##     bspp2 <- e
##     if (!same.range) {
##         spmx <- t((array(1, c(nspp, 1))) %*% mx)
##         mxsp <- max(mx) * ((array(1, c(length(mx), 1))) %*% t(e))
##         wd <- range.scale * iD.w %*% rnorm(nspp, sd = sd.range)
##         wd <- wd + (abs(min(wd)))
##         wd <- wd/max(wd)
##         dif <- sort(wd)[-1] - sort(wd)[-length(wd)]
##         rat <- mean(dif/sort(wd)[-1])
##         wd[wd == 0] <- sort(wd)[2] - sort(wd)[2] * rat
##         wd <- wd.all * wd
##         X <- exp(-((spmx - mxsp)^2)/t(matrix(wd, nspp, m)))
##     }
##     else {
##         spmx <- t((array(1, c(nspp, 1))) %*% mx)
##         mxsp <- max(mx) * ((array(1, c(length(mx), 1))) %*% t(e))
##         X <- exp(-((spmx - mxsp)^2)/wd.all)
##     }
##     X <- Xscale * X
##     X2 <- diag(1 - Mscale * runif(m)) %*% X
##     X.repulse <- NULL
##     if (repulse) {
##         compscale <- repulse.scale
##         b0scale <- 0
##         g <- abs(g.repulse)
##         V.r <- vcv(corBlomberg(g, tree), corr = T)
##         Vcomp <- solve(V.r, diag(nspp))
##         Vcomp <- Vcomp/max(Vcomp)
##         Vcomp <- compscale * Vcomp
##         iDcomp <- t(chol(Vcomp))
##         colnames(Vcomp) <- rownames(Vcomp)
##         bcomp <- NULL
##         for (i in 1:m) {
##             bcomp <- cbind(bcomp, iDcomp %*% rnorm(nspp))
##         }
##         bcomp0 <- 0
##         Xcomp <- exp(bcomp0 + bcomp)/(1 + exp(bcomp0 + bcomp))
##         X1 <- X1 * t(Xcomp)
##         X2 <- X2 * t(Xcomp)
##         X.repulse <- t(Xcomp)
##     }
##     X. <- NULL
##     spp.Xs <- array(NA, dim = c(m, m, nspp))
##     for (i in 1:nspp) {
##         sppX <- matrix((X1[, i]) %*% t(X2[, i]))
##         spp.Xs[, , i] <- sppX
##         X. <- cbind(X., matrix(sppX))
##     }
##     m. <- dim(X.)[1]
##     Y <- matrix(0, ncol = nspp, nrow = m.)
##     Y[10^-th < X.] <- 1
##     colnames(Y) <- colnames(X.)
##     index <- NULL
##     index <- cbind(matrix(sapply(1:m, rep, times = m)), matrix(rep(1:m, 
##         times = m)))
##     colnames(index) <- c("X1", "X2")
##     rownames(Y) <- paste(index[, 1], index[, 2], sep = ".")
##     colnames(Y) <- tree$tip.label
##     cc <- comparative.comm(tree, Y)
##     return(list(cc = cc, Y = Y, X.joint = X., X1 = X1, X2 = X2, 
##         sppXs = spp.Xs, V.phylo = Vinit, V.phylo.rho = V, V.center = V.a, 
##         V.range = V.w, V.repulse = Vcomp, bspp1 = bspp1, bspp2 = bspp2, 
##         u = mx, wd = wd.all))
## }
## <environment: namespace:pez>

4 Visualizing the simulation

The easiest way to quickly tell the phylogenetic structure of a run would be to color each species range on a scale from red to blue. Let’s say i wanted to color by relatedness. One simple way to approximate this is to number the tips from 1 to n, and assume a divergent scale. This is not quite true, since the relationship is non linear, but its a good start.

**For each visualization, we have one clade 'blue' and one clade 'red'.**

5 Simulation 1: No Phylogenetic Signal Followed by Strong Repulsion in Occurrence Trait

We would expect red and blue clades to separate, but then fragment to create the checkerboard pattern of distribution.

Using a logistic model, what would the functional form be between probability of co-occurrence and phylogenetic distance to closest related species, if we knew that both niche conservatism and competition were acting simultaneously?

6 Simulation 2: Strong Phylogenetic Signal Followed by Strong Repulsion in Occurrence Trait

7 Simulation 3: No Phylogenetic Signal Followed by No Repulsion in Occurrence Trait

8 Simulation 4: Strong Phylogenetic Signal Followed by No Repulsion in Occurrence Trait

9 Comparison of functional form

10 Varying levels of repulsion and signal

One important question would be the sensitivity of the level of repulsion and range signal in creating our predicted pattern.

11 Sensitivity to Tree Shape and Size

The grid size needs to increase to accommodate more species, so we are not artificially packing them into a tight space.

11.1 8 Tips

11.1.1 Balanced

11.1.2 Ladder

Ladder trees show a slightly weaker pattern because there are fewer closely related species. Only a few species have short branch lengths between them.

11.2 16 Tips

As we increase the number of tips, the range size of any one species needs to increase to get potential overlap with all other species.

11.2.1 Balanced

11.2.2 Ladder

11.3 32 Tips

11.3.1 Balanced

11.3.2 Ladder

Increase the level of repulsion.

As species # increase, the level of repulsion and signal needs to increase to get the predicted pattern. Once we increase the signal and repulsion, we recover the polynomial pattern we see with smaller trees.

12 Conclusion

The polynomial pattern is robust to tree size, although the level of repulsion needs to be increased as there are more species in the tree. This is due to more time since most recent common ancestry. The pattern is not very robust to ladder trees. This is due to very few closely related species in a ladder tree. We would expect this pattern to be more possible in tippy trees that have many late branching taxa and a more even distribution of tip distances.

Relatedness and Co-occurrence Simulation

Ben Weinstein

Friday, Feb 13, 2015