Auk and Seal: Bayesian Mark-Recapture: Parsimony and Pollock's Closed Robust Design

At the Bienniel Marine Mammal Conference 2016, I presented a Hierachical Bayesian version of Pollock's Closed Robust Design (see presentation here). This was motivated by some bad behaviour of Program Mark in estimating detection probabilities and migration parameters for "sparse data" -- I say sparse, but for my sample sizes with dolphins, failure is all too common. In both real data, and through an extensive simulation exercise, I found that Mark was highly prone to "boundary-level" MLEs for survival and migration and sometimes detection probabilities, leading to erratic abundance estimates.

Interestingly, mildy weak Bayesian priors (Beta(1,1)) help smooth over this bad behaviour ("Objective" priors do not).

The image to the right shows simulations of the PCRD, comparing Maximum Likelihood estimation versus a Bayesian model... notice the infinite CI's and boundary level MLEs for Program MARK! (See the full article link below)

Hierarchical Bayesian PCRD
The world is going Bayesian, but few people understand why. Here, the true power of Bayesian Inference is revealed in a hierarchical specification for the survival, detection and migration parameters: in particular, we shrink between "time-invariant theta(.)" models and "time-varying theta(t)" models. I found that applying a particular type of shrinkage-inducing hyperprior, called the "Scaled Half Student-t distribution", lead to somewhat similar estimates as "model-averaging" over many MLE-based fixed effect models (run in Mark).

Hierarchical Bayes provides a type of automatic "Occam's Razor" (Berger 2004), in that when there is sparse/weak data, the hyprprior shrinks the model to a simple model, but when there is a lot of evidence in the data or a large sample size, the hyperprior is easily overwhelmed and the model takes on a lot of complexity. Reassuringly, the final estimates seem insensitive to exact hyper-parameter values. As shown in the following animation...

I decided to upload an R & JAGS tutorial on github: https://github.com/faraway1nspace/PCRD_JAGS_demo Users can run simple fixed effects Bayesian PCRD models, or they can run a Hierarchical Bayesian version that includes: i) Individual Heterogeneity in detection probabilities; and ii) recruitment processes; and iii) uses a hyperprior to apply a type of "multimodel" inference. All things seem to unify in HB.

JAGS code: Fixed effects Bayesian PCRD
Here JAGS code to run the simple fixed model of the PCRD: phi(.)gamma'(.)gamma''(t)p(t,s). This is just for interested readers. Serious users should consult the github demonstration
. This implements the PCRD as a Bayesian HMM. Anyone who uses this JAGS code should cite: "Rankin RW, Nicholson K, Allen SJ, Krützen M, Bejder L, Pollock KH. 2016. A full-capture Hierarchical Bayesian model of Pollock's Closed Robust Design and application to dolphins. Frontiers in Marine Science, in Press. DOI: 10.3389/fmars.2016.00025". The article is available here


model{

  # priors

  phi ~ dbeta(1,1) # apparent survival probability (adjust for ecological 

  gamma1 ~ dbeta(1.2,1,2) #  temp migration: probabilty stays out conditional on being out

  # loop through primary periods: parameters for detection probability pd_t, and gamma2_t

  for(t_ in 1:T){ # T primary periods...

     pd_a[t_] ~ dgamma(3,2) # hyperprior on detect probs

     pd_b[t_] ~ dgamma(3,2) # hyperprior on  detect probs

     for(tt_ in 1:T2[t_]){ # loop through secondary periods...

        pd[t_,tt_] ~ dbeta(pd_a[t_],pd_b[t_]) # secondard-period-level detection probs

     }

     gamma2[t_] ~ dbeta(1,1) # temp migration: prob migrates outside conditional on being inside

     # recruitment process from 'eigenvector decomposition'

     lambda[1,t_] <- (1-gamma1)/(gamma2[t_]-gamma1+1) # recruitment ratio, or long-term prob of being inside

     lambda[2,t_] <- 1-lambda[1,t_] # 

     psi[t_] ~ dunif(0,1) # inclusion probability

     # trmat: transition matrix for Markovian latent-states

     # 1 =not yet in population;2=dead;3=offsite;4=onsite (only observable state)

     # transition are from the column --> rows

     # trmat[row,column,time] = [state at time=t_; state at time t_-1; time=t_]

     trmat[1,1,t_] <- 1-psi[t_] # excluded from pop

     trmat[2,1,t_] <- 0 # dead

     trmat[3,1,t_] <- psi[t_]*lambda[2,t_] # inclusion into pop, outside study are

     trmat[4,1,t_] <- psi[t_]*lambda[1,t_] # inclusion into pop, inside study area

     trmat[1,2,t_]<- 0

     trmat[2,2,t_]<- 1 # stay dead

     trmat[3,2,t_]<- 0

     trmat[4,2,t_]<- 0

     trmat[1,3,t_]<- 0

     trmat[2,3,t_]<- 1-phi # dies outside

     trmat[3,3,t_]<- gamma1*phi # stays outside | outside

     trmat[4,3,t_]<- (1-gamma1)*phi # reenters study area | outside

     trmat[1,4,t_]<- 0 # 

     trmat[2,4,t_]<- 1-phi # dies inside

     trmat[3,4,t_]<- gamma2[t_]*phi # leaves study area | inside

     trmat[4,4,t_]<- (1-gamma2[t_])*phi # stays inside | inside

  } # t_ 

  # likelihood: loop through M individuals, both real and pseudo-individuals

  for (i in 1:M){ 

    #draw latent state at primary period 1: 

    # ... by definition, everyone starts in z=1 (not-in-population) at time=0 

    z[i,1]~ dcat(trmat[1:4,1,1]) # first z strictly excluded from pop

    # likelihood for first primary period

    for(tt_ in 1:T2[1]){ # loop through secondary periods

         # Bernouilli process, conditional on z=4, otherwise no observation

         y[i,tt_,1] ~ dbern(pd[1,tt_]*equals(z[i,1],4)) 

    }

    alive_i[i,1] <- step(z[i,1]-3) # count if i is alive or not

    Nin_i[i,1] <- equals(z[i,1],4) # count if i is within study area

    # loop through primary periods after 1st primary periods

    for(t_ in 2:T){

      # state process: draw z(t_) conditional on z(t_-1)

      z[i,t_] ~ dcat(trmat[1:4, z[i,t_-1] , t_])

      # likelihood: loop through secondary period observations

      for(tt_ in 1:T2[t_]){

         # Bernouilli observation error, conditional on z=4

         y[i,tt_,t_] ~ dbern(pd[t_,tt_]*equals(z[i,t_],4)) 

      }

      # tally population size 

      alive_i[i,t_] <- step(z[i,t_]-3) # check alive or not

      Nin_i[i,t_] <- equals(z[i,t_],4) # count if i is within study area

    } # t_

  } # i

  # estimate population size per primary periods

  for(t_ in 1:T){

     alive[t_] <- sum(alive_i[,t_]) # number alive

     Nin[t_] <- sum(Nin_i[,t_]) # number in study area

  } # t_

}

JAGS code: Hierarchical Bayesian PCRD
Here is JAGS code for a Hierarchical Bayesian Model, phi(t)gamma'(t)gamma''(t)p(t,s). The Hierarchical Part uses 'hyperpriors' to deliberately shrink time-varying specifications, like phi(t), to a simpler time-invariant specifications, phi(.). The result is a type of "model parsimony", with a similar type of smoothing as "model averaging". The following is just a visual demo for the casual reader; serious users should refer to the github page and download the full R code. Anyone who uses this JAGS code should cite: "Rankin RW, Nicholson K, Allen SJ, Krützen M, Bejder L, Pollock KH. 2016. A full-capture Hierarchical Bayesian model of Pollock's Closed Robust Design and application to dolphins. Frontiers in Marine Science, in Press. DOI:10.3389/fmars.2016.00025". The article is available here


model{

  #hyperpriors: logit(theta_mu)~Normal and theta_sd ~ half-t

  phi.mu ~ dunif(pr.phiunif[1],pr.phiunif[2]) #mean survival with a Uniform prior

  sigma.phi ~ dt(0,pr.tauphi[1],pr.tauphi[2]) T(0,) #mean survival dispersion, half-t hyperprior

  g1.mu ~ dnorm(0,pr.g1mu) #mean gamma1, temp migration out | out

  sigma.g1~dt(0,pr.taug1[1],pr.taug1[2]) T(0,) #mean gamma1 dispersion, half-t hyperprior

  g2.mu ~ dnorm(0,pr.g2mu) #mean gamma2,  temp migration out | in

  sigma.g2~dt(0,pr.taug2[1],pr.taug2[2]) T(0,) #mean gamma2 dispersion, half-t hyperprior

  pd.mu ~ dnorm(0,pr.pdmu) #mean detection prob, overall 

  sigma.pdmu~dt(0,pr.taupdmu[1],pr.taupdmu[2]) T(0,) #primary period detection prob dispersion

  sigma.pd2nd~dt(0,pr.taupd2nd[1],pr.taupd2nd[2]) T(0,) #secondary periods detection prob dispersion 

  sigma.eps ~ dt(0,pr.taueps[1],pr.taueps[2]) T(0,) #individual detection prob dispersion

  #time-variant parameters 

  for(t_ in 1:T){ #loop through primary periods

    pd_mu[t_]~dnorm(pd.mu,pow(sigma.pdmu,-2)) #primary period mean detaction prob (logit)

    lgamma1[t_]~dnorm(g1.mu,pow(sigma.g1,-2)) #prob migrate out|out (logit)

    gamma1[t_] <- ilogit(lgamma1[t_]) #prob migrate out|out (probability)

    lgamma2[t_]~dnorm(g2.mu,pow(sigma.g2,-2)) #prob migrate out|in (logit)

    gamma2[t_] <- ilogit(lgamma2[t_]) #prob migrate out|in (probability)

    #RECRUITMENT: psi is the 'inclusion probability' and lambda is the 'recruitment ratio'

    psi[t_]~dunif(0,1) #inclusion probability

    lambda[t_] <- (1-gamma1[t_])/(gamma2[t_]-gamma1[t_]+1) #long-term probability inside study area

    #NOTE, lambda could also be a random variable with a beta prior

    #secondary-period detection probabilities

    for(tt_ in 1:T2[t_]){ #loop through secondary periods

      pd[t_,tt_] ~ dnorm(pd_mu[t_],pow(sigma.pd2nd,-2))

    } #tt_

  } #t_

  #first state transition (pure nusance; strictly from outside-pop to part of marked-pop)

  trmat0[1] <- (1-psi[1]) #remains not-yet-in-pop

  trmat0[2] <- 0

  trmat0[3] <- psi[1]*(1-lambda[1]) #inclusion into pop, goes outside study are

  trmat0[4] <- psi[1]*lambda[1] #inclusion into pop, goes inside study

  #state transitions (2:T)

  for(t_ in 1:(T-1)){

    lphi[t_]~dnorm(log(phi.mu/(1-phi.mu)), pow(sigma.phi,-2)) #survival prob (logit)

    phi[t_]<-ilogit(lphi[t_])

    #state transitions 

     #trmat: transition matrix for Markovian latent-states

     #1 =not yet in population; 2=dead;3=offsite;4=onsite (only observable state)

     #transition are from the column --> rows

     #trmat[row,column,time] = [state at time=t_; state at time t_-1; time=t_]

     #notice that the primary periods are offset by 1 (because we already dealt with T=1)

     trmat[1,1,t_]<- 1-psi[t_+1] #excluded from pop

     trmat[2,1,t_] <- 0 #dead

     trmat[3,1,t_] <- psi[t_+1]*(1-lambda[t_+1]) #inclusion into pop,outside study

     trmat[4,1,t_] <- psi[t_+1]*lambda[t_+1] #inclusion into pop,inside study

     trmat[1,2,t_]<- 0

     trmat[2,2,t_]<- 1 #stay dead

     trmat[3,2,t_]<- 0

     trmat[4,2,t_]<- 0

     trmat[1,3,t_]<- 0

     trmat[2,3,t_]<- 1-phi[t_] #dies outside

     trmat[3,3,t_]<- gamma1[t_+1]*phi[t_] #stays outside | outside

     trmat[4,3,t_]<- (1-gamma1[t_+1])*phi[t_] #reenters study area | outside

     trmat[1,4,t_]<- 0 #

     trmat[2,4,t_]<- 1-phi[t_] #dies inside

     trmat[3,4,t_]<- gamma2[t_+1]*phi[t_] #leaves study area | inside

     trmat[4,4,t_]<- (1-gamma2[t_+1])*phi[t_] #stays inside | inside

  } #t_

  #loop through M individuals

  for (i in 1:M){

    #state transitions and likelihood for the first primary period

    z[i,1]~ dcat(trmat0) #z at time 0 is strictly 'not-yet-in-pop'

    alive_i[i,1] <- step(z[i,1]-3) #count if i is alive or not

    Nin_i[i,1] <- equals(z[i,1],4) #count if i is within study area

    eps_i[i] ~ dnorm(0,pow(sigma.eps,-2)) #random effects at individual levels

    #Observation error y[i,tt_,t_] ~ Bernoulli conditional on being inside z=4

    for(tt_ in 1:T2[1]){  #loop through secondary periods

       y[i,tt_,1] ~ dbern(equals(z[i,1],4)/(1+exp(-pd[1,tt_]-eps_i[i]))) #inverse-logit transform

    }

    #state transition and likelihood for primary periods 2:T

    for(t_ in 2:T){ 

      #State process: draw z(t_) conditional on  z(t_-1)

      z[i,t_] ~ dcat(trmat[1:4, z[i,t_-1] , t_-1])

      #Observation error y[i,tt_,t_] ~ Bernoulli condition on being inside z=4

      for(tt_ in 1:T2[t_]){ #loop through secondary periods

        y[i,tt_,t_] ~ dbern(equals(z[i,t_],4)/(1+exp(-pd[t_,tt_]-eps_[i]))) #inverse-logit transform

      }

      #check whether i individual is alive and inside

      alive_i[i,t_] <- step(z[i,t_]-3) #check alive

      Nin_i[i,t_] <- equals(z[i,t_],4) #count if i is within study area

    } #t_

  } #i

  #tally population size

  for(t_ in 1:T){

     alive[t_] <- sum(alive_i[,t_]) #number alive

     Nin[t_] <- sum(Nin_i[,t_]) #number in study area

  } #t_

} #end model

Tuesday, 23 February 2016

Bayesian Mark-Recapture: Parsimony and Pollock's Closed Robust Design

No comments:

Post a Comment