Introduction to Generalized linear mixed models?

Harnessing non linearity random effects

Julien Martin

BIO 8940 - Lecture 8

2024-09-19

Questions after reading Bolker et al 2009

Difference between fixed and random effects
When to transform data?
- if you have a funky looking distribution of continuous data, is it always ok to transform to achieve normality if you don’t violate any test assumptions?
Walkthrough Figure 1 ?
Get rid of non-significant fixed effects?
- If important for my hypothesis, should I always keep them?
- What if I have a fairly small dataset?
How to choose a link function? Why not using the default?
Can we go through example in Box 1?

GLMM: What are they?

GaCha Life Minie Movie

Video game allowing you to dress-up anime style characters

Generalized linear mixed model

An extension to Generalized linear model and an extension to linear mixed model

GLMM expresses the transformed conditional expectation of the dependent variable y as a linear combination of the regression variables X

Model has 3 components

a structural component or additive expression \(\beta_0 + \beta_1 X_1 + ... + \beta_k X_k\)
a link function: \(g(\mu)\)
a response distribution: Gaussian, Binomial, Bernouilli, Poisson, negative binomial, zero-inflated …, zero-truncated …, …

\[ g(\mu_i) = \beta_0 + \beta_1 X_1 + ... + \beta_k X_k \]

and

\[ \mu_i = E(y_i | x_i) = g(\mu_i)^{-1} \]

How do you fit them?

In R:

glmer() from lme4 📦 same as lmer() but with a family argument
glmmPQL() from MASS 📦 (based on lme())
glmmADMB() from - glmmADMB 📦 works well and flexible be beware
glmmTMB() from glmmTMB 📦 works well and flexible be beware
asreml() from glmmTMB 📦 great but not-free
MCMCglmm() from MCMCglmm 📦 great but Bayesian
Choose you bayesian flavor 📦:
- stan: brms, rethinking, rstan, …
- BUGS: runjags, rjags, …

Model assumptions

Easy answer none or really few
More advanced answer I am not sure, it is complicated
Just check residuals I as usual

Technically only 3 assumption:
- Variance is a function of the mean specific to the distribution used
- observations are independent
- linear relation on the latent scale

Warning

Generalized Linear Models do not care if the residual errors are normally distributed as long as the specified mean-variance relationship is satisfied by the data

Choosing a link function

A link function should map the stuctural component from \((-\infty,\infty)\) to the distribution interval (e.g. (0,1) for binomial)

So number of link function possible is extremley large.

Choice of link function heavily influenced by field tradiditon

For binomial models

logit assume modelling probability of an observation to be one
probit assume binary outcome from a hidden gaussian variable (i.e. threshold model)
logit & probit are really similar, both are symmetric but probit tapers faster. logit coefficient easier to interpret directly
cologlog not-symmetrical

Estimating repeatability ?

Latent scale

Business as usual ?

Observed scale ??????

Using rptR 📦 is the easiest or QGGlmm 📦 (see associated citation for reference and explanations)

Marginalized vs Conditioned estimates

Difference between marginalized and conditioned coefficients?

GLMMadaptive 📦 is the only way I know to do easily get marginalized coefficients

Practical

Walkthrough Example box 1

Introduction to Generalized linear mixed models?

Questions after reading Bolker et al 2009

GLMM: What are they?

GaCha Life Minie Movie

Generalized linear mixed model

How do you fit them?

Model assumptions

Choosing a link function

Estimating repeatability ?

Marginalized vs Conditioned estimates

Practical

Happy modelling