(PDF)

## Introduction

A generalized linear model for binary response data has the form

where is the 0/1 response variable, is the -vector of predictor variables, is the vector of regression coefficients, and is the link function. In the Stan modeling language this would be written as

y ~ bernoulli(p); g(p) <- dot_product(x, beta);

with replaced by the name of a link function, and similarly for the BUGS modeling language.

The most common choices for the link function are

- logit:
- probit:
where is the cumulative distribution function for the standard normal distribution; and

- complementary log-log (cloglog):

All three of these are strictly increasing, continuous functions with and .

In this note we’ll discuss when to use each of these link functions.

## Probit

The probit link function is appropriate when it makes sense to think of as obtained by thresholding a normally distributed latent variable :

Defining , this yields

## Logit

Logit is the default link function to use when you have no specific reason to choose one of the others. There is a specific technical sense in which use of logit corresponds to minimal assumptions about the relationship between and . Suppose that we describe the joint distribution for and by giving

- the marginal distribution for , and
- the expected value of for each predictor variable .

Then the maximum-entropy (most spread-out, diffuse, least concentrated) joint distribution for and satisfying the above description has a pdf of form

for some function , coefficient vector and normalizing constant . The conditional distribution for is then

and so

## Cloglog

The complementary log-log link function arises when

where is a count having a Poisson distribution:

To see this, let

Then

and so

## Conclusion

In summary, here is when to use each of the link functions:

- Use probit when you can think of as obtained by thresholding a normally distributed latent variable.
- Use cloglog when indicates whether a count is nonzero, and the count can be modeled with a Poisson distribution.
- Use logit if you have no specific reason to choose some other link function.