Bayesium Analytics

Practical Bayesian Data Analysis

  • Archive
  • Backlog
  • Services

Archives for August 2015

August 14, 2015 by kevin@ksvanhorn.com

Which Link Function — Logit, Probit, or Cloglog?

(PDF)

Introduction

A generalized linear model for binary response data has the form

\Pr\left(y=1\mid x\right)=g^{-1}\left(x^{\prime}\beta\right)

where y is the 0/1 response variable, x is the n-vector of predictor variables, \beta is the vector of regression coefficients, and g is the link function. In the Stan modeling language this would be written as

  y ~ bernoulli(p);
  g(p) <- dot_product(x, beta);

with g replaced by the name of a link function, and similarly for the BUGS modeling language.

The most common choices for the link function are

  • logit: g(p)=\log\left(\frac{p}{1-p}\right);
  • probit: g^{-1}(\eta)=\Phi(\eta)

    where \Phi is the cumulative distribution function for the standard normal distribution; and

  • complementary log-log (cloglog): g(p)=\log\left(-\log\left(1-p\right)\right).

All three of these are strictly increasing, continuous functions with g(0)=-\infty and g(1)=+\infty.

In this note we’ll discuss when to use each of these link functions.

Probit

The probit link function is appropriate when it makes sense to think of y as obtained by thresholding a normally distributed latent variable z:

\begin{array}{rcl} z & = & x^{\prime}\beta^{*}+\varepsilon\\ \varepsilon & \sim & \text{Normal}\left(0,\sigma\right)\\ y & = & \begin{cases} 1 & \text{if }z\geq0\\ 0 & \text{otherwise}. \end{cases} \end{array}

Defining \beta=\beta^{*}/\sigma, this yields

\begin{array}{rcl} \Pr\left(y=1\mid x\right) & = & \Pr\left(x^{\prime}\beta^{*}+\varepsilon\geq0\right)\\ & = & \Pr\left(-\varepsilon\leq x^{\prime}\beta^{*}\right)\\ & = & \Pr\left(\varepsilon\leq x^{\prime}\beta^{*}\right)\\ & = & \Phi\left(x^{\prime}\beta\right). \end{array}

Logit

Logit is the default link function to use when you have no specific reason to choose one of the others. There is a specific technical sense in which use of logit corresponds to minimal assumptions about the relationship between y and x. Suppose that we describe the joint distribution for x and y by giving

  • the marginal distribution for x, and
  • the expected value of x_{i}y for each predictor variable x_{i}.

Then the maximum-entropy (most spread-out, diffuse, least concentrated) joint distribution for x and y satisfying the above description has a pdf of form

p\left(x,y\right)=\frac{1}{Z}f(x)\exp\left(\sum_{i=1}^{n}\beta_{i}x_{i}y\right)

for some function f, coefficient vector \beta and normalizing constant Z. The conditional distribution for y is then

\begin{array}{rcl} p\left(y\mid x\right) & = & \frac{p(x,y)}{p(x,0)+p(x,1)}\\ & = & \frac{\exp\left(\left(x^{\prime}\beta\right)y\right)}{1+\exp\left(x^{\prime}\beta\right)} \end{array}

and so

\begin{array}{rcl} \Pr\left(y=1\mid x\right) & = & \frac{\exp\left(x^{\prime}\beta\right)}{1+\exp\left(x^{\prime}\beta\right)}\\ & = & \text{logit}^{-1}\left(x^{\prime}\beta\right). \end{array}

Cloglog

The complementary log-log link function arises when

y=\begin{cases} 1 & \text{if }z > 0\\ 0 & \text{if }z=0 \end{cases}

where z is a count having a Poisson distribution:

\begin{array}{rcl} z & \sim & \text{Poisson}\left(\lambda\right)\\ \lambda & = & \exp\left(x^{\prime}\beta\right). \end{array}

To see this, let

p=\Pr\left(z > 0\mid x\right).

Then

\begin{array}{rcl} p & = & 1-\text{Poisson}\left(0\mid\lambda\right)\\ & = & 1-\exp\left(-\lambda\right)\\ & = & 1-\exp\left(-\exp\left(x^{\prime}\beta\right)\right) \end{array}

and so

\begin{array}{rcl} \text{cloglog}\left(p\right) & = & \log\left(-\log\left(1-p\right)\right)\\ & = & \log\left(-\log\left(\exp\left(-\exp\left(x^{\prime}\beta\right)\right)\right)\right)\\ & = & x^{\prime}\beta. \end{array}

Conclusion

In summary, here is when to use each of the link functions:

  • Use probit when you can think of y as obtained by thresholding a normally distributed latent variable.
  • Use cloglog when y indicates whether a count is nonzero, and the count can be modeled with a Poisson distribution.
  • Use logit if you have no specific reason to choose some other link function.

Filed Under: Uncategorized

Recent Posts

  • Installing httpstan on Ubuntu Linux 20.04
  • Installing CmdStan on Windows 10
  • Probability Theory Does Not Extend Logic?
  • It’s All About Jensen’s Inequality
  • Analysis of a Nootropics Survey

Archives

  • August 2022
  • July 2017
  • March 2016
  • August 2015
  • July 2015
  • June 2015

Categories

Tags

multilevel modeling nominal variables

Copyright © 2025 · Generate Pro Theme on Genesis Framework · WordPress · Log in