Scott Alexander’s blog Slate Star Codex recently carried the results of a survey of over 850 users of nootropics (cognitive enhancers) such as caffeine, Adderall, and Modafinil. The survey asked respondents to subjectively rate each substance on a scale of 0 to 10, with 0 meaning useless, 1-4 meaning subtle effects, 5-9 meaning strong effects, and 10 meaning life-changing.
There are several difficulties in analyzing this kind of data:
- Discretization. Actual effects vary over a continuum, but respondents are asked to choose from a discrete set of choices.
- Heterogeneous scale usage. People vary in how they use the scale. Some may spread their responses out more than others. Some may tend to give higher answers than others for the same underlying effect. There may be nonlinearities in how people use the scale.
- Meaning. Just what do these ratings mean? How do they translate into specific effects on a person’s mind?
In this note I tackle problems (1) and (2), which are purely technical problems; for (3) I have no answer. On (2) I restrict my attention to bias and scaling of responses.
The analytic approach used here is a simplified version of the methodology described in this paper.
One final caveat: the survey subjects are a self-selected sample, and hence may not be representative of the general populace. One way of dealing with that issue would be to regress the nootropic effects on various subject characteristics that might be predictive of nootropic effect. I did not do this, although the survey includes questions that could be used for this purpose.
Summary of Results
I did a Bayesian analysis that used a hierarchical prior for the nootropic effects and accounted for scale usage heterogeneity and discretization of responses. The first figure shows estimates for the population mean effect ( from the model described below) for each nootropic. The black point is the posterior median, the red line is the 80% posterior credible interval, and the black line is the 90% interval.
The picture is considerably murkier if you look at the posterior predictive distribution for each nootropic. The effect for an individual is , where is an individual deviation from the population mean, with nootropic-specific variance . The next figure shows credible intervals for this individual effect, for each nootropic. These individual effects are quite uncertain: the values vary from around 1.9 to around 2.8.
The data are provided as a table with one row per subject (survey respondent), and one column per nootropic. There were 36 nootropics mentioned in the survey, but subjects only gave ratings to those they had actually used. The first step is to reshape the data from this “wide” format into a “long” format with columns for subject, nootropic, and response, with each case being some subject’s experience with some nootropic. Then
- indexes cases;
- is the subject for case ;
- is the nootropic for case ;
- is the rating subject gave for nootropic .
Take each rating to be the binned version of a continuous latent variable . For example, a rating of means that . Similarly, a rating of means , and a rating of means .
This approach uses a fixed, equally-spaced set of breakpoints , ; a further refinement, which I did not explore, would be to infer the breakpoints themselves, perhaps restricting them to some parametric form such as a quadratic in .
One can expect that survey respondents will vary in how they translate their response to a nootropic into a continuous rating . Assume that an underlying continuous response gets translated into a continuous rating via an individual bias term and scaling factor:
Hierarchical priors for the scale-usage parameters are appropriate:
It probably would have made sense to use a bivariate normal prior on and to allow for correlations between them in the population, but I did not explore this option.
The priors on , and are weakly informative, based on the 0 to 10 scale used:
- Values of outside the range 0 to 10 are implausible.
- A value of is implausible, as it allows quite extreme values for to be common.
- A value of is implausible, as it means that it would be common for to see a factor of difference in the the scaling used by two different subjects.
The effectiveness of a nootropic will vary over the population; letting be the mean effect of nootropic , and its variance over the population, we have
Since there are 36 different nootropics in the study, I used hierarichical priors for the mean effects and variances:
The priors for , , and are again intended to be weakly informative—given the 10-point scale, values of larger than 7, values of larger than 2 , and values of larger than 2 all seem implausibly extreme.
The normal distribution for induces a normal distribution for , conditional on the other model variables:
The likelihood for case in the data set is then given by
where is the CDF for the standard normal distribution.