Uncertain Attitudes

It’s been a while since I posted anything in the bad statistics file, but an article in today’s Grauniad has now given me an opportunity to rectify that omission.
The piece concerned, entitled Racism on the rise in Britain is based on some new data from the British Social Attitudes survey; the full report can be found here (PDF). The main result is shown in this graph:

Racism_graph

The version of this plot shown in the Guardian piece has the smoothed long-term trend (the blue curve, based on a five-year moving average of the data and clearly generally downward since 1986) removed.

In any case the report, as is sadly almost always the case in surveys of this kind, neglects any mention of the statistical uncertainty in the survey. In fact the last point is based on a sample of 2149 respondents. Suppose the fraction of the population describing themselves as having some prejudice is p. For a sample of size n with x respondents indicating that they describe themselves as “very prejudiced or a little prejudiced” then one can straightforwardly estimate p \simeq x/n. So far so good, as long as there is no bias induced by the form of the question asked nor in the selection of the sample…

However, a little bit of mathematics involving the binomial distribution yields an answer for the uncertainty in this estimate of p in terms of the sampling error:

\sigma = \sqrt{\frac{p(1-p)}{n}}

For the sample size given, and a value p \simeq 0.35 this amounts to a standard error of about 1%. About 95% of samples drawn from a population in which the true fraction is p will yield an estimate within p \pm 2\sigma, i.e. within about 2% of the true figure. This is consistent with the “noise” on the unsmoothed curve and it shows that the year-on-year variation shown in the unsmoothed graph is largely attributable to sampling uncertainty; note that the sample sizes vary from year to year too. The results for 2012 and 2013 are 26% and 30% exactly, which differ by 4% and are therefore explicable solely in terms of sampling fluctuations.

I don’t know whether racial prejudice is on the rise in the UK or not, nor even how accurately such attitudes are measured by such surveys in the first place, but there’s no evidence in these data of any significant change over the past year. Given the behaviour of the smoothed data however, there is evidence that in the very long term the fraction of population identifying themselves as prejudiced is actually falling.

Newspapers however rarely let proper statistics get in the way of a good story, even to the extent of removing evidence that contradicts their own prejudice.

5 Responses to “Uncertain Attitudes”

  1. Anton Garrett Says:

    I would describe myself as very prejudiced against frequentists. (But is it prejudice if Bayesians are right and they are wrong?)

    • telescoper Says:

      Prejudice is defined as “preconceived opinion that is not based on reason or actual experience”. Since both reason and actual experience confirm your position I don’t think it counts as prejudice!

  2. This is the sort of dataset it can be fun to model with a particle filter. Below is an R code implementing a somewhat simplistic model for the observed data: a 5 year moving average model with beta-binomial likelihood function. Both the particle filter and model could be made more sophisticated (e.g. use particle Gibbs, and allow for e.g. some piece-wise linear trends in time or add an inertial term in the time series) but it’s a good example for anyone interested in learning how to program particle filters. I find 66% of posterior samples increasing in prejudice over the past year, 34% decreasing, btw.

    ### R script to perform SIR particle filtering of prejudice dataset

    library(VGAM) # For beta-binomial pdf [with rho param.]

    # Data hand entered from report at http://www.natcen.ac.uk/media/338770/selfreported-racial-prejudice-datafinal.pdf
    years <- 1983:2013
    n.sampled <- c(1761,1675,1804,3100,2847,0,3029,1397,0,1473,0,2302,0,2399,0,1035,2718,2293,2188,1123,4432,1062,1075,3213,2022,1128,1128,2216,2205,2172,2149)
    percent.very.prej <- c(4,3,5,4,4,0,4,4,0,2,0,2,0,2,0,2,2,2,2,1,3,2,3,2,3,3,3,3,4,2,3)
    percent.little.prej <- c(32,34,29,32,34,0,32,32,0,29,0,34,0,27,0,24,26,23,23,30,27,25,30,28,29,32,30,26,34,24,27)
    n.prej <- as.integer((percent.very.prej+percent.little.prej)/100*n.sampled)
    n.obs <- length(n.prej)

    # Draw vectors of initial (hidden) states and parameters
    n.particles <- 100000
    particles.theta <- matrix(rexp(n.particles*5),nrow=n.particles,ncol=5)
    particles.mu <- rbeta(n.particles,mean(n.prej[1:3])+1,mean(n.sampled[1:3])-mean(n.prej[1:3])+1)
    particles.innovation.sd <- rexp(n.particles,100)
    particles.innovations <- matrix(0,nrow=n.particles,ncol=n.obs)
    particles.rho <- runif(n.particles,0.0025,0.01)
    lg.weights <- numeric(n.particles)
    p.implicit <- matrix(0,nrow=n.particles,ncol=n.obs)

    # SIR Particle Filter
    for (i in 1:n.obs) {

    # Sampling (Propogation)
    particles.innovations[,i] <- rnorm(n.particles,mean=0,sd=particles.innovation.sd)
    p.current 1) {for (j in 1:min(i-1,5)) {p.current 0.9999] <- 0.9999
    p.current[p.current < 0.0001] 0) {lg.weights <- lg.weights+dbetabinom(n.prej[i],n.sampled[i],p.current,rho=particles.rho,log=T)}
    p.implicit[,i] <- p.current

    # Importance Resampling
    resampled <- sample(1:n.particles,n.particles,prob=exp(lg.weights),replace=T)
    lg.weights <- lg.weights*0+1
    p.implicit <- p.implicit[resampled,]
    particles.innovations <- particles.innovations[resampled,]
    particles.innovation.sd <- particles.innovation.sd[resampled]
    particles.mu <- particles.mu[resampled]
    particles.theta <- particles.theta[resampled,]
    particles.rho <- particles.rho[resampled]
    }

    # Plot results
    plot(0,0,xlim=range(years),ylim=c(20,45),xlab="",ylab="",xaxt='n',yaxt='n')
    for (i in 1:min(n.particles,500)) {lines(years,p.implicit[i,]*100,col=hsv(0,alpha=0.05))}
    points(years,n.prej/n.sampled*100)
    axis(1)
    axis(2,las=2)
    mtext("Year",side=1,line=2.3)
    mtext("Underlying Prejudice %",side=2,line=2.8)
    title("MA(5) Model for Underlying Prejudice Level with Beta-Binomial Likelihood Function")

    hist((p.implicit[,n.obs]-p.implicit[,n.obs-1])*100,xlab="Percentage Change in Past Year",ylab="",yaxt='n',main="Posterior for MA(5) Model w/ Beta-Binomial LF")

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: