## Uncertain Attitudes

It’s been a while since I posted anything in the bad statistics file, but an article in today’s* Grauniad* has now given me an opportunity to rectify that omission.

The piece concerned, entitled *Racism on the rise in Britain* is based on some new data from the British Social Attitudes survey; the full report can be found here (PDF). The main result is shown in this graph:

The version of this plot shown in the Guardian piece has the smoothed long-term trend (the blue curve, based on a five-year moving average of the data and clearly generally *downward* since 1986) removed.

In any case the report, as is sadly almost always the case in surveys of this kind, neglects any mention of the statistical uncertainty in the survey. In fact the last point is based on a sample of 2149 respondents. Suppose the fraction of the population describing themselves as having some prejudice is . For a sample of size with respondents indicating that they describe themselves as “very prejudiced or a little prejudiced” then one can straightforwardly estimate . So far so good, as long as there is no bias induced by the form of the question asked nor in the selection of the sample…

However, a little bit of mathematics involving the binomial distribution yields an answer for the uncertainty in this estimate of p in terms of the sampling error:

For the sample size given, and a value this amounts to a standard error of about 1%. About 95% of samples drawn from a population in which the true fraction is will yield an estimate within , i.e. within about 2% of the true figure. This is consistent with the “noise” on the unsmoothed curve and it shows that the year-on-year variation shown in the unsmoothed graph is largely attributable to sampling uncertainty; note that the sample sizes vary from year to year too. The results for 2012 and 2013 are 26% and 30% exactly, which differ by 4% and are therefore explicable solely in terms of sampling fluctuations.

I don’t know whether racial prejudice is on the rise in the UK or not, nor even how accurately such attitudes are measured by such surveys in the first place, but there’s no evidence in these data of any significant change over the past year. Given the behaviour of the smoothed data however, there is evidence that in the very long term the fraction of population identifying themselves as prejudiced is actually falling.

Newspapers however rarely let proper statistics get in the way of a good story, even to the extent of removing evidence that contradicts their own prejudice.

Follow @telescoper*Related*

This entry was posted on May 28, 2014 at 2:20 pm and is filed under Bad Statistics, Politics with tags binomial distribution, British Social Attitudes Survey, racism, sampling uncertainty, statistics. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

May 28, 2014 at 6:09 pm

I would describe myself as very prejudiced against frequentists. (But is it prejudice if Bayesians are right and they are wrong?)

May 29, 2014 at 8:51 am

Prejudice is defined as “preconceived opinion that is not based on reason or actual experience”. Since both reason and actual experience confirm your position I don’t think it counts as prejudice!

May 30, 2014 at 3:35 pm

This is the sort of dataset it can be fun to model with a particle filter. Below is an R code implementing a somewhat simplistic model for the observed data: a 5 year moving average model with beta-binomial likelihood function. Both the particle filter and model could be made more sophisticated (e.g. use particle Gibbs, and allow for e.g. some piece-wise linear trends in time or add an inertial term in the time series) but it’s a good example for anyone interested in learning how to program particle filters. I find 66% of posterior samples increasing in prejudice over the past year, 34% decreasing, btw.

### R script to perform SIR particle filtering of prejudice dataset

library(VGAM) # For beta-binomial pdf [with rho param.]

# Data hand entered from report at http://www.natcen.ac.uk/media/338770/selfreported-racial-prejudice-datafinal.pdf

years <- 1983:2013

n.sampled <- c(1761,1675,1804,3100,2847,0,3029,1397,0,1473,0,2302,0,2399,0,1035,2718,2293,2188,1123,4432,1062,1075,3213,2022,1128,1128,2216,2205,2172,2149)

percent.very.prej <- c(4,3,5,4,4,0,4,4,0,2,0,2,0,2,0,2,2,2,2,1,3,2,3,2,3,3,3,3,4,2,3)

percent.little.prej <- c(32,34,29,32,34,0,32,32,0,29,0,34,0,27,0,24,26,23,23,30,27,25,30,28,29,32,30,26,34,24,27)

n.prej <- as.integer((percent.very.prej+percent.little.prej)/100*n.sampled)

n.obs <- length(n.prej)

# Draw vectors of initial (hidden) states and parameters

n.particles <- 100000

particles.theta <- matrix(rexp(n.particles*5),nrow=n.particles,ncol=5)

particles.mu <- rbeta(n.particles,mean(n.prej[1:3])+1,mean(n.sampled[1:3])-mean(n.prej[1:3])+1)

particles.innovation.sd <- rexp(n.particles,100)

particles.innovations <- matrix(0,nrow=n.particles,ncol=n.obs)

particles.rho <- runif(n.particles,0.0025,0.01)

lg.weights <- numeric(n.particles)

p.implicit <- matrix(0,nrow=n.particles,ncol=n.obs)

# SIR Particle Filter

for (i in 1:n.obs) {

# Sampling (Propogation)

particles.innovations[,i] <- rnorm(n.particles,mean=0,sd=particles.innovation.sd)

p.current 1) {for (j in 1:min(i-1,5)) {p.current 0.9999] <- 0.9999

p.current[p.current < 0.0001] 0) {lg.weights <- lg.weights+dbetabinom(n.prej[i],n.sampled[i],p.current,rho=particles.rho,log=T)}

p.implicit[,i] <- p.current

# Importance Resampling

resampled <- sample(1:n.particles,n.particles,prob=exp(lg.weights),replace=T)

lg.weights <- lg.weights*0+1

p.implicit <- p.implicit[resampled,]

particles.innovations <- particles.innovations[resampled,]

particles.innovation.sd <- particles.innovation.sd[resampled]

particles.mu <- particles.mu[resampled]

particles.theta <- particles.theta[resampled,]

particles.rho <- particles.rho[resampled]

}

# Plot results

plot(0,0,xlim=range(years),ylim=c(20,45),xlab="",ylab="",xaxt='n',yaxt='n')

for (i in 1:min(n.particles,500)) {lines(years,p.implicit[i,]*100,col=hsv(0,alpha=0.05))}

points(years,n.prej/n.sampled*100)

axis(1)

axis(2,las=2)

mtext("Year",side=1,line=2.3)

mtext("Underlying Prejudice %",side=2,line=2.8)

title("MA(5) Model for Underlying Prejudice Level with Beta-Binomial Likelihood Function")

hist((p.implicit[,n.obs]-p.implicit[,n.obs-1])*100,xlab="Percentage Change in Past Year",ylab="",yaxt='n',main="Posterior for MA(5) Model w/ Beta-Binomial LF")

May 30, 2014 at 4:58 pm

So, is there evidence for supersymmetry in these data?

May 30, 2014 at 8:39 pm

unfortunately not that type of particle! š