Weighted variance and weighted coefficient of variation

Often we want to compare the variability of a variable in different contexts – say, the variability of unemployment in different countries over time, or the variability of height in two populations, etc. The most often used measures of variability are the variance and the standard deviation (which is just the square root of the variance). However, for some types of data, these measures are not entirely appropriate. For example, when data is generated by a Poisson process (e.g. when you have counts of rare events) the mean equals the variance by definition. Clearly, comparing the variability of two Poisson distributions using the variance or the standard deviation would not work if the means of these populations differ. A common and easy fix is to use the coefficient of variation instead, which is simply the standard deviation divided by the mean. So far, so good.

Things get tricky however when we want to calculate the weighted coefficient of variation. The weighted mean is just the mean but some data points contribute more than others. For example the mean of 0.4 and 0.8 is 0.6. If we assign the weights 0.9 to the first observation [0.4] and 0.1 to the second [0.8], the weighted mean is (0.9*0.4+0.1*0.8)/1, which equals to 0.44. You would guess that we can compute the weighted variance by analogy,  and you would be wrong.

For example, the sample variance of {0.4,0.8} is given by [Wikipedia]:

or in our example ((0.4-0.6)^2+(0.8-0.6)^2) / (2-1) which equals to 0.02. But, the weighted sample variance cannot be computed by simply adding the weights to the above formula (0.9*(0.4-0.6)^2+0.1*(0.8-0.6)^2) / (2-1). The formula for the weighted variance is different [Wikipedia]:

where V1 is the sum of the weights and V2 is the sum of squared weights:.
The next steps are straightforward: the weighted standard deviation is the square root of the above, and the weighted coefficient of variation is the weighted standard deviation divided by the weighted mean.

Although there is nothing new here, I thought it’s a good idea to put it together because it appears to be causing some confusion.  For example, in the latest issue of European Union Politics you can find the article ‘Measuring common standards  and equal responsibility-sharing in EU asylum outcome data’  by a team of scientists from LSE. On page 74, you can read that:

The weighted variance [of the set p={0.38, 0.42} with weights W={0.50,0.50}] equals 0.5(0.38-.0.40)^2+0.5(0.42-0.40)^2 =0.0004.

As explained above, this is not generally correct unless the biased (population) rather than the unbiased (sample)  weighted variance is meant. When calculated properly, the weighted variance turns out to be 0.0008. Here you can find the function Gavin Simpson has provided  for calculating the weighted variance in R and try for yourself.

P.S. To be clear, the weighted variance issue is not central to the argument of the article cited above but is significant as the authors discuss at length the methodology for estimating variability in data and introduce the so-called Coffey-Feingold-Broomberg measure of variability which the authors  deem more appropriate for proportions.

P.P.S On the internet, there is yet more confusion: for example, this document (which pops high in the Google results) has yet a different formula, shown in a slightly different form here  as  well.

Disclaimer. I have a forthcoming paper on the same topic (asylum policy) as the EUP article mentioned above.

When ‘just looking’ beats regression

In a draft paper currently under review I argue that the institutionalization of a common EU asylum policy has not led to a race to the bottom with respect to asylum applications, refugee status grants, and some other indicators. The graph below traces the number of asylum applications lodged in 29 European countries since 1997:

My conclusion is that there is no evidence in support of the theoretical expectation of a race to the bottom (an ever-declining rate of registered applications). One of the reviewers insists that I use a regression model to quantify the change and to estimate the uncertainly of the conclusion. While in general I couldn’t agree more that being open about the uncertainty of your inferences is a fundamental part of scientific practice, in this particular case I refused to fit a regression model and calculate standards errors or confidence intervals. Why?

In my opinion, just looking at the graph is convincing that there is no race to the bottom – applications rates have been down and then up again while the institutionalization of a common EU policy has only strengthened over the last decade. Calculating standard errors will be superficial because it is hard to think about the yearly averages as samples from some underlying population. Estimating a regression which would quantify the EU effect would only work if the model is sufficiently good to capture the fundamental dynamics of asylum applications before isolating the EU effect, and there is no such model. But most importantly, I just didn’t feel that a regression coefficient or a standard error will improve on the inference you get by just looking at the graph: applications have been all over the place since the late 1990s and you don’t need a confidence interval to see that! But the issue has bugged me ever since – after all, the reviewer was just asking for what would be the standard way of approaching an empirical question.

Then two days ago I read this blog post by William M. Briggs who (unlike myself) is a professional statistician. After showing that by manipulating the start and end points of a time series you can get any regression coefficient that you want even with randomly generated data, he concludes ‘The lesson is, of course, that straight lines should not be fit to time series.’  But here is the real punch line:

If we want to know if there has been a change from the start to the end dates, all we have to do is look! I’m tempted to add a dozen more exclamation points to that sentence, it is that important. We do not have to model what we can see. No statistical test is needed to say whether the data has changed. We can just look.

But what about hypothesis testing? We need a statistical test to refute a hypothesis, right? Let me quote some more:

It is true that you can look at the data and ponder a “null hypothesis” of “no change” and then fit a model to kill off this straw man. But why? If the model you fit is any good, it will be able to skillfully predict new data…. And if it’s a bad model, why clutter up the picture with spurious, misleading lines?

In the inimitable prose of Prof. Briggs, ‘if you want to claim that the data has gone up, down, did a swirl, or any other damn thing, just look at it!’